From patchwork Sun Apr 23 19:26:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83B2CC6FD18 for ; Sun, 23 Apr 2023 19:31:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230082AbjDWTba (ORCPT ); Sun, 23 Apr 2023 15:31:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229516AbjDWTb0 (ORCPT ); Sun, 23 Apr 2023 15:31:26 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17183E63; Sun, 23 Apr 2023 12:31:23 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 09AE75FD0B; Sun, 23 Apr 2023 22:31:19 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278279; bh=6GXN8Z4A5pzJKwUwAblmfS0l0b76F1IUad3+Vj3aKY0=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=shjYlIjxj3hxvvpP8No56KwLSqVFM8t313Q1+mBmJNJQdBgLY3Oi3yaijVYy6E27f t3vBh11K6UBYy9b2OnhUmmehr8/4E5sfctj3W/D2pRoqvTHTRM3lwcoN9gpNp4g2U7 xrlxvjrjFfXOWam3GK6mqZPBOKEGGga3j7YPBHsUbtxaoI/deDMr436K4+SheZe/MW eXmVsPUWh4jspoID9+ew3p0hWmbUIitC9XR9bxfzOYWY88gMszgKEGhPU9j+zovkS6 +iLCOfnj8nCPIRyiO7HA/leS4h9+s/qVhR9WTvgCz1q0vQuTOqb4Esc8Xc1C9oTpyr 8WKMRLB4JdofQ== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:18 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 01/15] vsock/virtio: prepare for non-linear skb support Date: Sun, 23 Apr 2023 22:26:29 +0300 Message-ID: <20230423192643.1537470-2-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This is preparation patch for non-linear skbuff handling. It does two things: 1) Handles freeing of non-linear skbuffs. 2) Adds copying from non-linear skbuffs to user's buffer. Signed-off-by: Arseniy Krasnov --- include/linux/virtio_vsock.h | 7 +++ net/vmw_vsock/virtio_transport_common.c | 84 +++++++++++++++++++++++-- 2 files changed, 87 insertions(+), 4 deletions(-) diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index c58453699ee9..848ec255e665 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -12,6 +12,10 @@ struct virtio_vsock_skb_cb { bool reply; bool tap_delivered; + /* Current fragment in 'frags' of skb. */ + u32 curr_frag; + /* Offset from 0 in current fragment. */ + u32 frag_off; }; #define VIRTIO_VSOCK_SKB_CB(skb) ((struct virtio_vsock_skb_cb *)((skb)->cb)) @@ -246,4 +250,7 @@ void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit); void virtio_transport_deliver_tap_pkt(struct sk_buff *skb); int virtio_transport_purge_skbs(void *vsk, struct sk_buff_head *list); int virtio_transport_read_skb(struct vsock_sock *vsk, skb_read_actor_t read_actor); +int virtio_transport_nl_skb_to_iov(struct sk_buff *skb, + struct iov_iter *iov_iter, size_t len, + bool peek); #endif /* _LINUX_VIRTIO_VSOCK_H */ diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index dde3c870bddd..b901017b9f92 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -337,6 +337,60 @@ static int virtio_transport_send_credit_update(struct vsock_sock *vsk) return virtio_transport_send_pkt_info(vsk, &info); } +int virtio_transport_nl_skb_to_iov(struct sk_buff *skb, + struct iov_iter *iov_iter, + size_t len, + bool peek) +{ + unsigned int skb_len; + size_t rest_len = len; + int curr_frag; + int curr_offs; + int err = 0; + + skb_len = skb->len; + curr_frag = VIRTIO_VSOCK_SKB_CB(skb)->curr_frag; + curr_offs = VIRTIO_VSOCK_SKB_CB(skb)->frag_off; + + while (rest_len && skb->len) { + struct bio_vec *curr_vec; + size_t curr_vec_end; + size_t to_copy; + void *data; + + curr_vec = &skb_shinfo(skb)->frags[curr_frag]; + curr_vec_end = curr_vec->bv_offset + curr_vec->bv_len; + to_copy = min(rest_len, (size_t)(curr_vec_end - curr_offs)); + data = kmap_local_page(curr_vec->bv_page); + + if (copy_to_iter(data + curr_offs, to_copy, iov_iter) != to_copy) + err = -EFAULT; + + kunmap_local(data); + + if (err) + break; + + rest_len -= to_copy; + skb_len -= to_copy; + curr_offs += to_copy; + + if (curr_offs == (curr_vec_end)) { + curr_frag++; + curr_offs = 0; + } + } + + if (!peek) { + skb->len = skb_len; + VIRTIO_VSOCK_SKB_CB(skb)->curr_frag = curr_frag; + VIRTIO_VSOCK_SKB_CB(skb)->frag_off = curr_offs; + } + + return err; +} +EXPORT_SYMBOL_GPL(virtio_transport_nl_skb_to_iov); + static ssize_t virtio_transport_stream_do_peek(struct vsock_sock *vsk, struct msghdr *msg, @@ -365,7 +419,14 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk, */ spin_unlock_bh(&vvs->rx_lock); - err = memcpy_to_msg(msg, skb->data + off, bytes); + if (skb_is_nonlinear(skb)) { + err = virtio_transport_nl_skb_to_iov(skb, + &msg->msg_iter, + bytes, + true); + } else { + err = memcpy_to_msg(msg, skb->data + off, bytes); + } if (err) goto out; @@ -417,14 +478,22 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk, */ spin_unlock_bh(&vvs->rx_lock); - err = memcpy_to_msg(msg, skb->data, bytes); + if (skb_is_nonlinear(skb)) { + err = virtio_transport_nl_skb_to_iov(skb, &msg->msg_iter, + bytes, false); + } else { + err = memcpy_to_msg(msg, skb->data, bytes); + } + if (err) goto out; spin_lock_bh(&vvs->rx_lock); total += bytes; - skb_pull(skb, bytes); + + if (!skb_is_nonlinear(skb)) + skb_pull(skb, bytes); if (skb->len == 0) { u32 pkt_len = le32_to_cpu(virtio_vsock_hdr(skb)->len); @@ -498,7 +567,14 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk, */ spin_unlock_bh(&vvs->rx_lock); - err = memcpy_to_msg(msg, skb->data, bytes_to_copy); + if (skb_is_nonlinear(skb)) { + err = virtio_transport_nl_skb_to_iov(skb, + &msg->msg_iter, + bytes_to_copy, + false); + } else { + err = memcpy_to_msg(msg, skb->data, bytes_to_copy); + } if (err) { /* Copy of message failed. Rest of * fragments will be freed without copy. From patchwork Sun Apr 23 19:26:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221452 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14E09C6FD18 for ; Sun, 23 Apr 2023 19:31:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230158AbjDWTbf (ORCPT ); Sun, 23 Apr 2023 15:31:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56242 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229751AbjDWTb0 (ORCPT ); Sun, 23 Apr 2023 15:31:26 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 37358E68; Sun, 23 Apr 2023 12:31:23 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 2E8365FD0C; Sun, 23 Apr 2023 22:31:19 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278279; bh=XExB9Kow9SS0d9XJh7TqIuND1hr0gu/engpR4A64Lyc=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=X/bZIwe55jdH2NExosg3J+Iw19MR3DHcCjmw8Ea0ezbbSGk3vvZNJQiAHa5iouzCH qK5dAjXFLqfOK1VWkggxaFPc74VJktlv52Sec955qL7FN5f9w61RVNKzmGxO3QtI/T MmJ9b/7kyZv+9Cmol3FLkMSohtakkdXyEaHa033WJwK0pKMthZBbPj/3eWL442Ww10 KEl2bLvVwKQRalZix6K1HuCwhBek+fyCAGr0XvGatGNDMI6QqM6HXkpkFNlhcvGw9L hfQ7DZyQeIEF/Otf8fJAXABkKAhwW2nopclEq03uNA973L95EtQ0HPE7BU1pzEEr6N 29fsQVpCHpSFw== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:19 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 02/15] vhost/vsock: non-linear skb handling support Date: Sun, 23 Apr 2023 22:26:30 +0300 Message-ID: <20230423192643.1537470-3-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds copying to guest's virtio buffers from non-linear skbs. Such skbs are created by protocol layer when MSG_ZEROCOPY flags is used. Signed-off-by: Arseniy Krasnov --- drivers/vhost/vsock.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 6578db78f0ae..1e70aa390e44 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -197,11 +197,20 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, break; } - nbytes = copy_to_iter(skb->data, payload_len, &iov_iter); - if (nbytes != payload_len) { - kfree_skb(skb); - vq_err(vq, "Faulted on copying pkt buf\n"); - break; + if (skb_is_nonlinear(skb)) { + if (virtio_transport_nl_skb_to_iov(skb, &iov_iter, + payload_len, + false)) { + vq_err(vq, "Faulted on copying pkt buf from page\n"); + break; + } + } else { + nbytes = copy_to_iter(skb->data, payload_len, &iov_iter); + if (nbytes != payload_len) { + kfree_skb(skb); + vq_err(vq, "Faulted on copying pkt buf\n"); + break; + } } /* Deliver to monitoring devices all packets that we @@ -212,7 +221,9 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, vhost_add_used(vq, head, sizeof(*hdr) + payload_len); added = true; - skb_pull(skb, payload_len); + if (!skb_is_nonlinear(skb)) + skb_pull(skb, payload_len); + total_len += payload_len; /* If we didn't send all the payload we can requeue the packet From patchwork Sun Apr 23 19:26:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA183C77B7C for ; Sun, 23 Apr 2023 19:31:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230044AbjDWTb3 (ORCPT ); Sun, 23 Apr 2023 15:31:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229493AbjDWTb0 (ORCPT ); Sun, 23 Apr 2023 15:31:26 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4319E10DB; Sun, 23 Apr 2023 12:31:23 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 52CBD5FD0D; Sun, 23 Apr 2023 22:31:19 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278279; bh=B6LHD0ddTDQ3vaHpouYR+VjbT71Q46Di1RjaF1nYxV8=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=WsvAG0VQW75XxdmEQP9cn5KvjdEn2udS90X105MAb2RPeqUqbAaNs4/bGFh+7rlRl tvJUZ+yTDc+vzC5nxub9oDTOoK+/pcBcYnVlY6XIoAhDsX+af9GIFAz6z+lByxU8yH eGL5HyMZWWVwjdxx6VsF602YcAQNjg/XTcyOhbJx3YopQzEgW3mKCznQ6c+IU/SyDx aofbWaiuiFjUZlh3GUpJrDucSGzrVrb86Fi+oDz7A+SAIuWReWtpo9MeGxY/ZPAgh+ bVQ3yoCrEWMJyjX8YtYBBt5HR2XXw8ZPutKa22jCz2jZWy1CfgkmpzUOdnfj+VR46R UWy3boLVXCK4A== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:19 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 03/15] vsock/virtio: non-linear skb handling support Date: Sun, 23 Apr 2023 22:26:31 +0300 Message-ID: <20230423192643.1537470-4-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Use pages of non-linear skb as buffers in virtio tx queue. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/virtio_transport.c | 32 ++++++++++++++++++++++++++------ 1 file changed, 26 insertions(+), 6 deletions(-) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index e95df847176b..1c269c3f010d 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -100,7 +100,9 @@ virtio_transport_send_pkt_work(struct work_struct *work) vq = vsock->vqs[VSOCK_VQ_TX]; for (;;) { - struct scatterlist hdr, buf, *sgs[2]; + /* +1 is for packet header. */ + struct scatterlist *sgs[MAX_SKB_FRAGS + 1]; + struct scatterlist bufs[MAX_SKB_FRAGS + 1]; int ret, in_sg = 0, out_sg = 0; struct sk_buff *skb; bool reply; @@ -111,12 +113,30 @@ virtio_transport_send_pkt_work(struct work_struct *work) virtio_transport_deliver_tap_pkt(skb); reply = virtio_vsock_skb_reply(skb); + sg_init_one(&bufs[0], virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb))); + sgs[out_sg++] = &bufs[0]; + + if (skb_is_nonlinear(skb)) { + int i; + + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + struct page *data_page = skb_shinfo(skb)->frags[i].bv_page; + + /* We will use 'page_to_virt()' for userspace page here, + * because virtio layer will call 'virt_to_phys()' later + * to fill buffer descriptor. We don't touch memory at + * "virtual" address of this page. + */ + sg_init_one(&bufs[i + 1], + page_to_virt(data_page), PAGE_SIZE); + sgs[out_sg++] = &bufs[i + 1]; + } + } else { + if (skb->len > 0) { + sg_init_one(&bufs[1], skb->data, skb->len); + sgs[out_sg++] = &bufs[1]; + } - sg_init_one(&hdr, virtio_vsock_hdr(skb), sizeof(*virtio_vsock_hdr(skb))); - sgs[out_sg++] = &hdr; - if (skb->len > 0) { - sg_init_one(&buf, skb->data, skb->len); - sgs[out_sg++] = &buf; } ret = virtqueue_add_sgs(vq, sgs, out_sg, in_sg, skb, GFP_KERNEL); From patchwork Sun Apr 23 19:26:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4613C7EE22 for ; Sun, 23 Apr 2023 19:31:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230134AbjDWTbc (ORCPT ); Sun, 23 Apr 2023 15:31:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229727AbjDWTb0 (ORCPT ); Sun, 23 Apr 2023 15:31:26 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D199CE5D; Sun, 23 Apr 2023 12:31:22 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 76FF95FD0F; Sun, 23 Apr 2023 22:31:19 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278279; bh=128uVYJJSNoHFD3kGZjfuSpswzUfi8bmAzZ5GBW0jlI=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=avki5sF9HFF3jZi/0aQea8KpgbSoH5UEWz/W5/c3O+fuiFMsdoFX0oLgSGnMwm050 fHcKKs1S6q9nu8kYHYbUt2a9stGdqZr4VUDENZ8IcbYeZ+deDsOO6KhSIBqDqxq4J3 cCEkYmgGDyzGAkLeli+LH0afWcB3FMDmG292rAYcGZ8fmrQ+BiXRt7dKabhuneG6O7 ivSI/+IjtlGzBsyWRjyEQtVL5IQtyGkWzmxJENuioSPsGlIGhOKOi4rVw7+mppNiw1 ZiXpiNBHW1FqysL3IYXutNppnyXPRlA9I2hvhMD5/fecckih6ib9AD8pqQ9TVpIHz3 S3zwbbroUAA7g== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:19 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 04/15] vsock/virtio: non-linear skb handling for tap Date: Sun, 23 Apr 2023 22:26:32 +0300 Message-ID: <20230423192643.1537470-5-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org For tap device new skb is created and data from the current skb is copied to it. This adds copying data from non-linear skb to new the skb. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/virtio_transport_common.c | 31 ++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index b901017b9f92..280497d97076 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -101,6 +101,27 @@ virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *info, return NULL; } +int virtio_transport_nl_skb_to_iov(struct sk_buff *skb, + struct iov_iter *iov_iter, + size_t len, + bool peek); +static void virtio_transport_copy_nonlinear_skb(struct sk_buff *skb, + void *dst, + size_t len) +{ + struct iov_iter iov_iter = { 0 }; + struct iovec iovec; + + iovec.iov_base = dst; + iovec.iov_len = len; + + iov_iter.iter_type = ITER_IOVEC; + iov_iter.iov = &iovec; + iov_iter.nr_segs = 1; + + virtio_transport_nl_skb_to_iov(skb, &iov_iter, len, false); +} + /* Packet capture */ static struct sk_buff *virtio_transport_build_skb(void *opaque) { @@ -109,7 +130,6 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque) struct af_vsockmon_hdr *hdr; struct sk_buff *skb; size_t payload_len; - void *payload_buf; /* A packet could be split to fit the RX buffer, so we can retrieve * the payload length from the header and the buffer pointer taking @@ -117,7 +137,6 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque) */ pkt_hdr = virtio_vsock_hdr(pkt); payload_len = pkt->len; - payload_buf = pkt->data; skb = alloc_skb(sizeof(*hdr) + sizeof(*pkt_hdr) + payload_len, GFP_ATOMIC); @@ -160,7 +179,13 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque) skb_put_data(skb, pkt_hdr, sizeof(*pkt_hdr)); if (payload_len) { - skb_put_data(skb, payload_buf, payload_len); + if (skb_is_nonlinear(pkt)) { + void *data = skb_put(skb, payload_len); + + virtio_transport_copy_nonlinear_skb(pkt, data, payload_len); + } else { + skb_put_data(skb, pkt->data, payload_len); + } } return skb; From patchwork Sun Apr 23 19:26:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8B4EC77B7C for ; Sun, 23 Apr 2023 19:31:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230303AbjDWTbu (ORCPT ); Sun, 23 Apr 2023 15:31:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230074AbjDWTba (ORCPT ); Sun, 23 Apr 2023 15:31:30 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 685DFE5D; Sun, 23 Apr 2023 12:31:27 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id A87C85FD10; Sun, 23 Apr 2023 22:31:19 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278279; bh=soNB+BivCyZDwOU52K+dmSsvfGU1rviaXB5JmG9YPLE=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=sVFBzGusith+KND7O+E/ZALbE8y2S5UsYJCSUt3FwEp9tLCITYSg91C1iRpmvoHq+ GW0gZ5zxE+dkp3+Uu3j8ip8K+T3FVw3Idm+4BCJgq2Wb+uh/HgHnA6mxNfxuGJ/hw2 xqMfEd79puhH4s3s7GeTI8u/LvxtctsGHi+93BJOuFTCy4D2HERq8uFCBYPWs/Azi4 a9LmjBEsKdwZVtk+R3ey6hlwbfX8rxYCaUnWIYWgpezs8XGTqXsX9cRFlcHq/Ksgai vJKKS/VVICtX8LEGA2iwm+HNNJwCAK1oLFDfii6ch1yNzQMzzCZM/HvR7HnOti5V/G bcAQpDkkYJimA== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:19 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 05/15] vsock/virtio: MSG_ZEROCOPY flag support Date: Sun, 23 Apr 2023 22:26:33 +0300 Message-ID: <20230423192643.1537470-6-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds handling of MSG_ZEROCOPY flag on transmission path, by alloc non-linear skbuffs and filling it with user's pages. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/virtio_transport_common.c | 390 ++++++++++++++++++++---- 1 file changed, 332 insertions(+), 58 deletions(-) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index 280497d97076..3c024d0d795c 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -37,27 +37,227 @@ virtio_transport_get_ops(struct vsock_sock *vsk) return container_of(t, struct virtio_transport, transport); } -/* Returns a new packet on success, otherwise returns NULL. - * - * If NULL is returned, errp is set to a negative errno. - */ -static struct sk_buff * -virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *info, - size_t len, - u32 src_cid, - u32 src_port, - u32 dst_cid, - u32 dst_port) -{ - const size_t skb_len = VIRTIO_VSOCK_SKB_HEADROOM + len; - struct virtio_vsock_hdr *hdr; - struct sk_buff *skb; +static bool virtio_transport_can_zcopy(struct virtio_vsock_pkt_info *info, + size_t max_to_send) +{ + struct iov_iter *iov_iter; + size_t max_skb_cap; + size_t bytes; + int i; + + if (!(info->flags & MSG_ZEROCOPY)) + return false; + + if (!info->msg) + return false; + + iov_iter = &info->msg->msg_iter; + + if (!iter_is_iovec(iov_iter)) + return false; + + if (iov_iter->iov_offset) + return false; + + /* We can't send whole iov. */ + if (iov_iter->count > max_to_send) + return false; + + for (bytes = 0, i = 0; i < iov_iter->nr_segs; i++) { + const struct iovec *iovec; + int pages_in_elem; + + iovec = &iov_iter->iov[i]; + + /* Base must be page aligned. */ + if (offset_in_page(iovec->iov_base)) + return false; + + /* Only last element could have non page aligned size. */ + if (i != (iov_iter->nr_segs - 1)) { + if (offset_in_page(iovec->iov_len)) + return false; + + pages_in_elem = iovec->iov_len >> PAGE_SHIFT; + } else { + pages_in_elem = round_up(iovec->iov_len, PAGE_SIZE); + pages_in_elem >>= PAGE_SHIFT; + } + + bytes += (pages_in_elem * PAGE_SIZE); + } + + /* How many bytes we can pack to single skb. Maximum packet + * buffer size is needed to allow vhost handle such packets, + * otherwise they will be dropped. + */ + max_skb_cap = min((unsigned int)(MAX_SKB_FRAGS * PAGE_SIZE), + (unsigned int)VIRTIO_VSOCK_MAX_PKT_BUF_SIZE); + + return true; +} + +static int virtio_transport_init_zcopy_skb(struct vsock_sock *vsk, + struct sk_buff *skb, + struct iov_iter *iter, + bool zerocopy) +{ + struct ubuf_info_msgzc *uarg_zc; + struct ubuf_info *uarg; + + uarg = msg_zerocopy_realloc(sk_vsock(vsk), + iov_length(iter->iov, iter->nr_segs), + NULL); + + if (!uarg) + return -1; + + uarg_zc = uarg_to_msgzc(uarg); + uarg_zc->zerocopy = zerocopy ? 1 : 0; + + skb_zcopy_init(skb, uarg); + + return 0; +} + +static int get_iov_elem(struct iov_iter *iter, ssize_t *offset) +{ + int i; + + *offset = iter->iov_offset; + + for (i = 0; i < iter->nr_segs; i++) { + if (*offset - (ssize_t)iter->iov[i].iov_len < 0) + return i; + + *offset -= iter->iov[i].iov_len; + } + + return -1; +} + +static int virtio_transport_fill_nonlinear_skb(struct sk_buff *skb, + struct vsock_sock *vsk, + struct virtio_vsock_pkt_info *info, + size_t payload_len) +{ + size_t payload_rest_len; + int frag_idx; + int err = 0; + + if (!info->msg) + return 0; + + frag_idx = 0; + VIRTIO_VSOCK_SKB_CB(skb)->curr_frag = 0; + VIRTIO_VSOCK_SKB_CB(skb)->frag_off = 0; + payload_rest_len = payload_len; + + while (payload_rest_len) { + struct page *user_pages[MAX_SKB_FRAGS]; + const struct iovec *iovec; + struct iov_iter *iter; + size_t last_frag_len; + ssize_t offs_in_iov; + size_t curr_iov_len; + size_t pages_in_seg; + long pinned_pages; + int page_idx; + int seg_idx; + + iter = &info->msg->msg_iter; + seg_idx = get_iov_elem(iter, &offs_in_iov); + if (seg_idx < 0) { + err = -1; + break; + } + + iovec = &iter->iov[seg_idx]; + curr_iov_len = min(iovec->iov_len - offs_in_iov, + payload_rest_len); + pages_in_seg = curr_iov_len >> PAGE_SHIFT; + + if (curr_iov_len % PAGE_SIZE) { + last_frag_len = curr_iov_len % PAGE_SIZE; + pages_in_seg++; + } else { + last_frag_len = PAGE_SIZE; + } + + pinned_pages = pin_user_pages((unsigned long)iovec->iov_base + + offs_in_iov, pages_in_seg, + FOLL_ANON, user_pages, NULL); + + if (pinned_pages != pages_in_seg) { + /* Unpin partially pinned pages. */ + unpin_user_pages(user_pages, pinned_pages); + err = -1; + break; + } + + for (page_idx = 0; page_idx < pages_in_seg; page_idx++) { + int frag_len = PAGE_SIZE; + + if (page_idx == (pages_in_seg - 1)) + frag_len = last_frag_len; + + /* 'get_page()' as pair to 'put_page()' during + * this non-linear skbuff deallocation. + */ + get_page(user_pages[page_idx]); + skb_fill_page_desc(skb, frag_idx, + user_pages[page_idx], 0, + frag_len); + skb_len_add(skb, frag_len); + frag_idx++; + } + + iter->iov_offset += curr_iov_len; + payload_rest_len -= curr_iov_len; + } + + return err; +} + +static int virtio_transport_fill_linear_skb(struct sk_buff *skb, + struct vsock_sock *vsk, + struct virtio_vsock_pkt_info *info, + size_t len) +{ void *payload; int err; - skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL); - if (!skb) - return NULL; + payload = skb_put(skb, len); + err = memcpy_from_msg(payload, info->msg, len); + if (err) + return -1; + + if (msg_data_left(info->msg)) + return 0; + + if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET) { + struct virtio_vsock_hdr *hdr; + + hdr = virtio_vsock_hdr(skb); + + hdr->flags |= cpu_to_le32(VIRTIO_VSOCK_SEQ_EOM); + + if (info->msg->msg_flags & MSG_EOR) + hdr->flags |= cpu_to_le32(VIRTIO_VSOCK_SEQ_EOR); + } + + return 0; +} + +static void virtio_transport_init_hdr(struct sk_buff *skb, + struct virtio_vsock_pkt_info *info, + u32 src_cid, + u32 src_port, + u32 dst_cid, + u32 dst_port, + size_t len) +{ + struct virtio_vsock_hdr *hdr; hdr = virtio_vsock_hdr(skb); hdr->type = cpu_to_le16(info->type); @@ -68,37 +268,6 @@ virtio_transport_alloc_skb(struct virtio_vsock_pkt_info *info, hdr->dst_port = cpu_to_le32(dst_port); hdr->flags = cpu_to_le32(info->flags); hdr->len = cpu_to_le32(len); - - if (info->msg && len > 0) { - payload = skb_put(skb, len); - err = memcpy_from_msg(payload, info->msg, len); - if (err) - goto out; - - if (msg_data_left(info->msg) == 0 && - info->type == VIRTIO_VSOCK_TYPE_SEQPACKET) { - hdr->flags |= cpu_to_le32(VIRTIO_VSOCK_SEQ_EOM); - - if (info->msg->msg_flags & MSG_EOR) - hdr->flags |= cpu_to_le32(VIRTIO_VSOCK_SEQ_EOR); - } - } - - if (info->reply) - virtio_vsock_skb_set_reply(skb); - - trace_virtio_transport_alloc_pkt(src_cid, src_port, - dst_cid, dst_port, - len, - info->type, - info->op, - info->flags); - - return skb; - -out: - kfree_skb(skb); - return NULL; } int virtio_transport_nl_skb_to_iov(struct sk_buff *skb, @@ -209,6 +378,88 @@ static u16 virtio_transport_get_type(struct sock *sk) return VIRTIO_VSOCK_TYPE_SEQPACKET; } +static void virtio_transport_unpin_skb(struct sk_buff *skb) +{ + int i; + + if (!skb_is_nonlinear(skb)) + return; + + for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) { + struct bio_vec *frag; + int nr_pages; + int page; + + frag = &skb_shinfo(skb)->frags[i]; + nr_pages = frag->bv_len / PAGE_SIZE; + + if (frag->bv_len % PAGE_SIZE) + nr_pages++; + + for (page = 0; page < nr_pages; page++) + unpin_user_page(&frag->bv_page[page]); + } +} + +/* Returns a new packet on success, otherwise returns NULL. + * + * If NULL is returned, errp is set to a negative errno. + */ +static struct sk_buff *virtio_transport_alloc_skb(struct vsock_sock *vsk, + struct virtio_vsock_pkt_info *info, + size_t payload_len, + bool zcopy, + u32 dst_cid, + u32 dst_port, + u32 src_cid, + u32 src_port) +{ + struct sk_buff *skb; + size_t skb_len; + + skb_len = VIRTIO_VSOCK_SKB_HEADROOM; + + if (!zcopy) + skb_len += payload_len; + + skb = virtio_vsock_alloc_skb(skb_len, GFP_KERNEL); + if (!skb) + return NULL; + + virtio_transport_init_hdr(skb, info, src_cid, src_port, + dst_cid, dst_port, + payload_len); + + if (info->msg && payload_len > 0) { + int err; + + if (zcopy) { + skb->destructor = virtio_transport_unpin_skb; + err = virtio_transport_fill_nonlinear_skb(skb, vsk, info, payload_len); + } else { + err = virtio_transport_fill_linear_skb(skb, vsk, info, payload_len); + } + + if (err) + goto out; + } + + if (info->reply) + virtio_vsock_skb_set_reply(skb); + + trace_virtio_transport_alloc_pkt(src_cid, src_port, + dst_cid, dst_port, + payload_len, + info->type, + info->op, + info->flags); + + return skb; +out: + kfree_skb(skb); + return NULL; +} + /* This function can only be used on connecting/connected sockets, * since a socket assigned to a transport is required. * @@ -221,6 +472,8 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, const struct virtio_transport *t_ops; struct virtio_vsock_sock *vvs; u32 pkt_len = info->pkt_len; + bool can_zcopy = false; + u32 max_skb_cap; u32 rest_len; int ret; @@ -230,6 +483,9 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, if (unlikely(!t_ops)) return -EFAULT; + if (!sock_flag(sk_vsock(vsk), SOCK_ZEROCOPY)) + info->flags &= ~MSG_ZEROCOPY; + src_cid = t_ops->transport.get_local_cid(); src_port = vsk->local_addr.svm_port; if (!info->remote_cid) { @@ -249,22 +505,36 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk, if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW) return pkt_len; + can_zcopy = virtio_transport_can_zcopy(info, pkt_len); + if (can_zcopy) + max_skb_cap = min_t(u32, VIRTIO_VSOCK_MAX_PKT_BUF_SIZE, + (MAX_SKB_FRAGS * PAGE_SIZE)); + else + max_skb_cap = VIRTIO_VSOCK_MAX_PKT_BUF_SIZE; + rest_len = pkt_len; do { struct sk_buff *skb; size_t skb_len; - skb_len = min_t(u32, VIRTIO_VSOCK_MAX_PKT_BUF_SIZE, rest_len); + skb_len = min(max_skb_cap, rest_len); - skb = virtio_transport_alloc_skb(info, skb_len, - src_cid, src_port, - dst_cid, dst_port); + skb = virtio_transport_alloc_skb(vsk, info, skb_len, can_zcopy, + dst_cid, dst_port, + src_cid, src_port); if (!skb) { ret = -ENOMEM; break; } + if (skb_len == rest_len && + info->flags & MSG_ZEROCOPY && + info->op == VIRTIO_VSOCK_OP_RW) + virtio_transport_init_zcopy_skb(vsk, skb, + &info->msg->msg_iter, + can_zcopy); + virtio_transport_inc_tx_pkt(vvs, skb); ret = t_ops->send_pkt(skb); @@ -945,6 +1215,7 @@ virtio_transport_stream_enqueue(struct vsock_sock *vsk, .msg = msg, .pkt_len = len, .vsk = vsk, + .flags = msg->msg_flags, }; return virtio_transport_send_pkt_info(vsk, &info); @@ -988,6 +1259,7 @@ static int virtio_transport_reset_no_sock(const struct virtio_transport *t, .reply = true, }; struct sk_buff *reply; + int res; /* Send RST only if the original pkt is not a RST pkt */ if (le16_to_cpu(hdr->op) == VIRTIO_VSOCK_OP_RST) @@ -996,15 +1268,17 @@ static int virtio_transport_reset_no_sock(const struct virtio_transport *t, if (!t) return -ENOTCONN; - reply = virtio_transport_alloc_skb(&info, 0, - le64_to_cpu(hdr->dst_cid), - le32_to_cpu(hdr->dst_port), + reply = virtio_transport_alloc_skb(NULL, &info, 0, false, le64_to_cpu(hdr->src_cid), - le32_to_cpu(hdr->src_port)); + le32_to_cpu(hdr->src_port), + le64_to_cpu(hdr->dst_cid), + le32_to_cpu(hdr->dst_port)); if (!reply) return -ENOMEM; - return t->send_pkt(reply); + res = t->send_pkt(reply); + + return res; } /* This function should be called with sk_lock held and SOCK_DONE set */ From patchwork Sun Apr 23 19:26:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221458 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1B2BC6FD18 for ; Sun, 23 Apr 2023 19:31:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230315AbjDWTbw (ORCPT ); Sun, 23 Apr 2023 15:31:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56254 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229908AbjDWTb1 (ORCPT ); Sun, 23 Apr 2023 15:31:27 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DD59E10E3; Sun, 23 Apr 2023 12:31:26 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 845135FD11; Sun, 23 Apr 2023 22:31:20 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278280; bh=92TjpAFzgN19cdeV3DB5ob03wq+mklevZg38Rj0IMpE=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=Mng9PSOvlrxvx5X/lA8GlGv+KKcbYKMDrsWhqZzW/HNfBLp1wasviZQwRM4f6Q7eG 0U0mIPJ2/rUrBhRf0NueNKj9bNnR9jTVb0c/HaG89B2FGYkI23Ax/XhcPq0EAAhaxD q5gceOhOxvJmuBrEr10WX1yMjKRmchmucWX4w0zShfGACUvI9dsqzdPT5iMjCkGrgn eule7pCMmBOedTnBw4/ljqYCZsF1lv6u4lg/pqxvEQGZFhHLm/5Ar5OdhcFPDnnfax Q1xB4wlEKLQz7oQ444+Z1OqPw9fy6lXqVgBWDEmK8F3Rox4PwAyB5jWaNhH22jsAgg sHvwA9O0Pz06Q== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:20 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 06/15] vsock: check error queue to set EPOLLERR Date: Sun, 23 Apr 2023 22:26:34 +0300 Message-ID: <20230423192643.1537470-7-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org If socket's error queue is not empty, EPOLLERR must be set. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/af_vsock.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 413407bb646c..137a0db6eaac 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1030,8 +1030,8 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock, poll_wait(file, sk_sleep(sk), wait); mask = 0; - if (sk->sk_err) - /* Signify that there has been an error on this socket. */ + /* Signify that there has been an error on this socket. */ + if (sk->sk_err || !skb_queue_empty_lockless(&sk->sk_error_queue)) mask |= EPOLLERR; /* INET sockets treat local write shutdown and peer write shutdown as a From patchwork Sun Apr 23 19:26:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83FADC77B76 for ; Sun, 23 Apr 2023 19:31:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230249AbjDWTbn (ORCPT ); Sun, 23 Apr 2023 15:31:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229994AbjDWTb2 (ORCPT ); Sun, 23 Apr 2023 15:31:28 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80CEEE63; Sun, 23 Apr 2023 12:31:27 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 639DD5FD12; Sun, 23 Apr 2023 22:31:21 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278281; bh=dnEZbD4yrTqC14p1P2CFEotu3kmN6tK9cyCJAJm4wDc=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=jQXpFIwQCXOjyCIt0VXPV/8YCiHqow3hUlSjenrDP5aaB6A1mjdXcGsk59qe3yNMX T5rBLrYzGhygTEOrYoXMDIKPuOwvWfHS4dZ2ToSzchl93UL+f1B9FSTZnCuVVzkdiB vhaXzmxFcz2Tu2I8yYF9jgRVNLLCt4In2FRE3eumShb36Fv1xIH350eI74YIJcbAKD 0lTJl1tUFSP1mX2lCeDyr6sz6H7NI23CV2ZpvsxGiWjlA2hXVC0Fg0HaVtdmQ8OYaz Tl9QjX6ADDmcvbFSSdQoUykRNNdZrSopwjpcvHNCGk/oK8q/6fozOimeI8dwbKh5UT Hp/Efnh/iaySg== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:21 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 07/15] vsock: read from socket's error queue Date: Sun, 23 Apr 2023 22:26:35 +0300 Message-ID: <20230423192643.1537470-8-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds handling of MSG_ERRQUEUE input flag for receive call, thus skb from socket's error queue is read. Signed-off-by: Arseniy Krasnov --- include/linux/socket.h | 1 + net/vmw_vsock/af_vsock.c | 5 +++++ 2 files changed, 6 insertions(+) diff --git a/include/linux/socket.h b/include/linux/socket.h index 13c3a237b9c9..19a6f39fa014 100644 --- a/include/linux/socket.h +++ b/include/linux/socket.h @@ -379,6 +379,7 @@ struct ucred { #define SOL_MPTCP 284 #define SOL_MCTP 285 #define SOL_SMC 286 +#define SOL_VSOCK 287 /* IPX options */ #define IPX_TYPE 1 diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index 137a0db6eaac..c50d2632a75f 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -110,6 +110,7 @@ #include #include #include +#include static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr); static void vsock_sk_destruct(struct sock *sk); @@ -2135,6 +2136,10 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len, int err; sk = sock->sk; + + if (unlikely(flags & MSG_ERRQUEUE)) + return sock_recv_errqueue(sk, msg, len, SOL_VSOCK, 0); + vsk = vsock_sk(sk); err = 0; From patchwork Sun Apr 23 19:26:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EDA7FC6FD18 for ; Sun, 23 Apr 2023 19:31:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230219AbjDWTbl (ORCPT ); Sun, 23 Apr 2023 15:31:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56270 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229968AbjDWTb2 (ORCPT ); Sun, 23 Apr 2023 15:31:28 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 54FDFE52; Sun, 23 Apr 2023 12:31:27 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 4CF2A5FD13; Sun, 23 Apr 2023 22:31:22 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278282; bh=8MkOLhUem8zje6YCZe3k8/M7+vTx+mrb7f0DcsdGdpY=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=KoO6dzXSzca3ttXVlKbvpwC3NwC5pb5eblm52zMokNdxh1SlYEEMB5LTkTdBoPwhd LKrrWG1ggHFLSx28sULqj7Fv7R2u36fHrcRfT7KWZXODEV/uujk9ikrmVrobYOZy/X h3KvORYZ3TyrHc9hiM1Klf/YJNRJq50O2R7uLL9t+k8uINNLiPz+6fUkRpwWUZ31fn m7TIlwljA/lnsHZaVQ3CgPLktq1gunvZJE+0SxvPQbHAso7iNSfkD2cmvRZniGyRbm AZKz3FwsuMvL+kmXyIdYB/LVJMqm9S2ltOGpSki/TL9iqwRbFHwO2T/6bRHqvczvGd AgBDybI60Gzfg== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:22 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 08/15] vsock: check for MSG_ZEROCOPY support Date: Sun, 23 Apr 2023 22:26:36 +0300 Message-ID: <20230423192643.1537470-9-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This feature totally depends on transport, so if transport doesn't support it, return error. Signed-off-by: Arseniy Krasnov --- include/net/af_vsock.h | 3 +++ net/vmw_vsock/af_vsock.c | 7 +++++++ 2 files changed, 10 insertions(+) diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h index 0e7504a42925..270ca54cfab8 100644 --- a/include/net/af_vsock.h +++ b/include/net/af_vsock.h @@ -177,6 +177,9 @@ struct vsock_transport { /* Read a single skb */ int (*read_skb)(struct vsock_sock *, skb_read_actor_t); + + /* Zero-copy. */ + bool (*msgzerocopy_allow)(void); }; /**** CORE ****/ diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c index c50d2632a75f..eac8c1affd6a 100644 --- a/net/vmw_vsock/af_vsock.c +++ b/net/vmw_vsock/af_vsock.c @@ -1824,6 +1824,13 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg, goto out; } + if (msg->msg_flags & MSG_ZEROCOPY && + (!transport->msgzerocopy_allow || + !transport->msgzerocopy_allow())) { + err = -EOPNOTSUPP; + goto out; + } + /* Wait for room in the produce queue to enqueue our user's data. */ timeout = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT); From patchwork Sun Apr 23 19:26:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221456 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FC69C77B76 for ; Sun, 23 Apr 2023 19:31:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230293AbjDWTbs (ORCPT ); Sun, 23 Apr 2023 15:31:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230034AbjDWTb3 (ORCPT ); Sun, 23 Apr 2023 15:31:29 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D38EB10DB; Sun, 23 Apr 2023 12:31:27 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 3154C5FD14; Sun, 23 Apr 2023 22:31:23 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278283; bh=I6s5f+VVEU4mKHhe9+5VqHMuX1hjVKhfruoOIGVs/SQ=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=lg8IECuEd4AdXNXlmfSFk/lwSrSIW9NV7ikX3i5WjR0bXNc99qZGI22kPh7CmGrFa 5v6VDUX9GHXHLqO1eCh1ZIo8OV/5fQU55STHIy0jdTEd98UXKDacNXuPbSpxNarrK3 R1Si42FZjWRIg+bAOm77sTJmW0WMFzb4lHAy/dSCI3RL40irScDxddAdMroruV8ojK tjSqJ6XFiyI+sEO1AVrBzk5mJgJIBamF+dG8SNFNhb0lThzSknyiz0Xs2mlA8biGFA zaCdRBB8v1+IZhj7BMOqssJ0Taxvo1nK1xW2Xa0xRCe/ZivNvHq8tdcqAPOhExJtqy LPaXPel3vJebw== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:23 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 09/15] vhost/vsock: support MSG_ZEROCOPY for transport Date: Sun, 23 Apr 2023 22:26:37 +0300 Message-ID: <20230423192643.1537470-10-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add 'msgzerocopy_allow()' callback for vhost transport. Signed-off-by: Arseniy Krasnov --- drivers/vhost/vsock.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 1e70aa390e44..4a33940a6020 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -405,6 +405,11 @@ static bool vhost_vsock_more_replies(struct vhost_vsock *vsock) return val < vq->num; } +static bool vhost_transport_msgzerocopy_allow(void) +{ + return true; +} + static bool vhost_transport_seqpacket_allow(u32 remote_cid); static struct virtio_transport vhost_transport = { @@ -451,6 +456,7 @@ static struct virtio_transport vhost_transport = { .notify_buffer_size = virtio_transport_notify_buffer_size, .read_skb = virtio_transport_read_skb, + .msgzerocopy_allow = vhost_transport_msgzerocopy_allow, }, .send_pkt = vhost_transport_send_pkt, From patchwork Sun Apr 23 19:26:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 24BF4C77B7E for ; Sun, 23 Apr 2023 19:31:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230197AbjDWTbi (ORCPT ); Sun, 23 Apr 2023 15:31:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230020AbjDWTb3 (ORCPT ); Sun, 23 Apr 2023 15:31:29 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1E98E68; Sun, 23 Apr 2023 12:31:27 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 13D5A5FD15; Sun, 23 Apr 2023 22:31:24 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278284; bh=+AZ60y8UsiiQtGZ1nWzLqmHATqRhZ0dBt3UOXPrC2rk=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=d/VPhNtAMtBpRA5MucHR3KjV+GUgJtfr0fqhy6CWt6Dk7XdFQs/G4sYxX0BCr1gTn hfxr6H53MR/toifCGYbnDaaWJPBtEk0QQENh1AIoHHY/PfIw93//F3/cTrVb7OywEd F98c+EduzDEKhUj97YHSauFsP9xKHZa/LgvH0auExDoiPLZ8sOuCQqVkKfXjl5aZhi 8wvYA9Ss0uHNzv+d0pI1f0Nu7B+wRQ41lQ1HBCwfs7X2oHYxG2rUdozDG23kE6lw4y IPdu7gqwQpXXMl8oJTalIAzOnaxttJK431jOEjEG9VebsbiCkGaosoCts+7XTcn+Kq YT1/Iy2FQLNbQ== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:24 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 10/15] vsock/virtio: support MSG_ZEROCOPY for transport Date: Sun, 23 Apr 2023 22:26:38 +0300 Message-ID: <20230423192643.1537470-11-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add 'msgzerocopy_allow()' callback for virtio transport. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/virtio_transport.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c index 1c269c3f010d..ca12db84e053 100644 --- a/net/vmw_vsock/virtio_transport.c +++ b/net/vmw_vsock/virtio_transport.c @@ -433,6 +433,11 @@ static void virtio_vsock_rx_done(struct virtqueue *vq) queue_work(virtio_vsock_workqueue, &vsock->rx_work); } +static bool virtio_transport_msgzerocopy_allow(void) +{ + return true; +} + static bool virtio_transport_seqpacket_allow(u32 remote_cid); static struct virtio_transport virtio_transport = { @@ -479,6 +484,8 @@ static struct virtio_transport virtio_transport = { .notify_buffer_size = virtio_transport_notify_buffer_size, .read_skb = virtio_transport_read_skb, + + .msgzerocopy_allow = virtio_transport_msgzerocopy_allow, }, .send_pkt = virtio_transport_send_pkt, From patchwork Sun Apr 23 19:26:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC9FDC77B7C for ; Sun, 23 Apr 2023 19:31:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230231AbjDWTbp (ORCPT ); Sun, 23 Apr 2023 15:31:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230051AbjDWTb3 (ORCPT ); Sun, 23 Apr 2023 15:31:29 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7DC9E52; Sun, 23 Apr 2023 12:31:28 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 9D4F95FD16; Sun, 23 Apr 2023 22:31:26 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278286; bh=qGsMRA60tSQDR+BPleHY9nYAYvbNU1tckKrhNTP0SDo=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=hTeHw7Gyig17Rh0mpISZGE99kkYSORIyjUQU/oxAJL+ZMm2iUbWYadLUbtgblaGD2 iW2oxzo1xGBN6GvcHN8vyMaHyhd2GEtUbLbiNTo4hKxAavjPDrs+6R1scaGhA88ytf /wUrqdztc9E2Y09icc80u1iO6H5DdE2NhkJs5U8p425iDkIJRVhBuBOcMqceo/OkUW K+pRiQZEzMVe1Lrgo93JKhs28D/DjqpXnywFXHIXnvurDDr1i0dz0hSoAtXRqETxmS 7ChMq2g2217N58ROTQ8IGbVFamr9bXsGkwUkavSg+p24hou/6HC64ePqZzaUPoUn4R lTdysUiI2h2Tw== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:26 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 11/15] vsock/loopback: support MSG_ZEROCOPY for transport Date: Sun, 23 Apr 2023 22:26:39 +0300 Message-ID: <20230423192643.1537470-12-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Add 'msgzerocopy_allow()' callback for loopback transport. Signed-off-by: Arseniy Krasnov --- net/vmw_vsock/vsock_loopback.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c index e3afc0c866f5..0de1436c7d4f 100644 --- a/net/vmw_vsock/vsock_loopback.c +++ b/net/vmw_vsock/vsock_loopback.c @@ -48,6 +48,7 @@ static int vsock_loopback_cancel_pkt(struct vsock_sock *vsk) } static bool vsock_loopback_seqpacket_allow(u32 remote_cid); +static bool vsock_loopback_msgzerocopy_allow(void); static struct virtio_transport loopback_transport = { .transport = { @@ -93,11 +94,18 @@ static struct virtio_transport loopback_transport = { .notify_buffer_size = virtio_transport_notify_buffer_size, .read_skb = virtio_transport_read_skb, + + .msgzerocopy_allow = vsock_loopback_msgzerocopy_allow, }, .send_pkt = vsock_loopback_send_pkt, }; +static bool vsock_loopback_msgzerocopy_allow(void) +{ + return true; +} + static bool vsock_loopback_seqpacket_allow(u32 remote_cid) { return true; From patchwork Sun Apr 23 19:26:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28EAEC77B7C for ; Sun, 23 Apr 2023 19:31:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230326AbjDWTbz (ORCPT ); Sun, 23 Apr 2023 15:31:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230107AbjDWTbb (ORCPT ); Sun, 23 Apr 2023 15:31:31 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA864E5D; Sun, 23 Apr 2023 12:31:30 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id F30245FD17; Sun, 23 Apr 2023 22:31:28 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278289; bh=DOvMiyMl9T/2K7i2my/swBI2rgDyc1aBJR26UiP7x5o=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=auT+o/tP2A15cToEEKYnRmS6p89PBmSxxoNkCqqy/pfLfuQm1n+5C0ErRkET2f8V+ eVlCQgwsymUWz/ckTl53hj5S1Xqk67lgvwffW4s2t2KiCLQVPcK/dyiXfcKHjm2a7s cD0RSajJpZPU8baow1lhPGQJK+0PD7UBF9TPiP7n0WVrdE1gpXtmFxpUndloNR42FU dCD+aC9Mx7/KHyyqPxLi8y/yBBK2W6lQB4S9MX51u57GyZTUMu97omMSZIH9yFReNA rSCXAcj/ied3rx9aDgsGrmrGiuS8BA7AOxGxK8Nq/L3BZbzljNV3zprHSZ7ow/yHvU 5x68HWXO7FXZg== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:28 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 12/15] net/sock: enable setting SO_ZEROCOPY for PF_VSOCK Date: Sun, 23 Apr 2023 22:26:40 +0300 Message-ID: <20230423192643.1537470-13-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org PF_VSOCK supports MSG_ZEROCOPY transmission, so SO_ZEROCOPY could be enabled. Signed-off-by: Arseniy Krasnov --- net/core/sock.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/net/core/sock.c b/net/core/sock.c index c25888795390..13a89c6cbfb8 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1457,9 +1457,11 @@ int sk_setsockopt(struct sock *sk, int level, int optname, (sk->sk_type == SOCK_DGRAM && sk->sk_protocol == IPPROTO_UDP))) ret = -EOPNOTSUPP; - } else if (sk->sk_family != PF_RDS) { + } else if (sk->sk_family != PF_RDS && + sk->sk_family != PF_VSOCK) { ret = -EOPNOTSUPP; } + if (!ret) { if (val < 0 || val > 1) ret = -EINVAL; From patchwork Sun Apr 23 19:26:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221460 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03E04C77B60 for ; Sun, 23 Apr 2023 19:32:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230359AbjDWTb6 (ORCPT ); Sun, 23 Apr 2023 15:31:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56348 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230147AbjDWTbe (ORCPT ); Sun, 23 Apr 2023 15:31:34 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BE6810EB; Sun, 23 Apr 2023 12:31:31 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id EDC965FD18; Sun, 23 Apr 2023 22:31:29 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278290; bh=PTO+88z7ZzGCowyLUeWnfeKiSAjG0RLn8oFYE0Qu8bY=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=sH69E1M6FNJ7rC2dzF58ROHqeXt5VLytPBTkz4GNy/gUcUsUQHI6VDPSw6D+FAy4i lkqnX8+ihuGZJSbcWCfTbWkH5iMWRtbpunO0NoKHsoIF78PHVGEJ/YNKAnMfV4Jt2+ ITLL4NP6ux9+VBqK/WW2SfG6Okpr3s8QUhTkqvOzBnK2SmMjkjrERAWCm4/pC/yk4J GlAvTLvuIxU1ecQcdRnsv/ZR09+e3TjZJwbfzmNfiqbU0zf8lwL3yU4QamD455jTJS 2jFibJD/YzMeh9Ci8SsEaXjAokyPT9Unt1b7ZgfZGv+2Z4V2nwQT42eEkdGFmnCifr 1/9vteiuVySyQ== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:29 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 13/15] test/vsock: MSG_ZEROCOPY flag tests Date: Sun, 23 Apr 2023 22:26:41 +0300 Message-ID: <20230423192643.1537470-14-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds set of tests for MSG_ZEROCOPY flag. Signed-off-by: Arseniy Krasnov --- tools/testing/vsock/Makefile | 2 +- tools/testing/vsock/util.h | 1 + tools/testing/vsock/vsock_test.c | 11 + tools/testing/vsock/vsock_test_zerocopy.c | 501 ++++++++++++++++++++++ tools/testing/vsock/vsock_test_zerocopy.h | 12 + 5 files changed, 526 insertions(+), 1 deletion(-) create mode 100644 tools/testing/vsock/vsock_test_zerocopy.c create mode 100644 tools/testing/vsock/vsock_test_zerocopy.h diff --git a/tools/testing/vsock/Makefile b/tools/testing/vsock/Makefile index 43a254f0e14d..0a78787d1d92 100644 --- a/tools/testing/vsock/Makefile +++ b/tools/testing/vsock/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only all: test vsock_perf test: vsock_test vsock_diag_test -vsock_test: vsock_test.o timeout.o control.o util.o +vsock_test: vsock_test.o vsock_test_zerocopy.o timeout.o control.o util.o vsock_diag_test: vsock_diag_test.o timeout.o control.o util.o vsock_perf: vsock_perf.o diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h index fb99208a95ea..46ba1d3202b8 100644 --- a/tools/testing/vsock/util.h +++ b/tools/testing/vsock/util.h @@ -2,6 +2,7 @@ #ifndef UTIL_H #define UTIL_H +#include #include #include diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c index ac1bd3ac1533..d9bddb643794 100644 --- a/tools/testing/vsock/vsock_test.c +++ b/tools/testing/vsock/vsock_test.c @@ -20,6 +20,7 @@ #include #include +#include "vsock_test_zerocopy.h" #include "timeout.h" #include "control.h" #include "util.h" @@ -1128,6 +1129,16 @@ static struct test_case test_cases[] = { .run_client = test_stream_virtio_skb_merge_client, .run_server = test_stream_virtio_skb_merge_server, }, + { + .name = "SOCK_STREAM MSG_ZEROCOPY", + .run_client = test_stream_msg_zcopy_client, + .run_server = test_stream_msg_zcopy_server, + }, + { + .name = "SOCK_STREAM MSG_ZEROCOPY empty MSG_ERRQUEUE", + .run_client = test_stream_msg_zcopy_empty_errq_client, + .run_server = test_stream_msg_zcopy_empty_errq_server, + }, {}, }; diff --git a/tools/testing/vsock/vsock_test_zerocopy.c b/tools/testing/vsock/vsock_test_zerocopy.c new file mode 100644 index 000000000000..de44587bff26 --- /dev/null +++ b/tools/testing/vsock/vsock_test_zerocopy.c @@ -0,0 +1,501 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* MSG_ZEROCOPY feature tests for vsock + * + * Copyright (C) 2023 SberDevices. + * + * Author: Arseniy Krasnov + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "control.h" +#include "vsock_test_zerocopy.h" + +#ifndef SOL_VSOCK +#define SOL_VSOCK 287 +#endif + +#define PAGE_SIZE 4096 +#define POLL_TIMEOUT_MS 100 +#define SENDMSG_RES_IOV_LEN (-2) + +struct zerocopy_test_data { + bool zerocopied; + bool completion; + int sendmsg_errno; + ssize_t sendmsg_res; + int vecs_cnt; + struct iovec vecs[3]; +}; + +static void do_recv_completion(int fd, bool zerocopied, bool completion) +{ + struct sock_extended_err *serr; + struct msghdr msg = { 0 }; + struct pollfd fds = { 0 }; + char cmsg_data[128]; + struct cmsghdr *cm; + uint32_t hi, lo; + ssize_t res; + + fds.fd = fd; + fds.events = 0; + + if (poll(&fds, 1, POLL_TIMEOUT_MS) < 0) { + perror("poll"); + exit(EXIT_FAILURE); + } + + if (!(fds.revents & POLLERR)) { + if (completion) { + fprintf(stderr, "POLLERR expected\n"); + exit(EXIT_FAILURE); + } else { + return; + } + } + + msg.msg_control = cmsg_data; + msg.msg_controllen = sizeof(cmsg_data); + + res = recvmsg(fd, &msg, MSG_ERRQUEUE); + if (res) { + fprintf(stderr, "failed to read error queue: %zi\n", res); + exit(EXIT_FAILURE); + } + + cm = CMSG_FIRSTHDR(&msg); + if (!cm) { + fprintf(stderr, "cmsg: no cmsg\n"); + exit(EXIT_FAILURE); + } + + if (cm->cmsg_level != SOL_VSOCK) { + fprintf(stderr, "cmsg: unexpected 'cmsg_level'\n"); + exit(EXIT_FAILURE); + } + + if (cm->cmsg_type != 0) { + fprintf(stderr, "cmsg: unexpected 'cmsg_type'\n"); + exit(EXIT_FAILURE); + } + + serr = (void *)CMSG_DATA(cm); + if (serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) { + printf("serr: wrong origin: %u\n", serr->ee_origin); + exit(EXIT_FAILURE); + } + + if (serr->ee_errno) { + printf("serr: wrong error code: %u\n", serr->ee_errno); + exit(EXIT_FAILURE); + } + + hi = serr->ee_data; + lo = serr->ee_info; + if (hi != lo) { + fprintf(stderr, "serr: expected hi == lo\n"); + exit(EXIT_FAILURE); + } + + if (hi) { + fprintf(stderr, "serr: expected hi == lo == 0\n"); + exit(EXIT_FAILURE); + } + + if (zerocopied && (serr->ee_code & SO_EE_CODE_ZEROCOPY_COPIED)) { + fprintf(stderr, "serr: was copy instead of zerocopy\n"); + exit(EXIT_FAILURE); + } + + if (!zerocopied && !(serr->ee_code & SO_EE_CODE_ZEROCOPY_COPIED)) { + fprintf(stderr, "serr: was zerocopy instead of copy\n"); + exit(EXIT_FAILURE); + } +} + +static void enable_so_zerocopy(int fd) +{ + int val = 1; + + if (setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &val, sizeof(val))) + error(1, errno, "setsockopt"); +} + +static void *mmap_no_fail(size_t bytes) +{ + void *res; + + res = mmap(NULL, bytes, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_POPULATE, -1, 0); + if (res == MAP_FAILED) { + perror("mmap"); + exit(EXIT_FAILURE); + } + + return res; +} + +static size_t iovec_bytes(const struct iovec *iov, size_t iovnum) +{ + size_t bytes; + int i; + + for (bytes = 0, i = 0; i < iovnum; i++) + bytes += iov[i].iov_len; + + return bytes; +} + +static void iovec_random_init(struct iovec *iov, + const struct zerocopy_test_data *test_data) +{ + int i; + + for (i = 0; i < test_data->vecs_cnt; i++) { + int j; + + if (test_data->vecs[i].iov_base == MAP_FAILED) + continue; + + for (j = 0; j < iov[i].iov_len; j++) + ((uint8_t *)iov[i].iov_base)[j] = rand() & 0xff; + } +} + +static unsigned long iovec_hash_djb2(struct iovec *iov, size_t iovnum) +{ + unsigned long hash; + size_t iov_bytes; + size_t offs; + void *tmp; + int i; + + iov_bytes = iovec_bytes(iov, iovnum); + + tmp = malloc(iov_bytes); + if (!tmp) { + perror("malloc"); + exit(EXIT_FAILURE); + } + + for (offs = 0, i = 0; i < iovnum; i++) { + memcpy(tmp + offs, iov[i].iov_base, iov[i].iov_len); + offs += iov[i].iov_len; + } + + hash = hash_djb2(tmp, iov_bytes); + free(tmp); + + return hash; +} + +static struct zerocopy_test_data test_data_array[] = { + /* Last element has non-page aligned size. */ + { + .zerocopied = true, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { NULL, PAGE_SIZE }, + { NULL, 200 } + } + }, + /* All elements have page aligned base and size. */ + { + .zerocopied = true, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { NULL, PAGE_SIZE * 2 }, + { NULL, PAGE_SIZE * 3 } + } + }, + /* All elements have page aligned base and size. But + * data length is bigger than 64Kb. + */ + { + .zerocopied = true, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE * 16 }, + { NULL, PAGE_SIZE * 16 }, + { NULL, PAGE_SIZE * 16 } + } + }, + /* All elements have page aligned base and size. */ + { + .zerocopied = true, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { NULL, PAGE_SIZE }, + { NULL, PAGE_SIZE } + } + }, + /* Middle element has non-page aligned size. */ + { + .zerocopied = false, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { NULL, 100 }, + { NULL, PAGE_SIZE } + } + }, + /* Middle element has both non-page aligned base and size. */ + { + .zerocopied = false, + .completion = true, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { (void *)1, 100 }, + { NULL, PAGE_SIZE } + } + }, + /* One element has invalid base. */ + { + .zerocopied = false, + .completion = false, + .sendmsg_errno = ENOMEM, + .sendmsg_res = -1, + .vecs_cnt = 3, + { + { NULL, PAGE_SIZE }, + { MAP_FAILED, PAGE_SIZE }, + { NULL, PAGE_SIZE } + } + }, + /* Valid data, but SO_ZEROCOPY is off. */ + { + .zerocopied = true, + .completion = false, + .sendmsg_errno = 0, + .sendmsg_res = SENDMSG_RES_IOV_LEN, + .vecs_cnt = 1, + { + { NULL, PAGE_SIZE } + } + }, +}; + +static void __test_stream_msg_zerocopy_client(const struct test_opts *opts, + const struct zerocopy_test_data *test_data) +{ + struct msghdr msg = { 0 }; + ssize_t sendmsg_res; + struct iovec *iovec; + int fd; + int i; + + fd = vsock_stream_connect(opts->peer_cid, 1234); + if (fd < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + if (test_data->completion) + enable_so_zerocopy(fd); + + iovec = malloc(sizeof(*iovec) * test_data->vecs_cnt); + if (!iovec) { + perror("malloc"); + exit(EXIT_FAILURE); + } + + for (i = 0; i < test_data->vecs_cnt; i++) { + iovec[i].iov_len = test_data->vecs[i].iov_len; + iovec[i].iov_base = mmap_no_fail(test_data->vecs[i].iov_len); + } + + for (i = 0; i < test_data->vecs_cnt; i++) { + if (test_data->vecs[i].iov_base == MAP_FAILED) { + if (munmap(iovec[i].iov_base, iovec[i].iov_len)) { + perror("munmap"); + exit(EXIT_FAILURE); + } + } + } + + iovec_random_init(iovec, test_data); + + msg.msg_iov = iovec; + msg.msg_iovlen = test_data->vecs_cnt; + + errno = 0; + + if (test_data->sendmsg_res == SENDMSG_RES_IOV_LEN) + sendmsg_res = iovec_bytes(iovec, test_data->vecs_cnt); + else + sendmsg_res = test_data->sendmsg_res; + + if (sendmsg(fd, &msg, MSG_ZEROCOPY) != sendmsg_res) { + perror("send"); + exit(EXIT_FAILURE); + } + + if (errno != test_data->sendmsg_errno) { + fprintf(stderr, "expected 'errno' == %i, got %i\n", + test_data->sendmsg_errno, errno); + exit(EXIT_FAILURE); + } + + do_recv_completion(fd, test_data->zerocopied, test_data->completion); + + if (test_data->sendmsg_res == SENDMSG_RES_IOV_LEN) + control_writeulong(iovec_hash_djb2(iovec, test_data->vecs_cnt)); + else + control_writeulong(0); + + for (i = 0; i < test_data->vecs_cnt; i++) { + if (test_data->vecs[i].iov_base != MAP_FAILED) { + if (munmap(iovec[i].iov_base, iovec[i].iov_len)) { + perror("munmap"); + exit(EXIT_FAILURE); + } + } + } + + free(iovec); + close(fd); + control_writeln("DONE"); +} + +static void __test_stream_msg_zerocopy_server(const struct test_opts *opts, + const struct zerocopy_test_data *test_data) +{ + unsigned long remote_hash; + unsigned long local_hash; + ssize_t total_bytes_rec; + unsigned char *data; + size_t data_len; + int fd; + + fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL); + if (fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + data_len = iovec_bytes(test_data->vecs, test_data->vecs_cnt); + + data = malloc(data_len); + if (!data) { + perror("malloc"); + exit(EXIT_FAILURE); + } + + total_bytes_rec = 0; + + while (total_bytes_rec != data_len) { + ssize_t bytes_rec; + + bytes_rec = read(fd, data + total_bytes_rec, + data_len - total_bytes_rec); + if (bytes_rec <= 0) + break; + + total_bytes_rec += bytes_rec; + } + + if (test_data->sendmsg_res == SENDMSG_RES_IOV_LEN) + local_hash = hash_djb2(data, data_len); + else + local_hash = 0; + + free(data); + + /* Waiting for some result. */ + remote_hash = control_readulong(); + if (remote_hash != local_hash) { + fprintf(stderr, "hash mismatch\n"); + exit(EXIT_FAILURE); + } + + close(fd); + control_expectln("DONE"); +} + +void test_stream_msg_zcopy_client(const struct test_opts *opts) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(test_data_array); i++) + __test_stream_msg_zerocopy_client(opts, &test_data_array[i]); +} + +void test_stream_msg_zcopy_server(const struct test_opts *opts) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(test_data_array); i++) + __test_stream_msg_zerocopy_server(opts, &test_data_array[i]); +} + +void test_stream_msg_zcopy_empty_errq_client(const struct test_opts *opts) +{ + struct msghdr msg = { 0 }; + char cmsg_data[128]; + ssize_t res; + int fd; + + fd = vsock_stream_connect(opts->peer_cid, 1234); + if (fd < 0) { + perror("connect"); + exit(EXIT_FAILURE); + } + + msg.msg_control = cmsg_data; + msg.msg_controllen = sizeof(cmsg_data); + + res = recvmsg(fd, &msg, MSG_ERRQUEUE); + if (res != -1) { + fprintf(stderr, "expected 'recvmsg(2)' failure, got %zi\n", + res); + exit(EXIT_FAILURE); + } + + control_writeln("DONE"); + close(fd); +} + +void test_stream_msg_zcopy_empty_errq_server(const struct test_opts *opts) +{ + int fd; + + fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL); + if (fd < 0) { + perror("accept"); + exit(EXIT_FAILURE); + } + + control_expectln("DONE"); + close(fd); +} diff --git a/tools/testing/vsock/vsock_test_zerocopy.h b/tools/testing/vsock/vsock_test_zerocopy.h new file mode 100644 index 000000000000..705a1e90f41a --- /dev/null +++ b/tools/testing/vsock/vsock_test_zerocopy.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef VSOCK_TEST_ZEROCOPY_H +#define VSOCK_TEST_ZEROCOPY_H +#include "util.h" + +void test_stream_msg_zcopy_client(const struct test_opts *opts); +void test_stream_msg_zcopy_server(const struct test_opts *opts); + +void test_stream_msg_zcopy_empty_errq_client(const struct test_opts *opts); +void test_stream_msg_zcopy_empty_errq_server(const struct test_opts *opts); + +#endif /* VSOCK_TEST_ZEROCOPY_H */ From patchwork Sun Apr 23 19:26:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82E6DC77B76 for ; Sun, 23 Apr 2023 19:32:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230384AbjDWTcA (ORCPT ); Sun, 23 Apr 2023 15:32:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230155AbjDWTbe (ORCPT ); Sun, 23 Apr 2023 15:31:34 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9A3410FB; Sun, 23 Apr 2023 12:31:32 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 1B8BB5FD19; Sun, 23 Apr 2023 22:31:31 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278291; bh=FgPTGIGxeBtK5P9H+BLgs5wiavf5GzMitjxHRDrphMA=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=FHjHXh7ULgkJbmMnqQaApOZuUK8x7EdOh+AFBikzbDIPM2a5xBmz4zFIi0UDi01wc wBjO1OuxpAWGY9GaK1VsvN459xtd2fDuPxd+WQaFUqr/WKVgS03FrF2hNHdvNy1Rjv cMFJkUxDvUFyZ99ZOmC3fB3XUd7JuTXg7dfkNNBEm9OsMTqDzeYZEkixut5JMuQVzD WZVIqTxdJ+XA7d/QxsXk7PTwAM46g4sXrUTdrVnhYPOVZwUxQeDDGhRpOQgOf3fTMm bA2eM5owMHBnYHppSU+W6IFtRjE7fPvYXmtWxTG1/9oCcRYiNoYd8UaAnuPYUSVu6K OWHwEYRQgPrJA== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:31 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 14/15] test/vsock: MSG_ZEROCOPY support for vsock_perf Date: Sun, 23 Apr 2023 22:26:42 +0300 Message-ID: <20230423192643.1537470-15-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org To use this option pass '--zc' parameter: ./vsock_perf --zc --sender --port --bytes With this option MSG_ZEROCOPY flag will be passed to the 'send()' call. Signed-off-by: Arseniy Krasnov --- tools/testing/vsock/vsock_perf.c | 139 +++++++++++++++++++++++++++++-- 1 file changed, 130 insertions(+), 9 deletions(-) diff --git a/tools/testing/vsock/vsock_perf.c b/tools/testing/vsock/vsock_perf.c index a72520338f84..7fd76f7a3c16 100644 --- a/tools/testing/vsock/vsock_perf.c +++ b/tools/testing/vsock/vsock_perf.c @@ -18,6 +18,8 @@ #include #include #include +#include +#include #define DEFAULT_BUF_SIZE_BYTES (128 * 1024) #define DEFAULT_TO_SEND_BYTES (64 * 1024) @@ -28,9 +30,14 @@ #define BYTES_PER_GB (1024 * 1024 * 1024ULL) #define NSEC_PER_SEC (1000000000ULL) +#ifndef SOL_VSOCK +#define SOL_VSOCK 287 +#endif + static unsigned int port = DEFAULT_PORT; static unsigned long buf_size_bytes = DEFAULT_BUF_SIZE_BYTES; static unsigned long vsock_buf_bytes = DEFAULT_VSOCK_BUF_BYTES; +static bool zerocopy; static void error(const char *s) { @@ -247,15 +254,76 @@ static void run_receiver(unsigned long rcvlowat_bytes) close(fd); } +static void recv_completion(int fd) +{ + struct sock_extended_err *serr; + char cmsg_data[128]; + struct cmsghdr *cm; + struct msghdr msg = { 0 }; + ssize_t ret; + + msg.msg_control = cmsg_data; + msg.msg_controllen = sizeof(cmsg_data); + + ret = recvmsg(fd, &msg, MSG_ERRQUEUE); + if (ret) { + fprintf(stderr, "recvmsg: failed to read err: %zi\n", ret); + return; + } + + cm = CMSG_FIRSTHDR(&msg); + if (!cm) { + fprintf(stderr, "cmsg: no cmsg\n"); + return; + } + + if (cm->cmsg_level != SOL_VSOCK) { + fprintf(stderr, "cmsg: unexpected 'cmsg_level'\n"); + return; + } + + if (cm->cmsg_type) { + fprintf(stderr, "cmsg: unexpected 'cmsg_type'\n"); + return; + } + + serr = (void *)CMSG_DATA(cm); + if (serr->ee_origin != SO_EE_ORIGIN_ZEROCOPY) { + fprintf(stderr, "serr: wrong origin\n"); + return; + } + + if (serr->ee_errno) { + fprintf(stderr, "serr: wrong error code\n"); + return; + } + + if (zerocopy && (serr->ee_code & SO_EE_CODE_ZEROCOPY_COPIED)) + fprintf(stderr, "warning: copy instead of zerocopy\n"); +} + +static void enable_so_zerocopy(int fd) +{ + int val = 1; + + if (setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &val, sizeof(val))) + error("setsockopt(SO_ZEROCOPY)"); +} + static void run_sender(int peer_cid, unsigned long to_send_bytes) { time_t tx_begin_ns; time_t tx_total_ns; size_t total_send; + time_t time_in_send; void *data; int fd; - printf("Run as sender\n"); + if (zerocopy) + printf("Run as sender MSG_ZEROCOPY\n"); + else + printf("Run as sender\n"); + printf("Connect to %i:%u\n", peer_cid, port); printf("Send %lu bytes\n", to_send_bytes); printf("TX buffer %lu bytes\n", buf_size_bytes); @@ -265,38 +333,82 @@ static void run_sender(int peer_cid, unsigned long to_send_bytes) if (fd < 0) exit(EXIT_FAILURE); - data = malloc(buf_size_bytes); + if (zerocopy) { + enable_so_zerocopy(fd); - if (!data) { - fprintf(stderr, "'malloc()' failed\n"); - exit(EXIT_FAILURE); + data = mmap(NULL, buf_size_bytes, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (data == MAP_FAILED) { + perror("mmap"); + exit(EXIT_FAILURE); + } + } else { + data = malloc(buf_size_bytes); + + if (!data) { + fprintf(stderr, "'malloc()' failed\n"); + exit(EXIT_FAILURE); + } } memset(data, 0, buf_size_bytes); total_send = 0; + time_in_send = 0; tx_begin_ns = current_nsec(); while (total_send < to_send_bytes) { ssize_t sent; + size_t rest_bytes; + time_t before; - sent = write(fd, data, buf_size_bytes); + rest_bytes = to_send_bytes - total_send; + + before = current_nsec(); + sent = send(fd, data, (rest_bytes > buf_size_bytes) ? + buf_size_bytes : rest_bytes, + zerocopy ? MSG_ZEROCOPY : 0); + time_in_send += (current_nsec() - before); if (sent <= 0) error("write"); total_send += sent; + + if (zerocopy) { + struct pollfd fds = { 0 }; + + fds.fd = fd; + + if (poll(&fds, 1, -1) < 0) { + perror("poll"); + exit(EXIT_FAILURE); + } + + if (!(fds.revents & POLLERR)) { + fprintf(stderr, "POLLERR expected\n"); + exit(EXIT_FAILURE); + } + + recv_completion(fd); + } } tx_total_ns = current_nsec() - tx_begin_ns; printf("total bytes sent: %zu\n", total_send); printf("tx performance: %f Gbits/s\n", - get_gbps(total_send * 8, tx_total_ns)); - printf("total time in 'write()': %f sec\n", + get_gbps(total_send * 8, time_in_send)); + printf("total time in tx loop: %f sec\n", (float)tx_total_ns / NSEC_PER_SEC); + printf("time in 'send()': %f sec\n", + (float)time_in_send / NSEC_PER_SEC); close(fd); - free(data); + + if (zerocopy) + munmap(data, buf_size_bytes); + else + free(data); } static const char optstring[] = ""; @@ -336,6 +448,11 @@ static const struct option longopts[] = { .has_arg = required_argument, .val = 'R', }, + { + .name = "zc", + .has_arg = no_argument, + .val = 'Z', + }, {}, }; @@ -351,6 +468,7 @@ static void usage(void) " --help This message\n" " --sender Sender mode (receiver default)\n" " of the receiver to connect to\n" + " --zc Enable zerocopy\n" " --port Port (default %d)\n" " --bytes KMG Bytes to send (default %d)\n" " --buf-size KMG Data buffer size (default %d). In sender mode\n" @@ -413,6 +531,9 @@ int main(int argc, char **argv) case 'H': /* Help. */ usage(); break; + case 'Z': /* Zerocopy. */ + zerocopy = true; + break; default: usage(); } From patchwork Sun Apr 23 19:26:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arseniy Krasnov X-Patchwork-Id: 13221462 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A5A6C6FD18 for ; Sun, 23 Apr 2023 19:32:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230392AbjDWTcD (ORCPT ); Sun, 23 Apr 2023 15:32:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230252AbjDWTbo (ORCPT ); Sun, 23 Apr 2023 15:31:44 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AD971706; Sun, 23 Apr 2023 12:31:34 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 3DE5F5FD1A; Sun, 23 Apr 2023 22:31:32 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1682278292; bh=tL2E1Y90S5bBkA16ZM0oXYsz18KCFk1Zn9JPdTBbPT8=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=UWKVnL/mqB/IkLUAiJ6REH5PGQEFI8HKRqL0NYUgyNx0c+yMVMH2IUuxcTv/FQ5hr vLUYKUFdeoblK9qlxbSYEPdTMd4Y9oSE/Wq/tHUKjzGWI00ZdgCtgVqFK7NkK5SSFz M59/IMy6h283dZJIHH0qts1gV49pnoGSHwksrm2m65iXyrNIc+/NRR1oK0jtUjNI5/ K7+R/VEfDnaQhFdd//q4KXivkqljSkLzSlb6V9Ad+OEliSeA4CB79XruzGPxuWMNBN d925s3hTWXgCEiimcxPrfmMT7CFszN6duip0h8OnQlLYWGpVdZoNvn5sYa5vrLXIz4 FJaRfdRleALXQ== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sun, 23 Apr 2023 22:31:32 +0300 (MSK) From: Arseniy Krasnov To: Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Jason Wang , Bobby Eshleman CC: , , , , , , , Arseniy Krasnov Subject: [RFC PATCH v2 15/15] docs: net: description of MSG_ZEROCOPY for AF_VSOCK Date: Sun, 23 Apr 2023 22:26:43 +0300 Message-ID: <20230423192643.1537470-16-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> References: <20230423192643.1537470-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH01.sberdevices.ru (172.16.1.4) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/04/23 16:01:00 #21150277 X-KSMG-AntiVirus-Status: Clean, skipped Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org This adds description of MSG_ZEROCOPY flag support for AF_VSOCK type of socket. Signed-off-by: Arseniy Krasnov --- Documentation/networking/msg_zerocopy.rst | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/Documentation/networking/msg_zerocopy.rst b/Documentation/networking/msg_zerocopy.rst index b3ea96af9b49..34bc7ff411ce 100644 --- a/Documentation/networking/msg_zerocopy.rst +++ b/Documentation/networking/msg_zerocopy.rst @@ -7,7 +7,8 @@ Intro ===== The MSG_ZEROCOPY flag enables copy avoidance for socket send calls. -The feature is currently implemented for TCP and UDP sockets. +The feature is currently implemented for TCP, UDP and VSOCK (with +virtio transport) sockets. Opportunity and Caveats @@ -174,7 +175,7 @@ read_notification() call in the previous snippet. A notification is encoded in the standard error format, sock_extended_err. The level and type fields in the control data are protocol family -specific, IP_RECVERR or IPV6_RECVERR. +specific, IP_RECVERR or IPV6_RECVERR (for TCP or UDP socket). Error origin is the new type SO_EE_ORIGIN_ZEROCOPY. ee_errno is zero, as explained before, to avoid blocking read and write system calls on @@ -201,6 +202,7 @@ undefined, bar for ee_code, as discussed below. printf("completed: %u..%u\n", serr->ee_info, serr->ee_data); +For VSOCK socket, cmsg_level will be SOL_VSOCK and cmsg_type will be 0. Deferred copies ~~~~~~~~~~~~~~~ @@ -235,12 +237,15 @@ Implementation Loopback -------- +For TCP and UDP: Data sent to local sockets can be queued indefinitely if the receive process does not read its socket. Unbound notification latency is not acceptable. For this reason all packets generated with MSG_ZEROCOPY that are looped to a local socket will incur a deferred copy. This includes looping onto packet sockets (e.g., tcpdump) and tun devices. +For VSOCK: +Data path sent to local sockets is the same as for non-local sockets. Testing ======= @@ -254,3 +259,6 @@ instance when run with msg_zerocopy.sh between a veth pair across namespaces, the test will not show any improvement. For testing, the loopback restriction can be temporarily relaxed by making skb_orphan_frags_rx identical to skb_orphan_frags. + +For VSOCK type of socket example can be found in tools/testing/vsock/ +vsock_test_zerocopy.c.