From patchwork Mon Oct 14 03:12:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 13834118 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 018B4487BE for ; Mon, 14 Oct 2024 03:12:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.130 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875566; cv=none; b=E94YyG95VCU+pxDoqqZ/U1eQ55s35ZO9xm+B4qVLxv4sHYzeH8YuvssLM4GX1zeOJL1L9C6sEIU5g+O0LAWnAdeS/GG8AGdK5ECC6qNf1eeCgid8kImndtYieoO4UuGMcMZ50x0glWHIs8HsoSVEvjMWGlWMNk0OhPSPi9N4QxQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875566; c=relaxed/simple; bh=FUCJ34kAec4ze7MT1I06ed4vudzitxWwNszCM4IE2YM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=b5fwoFTOguIN7UV/jRM7EvLmZReproo4jL1kQZGaHi99tnxqKfVym7H2/WFMDpTBm1Z1V6YVB9zZhzcmjC11pjTkRyBPDywBxMahq+PP6QLQhM2YAMylzpYYwyytmEt55awncTt8QiJa9wsgcxziSJKdkbMj30/0IwohDsemhbw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=cNCnrHu2; arc=none smtp.client-ip=115.124.30.130 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="cNCnrHu2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1728875556; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=kQQJRBTxO945uaJI4oANunCWCNhTnGMwAIxYBSbTKlE=; b=cNCnrHu2kdrgs5EQqR7K8NLoqCYxp8nLjVP5+sPxUt7I6zSCOHINufhASYGllJKpYCMEgWvNpIu3A/exiQQ/yLQX9UczEGmEUC+sAX8fGNPb48QevS6mpQUrqEpmfgFDEinhnbwe2GAyfr3SSN292/9z1+RfO7vTIMZpjJ4k3tE= Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0WH.LEUC_1728875555 cluster:ay36) by smtp.aliyun-inc.com; Mon, 14 Oct 2024 11:12:35 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?utf-8?q?Eugenio_P=C3=A9rez?= , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , virtualization@lists.linux.dev, "Si-Wei Liu" Subject: [PATCH 1/5] virtio-net: fix overflow inside virtnet_rq_alloc Date: Mon, 14 Oct 2024 11:12:30 +0800 Message-Id: <20241014031234.7659-2-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> References: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Git-Hash: bba499faae26 X-Patchwork-Delegate: kuba@kernel.org When the frag just got a page, then may lead to regression on VM. Specially if the sysctl net.core.high_order_alloc_disable value is 1, then the frag always get a page when do refill. Which could see reliable crashes or scp failure (scp a file 100M in size to VM): The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning of a new frag. When the frag size is larger than PAGE_SIZE, everything is fine. However, if the frag is only one page and the total size of the buffer and virtnet_rq_dma is larger than one page, an overflow may occur. Here, when the frag size is not enough, we reduce the buffer len to fix this problem. Fixes: f9dac92ba908 ("virtio_ring: enable premapped mode whatever use_dma_api") Reported-by: "Si-Wei Liu" Signed-off-by: Xuan Zhuo --- drivers/net/virtio_net.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index f8131f92a392..59a99bbaf852 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -926,9 +926,6 @@ static void *virtnet_rq_alloc(struct receive_queue *rq, u32 size, gfp_t gfp) void *buf, *head; dma_addr_t addr; - if (unlikely(!skb_page_frag_refill(size, alloc_frag, gfp))) - return NULL; - head = page_address(alloc_frag->page); if (rq->do_dma) { @@ -2423,6 +2420,9 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq, len = SKB_DATA_ALIGN(len) + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + if (unlikely(!skb_page_frag_refill(len, &rq->alloc_frag, gfp))) + return -ENOMEM; + buf = virtnet_rq_alloc(rq, len, gfp); if (unlikely(!buf)) return -ENOMEM; @@ -2525,6 +2525,12 @@ static int add_recvbuf_mergeable(struct virtnet_info *vi, */ len = get_mergeable_buf_len(rq, &rq->mrg_avg_pkt_len, room); + if (unlikely(!skb_page_frag_refill(len + room, alloc_frag, gfp))) + return -ENOMEM; + + if (!alloc_frag->offset && len + room + sizeof(struct virtnet_rq_dma) > alloc_frag->size) + len -= sizeof(struct virtnet_rq_dma); + buf = virtnet_rq_alloc(rq, len + room, gfp); if (unlikely(!buf)) return -ENOMEM; From patchwork Mon Oct 14 03:12:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 13834119 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB2D94D599 for ; Mon, 14 Oct 2024 03:12:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.132 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875567; cv=none; b=GiKdD6Ou9bttG8fug9jQBrlO2j/67Mqef+TxcMwVcqhlE3XyDH4kNATsvmBunbkgNAc+odT19zxEPXYom+5IHIuREz11DNWq3LuXlIVl8QLiWRQPlAkNf6qZm6MvJ5Pz5hkDG6CQNszAfxOMfuXnH0Y+GOvjNIlPtMvspk1jyJQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875567; c=relaxed/simple; bh=ZimzXTnS+ySDijkHswYbDsKSxEaN9eAvVdl+yrEtNus=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=m1BMIZ5YHqFVnsJd6/dCClNKXGKC4ZViOZ9ueKibf+oSI23I//GkkLnKZi1M9BgZf8E9v+H7fisZEIUZumigfiUHMDMTTUJWyOGJRA6p6l6tK2JAqWV1qzNaK3t+ATxLk4HCJSQNGuA7mLW8nndhRAtrrCnhgxp5INv292RmI5k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=Z9iuJRot; arc=none smtp.client-ip=115.124.30.132 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="Z9iuJRot" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1728875557; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=xKnPBWt2rPXTDtTMScZ4Nu217/7LuF0PrrSmloFxuN0=; b=Z9iuJRottrdRq5Pj2bRkR0Fai26/LCZHkAj8kayJ1pGr8KjwcRK8Prjzz3Py9qA6ep+J2gOz0d4lTcdFpzwHXzB3hZANUKCeEikqQmiqrDi34m3VjEJ2N0ucB9ra+j2bH6N39Hzkw7dR3LRarHbMnO1krcCMrpdWlj1WyqifWJw= Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0WH.H6A1_1728875556 cluster:ay36) by smtp.aliyun-inc.com; Mon, 14 Oct 2024 11:12:36 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?utf-8?q?Eugenio_P=C3=A9rez?= , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , virtualization@lists.linux.dev Subject: [PATCH 2/5] virtio_net: introduce vi->mode Date: Mon, 14 Oct 2024 11:12:31 +0800 Message-Id: <20241014031234.7659-3-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> References: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Git-Hash: bba499faae26 X-Patchwork-Delegate: kuba@kernel.org Now, if we want to judge the rx work mode, we have to use such codes: 1. merge mode: vi->mergeable_rx_bufs 2. big mode: vi->big_packets && !vi->mergeable_rx_bufs 3. small: !vi->big_packets && !vi->mergeable_rx_bufs This is inconvenient and abstract, and we also have this use case: if (vi->mergeable_rx_bufs) .... else if (vi->big_packets) .... else For this case, I think switch-case is the better choice. So here I introduce vi->mode to record the virtio-net work mode. That is helpful to judge the work mode and choose the branches. Signed-off-by: Xuan Zhuo --- drivers/net/virtio_net.c | 61 +++++++++++++++++++++++++++++++--------- 1 file changed, 47 insertions(+), 14 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 59a99bbaf852..14809b614d62 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -385,6 +385,12 @@ struct control_buf { virtio_net_ctrl_ack status; }; +enum virtnet_mode { + VIRTNET_MODE_SMALL, + VIRTNET_MODE_MERGE, + VIRTNET_MODE_BIG +}; + struct virtnet_info { struct virtio_device *vdev; struct virtqueue *cvq; @@ -414,6 +420,8 @@ struct virtnet_info { /* Host will merge rx buffers for big packets (shake it! shake it!) */ bool mergeable_rx_bufs; + enum virtnet_mode mode; + /* Host supports rss and/or hash report */ bool has_rss; bool has_rss_hash_report; @@ -643,12 +651,15 @@ static struct page *get_a_page(struct receive_queue *rq, gfp_t gfp_mask) static void virtnet_rq_free_buf(struct virtnet_info *vi, struct receive_queue *rq, void *buf) { - if (vi->mergeable_rx_bufs) + switch (vi->mode) { + case VIRTNET_MODE_SMALL: + case VIRTNET_MODE_MERGE: put_page(virt_to_head_page(buf)); - else if (vi->big_packets) + break; + case VIRTNET_MODE_BIG: give_pages(rq, buf); - else - put_page(virt_to_head_page(buf)); + break; + } } static void enable_delayed_refill(struct virtnet_info *vi) @@ -1315,7 +1326,8 @@ static void virtnet_receive_xsk_buf(struct virtnet_info *vi, struct receive_queu flags = ((struct virtio_net_common_hdr *)(xdp->data - vi->hdr_len))->hdr.flags; - if (!vi->mergeable_rx_bufs) + /* We only support small and merge mode. */ + if (vi->mode == VIRTNET_MODE_SMALL) skb = virtnet_receive_xsk_small(dev, vi, rq, xdp, xdp_xmit, stats); else skb = virtnet_receive_xsk_merge(dev, vi, rq, xdp, xdp_xmit, stats); @@ -2389,13 +2401,20 @@ static void receive_buf(struct virtnet_info *vi, struct receive_queue *rq, */ flags = ((struct virtio_net_common_hdr *)buf)->hdr.flags; - if (vi->mergeable_rx_bufs) + switch (vi->mode) { + case VIRTNET_MODE_MERGE: skb = receive_mergeable(dev, vi, rq, buf, ctx, len, xdp_xmit, stats); - else if (vi->big_packets) + break; + + case VIRTNET_MODE_BIG: skb = receive_big(dev, vi, rq, buf, len, stats); - else + break; + + case VIRTNET_MODE_SMALL: skb = receive_small(dev, vi, rq, buf, ctx, len, xdp_xmit, stats); + break; + } if (unlikely(!skb)) return; @@ -2580,12 +2599,19 @@ static bool try_fill_recv(struct virtnet_info *vi, struct receive_queue *rq, } do { - if (vi->mergeable_rx_bufs) + switch (vi->mode) { + case VIRTNET_MODE_MERGE: err = add_recvbuf_mergeable(vi, rq, gfp); - else if (vi->big_packets) + break; + + case VIRTNET_MODE_BIG: err = add_recvbuf_big(vi, rq, gfp); - else + break; + + case VIRTNET_MODE_SMALL: err = add_recvbuf_small(vi, rq, gfp); + break; + } if (err) break; @@ -2703,7 +2729,7 @@ static int virtnet_receive_packets(struct virtnet_info *vi, int packets = 0; void *buf; - if (!vi->big_packets || vi->mergeable_rx_bufs) { + if (vi->mode != VIRTNET_MODE_BIG) { void *ctx; while (packets < budget && (buf = virtnet_rq_get_buf(rq, &len, &ctx))) { @@ -5510,7 +5536,7 @@ static int virtnet_xsk_pool_enable(struct net_device *dev, /* In big_packets mode, xdp cannot work, so there is no need to * initialize xsk of rq. */ - if (vi->big_packets && !vi->mergeable_rx_bufs) + if (vi->mode == VIRTNET_MODE_BIG) return -ENOENT; if (qid >= vi->curr_queue_pairs) @@ -6007,7 +6033,7 @@ static int virtnet_find_vqs(struct virtnet_info *vi) vqs_info = kcalloc(total_vqs, sizeof(*vqs_info), GFP_KERNEL); if (!vqs_info) goto err_vqs_info; - if (!vi->big_packets || vi->mergeable_rx_bufs) { + if (vi->mode != VIRTNET_MODE_BIG) { ctx = kcalloc(total_vqs, sizeof(*ctx), GFP_KERNEL); if (!ctx) goto err_ctx; @@ -6480,6 +6506,13 @@ static int virtnet_probe(struct virtio_device *vdev) virtnet_set_big_packets(vi, mtu); + if (vi->mergeable_rx_bufs) + vi->mode = VIRTNET_MODE_MERGE; + else if (vi->big_packets) + vi->mode = VIRTNET_MODE_BIG; + else + vi->mode = VIRTNET_MODE_SMALL; + if (vi->any_header_sg) dev->needed_headroom = vi->hdr_len; From patchwork Mon Oct 14 03:12:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 13834120 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 049EB62171 for ; Mon, 14 Oct 2024 03:12:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.98 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875567; cv=none; b=kZXrtcog5ds+ctZXlGwGxh6wHsdRtks7fOIggt0MFgMnIPgNuLa0kQKTmfmV5wOYKatvYMUQN4c3ybZRbQOGNC6CLvdWND3s0WtBfN8OjL09xxM2E9p0j2f+PehGI9ccTJ1FSyjKjZewI48vVxfAYjcktyrQop3M8n8084raUWQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875567; c=relaxed/simple; bh=texysb0oMlboTBEcTXJZB/LDlVjukIWBLuQxBwmKgYs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cfF6O8Hz+PIMaVNco+QRZkHFqopMyuXWOvALs58L8lNalxhNlgLsrpoukq8NKtdxRh732nhZuLZa0B6cD8jUOJzyDyPIPDECxM3r4waY9OF2hKpTksYD4xHda2swYdaMz+iVdcyHdqQemveG7z3Y7nia1UMSTkB0/rL/OXe9gcM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=dIMMT119; arc=none smtp.client-ip=115.124.30.98 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="dIMMT119" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1728875558; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=kEqE/mvC9NiKzNXZg92xnGDP6/XNpv4tYHyaAHHbc74=; b=dIMMT119J3YitYm394mjih1IB8KAUdsWv9wowx7OrOy8h7Ir2oYLbQX+YScyoEM0RGcHQfdt+dEMjGXDxrRFrez2r0yErZSCDOuNxZVV1K+9gTaFYK1KxGOgyP/ReEwzKnsyGJWscogaptEkr4lSs1rdkawRhKkhW65Hab65lNA= Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0WH.LEVS_1728875557 cluster:ay36) by smtp.aliyun-inc.com; Mon, 14 Oct 2024 11:12:37 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?utf-8?q?Eugenio_P=C3=A9rez?= , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , virtualization@lists.linux.dev Subject: [PATCH 3/5] virtio_net: big mode skip the unmap check Date: Mon, 14 Oct 2024 11:12:32 +0800 Message-Id: <20241014031234.7659-4-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> References: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Git-Hash: bba499faae26 X-Patchwork-Delegate: kuba@kernel.org The virtio-net big mode did not enable premapped mode, so we did not need to check the unmap. And the subsequent commit will remove the failover code for failing enable premapped for merge and small mode. So we need to remove the checking do_dma code in the big mode path. Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio_net.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 14809b614d62..cd90e77881df 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -998,7 +998,7 @@ static void virtnet_rq_unmap_free_buf(struct virtqueue *vq, void *buf) return; } - if (rq->do_dma) + if (vi->mode != VIRTNET_MODE_BIG) virtnet_rq_unmap(rq, buf, 0); virtnet_rq_free_buf(vi, rq, buf); @@ -2738,7 +2738,7 @@ static int virtnet_receive_packets(struct virtnet_info *vi, } } else { while (packets < budget && - (buf = virtnet_rq_get_buf(rq, &len, NULL)) != NULL) { + (buf = virtqueue_get_buf(rq->vq, &len)) != NULL) { receive_buf(vi, rq, buf, len, NULL, xdp_xmit, stats); packets++; } From patchwork Mon Oct 14 03:12:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 13834121 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B61F37641E for ; Mon, 14 Oct 2024 03:12:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.119 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875568; cv=none; b=G0kfuoQ9bge/dDnRRcs7XTyYAfF08TcPelsL0WAXZo7rKI9I+ErNdcB1FZRyPpnuvwWFnYkYHO7LOmZcpk/scWBSTiK6Y+hWxD+A1uounb0nklnMSjDY7Nc3r7z6CxFJTmPg3jmourTf2a0jyxbP0wXQcqTnTNwI1KSIp/+sjVg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875568; c=relaxed/simple; bh=CWFFvB/woliuCJsHHhnfJ92el+XRFSCnpT7eu7ZtNC8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=s++cvsWQMsZpNoq4drJmrveYcU4g8iYciapajwXr6ddO9ISbQJprDARASiFAEWlEwiW+pO6C/nwPtPyf7AzC63qFNHat0hxXupdOJElljE8l7mv07PU1uIQrESQJ3+T2jaoZsWff77PxezIHgkwM4zciq/5QVjfRXxFN2blbPPM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=UpRKPN2f; arc=none smtp.client-ip=115.124.30.119 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="UpRKPN2f" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1728875559; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=2P7txvf1UTqbJ8Av/SHvqm6/S8okRtKCBqJpJo1/+wg=; b=UpRKPN2f37bLd2CJbXw69XOwq30byflgOHaE9Vrl9ffy2akIq6JYFNdni1dpXJNufvhxSy0sBTG4sHF3XSwoGKzeF/TuEBtI6LJLBezz90pC/B/24gtl6QH6CuDgc25z7bXyAOb85H9X0odN0GQv+GHeO9TZRk/mHLH6+px3NtU= Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0WH.FFnD_1728875558 cluster:ay36) by smtp.aliyun-inc.com; Mon, 14 Oct 2024 11:12:38 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?utf-8?q?Eugenio_P=C3=A9rez?= , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , virtualization@lists.linux.dev Subject: [PATCH 4/5] virtio_net: enable premapped mode for merge and small by default Date: Mon, 14 Oct 2024 11:12:33 +0800 Message-Id: <20241014031234.7659-5-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> References: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Git-Hash: bba499faae26 X-Patchwork-Delegate: kuba@kernel.org Currently, the virtio core will perform a dma operation for each buffer. Although, the same page may be operated multiple times. In premapped mod, we can perform only one dma operation for the pages of the alloc frag. This is beneficial for the iommu device. kernel command line: intel_iommu=on iommu.passthrough=0 | strict=0 | strict=1 Before | 775496pps | 428614pps After | 1109316pps | 742853pps In the 6.11, we disabled this feature because a regress [1]. Now, we fix the problem and re-enable it. [1]: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com Signed-off-by: Xuan Zhuo Acked-by: Jason Wang --- drivers/net/virtio_net.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index cd90e77881df..8cf24b7b58bd 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -6133,6 +6133,21 @@ static int virtnet_alloc_queues(struct virtnet_info *vi) return -ENOMEM; } +static void virtnet_rq_set_premapped(struct virtnet_info *vi) +{ + int i; + + /* disable for big mode */ + if (vi->mode == VIRTNET_MODE_BIG) + return; + + for (i = 0; i < vi->max_queue_pairs; i++) { + /* error should never happen */ + BUG_ON(virtqueue_set_dma_premapped(vi->rq[i].vq)); + vi->rq[i].do_dma = true; + } +} + static int init_vqs(struct virtnet_info *vi) { int ret; @@ -6146,6 +6161,8 @@ static int init_vqs(struct virtnet_info *vi) if (ret) goto err_free; + virtnet_rq_set_premapped(vi); + cpus_read_lock(); virtnet_set_affinity(vi); cpus_read_unlock(); From patchwork Mon Oct 14 03:12:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 13834122 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1FC425D8F0 for ; Mon, 14 Oct 2024 03:12:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.132 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875569; cv=none; b=Wf4HTfIKZSyXbnw3x1cL7+J/Jc06+XQJKKEBaOjh1kJJLg2GqqBNpmdaz1dH8d3nmYMUPGujVQILcX+AGAQa8sjlaj7qjc3JO8jZqwXSpVMSh4lH6nDyHh8xclZUpO0kIbave+sefao/yMfxt3zYXZdhxd2IIZAV8byHVRzse+o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728875569; c=relaxed/simple; bh=oIhm1daivK/11ogqXkp0GvPygiIMR2q6YNLg9ZQ4x8w=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=HXQdRl4vv+pOK3neapHfZOHbYRZLjqzmhFAQEMZrdAqqF2b3I4nY08WTHRNjW1Bqf48y/RzkykZYhcUQsa6Rht/cDKiSJGRJQ3qpy8640tEUKPZUZCXI8T3XtZv+MWi/BsVEKuRtCCL8q0ov+4UTOJ+TdiRtoHqO25o+Y20jJgY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=pN8t188d; arc=none smtp.client-ip=115.124.30.132 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="pN8t188d" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1728875560; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=ME97rGVXnM1xi7LnckE//Kdi+K7d+GvCGFmbbyMwvVk=; b=pN8t188doxzlU4FZRE30vQQ4wH9VTDJPz+4aVV/9rb6EtcS6TcBZWFw7I/kZoB60plpDe5MdIeEZkZ/XG8Js1A1M69ldy7Fr0NExQFp/U/f+27bzgkT//1TorwgyUD0C0yMHRCJZei1H0r9Oy75M02vDapPuYrSwijVGENNpJlo= Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0WH.LEWK_1728875558 cluster:ay36) by smtp.aliyun-inc.com; Mon, 14 Oct 2024 11:12:39 +0800 From: Xuan Zhuo To: netdev@vger.kernel.org Cc: "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?utf-8?q?Eugenio_P=C3=A9rez?= , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , virtualization@lists.linux.dev Subject: [PATCH 5/5] virtio_net: rx remove premapped failover code Date: Mon, 14 Oct 2024 11:12:34 +0800 Message-Id: <20241014031234.7659-6-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> References: <20241014031234.7659-1-xuanzhuo@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Git-Hash: bba499faae26 X-Patchwork-Delegate: kuba@kernel.org Now, the premapped mode can be enabled unconditionally. So we can remove the failover code for merge and small mode. Signed-off-by: Xuan Zhuo --- drivers/net/virtio_net.c | 80 +++++++++++++++++----------------------- 1 file changed, 33 insertions(+), 47 deletions(-) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 8cf24b7b58bd..4d3e35b02478 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -356,9 +356,6 @@ struct receive_queue { struct xdp_rxq_info xsk_rxq_info; struct xdp_buff **xsk_buffs; - - /* Do dma by self */ - bool do_dma; }; /* This structure can contain rss message with maximum settings for indirection table and keysize @@ -899,7 +896,7 @@ static void *virtnet_rq_get_buf(struct receive_queue *rq, u32 *len, void **ctx) void *buf; buf = virtqueue_get_buf_ctx(rq->vq, len, ctx); - if (buf && rq->do_dma) + if (buf) virtnet_rq_unmap(rq, buf, *len); return buf; @@ -912,11 +909,6 @@ static void virtnet_rq_init_one_sg(struct receive_queue *rq, void *buf, u32 len) u32 offset; void *head; - if (!rq->do_dma) { - sg_init_one(rq->sg, buf, len); - return; - } - head = page_address(rq->alloc_frag.page); offset = buf - head; @@ -939,44 +931,42 @@ static void *virtnet_rq_alloc(struct receive_queue *rq, u32 size, gfp_t gfp) head = page_address(alloc_frag->page); - if (rq->do_dma) { - dma = head; - - /* new pages */ - if (!alloc_frag->offset) { - if (rq->last_dma) { - /* Now, the new page is allocated, the last dma - * will not be used. So the dma can be unmapped - * if the ref is 0. - */ - virtnet_rq_unmap(rq, rq->last_dma, 0); - rq->last_dma = NULL; - } + dma = head; - dma->len = alloc_frag->size - sizeof(*dma); + /* new pages */ + if (!alloc_frag->offset) { + if (rq->last_dma) { + /* Now, the new page is allocated, the last dma + * will not be used. So the dma can be unmapped + * if the ref is 0. + */ + virtnet_rq_unmap(rq, rq->last_dma, 0); + rq->last_dma = NULL; + } - addr = virtqueue_dma_map_single_attrs(rq->vq, dma + 1, - dma->len, DMA_FROM_DEVICE, 0); - if (virtqueue_dma_mapping_error(rq->vq, addr)) - return NULL; + dma->len = alloc_frag->size - sizeof(*dma); - dma->addr = addr; - dma->need_sync = virtqueue_dma_need_sync(rq->vq, addr); + addr = virtqueue_dma_map_single_attrs(rq->vq, dma + 1, + dma->len, DMA_FROM_DEVICE, 0); + if (virtqueue_dma_mapping_error(rq->vq, addr)) + return NULL; - /* Add a reference to dma to prevent the entire dma from - * being released during error handling. This reference - * will be freed after the pages are no longer used. - */ - get_page(alloc_frag->page); - dma->ref = 1; - alloc_frag->offset = sizeof(*dma); + dma->addr = addr; + dma->need_sync = virtqueue_dma_need_sync(rq->vq, addr); - rq->last_dma = dma; - } + /* Add a reference to dma to prevent the entire dma from + * being released during error handling. This reference + * will be freed after the pages are no longer used. + */ + get_page(alloc_frag->page); + dma->ref = 1; + alloc_frag->offset = sizeof(*dma); - ++dma->ref; + rq->last_dma = dma; } + ++dma->ref; + buf = head + alloc_frag->offset; get_page(alloc_frag->page); @@ -2452,8 +2442,7 @@ static int add_recvbuf_small(struct virtnet_info *vi, struct receive_queue *rq, err = virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, buf, ctx, gfp); if (err < 0) { - if (rq->do_dma) - virtnet_rq_unmap(rq, buf, 0); + virtnet_rq_unmap(rq, buf, 0); put_page(virt_to_head_page(buf)); } @@ -2573,8 +2562,7 @@ static int add_recvbuf_mergeable(struct virtnet_info *vi, ctx = mergeable_len_to_ctx(len + room, headroom); err = virtqueue_add_inbuf_ctx(rq->vq, rq->sg, 1, buf, ctx, gfp); if (err < 0) { - if (rq->do_dma) - virtnet_rq_unmap(rq, buf, 0); + virtnet_rq_unmap(rq, buf, 0); put_page(virt_to_head_page(buf)); } @@ -5948,7 +5936,7 @@ static void free_receive_page_frags(struct virtnet_info *vi) int i; for (i = 0; i < vi->max_queue_pairs; i++) if (vi->rq[i].alloc_frag.page) { - if (vi->rq[i].do_dma && vi->rq[i].last_dma) + if (vi->rq[i].last_dma) virtnet_rq_unmap(&vi->rq[i], vi->rq[i].last_dma, 0); put_page(vi->rq[i].alloc_frag.page); } @@ -6141,11 +6129,9 @@ static void virtnet_rq_set_premapped(struct virtnet_info *vi) if (vi->mode == VIRTNET_MODE_BIG) return; - for (i = 0; i < vi->max_queue_pairs; i++) { + for (i = 0; i < vi->max_queue_pairs; i++) /* error should never happen */ BUG_ON(virtqueue_set_dma_premapped(vi->rq[i].vq)); - vi->rq[i].do_dma = true; - } } static int init_vqs(struct virtnet_info *vi)