From patchwork Thu Feb 29 07:20:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xuan Zhuo X-Patchwork-Id: 13576694 Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C8F64D9E3; Thu, 29 Feb 2024 07:20:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709191262; cv=none; b=SspdAuKPQ8h8/U1LebyK+wTZcMfCy1pvbQ9aMMqm3Fn3y4fgco8aaRsF1oqI09l5lmYEXBjW+U0uGKspVZN4t1MujyKtFyRRqDNSRPMV1T+RW0DMGEYQVxGed/7gGpffD2Po78kLzvFviC/VoaXIzMiM9iyFbn0NKN2xiXRO4Jg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709191262; c=relaxed/simple; bh=5ypWt7yf7hDToKKXL1Vw4VBylzx3H8eAF2+EmY2vrkA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=g5j7wsYoZwiRxkjzKvfmp55EbosIlVtu3VkV8Ctvwh3AvAiKbvT8NIss8V/2/ZK9PQPbtx4+G3thSiEaZBDn+8N66ei3WRf/81s86rIn7gFo55KDjk88YzZpkUE/8sH/cEBXgU7XwT6mDMJD3Z7gOFACVjeg0Jo/pSEhH1YbH8I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=q9sa/G5n; arc=none smtp.client-ip=115.124.30.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="q9sa/G5n" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1709191257; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=rZEA3HZfZtKV/P1RBQjdCvHf1tsCYt8NUzyvMGlh/fw=; b=q9sa/G5n//buskSUQyRscaYRWoRyH6myMBMZNa2vDLXCbuz4y7tNU0BuOlF4LQPiF3rF/c9MASZkJ9OloYB0QbK6KIpHn0j/+xYImShTi6iRlk0adSWsk1543Mu14pYywWnqycHS/YcpI+8hkPVulsK2XObFnXs3aCP7qSWNTVs= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R951e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=xuanzhuo@linux.alibaba.com;NM=1;PH=DS;RN=35;SR=0;TI=SMTPD_---0W1SDP6q_1709191252; Received: from localhost(mailfrom:xuanzhuo@linux.alibaba.com fp:SMTPD_---0W1SDP6q_1709191252) by smtp.aliyun-inc.com; Thu, 29 Feb 2024 15:20:54 +0800 From: Xuan Zhuo To: virtualization@lists.linux.dev Cc: Richard Weinberger , Anton Ivanov , Johannes Berg , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Hans de Goede , =?utf-8?q?Ilpo_J=C3=A4rvinen?= , Vadim Pasternak , Bjorn Andersson , Mathieu Poirier , Cornelia Huck , Halil Pasic , Eric Farman , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , linux-um@lists.infradead.org, netdev@vger.kernel.org, platform-driver-x86@vger.kernel.org, linux-remoteproc@vger.kernel.org, linux-s390@vger.kernel.org, kvm@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH vhost v3 05/19] virtio_ring: split: structure the indirect desc table Date: Thu, 29 Feb 2024 15:20:30 +0800 Message-Id: <20240229072044.77388-6-xuanzhuo@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240229072044.77388-1-xuanzhuo@linux.alibaba.com> References: <20240229072044.77388-1-xuanzhuo@linux.alibaba.com> Precedence: bulk X-Mailing-List: bpf@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Git-Hash: e3a3e51d6b70 This commit structure the indirect desc table. Then we can get the desc num directly when doing unmap. And save the dma info to the struct, then the indirect will not use the dma fields of the desc_extra. The subsequent commits will make the dma fields are optional. But for the indirect case, we must record the dma info. Signed-off-by: Xuan Zhuo --- drivers/virtio/virtio_ring.c | 91 ++++++++++++++++++++++-------------- 1 file changed, 55 insertions(+), 36 deletions(-) diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 194bd877ab5a..9680b3827d5b 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -67,9 +67,16 @@ #define LAST_ADD_TIME_INVALID(vq) #endif +struct vring_split_desc_indir { + dma_addr_t addr; /* Descriptor Array DMA addr. */ + u32 len; /* Descriptor Array length. */ + u32 num; + struct vring_desc desc[]; +}; + struct vring_desc_state_split { void *data; /* Data for callback. */ - struct vring_desc *indir_desc; /* Indirect descriptor, if any. */ + struct vring_split_desc_indir *indir_desc; /* Indirect descriptor, if any. */ }; struct vring_packed_desc_indir { @@ -478,12 +485,16 @@ static unsigned int vring_unmap_one_split(const struct vring_virtqueue *vq, return extra[i].next; } -static struct vring_desc *alloc_indirect_split(struct virtqueue *_vq, - unsigned int total_sg, - gfp_t gfp) +static struct vring_split_desc_indir *alloc_indirect_split(struct virtqueue *_vq, + unsigned int total_sg, + gfp_t gfp) { + struct vring_split_desc_indir *in_desc; struct vring_desc *desc; unsigned int i; + u32 size; + + size = struct_size(in_desc, desc, total_sg); /* * We require lowmem mappings for the descriptors because @@ -492,13 +503,16 @@ static struct vring_desc *alloc_indirect_split(struct virtqueue *_vq, */ gfp &= ~__GFP_HIGHMEM; - desc = kmalloc_array(total_sg, sizeof(struct vring_desc), gfp); - if (!desc) + in_desc = kmalloc(size, gfp); + if (!in_desc) return NULL; + desc = in_desc->desc; + for (i = 0; i < total_sg; i++) desc[i].next = cpu_to_virtio16(_vq->vdev, i + 1); - return desc; + + return in_desc; } static inline unsigned int virtqueue_add_desc_split(struct virtqueue *vq, @@ -540,6 +554,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, gfp_t gfp) { struct vring_virtqueue *vq = to_vvq(_vq); + struct vring_split_desc_indir *in_desc; struct scatterlist *sg; struct vring_desc *desc; unsigned int i, n, avail, descs_used, prev, err_idx; @@ -562,9 +577,13 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, head = vq->free_head; - if (virtqueue_use_indirect(vq, total_sg)) - desc = alloc_indirect_split(_vq, total_sg, gfp); - else { + if (virtqueue_use_indirect(vq, total_sg)) { + in_desc = alloc_indirect_split(_vq, total_sg, gfp); + if (!in_desc) + desc = NULL; + else + desc = in_desc->desc; + } else { desc = NULL; WARN_ON_ONCE(total_sg > vq->split.vring.num && !vq->indirect); } @@ -637,10 +656,10 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, ~VRING_DESC_F_NEXT; if (indirect) { + u32 size = total_sg * sizeof(struct vring_desc); + /* Now that the indirect table is filled in, map it. */ - dma_addr_t addr = vring_map_single( - vq, desc, total_sg * sizeof(struct vring_desc), - DMA_TO_DEVICE); + dma_addr_t addr = vring_map_single(vq, desc, size, DMA_TO_DEVICE); if (vring_mapping_error(vq, addr)) { if (!vring_need_unmap_buffer(vq)) goto free_indirect; @@ -648,11 +667,20 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, goto unmap_release; } - virtqueue_add_desc_split(_vq, vq->split.vring.desc, - head, addr, - total_sg * sizeof(struct vring_desc), - VRING_DESC_F_INDIRECT, - false); + desc = &vq->split.vring.desc[head]; + + desc->flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_INDIRECT); + desc->addr = cpu_to_virtio64(_vq->vdev, addr); + desc->len = cpu_to_virtio32(_vq->vdev, size); + + vq->split.desc_extra[head].flags = VRING_DESC_F_INDIRECT; + + if (vq->use_dma_api) { + in_desc->addr = addr; + in_desc->len = size; + } + + in_desc->num = total_sg; } /* We're using some buffers from the free list. */ @@ -667,7 +695,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, /* Store token and indirect buffer state. */ vq->split.desc_state[head].data = data; if (indirect) - vq->split.desc_state[head].indir_desc = desc; + vq->split.desc_state[head].indir_desc = in_desc; else vq->split.desc_state[head].indir_desc = ctx; @@ -717,7 +745,7 @@ static inline int virtqueue_add_split(struct virtqueue *_vq, free_indirect: if (indirect) - kfree(desc); + kfree(in_desc); END_USE(vq); return -ENOMEM; @@ -782,32 +810,23 @@ static void detach_buf_split(struct vring_virtqueue *vq, unsigned int head, if (ctx) *ctx = vq->split.desc_state[head].indir_desc; } else { - struct vring_desc *indir_desc = - vq->split.desc_state[head].indir_desc; - u32 len; + struct vring_split_desc_indir *in_desc; - if (vq->use_dma_api) { - struct vring_desc_extra *extra = vq->split.desc_extra; + in_desc = vq->split.desc_state[head].indir_desc; + if (vq->use_dma_api) { dma_unmap_single(vring_dma_dev(vq), - extra[i].addr, - extra[i].len, + in_desc->addr, in_desc->len, (flags & VRING_DESC_F_WRITE) ? DMA_FROM_DEVICE : DMA_TO_DEVICE); } - len = vq->split.desc_extra[head].len; - - BUG_ON(!(vq->split.desc_extra[head].flags & - VRING_DESC_F_INDIRECT)); - BUG_ON(len == 0 || len % sizeof(struct vring_desc)); - if (vring_need_unmap_buffer(vq)) { - for (j = 0; j < len / sizeof(struct vring_desc); j++) - vring_unmap_one_split_indirect(vq, &indir_desc[j]); + for (j = 0; j < in_desc->num; j++) + vring_unmap_one_split_indirect(vq, &in_desc->desc[j]); } - kfree(indir_desc); + kfree(in_desc); vq->split.desc_state[head].indir_desc = NULL; }