Message ID | 20230216083047.93525-1-xuanzhuo@linux.alibaba.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 9f78bf330a66cd400b3e00f370f597e9fa939207 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v4] xsk: support use vaddr as ring | expand |
From: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Date: Thu, 16 Feb 2023 16:30:47 +0800 > When we try to start AF_XDP on some machines with long running time, due > to the machine's memory fragmentation problem, there is no sufficient > contiguous physical memory that will cause the start failure. > > If the size of the queue is 8 * 1024, then the size of the desc[] is > 8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is > 16page+. This is necessary to apply for a 4-order memory. If there are a > lot of queues, it is difficult to these machine with long running time. > > Here, that we actually waste 15 pages. 4-Order memory is 32 pages, but > we only use 17 pages. > > This patch replaces __get_free_pages() by vmalloc() to allocate memory > to solve these problems. > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> > --- [...] > diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h > index c6fb6b763658..bfb2a7e50c26 100644 > --- a/net/xdp/xsk_queue.h > +++ b/net/xdp/xsk_queue.h > @@ -45,6 +45,7 @@ struct xsk_queue { > struct xdp_ring *ring; > u64 invalid_descs; > u64 queue_empty_descs; > + size_t ring_vmalloc_size; The name looks a bit long to me, but that might be just personal preference. The code itself now looks good to me. Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> > }; > > /* The structure of the shared state of the rings are a simple Next time pls make sure you added all of the reviewers to the Cc list when sending a new revision. I noticed you posted v4 only by monitoring the ML. Thanks, Olek
On Thu, 16 Feb 2023 14:04:47 +0100, Alexander Lobakin <aleksander.lobakin@intel.com> wrote: > From: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > Date: Thu, 16 Feb 2023 16:30:47 +0800 > > > When we try to start AF_XDP on some machines with long running time, due > > to the machine's memory fragmentation problem, there is no sufficient > > contiguous physical memory that will cause the start failure. > > > > If the size of the queue is 8 * 1024, then the size of the desc[] is > > 8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is > > 16page+. This is necessary to apply for a 4-order memory. If there are a > > lot of queues, it is difficult to these machine with long running time. > > > > Here, that we actually waste 15 pages. 4-Order memory is 32 pages, but > > we only use 17 pages. > > > > This patch replaces __get_free_pages() by vmalloc() to allocate memory > > to solve these problems. > > > > Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > > Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> > > --- > > [...] > > > diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h > > index c6fb6b763658..bfb2a7e50c26 100644 > > --- a/net/xdp/xsk_queue.h > > +++ b/net/xdp/xsk_queue.h > > @@ -45,6 +45,7 @@ struct xsk_queue { > > struct xdp_ring *ring; > > u64 invalid_descs; > > u64 queue_empty_descs; > > + size_t ring_vmalloc_size; > > The name looks a bit long to me, but that might be just personal > preference. The code itself now looks good to me. > > Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> > > > }; > > > > /* The structure of the shared state of the rings are a simple > > Next time pls make sure you added all of the reviewers to the Cc list > when sending a new revision. I noticed you posted v4 only by monitoring > the ML. Oh, sorry. I always thought you were in the list. I did not notice this situation. I will pay attention next time. Thank you for your reply. Thanks. > > Thanks, > Olek
Hello: This patch was applied to netdev/net-next.git (master) by David S. Miller <davem@davemloft.net>: On Thu, 16 Feb 2023 16:30:47 +0800 you wrote: > When we try to start AF_XDP on some machines with long running time, due > to the machine's memory fragmentation problem, there is no sufficient > contiguous physical memory that will cause the start failure. > > If the size of the queue is 8 * 1024, then the size of the desc[] is > 8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is > 16page+. This is necessary to apply for a 4-order memory. If there are a > lot of queues, it is difficult to these machine with long running time. > > [...] Here is the summary with links: - [net-next,v4] xsk: support use vaddr as ring https://git.kernel.org/netdev/net-next/c/9f78bf330a66 You are awesome, thank you!
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 9f0561b67c12..0a047a09a10f 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -1295,8 +1295,6 @@ static int xsk_mmap(struct file *file, struct socket *sock, unsigned long size = vma->vm_end - vma->vm_start; struct xdp_sock *xs = xdp_sk(sock->sk); struct xsk_queue *q = NULL; - unsigned long pfn; - struct page *qpg; if (READ_ONCE(xs->state) != XSK_READY) return -EBUSY; @@ -1319,13 +1317,10 @@ static int xsk_mmap(struct file *file, struct socket *sock, /* Matches the smp_wmb() in xsk_init_queue */ smp_rmb(); - qpg = virt_to_head_page(q->ring); - if (size > page_size(qpg)) + if (size > q->ring_vmalloc_size) return -EINVAL; - pfn = virt_to_phys(q->ring) >> PAGE_SHIFT; - return remap_pfn_range(vma, vma->vm_start, pfn, - size, vma->vm_page_prot); + return remap_vmalloc_range(vma, q->ring, 0); } static int xsk_notifier(struct notifier_block *this, diff --git a/net/xdp/xsk_queue.c b/net/xdp/xsk_queue.c index 6cf9586e5027..f8905400ee07 100644 --- a/net/xdp/xsk_queue.c +++ b/net/xdp/xsk_queue.c @@ -6,6 +6,7 @@ #include <linux/log2.h> #include <linux/slab.h> #include <linux/overflow.h> +#include <linux/vmalloc.h> #include <net/xdp_sock_drv.h> #include "xsk_queue.h" @@ -23,7 +24,6 @@ static size_t xskq_get_ring_size(struct xsk_queue *q, bool umem_queue) struct xsk_queue *xskq_create(u32 nentries, bool umem_queue) { struct xsk_queue *q; - gfp_t gfp_flags; size_t size; q = kzalloc(sizeof(*q), GFP_KERNEL); @@ -33,17 +33,16 @@ struct xsk_queue *xskq_create(u32 nentries, bool umem_queue) q->nentries = nentries; q->ring_mask = nentries - 1; - gfp_flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN | - __GFP_COMP | __GFP_NORETRY; size = xskq_get_ring_size(q, umem_queue); + size = PAGE_ALIGN(size); - q->ring = (struct xdp_ring *)__get_free_pages(gfp_flags, - get_order(size)); + q->ring = vmalloc_user(size); if (!q->ring) { kfree(q); return NULL; } + q->ring_vmalloc_size = size; return q; } @@ -52,6 +51,6 @@ void xskq_destroy(struct xsk_queue *q) if (!q) return; - page_frag_free(q->ring); + vfree(q->ring); kfree(q); } diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index c6fb6b763658..bfb2a7e50c26 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -45,6 +45,7 @@ struct xsk_queue { struct xdp_ring *ring; u64 invalid_descs; u64 queue_empty_descs; + size_t ring_vmalloc_size; }; /* The structure of the shared state of the rings are a simple