Message ID | 20240224090630.605917-1-edumazet@google.com (mailing list archive) |
---|---|
State | Accepted |
Commit | f8cbf6bde4c8d5d32330bcceafa7b139fec89f97 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] netlink: use kvmalloc() in netlink_alloc_large_skb() | expand |
On 2024/2/24 17:06, Eric Dumazet wrote: > This is a followup of commit 234ec0b6034b ("netlink: fix potential > sleeping issue in mqueue_flush_file"), because vfree_atomic() > overhead is unfortunate for medium sized allocations. > > 1) If the allocation is smaller than PAGE_SIZE, do not bother > with vmalloc() at all. Some arches have 64KB PAGE_SIZE, > while NLMSG_GOODSIZE is smaller than 8KB. > > 2) Use kvmalloc(), which might allocate one high order page > instead of vmalloc if memory is not too fragmented. > > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Zhengchao Shao <shaozhengchao@huawei.com> > --- > net/netlink/af_netlink.c | 18 ++++++++---------- > 1 file changed, 8 insertions(+), 10 deletions(-) > > diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c > index 9c962347cf859f16fc76e4d8a2fd22cdb3d142d6..90ca4e0ed9b3632bf223bf29fd864dbb76f3c89c 100644 > --- a/net/netlink/af_netlink.c > +++ b/net/netlink/af_netlink.c > @@ -1206,23 +1206,21 @@ struct sock *netlink_getsockbyfilp(struct file *filp) > > struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast) > { > + size_t head_size = SKB_HEAD_ALIGN(size); > struct sk_buff *skb; > void *data; > > - if (size <= NLMSG_GOODSIZE || broadcast) > + if (head_size <= PAGE_SIZE || broadcast) > return alloc_skb(size, GFP_KERNEL); > > - size = SKB_DATA_ALIGN(size) + > - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > - > - data = vmalloc(size); > - if (data == NULL) > + data = kvmalloc(head_size, GFP_KERNEL); > + if (!data) > return NULL; > > - skb = __build_skb(data, size); > - if (skb == NULL) > - vfree(data); > - else > + skb = __build_skb(data, head_size); > + if (!skb) > + kvfree(data); > + else if (is_vmalloc_addr(data)) > skb->destructor = netlink_skb_destructor; > > return skb; LGTM, thanks. Reviewed-by: Zhengchao Shao <shaozhengchao@huawei.com>
On Sat, 24 Feb 2024 09:06:30 +0000 Eric Dumazet wrote: > struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast) > { > + size_t head_size = SKB_HEAD_ALIGN(size); > struct sk_buff *skb; > void *data; > > - if (size <= NLMSG_GOODSIZE || broadcast) > + if (head_size <= PAGE_SIZE || broadcast) > return alloc_skb(size, GFP_KERNEL); > > - size = SKB_DATA_ALIGN(size) + > - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > - > - data = vmalloc(size); > - if (data == NULL) > + data = kvmalloc(head_size, GFP_KERNEL); > + if (!data) > return NULL; > > - skb = __build_skb(data, size); > - if (skb == NULL) > - vfree(data); > - else > + skb = __build_skb(data, head_size); Is this going to work with KFENCE? Don't we need similar size adjustment logic as we have in __slab_build_skb() ? > + if (!skb) > + kvfree(data); > + else if (is_vmalloc_addr(data)) > skb->destructor = netlink_skb_destructor;
On Tue, Feb 27, 2024 at 6:52 PM Jakub Kicinski <kuba@kernel.org> wrote: > > On Sat, 24 Feb 2024 09:06:30 +0000 Eric Dumazet wrote: > > struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast) > > { > > + size_t head_size = SKB_HEAD_ALIGN(size); > > struct sk_buff *skb; > > void *data; > > > > - if (size <= NLMSG_GOODSIZE || broadcast) > > + if (head_size <= PAGE_SIZE || broadcast) > > return alloc_skb(size, GFP_KERNEL); > > > > - size = SKB_DATA_ALIGN(size) + > > - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > - > > - data = vmalloc(size); > > - if (data == NULL) > > + data = kvmalloc(head_size, GFP_KERNEL); > > + if (!data) > > return NULL; > > > > - skb = __build_skb(data, size); > > - if (skb == NULL) > > - vfree(data); > > - else > > + skb = __build_skb(data, head_size); > > Is this going to work with KFENCE? Don't we need similar size > adjustment logic as we have in __slab_build_skb() ? Note that the 2nd argument of __build_skb() has not been changed by my patch. SKB_HEAD_ALIGN(size) == SKB_DATA_ALIGN(size) + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); I do not expect kfence being a problem here ? Either data is vmalloc, and the patch is a no-op, either it is kmalloc(), and __build_skb() does nothing special, kfence magic already happened. > > > + if (!skb) > > + kvfree(data); Note that skb->head at this point must be equal to @data > > + else if (is_vmalloc_addr(data)) > > skb->destructor = netlink_skb_destructor;
Hello: This patch was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Sat, 24 Feb 2024 09:06:30 +0000 you wrote: > This is a followup of commit 234ec0b6034b ("netlink: fix potential > sleeping issue in mqueue_flush_file"), because vfree_atomic() > overhead is unfortunate for medium sized allocations. > > 1) If the allocation is smaller than PAGE_SIZE, do not bother > with vmalloc() at all. Some arches have 64KB PAGE_SIZE, > while NLMSG_GOODSIZE is smaller than 8KB. > > [...] Here is the summary with links: - [net-next] netlink: use kvmalloc() in netlink_alloc_large_skb() https://git.kernel.org/netdev/net-next/c/f8cbf6bde4c8 You are awesome, thank you!
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 9c962347cf859f16fc76e4d8a2fd22cdb3d142d6..90ca4e0ed9b3632bf223bf29fd864dbb76f3c89c 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -1206,23 +1206,21 @@ struct sock *netlink_getsockbyfilp(struct file *filp) struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast) { + size_t head_size = SKB_HEAD_ALIGN(size); struct sk_buff *skb; void *data; - if (size <= NLMSG_GOODSIZE || broadcast) + if (head_size <= PAGE_SIZE || broadcast) return alloc_skb(size, GFP_KERNEL); - size = SKB_DATA_ALIGN(size) + - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); - - data = vmalloc(size); - if (data == NULL) + data = kvmalloc(head_size, GFP_KERNEL); + if (!data) return NULL; - skb = __build_skb(data, size); - if (skb == NULL) - vfree(data); - else + skb = __build_skb(data, head_size); + if (!skb) + kvfree(data); + else if (is_vmalloc_addr(data)) skb->destructor = netlink_skb_destructor; return skb;
This is a followup of commit 234ec0b6034b ("netlink: fix potential sleeping issue in mqueue_flush_file"), because vfree_atomic() overhead is unfortunate for medium sized allocations. 1) If the allocation is smaller than PAGE_SIZE, do not bother with vmalloc() at all. Some arches have 64KB PAGE_SIZE, while NLMSG_GOODSIZE is smaller than 8KB. 2) Use kvmalloc(), which might allocate one high order page instead of vmalloc if memory is not too fragmented. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Zhengchao Shao <shaozhengchao@huawei.com> --- net/netlink/af_netlink.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-)