diff mbox series

[net-next] netlink: use kvmalloc() in netlink_alloc_large_skb()

Message ID 20240224090630.605917-1-edumazet@google.com (mailing list archive)
State Accepted
Commit f8cbf6bde4c8d5d32330bcceafa7b139fec89f97
Delegated to: Netdev Maintainers
Headers show
Series [net-next] netlink: use kvmalloc() in netlink_alloc_large_skb() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 945 this patch: 945
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 4 of 4 maintainers
netdev/build_clang success Errors and warnings before: 958 this patch: 958
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 962 this patch: 962
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 31 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-02-25--03-00 (tests: 1457)

Commit Message

Eric Dumazet Feb. 24, 2024, 9:06 a.m. UTC
This is a followup of commit 234ec0b6034b ("netlink: fix potential
sleeping issue in mqueue_flush_file"), because vfree_atomic()
overhead is unfortunate for medium sized allocations.

1) If the allocation is smaller than PAGE_SIZE, do not bother
   with vmalloc() at all. Some arches have 64KB PAGE_SIZE,
   while NLMSG_GOODSIZE is smaller than 8KB.

2) Use kvmalloc(), which might allocate one high order page
   instead of vmalloc if memory is not too fragmented.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Zhengchao Shao <shaozhengchao@huawei.com>
---
 net/netlink/af_netlink.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

Comments

shaozhengchao Feb. 26, 2024, 1:33 a.m. UTC | #1
On 2024/2/24 17:06, Eric Dumazet wrote:
> This is a followup of commit 234ec0b6034b ("netlink: fix potential
> sleeping issue in mqueue_flush_file"), because vfree_atomic()
> overhead is unfortunate for medium sized allocations.
> 
> 1) If the allocation is smaller than PAGE_SIZE, do not bother
>     with vmalloc() at all. Some arches have 64KB PAGE_SIZE,
>     while NLMSG_GOODSIZE is smaller than 8KB.
> 
> 2) Use kvmalloc(), which might allocate one high order page
>     instead of vmalloc if memory is not too fragmented.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Zhengchao Shao <shaozhengchao@huawei.com>
> ---
>   net/netlink/af_netlink.c | 18 ++++++++----------
>   1 file changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 9c962347cf859f16fc76e4d8a2fd22cdb3d142d6..90ca4e0ed9b3632bf223bf29fd864dbb76f3c89c 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -1206,23 +1206,21 @@ struct sock *netlink_getsockbyfilp(struct file *filp)
>   
>   struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast)
>   {
> +	size_t head_size = SKB_HEAD_ALIGN(size);
>   	struct sk_buff *skb;
>   	void *data;
>   
> -	if (size <= NLMSG_GOODSIZE || broadcast)
> +	if (head_size <= PAGE_SIZE || broadcast)
>   		return alloc_skb(size, GFP_KERNEL);
>   
> -	size = SKB_DATA_ALIGN(size) +
> -	       SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> -
> -	data = vmalloc(size);
> -	if (data == NULL)
> +	data = kvmalloc(head_size, GFP_KERNEL);
> +	if (!data)
>   		return NULL;
>   
> -	skb = __build_skb(data, size);
> -	if (skb == NULL)
> -		vfree(data);
> -	else
> +	skb = __build_skb(data, head_size);
> +	if (!skb)
> +		kvfree(data);
> +	else if (is_vmalloc_addr(data))
>   		skb->destructor = netlink_skb_destructor;
>   
>   	return skb;
LGTM, thanks.

Reviewed-by: Zhengchao Shao <shaozhengchao@huawei.com>
Jakub Kicinski Feb. 27, 2024, 5:52 p.m. UTC | #2
On Sat, 24 Feb 2024 09:06:30 +0000 Eric Dumazet wrote:
>  struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast)
>  {
> +	size_t head_size = SKB_HEAD_ALIGN(size);
>  	struct sk_buff *skb;
>  	void *data;
>  
> -	if (size <= NLMSG_GOODSIZE || broadcast)
> +	if (head_size <= PAGE_SIZE || broadcast)
>  		return alloc_skb(size, GFP_KERNEL);
>  
> -	size = SKB_DATA_ALIGN(size) +
> -	       SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> -
> -	data = vmalloc(size);
> -	if (data == NULL)
> +	data = kvmalloc(head_size, GFP_KERNEL);
> +	if (!data)
>  		return NULL;
>  
> -	skb = __build_skb(data, size);
> -	if (skb == NULL)
> -		vfree(data);
> -	else
> +	skb = __build_skb(data, head_size);

Is this going to work with KFENCE? Don't we need similar size
adjustment logic as we have in __slab_build_skb() ?

> +	if (!skb)
> +		kvfree(data);
> +	else if (is_vmalloc_addr(data))
>  		skb->destructor = netlink_skb_destructor;
Eric Dumazet Feb. 27, 2024, 6:15 p.m. UTC | #3
On Tue, Feb 27, 2024 at 6:52 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Sat, 24 Feb 2024 09:06:30 +0000 Eric Dumazet wrote:
> >  struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast)
> >  {
> > +     size_t head_size = SKB_HEAD_ALIGN(size);
> >       struct sk_buff *skb;
> >       void *data;
> >
> > -     if (size <= NLMSG_GOODSIZE || broadcast)
> > +     if (head_size <= PAGE_SIZE || broadcast)
> >               return alloc_skb(size, GFP_KERNEL);
> >
> > -     size = SKB_DATA_ALIGN(size) +
> > -            SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> > -
> > -     data = vmalloc(size);
> > -     if (data == NULL)
> > +     data = kvmalloc(head_size, GFP_KERNEL);
> > +     if (!data)
> >               return NULL;
> >
> > -     skb = __build_skb(data, size);
> > -     if (skb == NULL)
> > -             vfree(data);
> > -     else
> > +     skb = __build_skb(data, head_size);
>
> Is this going to work with KFENCE? Don't we need similar size
> adjustment logic as we have in __slab_build_skb() ?

Note that the 2nd argument of  __build_skb() has not been changed by my patch.

 SKB_HEAD_ALIGN(size) == SKB_DATA_ALIGN(size) +

SKB_DATA_ALIGN(sizeof(struct skb_shared_info));

I do not expect kfence being a problem here ?

Either data is vmalloc, and the patch is a no-op,
either it is kmalloc(), and __build_skb() does nothing special,
kfence magic already happened.

>
> > +     if (!skb)
> > +             kvfree(data);

Note that skb->head at this point must be equal to @data

> > +     else if (is_vmalloc_addr(data))
> >               skb->destructor = netlink_skb_destructor;
patchwork-bot+netdevbpf@kernel.org Feb. 27, 2024, 7:20 p.m. UTC | #4
Hello:

This patch was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Sat, 24 Feb 2024 09:06:30 +0000 you wrote:
> This is a followup of commit 234ec0b6034b ("netlink: fix potential
> sleeping issue in mqueue_flush_file"), because vfree_atomic()
> overhead is unfortunate for medium sized allocations.
> 
> 1) If the allocation is smaller than PAGE_SIZE, do not bother
>    with vmalloc() at all. Some arches have 64KB PAGE_SIZE,
>    while NLMSG_GOODSIZE is smaller than 8KB.
> 
> [...]

Here is the summary with links:
  - [net-next] netlink: use kvmalloc() in netlink_alloc_large_skb()
    https://git.kernel.org/netdev/net-next/c/f8cbf6bde4c8

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 9c962347cf859f16fc76e4d8a2fd22cdb3d142d6..90ca4e0ed9b3632bf223bf29fd864dbb76f3c89c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1206,23 +1206,21 @@  struct sock *netlink_getsockbyfilp(struct file *filp)
 
 struct sk_buff *netlink_alloc_large_skb(unsigned int size, int broadcast)
 {
+	size_t head_size = SKB_HEAD_ALIGN(size);
 	struct sk_buff *skb;
 	void *data;
 
-	if (size <= NLMSG_GOODSIZE || broadcast)
+	if (head_size <= PAGE_SIZE || broadcast)
 		return alloc_skb(size, GFP_KERNEL);
 
-	size = SKB_DATA_ALIGN(size) +
-	       SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-
-	data = vmalloc(size);
-	if (data == NULL)
+	data = kvmalloc(head_size, GFP_KERNEL);
+	if (!data)
 		return NULL;
 
-	skb = __build_skb(data, size);
-	if (skb == NULL)
-		vfree(data);
-	else
+	skb = __build_skb(data, head_size);
+	if (!skb)
+		kvfree(data);
+	else if (is_vmalloc_addr(data))
 		skb->destructor = netlink_skb_destructor;
 
 	return skb;