diff mbox series

[NET,1/7] skbuff: introduce pskb_realloc_headroom()

Message ID 8049e16b-3d7a-64c3-c948-ec504590a136@virtuozzo.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [NET,1/7] skbuff: introduce pskb_realloc_headroom() | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply

Commit Message

Vasily Averin July 12, 2021, 1:26 p.m. UTC
Unlike skb_realloc_headroom, new helper does not allocate a new skb
if possible; copies skb->sk on new skb when as needed and frees
original skb in case of failures.

This helps to simplify ip[6]_finish_output2() and a few other similar cases.

Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
---
 include/linux/skbuff.h |  2 ++
 net/core/skbuff.c      | 41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 43 insertions(+)

Comments

Jakub Kicinski July 12, 2021, 5:53 p.m. UTC | #1
On Mon, 12 Jul 2021 16:26:44 +0300 Vasily Averin wrote:
>  /**
> + *	pskb_realloc_headroom - reallocate header of &sk_buff
> + *	@skb: buffer to reallocate
> + *	@headroom: needed headroom
> + *
> + *	Unlike skb_realloc_headroom, this one does not allocate a new skb
> + *	if possible; copies skb->sk to new skb as needed
> + *	and frees original scb in case of failures.
> + *
> + *	It expect increased headroom, and generates warning otherwise.
> + */
> +
> +struct sk_buff *pskb_realloc_headroom(struct sk_buff *skb, unsigned int headroom)

I saw you asked about naming in a different sub-thread, what do you
mean by "'pskb_expand_head' have different semantic"? AFAIU the 'p'
in pskb stands for "private", meaning not shared. In fact
skb_realloc_headroom() should really be pskb... but it predates the 
'pskb' naming pattern by quite a while. Long story short
skb_expand_head() seems like a good name. With the current patch
pskb_realloc_headroom() vs skb_realloc_headroom() would give people
exactly the opposite intuition of what the code does.
Vasily Averin July 12, 2021, 6:45 p.m. UTC | #2
On 7/12/21 8:53 PM, Jakub Kicinski wrote:
> I saw you asked about naming in a different sub-thread, what do you
> mean by "'pskb_expand_head' have different semantic"? AFAIU the 'p'
> in pskb stands for "private", meaning not shared. In fact
> skb_realloc_headroom() should really be pskb... but it predates the 
> 'pskb' naming pattern by quite a while. Long story short
> skb_expand_head() seems like a good name. With the current patch
> pskb_realloc_headroom() vs skb_realloc_headroom() would give people
> exactly the opposite intuition of what the code does.

Thank you for feedback,
I'll change helper name back to skb_expand_head() in next patch version.

	Vasily Averin
Vasily Averin July 13, 2021, 8:57 p.m. UTC | #3
currently if skb does not have enough headroom skb_realloc_headrom is called.
It is not optimal because it creates new skb.

Unlike skb_realloc_headroom, new helper skb_учзфтв_head 
does not allocate a new skb if possible; 
copies skb->sk on new skb when as needed
and frees original skb in case of failures.

This helps to simplify ip[6]_finish_output2(), ip6_xmit() and a few other
functions in vrf, ax25 and bpf.

There are few other cases where this helper can be used but they require
an additional investigations. 

v2 changes:
 - helper's name was changed to skb_expand_head
 - fixed few mistakes inside skb_expand_head():
    skb_set_owner_w should set sk on nskb
    kfree was replaced by kfree_skb()
    improved warning message
 - added minor refactoring in changed functions in vrf and bpf patches
 - removed kfree_skb() in ax25_rt_build_path caller ax25_ip_xmit

NB: patch "ipv6: use skb_expand_head in ip6_finish_output2" depends on 
patch "ipv6: allocate enough headroom in ip6_finish_output2()" submitted separately
https://lkml.org/lkml/2021/7/12/732

Vasily Averin (7):
  skbuff: introduce skb_expand_head()
  ipv6: use skb_expand_head in ip6_finish_output2
  ipv6: use skb_expand_head in ip6_xmit refactoring
  ipv4: use skb_expand_head in ip_finish_output2
  vrf: use skb_expand_head in vrf_finish_output
  ax25: use skb_expand_head
  bpf: use skb_expand_head in bpf_out_neigh_v4/6

 drivers/net/vrf.c      | 21 +++++---------
 include/linux/skbuff.h |  1 +
 net/ax25/ax25_out.c    | 12 ++------
 net/ax25/ax25_route.c  | 13 ++-------
 net/core/filter.c      | 27 ++++-------------
 net/core/skbuff.c      | 42 +++++++++++++++++++++++++++
 net/ipv4/ip_output.c   | 13 ++-------
 net/ipv6/ip6_output.c  | 78 +++++++++++++++++---------------------------------
 8 files changed, 90 insertions(+), 117 deletions(-)
diff mbox series

Patch

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index dbf820a..381a219 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1174,6 +1174,8 @@  static inline struct sk_buff *__pskb_copy(struct sk_buff *skb, int headroom,
 int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail, gfp_t gfp_mask);
 struct sk_buff *skb_realloc_headroom(struct sk_buff *skb,
 				     unsigned int headroom);
+struct sk_buff *pskb_realloc_headroom(struct sk_buff *skb,
+				      unsigned int headroom);
 struct sk_buff *skb_copy_expand(const struct sk_buff *skb, int newheadroom,
 				int newtailroom, gfp_t priority);
 int __must_check skb_to_sgvec_nomark(struct sk_buff *skb, struct scatterlist *sg,
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index bbc3b4b..13cbe98 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1769,6 +1769,47 @@  struct sk_buff *skb_realloc_headroom(struct sk_buff *skb, unsigned int headroom)
 EXPORT_SYMBOL(skb_realloc_headroom);
 
 /**
+ *	pskb_realloc_headroom - reallocate header of &sk_buff
+ *	@skb: buffer to reallocate
+ *	@headroom: needed headroom
+ *
+ *	Unlike skb_realloc_headroom, this one does not allocate a new skb
+ *	if possible; copies skb->sk to new skb as needed
+ *	and frees original scb in case of failures.
+ *
+ *	It expect increased headroom, and generates warning otherwise.
+ */
+
+struct sk_buff *pskb_realloc_headroom(struct sk_buff *skb, unsigned int headroom)
+{
+	int delta = headroom - skb_headroom(skb);
+
+	if (WARN_ONCE(delta <= 0, "%s expect positive delta", __func__))
+		return skb;
+
+	/* pskb_expand_head() might crash, if skb is shared */
+	if (skb_shared(skb)) {
+		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
+
+		if (likely(nskb)) {
+			if (skb->sk)
+				skb_set_owner_w(skb, skb->sk);
+			consume_skb(skb);
+		} else {
+			kfree(skb);
+		}
+		skb = nskb;
+	}
+	if (skb &&
+	    pskb_expand_head(skb, SKB_DATA_ALIGN(delta), 0, GFP_ATOMIC)) {
+		kfree(skb);
+		skb = NULL;
+	}
+	return skb;
+}
+EXPORT_SYMBOL(pskb_realloc_headroom);
+
+/**
  *	skb_copy_expand	-	copy and expand sk_buff
  *	@skb: buffer to copy
  *	@newheadroom: new free bytes at head