diff mbox series

[RFC,bpf-next,22/52] net, skbuff: add ability to skip skb metadata comparison

Message ID 20220628194812.1453059-23-alexandr.lobakin@intel.com (mailing list archive)
State RFC
Delegated to: BPF
Headers show
Series bpf, xdp: introduce and use Generic Hints/metadata | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 fail Logs for Kernel LATEST on ubuntu-latest with gcc
bpf/vmtest-bpf-next-VM_Test-2 fail Logs for Kernel LATEST on ubuntu-latest with llvm-15
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for Kernel LATEST on z15 with gcc
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/apply fail Patch does not apply to bpf-next

Commit Message

Alexander Lobakin June 28, 2022, 7:47 p.m. UTC
Some XDP metadata fields maybe be unique from frame to frame, not
necessarily indicating that it's from a different flow. This
includes frame checksums, timestamps etc.
The drivers usually carry the metadata to skbs along with the
payload, and the GRO layer tries to compare the metadata of
the frames. This not only leads to perf regressions (esp. given
that metadata can now be larger than 32 bytes -> a slower call to
memmp() will be used), but also breaks frame coalescing at all.
To avoid that, add an skb flag indicating that the metadata can
carry unique values and thus should not be compared. If at least
one of the skbs passed to skb_metadata_differs() carries it, the
function will then immediately return reporting that they're
identical.
The underscored version of the function is not affected, allowing
to explicitly compare the meta if needed. The flag is being cleared
on pskb_expand_head() when the skb_shared_info::meta_len gets
zeroed.

Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
---
 include/linux/skbuff.h | 18 ++++++++++++++++++
 net/core/skbuff.c      |  1 +
 2 files changed, 19 insertions(+)
diff mbox series

Patch

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a825ea7f375d..1c308511acbb 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -509,6 +509,11 @@  enum {
 	 * charged to the kernel memory.
 	 */
 	SKBFL_PURE_ZEROCOPY = BIT(2),
+
+	/* skb metadata may contain unique values such as checksums
+	 * and we should not compare it against others.
+	 */
+	SKBFL_METADATA_NOCOMP = BIT(3),
 };
 
 #define SKBFL_ZEROCOPY_FRAG	(SKBFL_ZEROCOPY_ENABLE | SKBFL_SHARED_FRAG)
@@ -4137,6 +4142,9 @@  static inline bool skb_metadata_differs(const struct sk_buff *skb_a,
 
 	if (!(len_a | len_b))
 		return false;
+	if ((skb_shinfo(skb_a)->flags | skb_shinfo(skb_b)->flags) &
+	    SKBFL_METADATA_NOCOMP)
+		return false;
 
 	return len_a != len_b ?
 	       true : __skb_metadata_differs(skb_a, skb_b, len_a);
@@ -4152,6 +4160,16 @@  static inline void skb_metadata_clear(struct sk_buff *skb)
 	skb_metadata_set(skb, 0);
 }
 
+static inline void skb_metadata_nocomp_set(struct sk_buff *skb)
+{
+	skb_shinfo(skb)->flags |= SKBFL_METADATA_NOCOMP;
+}
+
+static inline void skb_metadata_nocomp_clear(struct sk_buff *skb)
+{
+	skb_shinfo(skb)->flags &= ~SKBFL_METADATA_NOCOMP;
+}
+
 struct sk_buff *skb_clone_sk(struct sk_buff *skb);
 
 #ifdef CONFIG_NETWORK_PHY_TIMESTAMPING
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 00bf35ee8205..5b23fc7f1157 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1750,6 +1750,7 @@  int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 	atomic_set(&skb_shinfo(skb)->dataref, 1);
 
 	skb_metadata_clear(skb);
+	skb_metadata_nocomp_clear(skb);
 
 	/* It is not generally safe to change skb->truesize.
 	 * For the moment, we really care of rx path, or