diff mbox series

[net-next] netpoll: optimize struct layout for cache efficiency

Message ID 20250324-netpoll_structstruct-v1-1-ff78f8a88dbb@debian.org (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] netpoll: optimize struct layout for cache efficiency | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/build_tools success Errors and warnings before: 26 (+0) this patch: 26 (+0)
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 21 this patch: 21
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 30 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-03-24--15-00 (tests: 896)

Commit Message

Breno Leitao March 24, 2025, 12:29 p.m. UTC
The struct netpoll serves two distinct purposes: it contains
configuration data needed only during setup (in netpoll_setup()), and
runtime data that's accessed on every packet transmission (in
netpoll_send_udp()).

Currently, this structure spans three cache lines with suboptimal
organization, where frequently accessed fields are mixed with rarely
accessed ones.

This commit reorganizes the structure to place all runtime fields used
during packet transmission together in the first cache line, while
moving the setup-only configuration fields to subsequent cache lines.
This approach follows the principle of placing hot fields together for
better cache locality during the performance-critical path.

The restructuring also eliminates structural inefficiencies, reducing
the number of holes. This provides a more compact memory layout while
maintaining the same functionality, resulting in better cache
utilization and potentially improves performance during packet
transmission operations.

  -   /* sum members: 137, holes: 3, sum holes: 7 */
  +   /* sum members: 137, holes: 1, sum holes: 3 */

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 include/linux/netpoll.h | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)


---
base-commit: bfc17c1658353f22843c7c13e27c2d31950f1887
change-id: 20250324-netpoll_structstruct-6ff56d1cc4d8

Best regards,

Comments

Jakub Kicinski March 25, 2025, 3:48 p.m. UTC | #1
On Mon, 24 Mar 2025 05:29:13 -0700 Breno Leitao wrote:
> The struct netpoll serves two distinct purposes: it contains
> configuration data needed only during setup (in netpoll_setup()), and
> runtime data that's accessed on every packet transmission (in
> netpoll_send_udp()).
> 
> Currently, this structure spans three cache lines with suboptimal
> organization, where frequently accessed fields are mixed with rarely
> accessed ones.
> 
> This commit reorganizes the structure to place all runtime fields used
> during packet transmission together in the first cache line, while
> moving the setup-only configuration fields to subsequent cache lines.
> This approach follows the principle of placing hot fields together for
> better cache locality during the performance-critical path.
> 
> The restructuring also eliminates structural inefficiencies, reducing
> the number of holes. This provides a more compact memory layout while
> maintaining the same functionality, resulting in better cache
> utilization and potentially improves performance during packet
> transmission operations.

Netpoll shouldn't send too many packets, "not too many" for networking
means >100kpps. So I don't think the hot / close split matters?

> 
>   -   /* sum members: 137, holes: 3, sum holes: 7 */
>   +   /* sum members: 137, holes: 1, sum holes: 3 */
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
>  include/linux/netpoll.h | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
> index 0477208ed9ffa..a8de41d84be52 100644
> --- a/include/linux/netpoll.h
> +++ b/include/linux/netpoll.h
> @@ -24,7 +24,16 @@ union inet_addr {
>  
>  struct netpoll {
>  	struct net_device *dev;
> +	u16 local_port, remote_port;
>  	netdevice_tracker dev_tracker;

It's a little odd to leave the tracker in hot data, if you do it
should at least be adjacent to the pointer it tracks?

> +	union inet_addr local_ip, remote_ip;
> +	bool ipv6;
Breno Leitao March 25, 2025, 4:50 p.m. UTC | #2
On Tue, Mar 25, 2025 at 08:48:38AM -0700, Jakub Kicinski wrote:
> On Mon, 24 Mar 2025 05:29:13 -0700 Breno Leitao wrote:
> > The struct netpoll serves two distinct purposes: it contains
> > configuration data needed only during setup (in netpoll_setup()), and
> > runtime data that's accessed on every packet transmission (in
> > netpoll_send_udp()).
> > 
> > Currently, this structure spans three cache lines with suboptimal
> > organization, where frequently accessed fields are mixed with rarely
> > accessed ones.
> > 
> > This commit reorganizes the structure to place all runtime fields used
> > during packet transmission together in the first cache line, while
> > moving the setup-only configuration fields to subsequent cache lines.
> > This approach follows the principle of placing hot fields together for
> > better cache locality during the performance-critical path.
> > 
> > The restructuring also eliminates structural inefficiencies, reducing
> > the number of holes. This provides a more compact memory layout while
> > maintaining the same functionality, resulting in better cache
> > utilization and potentially improves performance during packet
> > transmission operations.
> 
> Netpoll shouldn't send too many packets, "not too many" for networking
> means >100kpps. So I don't think the hot / close split matters?

I see your point. The gain is going to be marginal given the frequency
this netpoll is supposed to be called, for sure.

On the other side, I think this is still better than the current state,
given: 

 * it has no adverse effect
 * potential marginal performance win
 * structure packing, potentially saving 2 bytes.

> >   -   /* sum members: 137, holes: 3, sum holes: 7 */
> >   +   /* sum members: 137, holes: 1, sum holes: 3 */
> > 
> > Signed-off-by: Breno Leitao <leitao@debian.org>
> > ---
> >  include/linux/netpoll.h | 16 +++++++++-------
> >  1 file changed, 9 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
> > index 0477208ed9ffa..a8de41d84be52 100644
> > --- a/include/linux/netpoll.h
> > +++ b/include/linux/netpoll.h
> > @@ -24,7 +24,16 @@ union inet_addr {
> >  
> >  struct netpoll {
> >  	struct net_device *dev;
> > +	u16 local_port, remote_port;
> >  	netdevice_tracker dev_tracker;
> 
> It's a little odd to leave the tracker in hot data, if you do it
> should at least be adjacent to the pointer it tracks?

Double-checking this better, netdevice_tracker is NOT on the hot path.
It is only used on the setup functions.

If you think this is not a total waste of time, I will send a v2 moving
it to the bottom.

Thanks for your review,
--breno
Jakub Kicinski March 25, 2025, 5 p.m. UTC | #3
On Tue, 25 Mar 2025 09:50:21 -0700 Breno Leitao wrote:
> > It's a little odd to leave the tracker in hot data, if you do it
> > should at least be adjacent to the pointer it tracks?  
> 
> Double-checking this better, netdevice_tracker is NOT on the hot path.
> It is only used on the setup functions.
> 
> If you think this is not a total waste of time, I will send a v2 moving
> it to the bottom.

I'd keep it next to dev. A bit of packing to save space is fine, 
but let's prioritize readability during the reshuffle.
Note that net-next is closed for the MW.
diff mbox series

Patch

diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 0477208ed9ffa..a8de41d84be52 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -24,7 +24,16 @@  union inet_addr {
 
 struct netpoll {
 	struct net_device *dev;
+	u16 local_port, remote_port;
 	netdevice_tracker dev_tracker;
+	union inet_addr local_ip, remote_ip;
+	bool ipv6;
+
+	/* Hot fields above */
+
+	const char *name;
+	struct sk_buff_head skb_pool;
+	struct work_struct refill_wq;
 	/*
 	 * Either dev_name or dev_mac can be used to specify the local
 	 * interface - dev_name is used if it is a nonempty string, else
@@ -32,14 +41,7 @@  struct netpoll {
 	 */
 	char dev_name[IFNAMSIZ];
 	u8 dev_mac[ETH_ALEN];
-	const char *name;
-
-	union inet_addr local_ip, remote_ip;
-	bool ipv6;
-	u16 local_port, remote_port;
 	u8 remote_mac[ETH_ALEN];
-	struct sk_buff_head skb_pool;
-	struct work_struct refill_wq;
 };
 
 struct netpoll_info {