[RFC,net-next,1/1] idpf: Don't hard code napi_struct size

Message ID	20240925180017.82891-2-jdamato@fastly.com (mailing list archive)
State	Superseded
Delegated to:	Netdev Maintainers
Headers	show Received: from mail-oi1-f181.google.com (mail-oi1-f181.google.com [209.85.167.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6F83513B287 for <netdev@vger.kernel.org>; Wed, 25 Sep 2024 18:00:26 +0000 (UTC) From: Joe Damato <jdamato@fastly.com> To: netdev@vger.kernel.org Cc: Joe Damato <jdamato@fastly.com>, Tony Nguyen <anthony.l.nguyen@intel.com>, Przemek Kitszel <przemyslaw.kitszel@intel.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Alexander Lobakin <aleksander.lobakin@intel.com>, intel-wired-lan@lists.osuosl.org (moderated list:INTEL ETHERNET DRIVERS), linux-kernel@vger.kernel.org (open list) Subject: [RFC net-next 1/1] idpf: Don't hard code napi_struct size Date: Wed, 25 Sep 2024 18:00:17 +0000 Message-Id: <20240925180017.82891-2-jdamato@fastly.com> In-Reply-To: <20240925180017.82891-1-jdamato@fastly.com> References: <20240925180017.82891-1-jdamato@fastly.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	idpf: Don't hardcode napi_struct size \| expand [RFC,net-next,0/1] idpf: Don't hardcode napi_struct size [RFC,net-next,1/1] idpf: Don't hard code napi_struct size

Context	Check	Description
netdev/series_format	success	Posting correctly formatted
netdev/tree_selection	success	Clearly marked for net-next
netdev/ynl	success	Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present	success	Fixes tag not required for -next series
netdev/header_inline	success	No static functions without inline keyword in header files
netdev/build_32bit	success	Errors and warnings before: 16 this patch: 16
netdev/build_tools	success	No tools touched, skip
netdev/cc_maintainers	success	CCed 8 of 8 maintainers
netdev/build_clang	success	Errors and warnings before: 16 this patch: 16
netdev/verify_signedoff	success	Signed-off-by tag matches author and committer
netdev/deprecated_api	success	None detected
netdev/check_selftest	success	No net selftest shell script
netdev/verify_fixes	success	No Fixes tag
netdev/build_allmodconfig_warn	success	Errors and warnings before: 16 this patch: 16
netdev/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/build_clang_rust	success	No Rust files in patch. Skipping build
netdev/kdoc	success	Errors and warnings before: 22 this patch: 22
netdev/source_inline	success	Was 0 now: 0

Joe Damato Sept. 25, 2024, 6 p.m. UTC

The sizeof(struct napi_struct) can change. Don't hardcode the size to
400 bytes and instead use "sizeof(struct napi_struct)".

While fixing this, also move other calculations into compile time
defines.

Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/idpf/idpf_txrx.h | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Simon Horman Sept. 25, 2024, 8:33 p.m. UTC | #1

On Wed, Sep 25, 2024 at 06:00:17PM +0000, Joe Damato wrote:
> The sizeof(struct napi_struct) can change. Don't hardcode the size to
> 400 bytes and instead use "sizeof(struct napi_struct)".
> 
> While fixing this, also move other calculations into compile time
> defines.
> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>

Reviewed-by: Simon Horman <horms@kernel.org>

Alexander Lobakin Sept. 30, 2024, 12:33 p.m. UTC | #2

From: Joe Damato <jdamato@fastly.com>
Date: Wed, 25 Sep 2024 18:00:17 +0000

> The sizeof(struct napi_struct) can change. Don't hardcode the size to
> 400 bytes and instead use "sizeof(struct napi_struct)".

Just change this hardcode here when you submit your series.
I use sizeof() here only for structures which size can change depending
on .config, like ones containing spinlocks etc.
If you just add one field, it's easy to adjust the size here.

Otherwise, these assertions lose their sense. They're used to make sure
the structures are of *certain* *known* size, hardcoded inside them.
sizeof()s mean nothing, they don't give you the idea of how the
cachelines are organized after some code change.

> 
> While fixing this, also move other calculations into compile time
> defines.
> 
> Signed-off-by: Joe Damato <jdamato@fastly.com>
> ---
>  drivers/net/ethernet/intel/idpf/idpf_txrx.h | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
> index f0537826f840..d5e904ddcb6e 100644
> --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
> +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
> @@ -437,9 +437,13 @@ struct idpf_q_vector {
>  	cpumask_var_t affinity_mask;
>  	__cacheline_group_end_aligned(cold);
>  };
> -libeth_cacheline_set_assert(struct idpf_q_vector, 112,
> -			    424 + 2 * sizeof(struct dim),
> -			    8 + sizeof(cpumask_var_t));
> +
> +#define IDPF_Q_VECTOR_RO_SZ (112)
> +#define IDPF_Q_VECTOR_RW_SZ (sizeof(struct napi_struct) + 24 + \
> +			     2 * sizeof(struct dim))
> +#define IDPF_Q_VECTOR_COLD_SZ (8 + sizeof(cpumask_var_t))
> +libeth_cacheline_set_assert(struct idpf_q_vector, IDPF_Q_VECTOR_RO_SZ,
> +			    IDPF_Q_VECTOR_RW_SZ, IDPF_Q_VECTOR_COLD_SZ);
>  
>  struct idpf_rx_queue_stats {
>  	u64_stats_t packets;

Thanks,
Olek

Alexander Lobakin Sept. 30, 2024, 12:38 p.m. UTC | #3

From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Mon, 30 Sep 2024 14:33:45 +0200

> From: Joe Damato <jdamato@fastly.com>
> Date: Wed, 25 Sep 2024 18:00:17 +0000
> 
>> The sizeof(struct napi_struct) can change. Don't hardcode the size to
>> 400 bytes and instead use "sizeof(struct napi_struct)".
> 
> Just change this hardcode here when you submit your series.
> I use sizeof() here only for structures which size can change depending
> on .config, like ones containing spinlocks etc.
> If you just add one field, it's easy to adjust the size here.
> 
> Otherwise, these assertions lose their sense. They're used to make sure
> the structures are of *certain* *known* size, hardcoded inside them.
> sizeof()s mean nothing, they don't give you the idea of how the
> cachelines are organized after some code change.

struct dim depends on spinlock debug settings, lockdep etc., plenty of
different .config options which can change its size unpredictably.
cpumask_var_t directly depends on CONFIG_NR_CPUS, but it can also be a
pointer if CONFIG_CPUMASK_OFFSTACK. IOW its size can vary from 4 bytes
to several Kbs.

struct napi_struct doesn't have any such fields and doesn't depend on
the kernel configuration, that's why it's hardcoded.
Please don't change that, just adjust the hardcoded values when needed.
Otherwise, if anyone have objections with strong arguments, I'd just
remove these assertions completely.
It's a common thing that if we change some generic structure in the
kernel, we need to adjust some driver code.

> 
>>
>> While fixing this, also move other calculations into compile time
>> defines.
>>
>> Signed-off-by: Joe Damato <jdamato@fastly.com>
>> ---
>>  drivers/net/ethernet/intel/idpf/idpf_txrx.h | 10 +++++++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/idpf/idpf_txrx.h b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
>> index f0537826f840..d5e904ddcb6e 100644
>> --- a/drivers/net/ethernet/intel/idpf/idpf_txrx.h
>> +++ b/drivers/net/ethernet/intel/idpf/idpf_txrx.h
>> @@ -437,9 +437,13 @@ struct idpf_q_vector {
>>  	cpumask_var_t affinity_mask;
>>  	__cacheline_group_end_aligned(cold);
>>  };
>> -libeth_cacheline_set_assert(struct idpf_q_vector, 112,
>> -			    424 + 2 * sizeof(struct dim),
>> -			    8 + sizeof(cpumask_var_t));
>> +
>> +#define IDPF_Q_VECTOR_RO_SZ (112)
>> +#define IDPF_Q_VECTOR_RW_SZ (sizeof(struct napi_struct) + 24 + \
>> +			     2 * sizeof(struct dim))
>> +#define IDPF_Q_VECTOR_COLD_SZ (8 + sizeof(cpumask_var_t))
>> +libeth_cacheline_set_assert(struct idpf_q_vector, IDPF_Q_VECTOR_RO_SZ,
>> +			    IDPF_Q_VECTOR_RW_SZ, IDPF_Q_VECTOR_COLD_SZ);
>>  
>>  struct idpf_rx_queue_stats {
>>  	u64_stats_t packets;

Thanks,
Olek

Przemek Kitszel Sept. 30, 2024, 1:10 p.m. UTC | #4

On 9/30/24 14:38, Alexander Lobakin wrote:
> From: Alexander Lobakin <aleksander.lobakin@intel.com>
> Date: Mon, 30 Sep 2024 14:33:45 +0200
> 
>> From: Joe Damato <jdamato@fastly.com>
>> Date: Wed, 25 Sep 2024 18:00:17 +0000

> struct napi_struct doesn't have any such fields and doesn't depend on
> the kernel configuration, that's why it's hardcoded.
> Please don't change that, just adjust the hardcoded values when needed.

This is the crucial point, and I agree with Olek.

If you will find it more readable/future proof, feel free to add
comments like /* napi_struct */ near their "400" part in the hardcode.

Side note: you could just run this as a part of your netdev series,
given you will properly CC.

Joe Damato Sept. 30, 2024, 10:17 p.m. UTC | #5

On Mon, Sep 30, 2024 at 03:10:41PM +0200, Przemek Kitszel wrote:
> On 9/30/24 14:38, Alexander Lobakin wrote:
> > From: Alexander Lobakin <aleksander.lobakin@intel.com>
> > Date: Mon, 30 Sep 2024 14:33:45 +0200
> > 
> > > From: Joe Damato <jdamato@fastly.com>
> > > Date: Wed, 25 Sep 2024 18:00:17 +0000
> 
> > struct napi_struct doesn't have any such fields and doesn't depend on
> > the kernel configuration, that's why it's hardcoded.
> > Please don't change that, just adjust the hardcoded values when needed.
> 
> This is the crucial point, and I agree with Olek.
> 
> If you will find it more readable/future proof, feel free to add
> comments like /* napi_struct */ near their "400" part in the hardcode.
> 
> Side note: you could just run this as a part of your netdev series,
> given you will properly CC.

I've already sent the official patch because I didn't hear back on
this RFC.

Sorry, but I respectfully disagree with you both on this; I don't
think it makes sense to have code that will break if fields are
added to napi_struct thereby requiring anyone who works on the core
to update this code over and over again.

I understand that the sizeofs are "meaningless" because of your
desire to optimize cachelines, but IMHO and, again, respectfully, it
seems strange that any adjustments to core should require a change
to this code.

I really do not want to include a patch to change the size of
napi_struct in idpf as part of my RFC which is totally unrelated to
idpf and will detract from the review of my core changes.

Perhaps my change is unacceptable, but there should be a way to deal
with this that doesn't require everyone working on core networking
code to update idpf, right?

- Joe

Alexander Lobakin Oct. 1, 2024, 1:14 p.m. UTC | #6

From: Joe Damato <jdamato@fastly.com>
Date: Mon, 30 Sep 2024 15:17:46 -0700

> On Mon, Sep 30, 2024 at 03:10:41PM +0200, Przemek Kitszel wrote:
>> On 9/30/24 14:38, Alexander Lobakin wrote:
>>> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>>> Date: Mon, 30 Sep 2024 14:33:45 +0200
>>>
>>>> From: Joe Damato <jdamato@fastly.com>
>>>> Date: Wed, 25 Sep 2024 18:00:17 +0000
>>
>>> struct napi_struct doesn't have any such fields and doesn't depend on
>>> the kernel configuration, that's why it's hardcoded.
>>> Please don't change that, just adjust the hardcoded values when needed.
>>
>> This is the crucial point, and I agree with Olek.
>>
>> If you will find it more readable/future proof, feel free to add
>> comments like /* napi_struct */ near their "400" part in the hardcode.
>>
>> Side note: you could just run this as a part of your netdev series,
>> given you will properly CC.
> 
> I've already sent the official patch because I didn't hear back on
> this RFC.
> 
> Sorry, but I respectfully disagree with you both on this; I don't
> think it makes sense to have code that will break if fields are
> added to napi_struct thereby requiring anyone who works on the core
> to update this code over and over again.
> 
> I understand that the sizeofs are "meaningless" because of your
> desire to optimize cachelines, but IMHO and, again, respectfully, it
> seems strange that any adjustments to core should require a change
> to this code.

But if you change any core API, let's say rename a field used in several
drivers, you anyway need to adjust the affected drivers.
It's a common practice that some core changes require touching drivers.
Moreover, sizeof(struct napi_struct) doesn't get changed often, so I
don't see any issue in adjusting one line in idpf by just increasing one
value there by sizeof(your_new_field).

If you do that, then:
+ you get notified that you may affect the performance of different
  drivers (napi_struct is usually embedded into perf-critical
  structures in drivers);
+ I get notified (Cced) that someone's change will affect idpf, so I'll
  be aware of it and review the change;
- you need to adjust one line in idpf.

Is it just me or these '+'s easily outweight that sole '-'?

> 
> I really do not want to include a patch to change the size of
> napi_struct in idpf as part of my RFC which is totally unrelated to
> idpf and will detract from the review of my core changes.

One line won't distract anyone. The kernel tree contains let's say one
patch from me where I needed to modify around 20 drivers within one
commit due to core code structure change -- the number of locs I changed
in the drivers was way bigger than the number of locs I changed in the
core. And there's a ton of such commits in there. Again, it's a common
practice.

> 
> Perhaps my change is unacceptable, but there should be a way to deal
> with this that doesn't require everyone working on core networking
> code to update idpf, right?

We can't isolate the core code from the drivers up to the point that you
wouldn't require to touch the drivers at all when working on the core
changes, we're not Windows. The drivers and the core are inside one
tree, so that such changes can be made easily and no code inside the
whole tree is ABI (excl uAPI).

Thanks,
Olek

Joe Damato Oct. 1, 2024, 2:44 p.m. UTC | #7

On Tue, Oct 01, 2024 at 03:14:07PM +0200, Alexander Lobakin wrote:
> From: Joe Damato <jdamato@fastly.com>
> Date: Mon, 30 Sep 2024 15:17:46 -0700
> 
> > On Mon, Sep 30, 2024 at 03:10:41PM +0200, Przemek Kitszel wrote:
> >> On 9/30/24 14:38, Alexander Lobakin wrote:
> >>> From: Alexander Lobakin <aleksander.lobakin@intel.com>
> >>> Date: Mon, 30 Sep 2024 14:33:45 +0200
> >>>
> >>>> From: Joe Damato <jdamato@fastly.com>
> >>>> Date: Wed, 25 Sep 2024 18:00:17 +0000
> >>
> >>> struct napi_struct doesn't have any such fields and doesn't depend on
> >>> the kernel configuration, that's why it's hardcoded.
> >>> Please don't change that, just adjust the hardcoded values when needed.
> >>
> >> This is the crucial point, and I agree with Olek.
> >>
> >> If you will find it more readable/future proof, feel free to add
> >> comments like /* napi_struct */ near their "400" part in the hardcode.
> >>
> >> Side note: you could just run this as a part of your netdev series,
> >> given you will properly CC.
> > 
> > I've already sent the official patch because I didn't hear back on
> > this RFC.
> > 
> > Sorry, but I respectfully disagree with you both on this; I don't
> > think it makes sense to have code that will break if fields are
> > added to napi_struct thereby requiring anyone who works on the core
> > to update this code over and over again.
> > 
> > I understand that the sizeofs are "meaningless" because of your
> > desire to optimize cachelines, but IMHO and, again, respectfully, it
> > seems strange that any adjustments to core should require a change
> > to this code.
> 
> But if you change any core API, let's say rename a field used in several
> drivers, you anyway need to adjust the affected drivers.

Sorry, but that's a totally different argument.

There are obvious cases where touching certain parts of core would
require changes to drivers, yes. I agree on that if I change an API
or a struct field name, or remove an enum, then this affects drivers
which must be updated.

idpf does not fall in this category as it relies on the *size* of
the structure, not the field names. Adding a new field wouldn't
break any of your existing code accessing fields in the struct since
I haven't removed a field.

Adding a new field may adjust the size. According to your previous
email: idpf cares about the size because it wants the cachelines to
look a certain way in pahole. OK, but why is that the concern of
someone working on core code?

It doesn't make sense to me that if I add a new field, I now need to
look at pahole output for idpf to make sure you will approve of the
cacheline placement.

> It's a common practice that some core changes require touching drivers.
> Moreover, sizeof(struct napi_struct) doesn't get changed often, so I
> don't see any issue in adjusting one line in idpf by just increasing one
> value there by sizeof(your_new_field).

The problem is: what if everyone starts doing this? Trying to rely
on the specific size of some core data structure in their driver
code for cacheline placement?

Do I then need to shift through each driver with pahole and adjust
the cacheline placement of each affected structure because I added a
field to napi_struct ?

The burden of optimizing cache line placement should fall on the
driver maintainers, not the person adding the field to napi_struct.

It would be different if I deleted a field or renamed a field. In
those cases: sure that's my issue to deal with changing the affected
drivers. Just like it would be if I removed an enum value.

> If you do that, then:
> + you get notified that you may affect the performance of different
>   drivers (napi_struct is usually embedded into perf-critical
>   structures in drivers);

But I don't want to think about idpf; it's not my responsibility at
all to ensure that the cacheline placement in idpf is optimal.

> + I get notified (Cced) that someone's change will affect idpf, so I'll
>   be aware of it and review the change;
> - you need to adjust one line in idpf.

Adjust one line in idpf and then go through another revision if the
maintainers don't like what the cachelines look like in pahole?

And I need to do this for something totally unrelated that idpf
won't even support (because I'm not adding support for it in the
RFC) ?

> Is it just me or these '+'s easily outweight that sole '-'?

I disagree even more than I disagreed yesterday.

Jakub Kicinski Oct. 2, 2024, 5:17 p.m. UTC | #8

On Tue, 1 Oct 2024 07:44:36 -0700 Joe Damato wrote:
> > But if you change any core API, let's say rename a field used in several
> > drivers, you anyway need to adjust the affected drivers.  
> 
> Sorry, but that's a totally different argument.
> 
> There are obvious cases where touching certain parts of core would
> require changes to drivers, yes. I agree on that if I change an API
> or a struct field name, or remove an enum, then this affects drivers
> which must be updated.

+1

I fully agree with Joe. Drivers asserting the size of core structures
is both undue burden on core changes and pointless.
The former is subjective, as for the latter: most core structures 
will contain cold / slow path data, usually at the end. If you care
about performance of anything that follows a core struct you need
to align the next field yourself.

IDK how you want to fit this into your magic macros but complex
nested types should be neither ro, rw nor cold. They are separate.

Alexander Lobakin Oct. 3, 2024, 1:35 p.m. UTC | #9

From: Jakub Kicinski <kuba@kernel.org>
Date: Wed, 2 Oct 2024 10:17:27 -0700

> On Tue, 1 Oct 2024 07:44:36 -0700 Joe Damato wrote:
>>> But if you change any core API, let's say rename a field used in several
>>> drivers, you anyway need to adjust the affected drivers.  
>>
>> Sorry, but that's a totally different argument.
>>
>> There are obvious cases where touching certain parts of core would
>> require changes to drivers, yes. I agree on that if I change an API
>> or a struct field name, or remove an enum, then this affects drivers
>> which must be updated.
> 
> +1
> 
> I fully agree with Joe. Drivers asserting the size of core structures
> is both undue burden on core changes and pointless.
> The former is subjective, as for the latter: most core structures 
> will contain cold / slow path data, usually at the end. If you care
> about performance of anything that follows a core struct you need
> to align the next field yourself.
> 
> IDK how you want to fit this into your magic macros but complex
> nested types should be neither ro, rw nor cold. They are separate.

Ok I'm convinced enough. I've just imagined that if every NIC driver had
such assertions, that would've been hell.

napi_struct is the only generic struct whichs size is hardcoded in the
macros (struct dim is already sizeof()ed, as well as cpumask_var_t), so
I'm fine with the change you proposed in your first RFC -- I mean

 libeth_cacheline_set_assert(struct idpf_q_vector, 112,
-			    424 + 2 * sizeof(struct dim),
+			    24 + sizeof(struct napi_struct) +
+			    2 * sizeof(struct dim),
 			    8 + sizeof(cpumask_var_t));

I may revise this later and put generic structs outside CL groups as
Jakub suggested.

Thanks,
Olek

Joe Damato Oct. 3, 2024, 3:46 p.m. UTC | #10

On Thu, Oct 03, 2024 at 03:35:54PM +0200, Alexander Lobakin wrote:
[...]
> napi_struct is the only generic struct whichs size is hardcoded in the
> macros (struct dim is already sizeof()ed, as well as cpumask_var_t), so
> I'm fine with the change you proposed in your first RFC -- I mean
> 
>  libeth_cacheline_set_assert(struct idpf_q_vector, 112,
> -			    424 + 2 * sizeof(struct dim),
> +			    24 + sizeof(struct napi_struct) +
> +			    2 * sizeof(struct dim),
>  			    8 + sizeof(cpumask_var_t));

So you are saying to drop the other #defines I added in the RFC and
just embed a sizeof? I just want to be clear so that I send a v2
that'll be correct.

[RFC,net-next,1/1] idpf: Don't hard code napi_struct size

Checks

Commit Message

Comments

Patch