mbox series

[net-next,v2,00/12] ethtool: rss: driver tweaks and netlink context dumps

Message ID 20240803042624.970352-1-kuba@kernel.org (mailing list archive)
Headers show
Series ethtool: rss: driver tweaks and netlink context dumps | expand

Message

Jakub Kicinski Aug. 3, 2024, 4:26 a.m. UTC
This series is a semi-related collection of RSS patches.
Main point is supporting dumping RSS contexts via ethtool netlink.
At present additional RSS contexts can be queried one by one, and
assuming user know the right IDs. This series uses the XArray
added by Ed to provide netlink dump support for ETHTOOL_GET_RSS.

Patch 1 is a trivial selftest debug patch.
Patch 2 coverts mvpp2 for no real reason other than that I had
	a grand plan of converting all drivers at some stage.
Patch 3 removes a now moot check from mlx5 so that all tests
	can pass.
Patch 4 and 5 make a bit used for context support optional,
	for easier grepping of drivers which need converting
	if nothing else.
Patch 6 OTOH adds a new cap bit; some devices don't support
	using a different key per context and currently act
	in surprising ways.
Patch 7 and 8 update the RSS netlink code to use XArray.
Patch 9 and 10 add support for dumping contexts.
Patch 11 and 12 are small adjustments to spec and a new test.


I'm getting distracted with other work, so probably won't have
the time soon to complete next steps, but things which are missing
are (and some of these may be bad ideas):

 - better discovery

   Some sort of API to tell the user who many contexts the device
   can create. Upper bound, devices often share contexts between
   ports etc. so it's hard to tell exactly and upfront number of
   contexts for a netdev. But order of magnitude (4 vs 10s) may
   be enough for container management system to know whether to bother.

 - create/modify/delete via netlink
 
   The only question here is how to handle all the tricky IOCTL
   legacy. "No change" maps trivially to attribute not present.
   "reset" (indir_size = 0) probably needs to be a new NLA_FLAG?

 - better table size handling

   The current API assumes the LUT has fixed size, which isn't
   true for modern devices. We should have better APIs for the
   drivers to resize the tables, and in user facing API -
   the ability to specify pattern and min size rather than
   exact table expected (sort of like ethtool CLI already does).

 - recounted / socket-bound contexts

   Support for contexts which get "cleaned up" when their parent
   netlink socket gets closed. The major catch is that ntuple
   filters (which we don't currently track) depend on the context,
   so we need auto-removal for both.

v2:
 - fix bugs and build in mvpp2
v1: https://lore.kernel.org/20240802001801.565176-1-kuba@kernel.org

Jakub Kicinski (12):
  selftests: drv-net: rss_ctx: add identifier to traffic comments
  eth: mvpp2: implement new RSS context API
  eth: mlx5: allow disabling queues when RSS contexts exist
  ethtool: make ethtool_ops::cap_rss_ctx_supported optional
  eth: remove .cap_rss_ctx_supported from updated drivers
  ethtool: rss: don't report key if device doesn't support it
  ethtool: rss: move the device op invocation out of rss_prepare_data()
  ethtool: rss: report info about additional contexts from XArray
  ethtool: rss: support dumping RSS contexts
  ethtool: rss: support skipping contexts during dump
  netlink: specs: decode indirection table as u32 array
  selftests: drv-net: rss_ctx: test dumping RSS contexts

 Documentation/netlink/specs/ethtool.yaml      |  14 +-
 Documentation/networking/ethtool-netlink.rst  |  12 +-
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |   2 +-
 drivers/net/ethernet/intel/ice/ice_ethtool.c  |   1 +
 .../net/ethernet/marvell/mvpp2/mvpp2_cls.c    |  18 +-
 .../net/ethernet/marvell/mvpp2/mvpp2_cls.h    |   2 +-
 .../net/ethernet/marvell/mvpp2/mvpp2_main.c   | 103 +++++---
 .../ethernet/mellanox/mlx5/core/en_ethtool.c  |  13 +-
 drivers/net/ethernet/sfc/ef100_ethtool.c      |   2 +-
 drivers/net/ethernet/sfc/ethtool.c            |   2 +-
 drivers/net/ethernet/sfc/siena/ethtool.c      |   1 +
 include/linux/ethtool.h                       |   6 +-
 include/uapi/linux/ethtool_netlink.h          |   1 +
 net/ethtool/ioctl.c                           |  31 ++-
 net/ethtool/netlink.c                         |   2 +
 net/ethtool/netlink.h                         |   4 +-
 net/ethtool/rss.c                             | 231 ++++++++++++++++--
 .../selftests/drivers/net/hw/rss_ctx.py       |  74 +++++-
 tools/testing/selftests/net/lib/py/ksft.py    |   6 +
 19 files changed, 434 insertions(+), 91 deletions(-)

Comments

Gal Pressman Aug. 4, 2024, 6:08 a.m. UTC | #1
On 03/08/2024 7:26, Jakub Kicinski wrote:
> This series is a semi-related collection of RSS patches.
> Main point is supporting dumping RSS contexts via ethtool netlink.
> At present additional RSS contexts can be queried one by one, and
> assuming user know the right IDs. This series uses the XArray
> added by Ed to provide netlink dump support for ETHTOOL_GET_RSS.
> 
> Patch 1 is a trivial selftest debug patch.
> Patch 2 coverts mvpp2 for no real reason other than that I had
> 	a grand plan of converting all drivers at some stage.
> Patch 3 removes a now moot check from mlx5 so that all tests
> 	can pass.
> Patch 4 and 5 make a bit used for context support optional,
> 	for easier grepping of drivers which need converting
> 	if nothing else.
> Patch 6 OTOH adds a new cap bit; some devices don't support
> 	using a different key per context and currently act
> 	in surprising ways.
> Patch 7 and 8 update the RSS netlink code to use XArray.
> Patch 9 and 10 add support for dumping contexts.
> Patch 11 and 12 are small adjustments to spec and a new test.

Very useful, I was messing around with the RSS code lately and was
thinking about these stuff too, thanks!

> 
> 
> I'm getting distracted with other work, so probably won't have
> the time soon to complete next steps, but things which are missing
> are (and some of these may be bad ideas):
> 
>  - better discovery
> 
>    Some sort of API to tell the user who many contexts the device
>    can create. Upper bound, devices often share contexts between
>    ports etc. so it's hard to tell exactly and upfront number of
>    contexts for a netdev. But order of magnitude (4 vs 10s) may
>    be enough for container management system to know whether to bother.
> 
>  - create/modify/delete via netlink

And actually plugging extack into set_rxfh :).

>  
>    The only question here is how to handle all the tricky IOCTL
>    legacy. "No change" maps trivially to attribute not present.
>    "reset" (indir_size = 0) probably needs to be a new NLA_FLAG?

FWIW, we have an incompatibility issue with the recent rxfh.input_xfrm
parameter.

In ethtool_set_rxfh():
	/* If either indir, hash key or function is valid, proceed further.
	 * Must request at least one change: indir size, hash key, function
	 * or input transformation.
	 */
	if ((rxfh.indir_size &&
	     rxfh.indir_size != ETH_RXFH_INDIR_NO_CHANGE &&
	     rxfh.indir_size != dev_indir_size) ||
	    (rxfh.key_size && (rxfh.key_size != dev_key_size)) ||
	    (rxfh.indir_size == ETH_RXFH_INDIR_NO_CHANGE &&
	     rxfh.key_size == 0 && rxfh.hfunc == ETH_RSS_HASH_NO_CHANGE &&
	     rxfh.input_xfrm == RXH_XFRM_NO_CHANGE))
		return -EINVAL;

When using a recent kernel with an old userspace ethtool,
rxfh.input_xfrm is treated as zero (which is different than
RXH_XFRM_NO_CHANGE) and passes the check, whereas the same command with
a recent userspace would result in an error.
This also makes it so old userspace always disables input_xfrm
unintentionally. I do not have any ideas on how to resolve this..

Regardless, I believe this check is wrong as it prevents us from
creating RSS context with no parameters (i.e. 'ethtool -X eth0 context
new', as done in selftests), it works by mistake with old userspace.
I plan to submit a patch soon to skip this check in case of context
creation.
Jakub Kicinski Aug. 5, 2024, 10:13 p.m. UTC | #2
On Sun, 4 Aug 2024 09:08:50 +0300 Gal Pressman wrote:
> >    The only question here is how to handle all the tricky IOCTL
> >    legacy. "No change" maps trivially to attribute not present.
> >    "reset" (indir_size = 0) probably needs to be a new NLA_FLAG?  
> 
> FWIW, we have an incompatibility issue with the recent rxfh.input_xfrm
> parameter.
> 
> In ethtool_set_rxfh():
> 	/* If either indir, hash key or function is valid, proceed further.
> 	 * Must request at least one change: indir size, hash key, function
> 	 * or input transformation.
> 	 */
> 	if ((rxfh.indir_size &&
> 	     rxfh.indir_size != ETH_RXFH_INDIR_NO_CHANGE &&
> 	     rxfh.indir_size != dev_indir_size) ||
> 	    (rxfh.key_size && (rxfh.key_size != dev_key_size)) ||
> 	    (rxfh.indir_size == ETH_RXFH_INDIR_NO_CHANGE &&
> 	     rxfh.key_size == 0 && rxfh.hfunc == ETH_RSS_HASH_NO_CHANGE &&
> 	     rxfh.input_xfrm == RXH_XFRM_NO_CHANGE))
> 		return -EINVAL;
> 
> When using a recent kernel with an old userspace ethtool,
> rxfh.input_xfrm is treated as zero (which is different than
> RXH_XFRM_NO_CHANGE) and passes the check, whereas the same command with
> a recent userspace would result in an error.
> This also makes it so old userspace always disables input_xfrm
> unintentionally. I do not have any ideas on how to resolve this..
> 
> Regardless, I believe this check is wrong as it prevents us from
> creating RSS context with no parameters (i.e. 'ethtool -X eth0 context
> new', as done in selftests), it works by mistake with old userspace.
> I plan to submit a patch soon to skip this check in case of context
> creation.

I guess we just need to throw "&& !create" into the condition?
Sounds good! We should probably split the "actual invalid" from 
the "nothing specified" checks.

Also - curious what you'll put under Fixes, looks like a pretty 
ancient bug :)
Gal Pressman Aug. 6, 2024, 12:22 p.m. UTC | #3
On 06/08/2024 1:13, Jakub Kicinski wrote:
> On Sun, 4 Aug 2024 09:08:50 +0300 Gal Pressman wrote:
>>>    The only question here is how to handle all the tricky IOCTL
>>>    legacy. "No change" maps trivially to attribute not present.
>>>    "reset" (indir_size = 0) probably needs to be a new NLA_FLAG?  
>>
>> FWIW, we have an incompatibility issue with the recent rxfh.input_xfrm
>> parameter.
>>
>> In ethtool_set_rxfh():
>> 	/* If either indir, hash key or function is valid, proceed further.
>> 	 * Must request at least one change: indir size, hash key, function
>> 	 * or input transformation.
>> 	 */
>> 	if ((rxfh.indir_size &&
>> 	     rxfh.indir_size != ETH_RXFH_INDIR_NO_CHANGE &&
>> 	     rxfh.indir_size != dev_indir_size) ||
>> 	    (rxfh.key_size && (rxfh.key_size != dev_key_size)) ||
>> 	    (rxfh.indir_size == ETH_RXFH_INDIR_NO_CHANGE &&
>> 	     rxfh.key_size == 0 && rxfh.hfunc == ETH_RSS_HASH_NO_CHANGE &&
>> 	     rxfh.input_xfrm == RXH_XFRM_NO_CHANGE))
>> 		return -EINVAL;
>>
>> When using a recent kernel with an old userspace ethtool,
>> rxfh.input_xfrm is treated as zero (which is different than
>> RXH_XFRM_NO_CHANGE) and passes the check, whereas the same command with
>> a recent userspace would result in an error.
>> This also makes it so old userspace always disables input_xfrm
>> unintentionally. I do not have any ideas on how to resolve this..
>>
>> Regardless, I believe this check is wrong as it prevents us from
>> creating RSS context with no parameters (i.e. 'ethtool -X eth0 context
>> new', as done in selftests), it works by mistake with old userspace.
>> I plan to submit a patch soon to skip this check in case of context
>> creation.
> 
> I guess we just need to throw "&& !create" into the condition?
> Sounds good! 

Yes.

> We should probably split the "actual invalid" from 
> the "nothing specified" checks.

And make the "no change" check return zero?

> 
> Also - curious what you'll put under Fixes, looks like a pretty 
> ancient bug :)

Maybe 84a1d9c48200 ("net: ethtool: extend RXNFC API to support RSS
spreading of filter matches")?
Jakub Kicinski Aug. 6, 2024, 2:20 p.m. UTC | #4
On Tue, 6 Aug 2024 15:22:07 +0300 Gal Pressman wrote:
> > I guess we just need to throw "&& !create" into the condition?
> > Sounds good!   
> 
> Yes.
> 
> > We should probably split the "actual invalid" from 
> > the "nothing specified" checks.  
> 
> And make the "no change" check return zero?

My knee jerk reaction would be to keep the error return code.
But I guess one could argue in either direction.

> > Also - curious what you'll put under Fixes, looks like a pretty 
> > ancient bug :)  
> 
> Maybe 84a1d9c48200 ("net: ethtool: extend RXNFC API to support RSS
> spreading of filter matches")?

Nod.
Gal Pressman Aug. 6, 2024, 3:14 p.m. UTC | #5
On 06/08/2024 17:20, Jakub Kicinski wrote:
> On Tue, 6 Aug 2024 15:22:07 +0300 Gal Pressman wrote:
>>> I guess we just need to throw "&& !create" into the condition?
>>> Sounds good!   
>>
>> Yes.
>>
>>> We should probably split the "actual invalid" from 
>>> the "nothing specified" checks.  
>>
>> And make the "no change" check return zero?
> 
> My knee jerk reaction would be to keep the error return code.
> But I guess one could argue in either direction.

Yea, maybe it's better to not risk upsetting users with a behavior change.
I'll start with a net submission for the fix, then figure out what
should be done for net-next.

Repeating what I said in my earlier mail, I do not think there's a way
to actually solve the compatibility issue. There is no way for the
kernel to differentiate between old userspace that is not aware of
input_xfrm vs. new userspace that explicitly sets it to zero, I guess
we're stuck with this bug :\.