diff mbox series

[net-queue,bugfix,RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset

Message ID 1665701671-6353-1-git-send-email-jdamato@fastly.com (mailing list archive)
State RFC
Delegated to: Netdev Maintainers
Headers show
Series [net-queue,bugfix,RFC] i40e: Clear IFF_RXFH_CONFIGURED when RSS is reset | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 1 blamed authors not CCed: helin.zhang@intel.com; 3 maintainers not CCed: helin.zhang@intel.com edumazet@google.com pabeni@redhat.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 8 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Joe Damato Oct. 13, 2022, 10:54 p.m. UTC
Sending this first as an RFC to ensure that this is the correct and desired
behavior when changing queue counts and flowhashes in i40e.

If this is approved, I can send an official "v1".

Before this change, reconfiguring the queue count using ethtool doesn't
always work, even for queue counts that were previously accepted because
the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
is cleared by the driver.

For example:

$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 34 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
   32:     32    33     0     1     2     3     4     5
[...snip...]

As you can see, the flow indirection hash distributes flows to 34 queues.

Increasing the number of queues from 34 to 64 works, and the flow
indirection hash is reset automatically:

$ sudo ethtool -L eth0 combined 64
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 64 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
   32:     32    33    34    35    36    37    38    39
   40:     40    41    42    43    44    45    46    47
   48:     48    49    50    51    52    53    54    55
   56:     56    57    58    59    60    61    62    63

However, reducing the queue count back to 34 (which previously worked)
fails:

$ sudo ethtool -L eth0 combined 34
Cannot set device channel parameters: Invalid argument

This happens because the kernel thinks that the user configured the flow
hash (since the IFF_RXFH_CONFIGURED bit is not cleared by the driver when
the driver reset it) and thus returns -EINVAL, presumably to prevent the
driver from resizing the queues and resetting the user-defined flowhash.

With this patch applied, the queue count can be reduced to fewer queues
than the flow indirection hash is set to distribute flows to if the flow
indirection hash was not modified by the user.

For example, with the patch applied:

$ sudo ethtool -L eth0 combined 32
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 32 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23
   24:     24    25    26    27    28    29    30    31
[..snip..]

I can now reduce the queue count to below 32 without error (unlike earlier):

$ sudo ethtool -L eth0 combined 24
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19    20    21    22    23

This works because I was using the default flow hash, so the driver discards
it and regenerates it.

However, if I manually set the flow hash to some user defined value:

$ sudo ethtool -X eth0 equal 20
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 24 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19     0     1     2     3
[..snip..]

I will now not be able to shrink the queue count again:

$ sudo ethtool -L eth0 combined 16
Cannot set device channel parameters: Invalid argument

But, I can increase the queue count and the flow hash is preserved:

$ sudo ethtool -L eth0 combined 64
$ sudo ethtool -x eth0
RX flow hash indirection table for eth0 with 64 RX ring(s):
    0:      0     1     2     3     4     5     6     7
    8:      8     9    10    11    12    13    14    15
   16:     16    17    18    19     0     1     2     3

Fixes: 28c5869f2bc4 ("i40e: add new fields to store user configuration")
Signed-off-by: Joe Damato <jdamato@fastly.com>
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jakub Kicinski Oct. 17, 2022, 7:45 p.m. UTC | #1
On Thu, 13 Oct 2022 15:54:31 -0700 Joe Damato wrote:
> Before this change, reconfiguring the queue count using ethtool doesn't
> always work, even for queue counts that were previously accepted because
> the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
> is cleared by the driver.

It's not cleared but when was it set? Could you describe the flow that
gets us to this set a bit more?

Normally clearing the IFF_RXFH_CONFIGURED in the driver is _only_
acceptable on error recovery paths, and should come with a "this should
never happen" warning.

> For example:
> 
> $ sudo ethtool -x eth0
> RX flow hash indirection table for eth0 with 34 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19    20    21    22    23
>    24:     24    25    26    27    28    29    30    31
>    32:     32    33     0     1     2     3     4     5
> [...snip...]
> 
> As you can see, the flow indirection hash distributes flows to 34 queues.
> 
> Increasing the number of queues from 34 to 64 works, and the flow
> indirection hash is reset automatically:
> 
> $ sudo ethtool -L eth0 combined 64
> $ sudo ethtool -x eth0
> RX flow hash indirection table for eth0 with 64 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19    20    21    22    23
>    24:     24    25    26    27    28    29    30    31
>    32:     32    33    34    35    36    37    38    39
>    40:     40    41    42    43    44    45    46    47
>    48:     48    49    50    51    52    53    54    55
>    56:     56    57    58    59    60    61    62    63

This is odd, if IFF_RXFH_CONFIGURED is set driver should not
re-initialize the indirection table. Which I believe is what
you describe at the end of your message:

> But, I can increase the queue count and the flow hash is preserved:
> 
> $ sudo ethtool -L eth0 combined 64
> $ sudo ethtool -x eth0
> RX flow hash indirection table for eth0 with 64 RX ring(s):
>     0:      0     1     2     3     4     5     6     7
>     8:      8     9    10    11    12    13    14    15
>    16:     16    17    18    19     0     1     2     3
Jacob Keller Oct. 17, 2022, 8:25 p.m. UTC | #2
On 10/17/2022 12:45 PM, Jakub Kicinski wrote:
> On Thu, 13 Oct 2022 15:54:31 -0700 Joe Damato wrote:
>> Before this change, reconfiguring the queue count using ethtool doesn't
>> always work, even for queue counts that were previously accepted because
>> the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
>> is cleared by the driver.
> 
> It's not cleared but when was it set? Could you describe the flow that
> gets us to this set a bit more?
> 
> Normally clearing the IFF_RXFH_CONFIGURED in the driver is _only_
> acceptable on error recovery paths, and should come with a "this should
> never happen" warning.
> 

Correct. The whole point of IFF_RXFH_CONFIGURED is to be able for the
driver to know whether or not the current config was the default or a
user specified value. If this flag is set, we should not be changing the
config except in exceptional circumstances.

>> For example:
>>
>> $ sudo ethtool -x eth0
>> RX flow hash indirection table for eth0 with 34 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19    20    21    22    23
>>    24:     24    25    26    27    28    29    30    31
>>    32:     32    33     0     1     2     3     4     5
>> [...snip...]
>>
>> As you can see, the flow indirection hash distributes flows to 34 queues.
>>
>> Increasing the number of queues from 34 to 64 works, and the flow
>> indirection hash is reset automatically:
>>
>> $ sudo ethtool -L eth0 combined 64
>> $ sudo ethtool -x eth0
>> RX flow hash indirection table for eth0 with 64 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19    20    21    22    23
>>    24:     24    25    26    27    28    29    30    31
>>    32:     32    33    34    35    36    37    38    39
>>    40:     40    41    42    43    44    45    46    47
>>    48:     48    49    50    51    52    53    54    55
>>    56:     56    57    58    59    60    61    62    63
> 
> This is odd, if IFF_RXFH_CONFIGURED is set driver should not
> re-initialize the indirection table. Which I believe is what
> you describe at the end of your message:
> 

Right. It seems like the driver should actually be checking this flag
somewhere else and preventing the flow where we clear the indirection
table...

We are at least in some places according to your report here, but
perhaps there is a gap....

>> But, I can increase the queue count and the flow hash is preserved:
>>
>> $ sudo ethtool -L eth0 combined 64
>> $ sudo ethtool -x eth0
>> RX flow hash indirection table for eth0 with 64 RX ring(s):
>>     0:      0     1     2     3     4     5     6     7
>>     8:      8     9    10    11    12    13    14    15
>>    16:     16    17    18    19     0     1     2     3
Joe Damato Oct. 17, 2022, 8:36 p.m. UTC | #3
On Mon, Oct 17, 2022 at 01:25:39PM -0700, Jacob Keller wrote:
> 
> 
> On 10/17/2022 12:45 PM, Jakub Kicinski wrote:
> > On Thu, 13 Oct 2022 15:54:31 -0700 Joe Damato wrote:
> >> Before this change, reconfiguring the queue count using ethtool doesn't
> >> always work, even for queue counts that were previously accepted because
> >> the IFF_RXFH_CONFIGURED bit was not cleared when the flow indirection hash
> >> is cleared by the driver.
> > 
> > It's not cleared but when was it set? Could you describe the flow that
> > gets us to this set a bit more?
> > 
> > Normally clearing the IFF_RXFH_CONFIGURED in the driver is _only_
> > acceptable on error recovery paths, and should come with a "this should
> > never happen" warning.
> > 
> 
> Correct. The whole point of IFF_RXFH_CONFIGURED is to be able for the
> driver to know whether or not the current config was the default or a
> user specified value. If this flag is set, we should not be changing the
> config except in exceptional circumstances.
> 
> >> For example:
> >>
> >> $ sudo ethtool -x eth0
> >> RX flow hash indirection table for eth0 with 34 RX ring(s):
> >>     0:      0     1     2     3     4     5     6     7
> >>     8:      8     9    10    11    12    13    14    15
> >>    16:     16    17    18    19    20    21    22    23
> >>    24:     24    25    26    27    28    29    30    31
> >>    32:     32    33     0     1     2     3     4     5
> >> [...snip...]
> >>
> >> As you can see, the flow indirection hash distributes flows to 34 queues.
> >>
> >> Increasing the number of queues from 34 to 64 works, and the flow
> >> indirection hash is reset automatically:
> >>
> >> $ sudo ethtool -L eth0 combined 64
> >> $ sudo ethtool -x eth0
> >> RX flow hash indirection table for eth0 with 64 RX ring(s):
> >>     0:      0     1     2     3     4     5     6     7
> >>     8:      8     9    10    11    12    13    14    15
> >>    16:     16    17    18    19    20    21    22    23
> >>    24:     24    25    26    27    28    29    30    31
> >>    32:     32    33    34    35    36    37    38    39
> >>    40:     40    41    42    43    44    45    46    47
> >>    48:     48    49    50    51    52    53    54    55
> >>    56:     56    57    58    59    60    61    62    63
> > 
> > This is odd, if IFF_RXFH_CONFIGURED is set driver should not
> > re-initialize the indirection table. Which I believe is what
> > you describe at the end of your message:
> > 
> 
> Right. It seems like the driver should actually be checking this flag
> somewhere else and preventing the flow where we clear the indirection
> table...
> 
> We are at least in some places according to your report here, but
> perhaps there is a gap....

Thanks for the comments / information. I noticed that one other driver
(mlx5) tweaks this bit, which is what led me down this rabbit hole.

I'll have to re-read the i40e code and re-run some experiments with the
queue count and flow hash to get a better understanding of the current
behavior and verify/double check the results.

I'll follow-up with an email to intel-wired-lan about the current
(unpatched) behavior I'm seeing with i40e to double check if there's
a bug or if I've simply made a mistake somewhere in my testing.

I did run the experiments a few times, so it is possible I got into some
weird state. It is worth revisiting fresh from a reboot with a kernel built
from net-next.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index feabd26..0e8dca7 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -11522,6 +11522,8 @@  static void i40e_clear_rss_config_user(struct i40e_vsi *vsi)
 
 	kfree(vsi->rss_lut_user);
 	vsi->rss_lut_user = NULL;
+
+	vsi->netdev->priv_flags &= ~IFF_RXFH_CONFIGURED;
 }
 
 /**