mbox series

[net-next,V2,0/6] net/mlx5: hw counters refactor

Message ID 20241001103709.58127-1-tariqt@nvidia.com (mailing list archive)
Headers show
Series net/mlx5: hw counters refactor | expand

Message

Tariq Toukan Oct. 1, 2024, 10:37 a.m. UTC
This is a patchset re-post, see:
https://lore.kernel.org/netdev/20240815054656.2210494-7-tariqt@nvidia.com/T/

In this patchset, Cosmin refactors hw counters and solves perf scaling
issue. 

Series generated against:
commit c824deb1a897 ("cxgb4: clip_tbl: Fix spelling mistake "wont" -> "won't"")

HW counters are central to mlx5 driver operations. They are hardware
objects created and used alongside most steering operations, and queried
from a variety of places. Most counters are queried in bulk from a
periodic task in fs_counters.c.

Counter performance is important and as such, a variety of improvements
have been done over the years. Currently, counters are allocated from
pools, which are bulk allocated to amortize the cost of firmware
commands. Counters are managed through an IDR, a doubly linked list and
two atomic single linked lists. Adding/removing counters is a complex
dance between user contexts requesting it and the mlx5_fc_stats_work
task which does most of the work.

Under high load (e.g. from connection tracking flow insertion/deletion),
the counter code becomes a bottleneck, as seen on flame graphs. Whenever
a counter is deleted, it gets added to a list and the wq task is
scheduled to run immediately to actually delete it. This is done via
mod_delayed_work which uses an internal spinlock. In some tests, waiting
for this spinlock took up to 66% of all samples.

This series refactors the counter code to use a more straight-forward
approach, avoiding the mod_delayed_work problem and making the code
easier to understand. For that:

- patch #1 moves counters data structs to a more appropriate place.
- patch #2 simplifies the bulk query allocation scheme by using vmalloc.
- patch #3 replaces the IDR+3 lists with an xarray. This is the main
  patch of the series, solving the spinlock congestion issue.
- patch #4 removes an unnecessary cacheline alignment causing a lot of
  memory to be wasted.
- patches #5 and #6 are small cleanups enabled by the refactoring.

Regards,
Tariq

V2:
- no changes, re-posting.

Cosmin Ratiu (6):
  net/mlx5: hw counters: Make fc_stats & fc_pool private
  net/mlx5: hw counters: Use kvmalloc for bulk query buffer
  net/mlx5: hw counters: Replace IDR+lists with xarray
  net/mlx5: hw counters: Drop unneeded cacheline alignment
  net/mlx5: hw counters: Don't maintain a counter count
  net/mlx5: hw counters: Remove mlx5_fc_create_ex

 .../ethernet/mellanox/mlx5/core/en/tc_ct.c    |   2 +-
 .../ethernet/mellanox/mlx5/core/fs_counters.c | 387 +++++++-----------
 include/linux/mlx5/driver.h                   |  33 +-
 include/linux/mlx5/fs.h                       |   3 -
 4 files changed, 147 insertions(+), 278 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org Oct. 4, 2024, 6:50 p.m. UTC | #1
Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 1 Oct 2024 13:37:03 +0300 you wrote:
> This is a patchset re-post, see:
> https://lore.kernel.org/netdev/20240815054656.2210494-7-tariqt@nvidia.com/T/
> 
> In this patchset, Cosmin refactors hw counters and solves perf scaling
> issue.
> 
> Series generated against:
> commit c824deb1a897 ("cxgb4: clip_tbl: Fix spelling mistake "wont" -> "won't"")
> 
> [...]

Here is the summary with links:
  - [net-next,V2,1/6] net/mlx5: hw counters: Make fc_stats & fc_pool private
    https://git.kernel.org/netdev/net-next/c/5acd957a986c
  - [net-next,V2,2/6] net/mlx5: hw counters: Use kvmalloc for bulk query buffer
    https://git.kernel.org/netdev/net-next/c/10cd92df833c
  - [net-next,V2,3/6] net/mlx5: hw counters: Replace IDR+lists with xarray
    https://git.kernel.org/netdev/net-next/c/918af0219a4d
  - [net-next,V2,4/6] net/mlx5: hw counters: Drop unneeded cacheline alignment
    https://git.kernel.org/netdev/net-next/c/d95f77f1196a
  - [net-next,V2,5/6] net/mlx5: hw counters: Don't maintain a counter count
    https://git.kernel.org/netdev/net-next/c/4a67ebf85f38
  - [net-next,V2,6/6] net/mlx5: hw counters: Remove mlx5_fc_create_ex
    https://git.kernel.org/netdev/net-next/c/d1c9cffe4b01

You are awesome, thank you!