mbox series

[RFC,net-next,v3,0/2] mlx5: Add netdev-genl queue stats

Message ID 20240529031628.324117-1-jdamato@fastly.com (mailing list archive)
Headers show
Series mlx5: Add netdev-genl queue stats | expand

Message

Joe Damato May 29, 2024, 3:16 a.m. UTC
Greetings:

Switching to an RFC instead of a PATCH because even though Tariq
patiently explained the code to me, I'm sure I probably still missed
something ;)

If this turns out to be right and Tariq agrees, I can send a PATCH
net-next v4.

This change adds support for the per queue netdev-genl API to mlx5,
which seems to output stats:

./cli.py --spec ../../../Documentation/netlink/specs/netdev.yaml \
         --dump qstats-get --json '{"scope": "queue"}'

...snip
 {'ifindex': 7,
  'queue-id': 28,
  'queue-type': 'tx',
  'tx-bytes': 399462,
  'tx-packets': 3311},
...snip

I've used the suggested tooling to verify the per queue stats match
rtnl by doing this:

  NETIF=eth0 tools/testing/selftests/drivers/net/stats.py

I've tested the following scenarios:
  - The machine at boot (default queue configuration)
  - Adjusting the queue configuration to various amounts via ethtool
  - Add mqprio TCs
  - Removing the mqprio TCs

and in each scenario the stats script above reports that the stats match
rtnl.

Worth noting that Tariq suggested I also export HTB/QOS stats in
mlx5e_get_base_stats.

I am open to doing this, but I think if I were to do that, HTB/QOS queue
stats should also be exported by rtnl so that the script above will
continue to show that the output is correct.

I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl
code together at the same time, but a later time, separate from this
change.

Thanks,
Joe

v2 -> rfcv3:
 - Added patch 1/2 which creates some helpers for computing the txq_ix
   and ch_ix/tc_ix.

 - Patch 2/2 modified in several ways:
   - Fixed variable declarations in mlx5e_get_queue_stats_rx to be at
     the start of the function.
   - mlx5e_get_queue_stats_tx rewritten to access sq stats directly by
     using the helpers added in the previous patch.
   - mlx5e_get_base_stats modified in several ways:
     - Took the state_lock when accessing priv->channels.
     - For the base RX stats, code was simplified to call
       mlx5e_get_queue_stats_rx instead of repeating the same code.
     - For the base TX stats, I attempted to implement what I think
       Tariq suggested in the previous thread:
         - for available channels, only unavailable TC stats are summed
	 - for unavailable channels, all stats for TCs up to
	   max_opened_tc are summed.

v1 - > v2:
  - Essentially a full rewrite after comments from Jakub, Tariq, and
    Zhu.

Joe Damato (2):
  net/mlx5e: Add helpers to calculate txq and ch idx
  net/mlx5e: Add per queue netdev-genl stats

 .../net/ethernet/mellanox/mlx5/core/en_main.c | 150 +++++++++++++++++-
 1 file changed, 149 insertions(+), 1 deletion(-)

Comments

Jakub Kicinski May 31, 2024, 12:11 a.m. UTC | #1
On Wed, 29 May 2024 03:16:25 +0000 Joe Damato wrote:
> Worth noting that Tariq suggested I also export HTB/QOS stats in
> mlx5e_get_base_stats.

Why to base, and not report them as queue stats?

Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
../mlx5/core/en/htb.c it seems that the driver will update the
real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
it seems like the inverse calculation is:

i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)

But really, isn't it enough to use priv->txq2sq[i] for the active
queues, and not active ones you've already covered?

> I am open to doing this, but I think if I were to do that, HTB/QOS queue
> stats should also be exported by rtnl so that the script above will
> continue to show that the output is correct.
> 
> I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl
> code together at the same time, but a later time, separate from this
> change.

Hm, are HTB queues really not counted in rtnl? That'd be pretty wrong.
Joe Damato May 31, 2024, 12:15 a.m. UTC | #2
On Thu, May 30, 2024 at 05:11:28PM -0700, Jakub Kicinski wrote:
> On Wed, 29 May 2024 03:16:25 +0000 Joe Damato wrote:
> > Worth noting that Tariq suggested I also export HTB/QOS stats in
> > mlx5e_get_base_stats.
> 
> Why to base, and not report them as queue stats?
> 
> Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
> ../mlx5/core/en/htb.c it seems that the driver will update the
> real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
> it seems like the inverse calculation is:
> 
> i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)
> 
> But really, isn't it enough to use priv->txq2sq[i] for the active
> queues, and not active ones you've already covered?

This is what I proposed in the thread for the v2, but Tariq
suggested a different approach he liked more, please see this
message for more details:

  https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/

I attempted to implement option 1 as he described in his message.
 
> > I am open to doing this, but I think if I were to do that, HTB/QOS queue
> > stats should also be exported by rtnl so that the script above will
> > continue to show that the output is correct.
> > 
> > I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl
> > code together at the same time, but a later time, separate from this
> > change.
> 
> Hm, are HTB queues really not counted in rtnl? That'd be pretty wrong.

As far as I can tell (and I could be wrong) I didn't see them
included in the rtnl stats, but I'll take another look now.
Jakub Kicinski May 31, 2024, 12:39 a.m. UTC | #3
On Thu, 30 May 2024 17:15:25 -0700 Joe Damato wrote:
> > Why to base, and not report them as queue stats?
> > 
> > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
> > ../mlx5/core/en/htb.c it seems that the driver will update the
> > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
> > it seems like the inverse calculation is:
> > 
> > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)
> > 
> > But really, isn't it enough to use priv->txq2sq[i] for the active
> > queues, and not active ones you've already covered?  
> 
> This is what I proposed in the thread for the v2, but Tariq
> suggested a different approach he liked more, please see this
> message for more details:
> 
>   https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/
> 
> I attempted to implement option 1 as he described in his message.

I see, although it sounds like option 2 would also work.

Saeed can you shine any light here? I'm worried Tariq is already AFK
for the weekend and we'll make no progress until Monday...
Joe Damato May 31, 2024, 12:56 a.m. UTC | #4
On Thu, May 30, 2024 at 05:39:02PM -0700, Jakub Kicinski wrote:
> On Thu, 30 May 2024 17:15:25 -0700 Joe Damato wrote:
> > > Why to base, and not report them as queue stats?
> > > 
> > > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
> > > ../mlx5/core/en/htb.c it seems that the driver will update the
> > > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
> > > it seems like the inverse calculation is:
> > > 
> > > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)
> > > 
> > > But really, isn't it enough to use priv->txq2sq[i] for the active
> > > queues, and not active ones you've already covered?  
> > 
> > This is what I proposed in the thread for the v2, but Tariq
> > suggested a different approach he liked more, please see this
> > message for more details:
> > 
> >   https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/
> > 
> > I attempted to implement option 1 as he described in his message.
> 
> I see, although it sounds like option 2 would also work.

I don't really mind either way; from Tariq's message it sounded like
he preferred option 1, so I tried to implement that thinking that it would be
my best bet at getting this done.

If option 2 is easier/preferred for some reason... it seems like
(other than the locking I forgot to include) the base implementation
in v2 was correct and I could use what I proposed in the thread for
the tx stats, which was:

  mutex_lock(&priv->state_lock);
  if (priv->channels.num > 0) {
          sq = priv->txq2sq[i];
          stats->packets = sq->stats->packets;
          stats->bytes = sq->stats->bytes;
  }
  mutex_unlock(&priv->state_lock);

And I would have implemented option 2... IIUC.
Joe Damato May 31, 2024, 1:07 a.m. UTC | #5
On Thu, May 30, 2024 at 05:15:25PM -0700, Joe Damato wrote:
> On Thu, May 30, 2024 at 05:11:28PM -0700, Jakub Kicinski wrote:
> > On Wed, 29 May 2024 03:16:25 +0000 Joe Damato wrote:
> > > Worth noting that Tariq suggested I also export HTB/QOS stats in
> > > mlx5e_get_base_stats.
> > 
> > Why to base, and not report them as queue stats?
> > 
> > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in
> > ../mlx5/core/en/htb.c it seems that the driver will update the
> > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos()
> > it seems like the inverse calculation is:
> > 
> > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params)
> > 
> > But really, isn't it enough to use priv->txq2sq[i] for the active
> > queues, and not active ones you've already covered?
> 
> This is what I proposed in the thread for the v2, but Tariq
> suggested a different approach he liked more, please see this
> message for more details:
> 
>   https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/
> 
> I attempted to implement option 1 as he described in his message.
>  
> > > I am open to doing this, but I think if I were to do that, HTB/QOS queue
> > > stats should also be exported by rtnl so that the script above will
> > > continue to show that the output is correct.
> > > 
> > > I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl
> > > code together at the same time, but a later time, separate from this
> > > change.
> > 
> > Hm, are HTB queues really not counted in rtnl? That'd be pretty wrong.
> 
> As far as I can tell (and I could be wrong) I didn't see them
> included in the rtnl stats, but I'll take another look now.

I looked and it seems like the htb stats are included in ethtool's
output if you look at mlx5/core/en_stats.c, it appears that
mlx5e_stats_grp_sw_update_stats_qos rolls up the htb stats.

However, the rtnl stats, seem to be computed in mlx5/core/en_main.c
via mlx5e_get_stats calling mlx5e_fold_sw_stats64, which appears to
gather stats for rx, tx, and ptp, but it doesn't seem like qos/htb
is handled.

Unless I am missing something, I think mlx5e_fold_sw_stats64 would
need code similar to mlx5e_stats_grp_sw_update_stats_qos and then
rtnl would account for htb stats.

That said: since it seems the htb numbers are not included right
now, I was proposing adding that in later both to rtnl and
netdev-genl together, hoping that would keep the proposed
simpler/easier to get accepted.

LMK what you think.
Jakub Kicinski May 31, 2024, 1:26 a.m. UTC | #6
On Thu, 30 May 2024 18:07:31 -0700 Joe Damato wrote:
> Unless I am missing something, I think mlx5e_fold_sw_stats64 would
> need code similar to mlx5e_stats_grp_sw_update_stats_qos and then
> rtnl would account for htb stats.

Hm, I think you're right. I'm just surprised this could have gone
unnoticed for so long.

> That said: since it seems the htb numbers are not included right
> now, I was proposing adding that in later both to rtnl and
> netdev-genl together, hoping that would keep the proposed
> simpler/easier to get accepted.

SGTM.
Joe Damato May 31, 2024, 6:30 p.m. UTC | #7
On Thu, May 30, 2024 at 06:26:30PM -0700, Jakub Kicinski wrote:
> On Thu, 30 May 2024 18:07:31 -0700 Joe Damato wrote:
> > Unless I am missing something, I think mlx5e_fold_sw_stats64 would
> > need code similar to mlx5e_stats_grp_sw_update_stats_qos and then
> > rtnl would account for htb stats.
> 
> Hm, I think you're right. I'm just surprised this could have gone
> unnoticed for so long.
> 
> > That said: since it seems the htb numbers are not included right
> > now, I was proposing adding that in later both to rtnl and
> > netdev-genl together, hoping that would keep the proposed
> > simpler/easier to get accepted.
> 
> SGTM.

Cool, so based on that it seems like I just need to figure out the
correct implementation for base and tx stats that is correct and
that will be accepted.

Hoping to hear back from them soon as I'd personally love to have
this API available on our systems.