Message ID | 20240529031628.324117-1-jdamato@fastly.com (mailing list archive) |
---|---|
Headers | show |
Series | mlx5: Add netdev-genl queue stats | expand |
On Wed, 29 May 2024 03:16:25 +0000 Joe Damato wrote: > Worth noting that Tariq suggested I also export HTB/QOS stats in > mlx5e_get_base_stats. Why to base, and not report them as queue stats? Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in ../mlx5/core/en/htb.c it seems that the driver will update the real_num_tx_queues accordingly. And from mlx5e_qid_from_qos() it seems like the inverse calculation is: i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params) But really, isn't it enough to use priv->txq2sq[i] for the active queues, and not active ones you've already covered? > I am open to doing this, but I think if I were to do that, HTB/QOS queue > stats should also be exported by rtnl so that the script above will > continue to show that the output is correct. > > I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl > code together at the same time, but a later time, separate from this > change. Hm, are HTB queues really not counted in rtnl? That'd be pretty wrong.
On Thu, May 30, 2024 at 05:11:28PM -0700, Jakub Kicinski wrote: > On Wed, 29 May 2024 03:16:25 +0000 Joe Damato wrote: > > Worth noting that Tariq suggested I also export HTB/QOS stats in > > mlx5e_get_base_stats. > > Why to base, and not report them as queue stats? > > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in > ../mlx5/core/en/htb.c it seems that the driver will update the > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos() > it seems like the inverse calculation is: > > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params) > > But really, isn't it enough to use priv->txq2sq[i] for the active > queues, and not active ones you've already covered? This is what I proposed in the thread for the v2, but Tariq suggested a different approach he liked more, please see this message for more details: https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/ I attempted to implement option 1 as he described in his message. > > I am open to doing this, but I think if I were to do that, HTB/QOS queue > > stats should also be exported by rtnl so that the script above will > > continue to show that the output is correct. > > > > I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl > > code together at the same time, but a later time, separate from this > > change. > > Hm, are HTB queues really not counted in rtnl? That'd be pretty wrong. As far as I can tell (and I could be wrong) I didn't see them included in the rtnl stats, but I'll take another look now.
On Thu, 30 May 2024 17:15:25 -0700 Joe Damato wrote: > > Why to base, and not report them as queue stats? > > > > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in > > ../mlx5/core/en/htb.c it seems that the driver will update the > > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos() > > it seems like the inverse calculation is: > > > > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params) > > > > But really, isn't it enough to use priv->txq2sq[i] for the active > > queues, and not active ones you've already covered? > > This is what I proposed in the thread for the v2, but Tariq > suggested a different approach he liked more, please see this > message for more details: > > https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/ > > I attempted to implement option 1 as he described in his message. I see, although it sounds like option 2 would also work. Saeed can you shine any light here? I'm worried Tariq is already AFK for the weekend and we'll make no progress until Monday...
On Thu, May 30, 2024 at 05:39:02PM -0700, Jakub Kicinski wrote: > On Thu, 30 May 2024 17:15:25 -0700 Joe Damato wrote: > > > Why to base, and not report them as queue stats? > > > > > > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in > > > ../mlx5/core/en/htb.c it seems that the driver will update the > > > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos() > > > it seems like the inverse calculation is: > > > > > > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params) > > > > > > But really, isn't it enough to use priv->txq2sq[i] for the active > > > queues, and not active ones you've already covered? > > > > This is what I proposed in the thread for the v2, but Tariq > > suggested a different approach he liked more, please see this > > message for more details: > > > > https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/ > > > > I attempted to implement option 1 as he described in his message. > > I see, although it sounds like option 2 would also work. I don't really mind either way; from Tariq's message it sounded like he preferred option 1, so I tried to implement that thinking that it would be my best bet at getting this done. If option 2 is easier/preferred for some reason... it seems like (other than the locking I forgot to include) the base implementation in v2 was correct and I could use what I proposed in the thread for the tx stats, which was: mutex_lock(&priv->state_lock); if (priv->channels.num > 0) { sq = priv->txq2sq[i]; stats->packets = sq->stats->packets; stats->bytes = sq->stats->bytes; } mutex_unlock(&priv->state_lock); And I would have implemented option 2... IIUC.
On Thu, May 30, 2024 at 05:15:25PM -0700, Joe Damato wrote: > On Thu, May 30, 2024 at 05:11:28PM -0700, Jakub Kicinski wrote: > > On Wed, 29 May 2024 03:16:25 +0000 Joe Damato wrote: > > > Worth noting that Tariq suggested I also export HTB/QOS stats in > > > mlx5e_get_base_stats. > > > > Why to base, and not report them as queue stats? > > > > Judging by mlx5e_update_tx_netdev_queues() calls sprinkled in > > ../mlx5/core/en/htb.c it seems that the driver will update the > > real_num_tx_queues accordingly. And from mlx5e_qid_from_qos() > > it seems like the inverse calculation is: > > > > i - (chs->params.num_channels + is_ptp)*mlx5e_get_dcb_num_tc(&chs->params) > > > > But really, isn't it enough to use priv->txq2sq[i] for the active > > queues, and not active ones you've already covered? > > This is what I proposed in the thread for the v2, but Tariq > suggested a different approach he liked more, please see this > message for more details: > > https://lore.kernel.org/netdev/68225941-f3c3-4335-8f3d-edee43f59033@gmail.com/ > > I attempted to implement option 1 as he described in his message. > > > > I am open to doing this, but I think if I were to do that, HTB/QOS queue > > > stats should also be exported by rtnl so that the script above will > > > continue to show that the output is correct. > > > > > > I'd like to propose: adding HTB/QOS to both rtnl *and* the netdev-genl > > > code together at the same time, but a later time, separate from this > > > change. > > > > Hm, are HTB queues really not counted in rtnl? That'd be pretty wrong. > > As far as I can tell (and I could be wrong) I didn't see them > included in the rtnl stats, but I'll take another look now. I looked and it seems like the htb stats are included in ethtool's output if you look at mlx5/core/en_stats.c, it appears that mlx5e_stats_grp_sw_update_stats_qos rolls up the htb stats. However, the rtnl stats, seem to be computed in mlx5/core/en_main.c via mlx5e_get_stats calling mlx5e_fold_sw_stats64, which appears to gather stats for rx, tx, and ptp, but it doesn't seem like qos/htb is handled. Unless I am missing something, I think mlx5e_fold_sw_stats64 would need code similar to mlx5e_stats_grp_sw_update_stats_qos and then rtnl would account for htb stats. That said: since it seems the htb numbers are not included right now, I was proposing adding that in later both to rtnl and netdev-genl together, hoping that would keep the proposed simpler/easier to get accepted. LMK what you think.
On Thu, 30 May 2024 18:07:31 -0700 Joe Damato wrote: > Unless I am missing something, I think mlx5e_fold_sw_stats64 would > need code similar to mlx5e_stats_grp_sw_update_stats_qos and then > rtnl would account for htb stats. Hm, I think you're right. I'm just surprised this could have gone unnoticed for so long. > That said: since it seems the htb numbers are not included right > now, I was proposing adding that in later both to rtnl and > netdev-genl together, hoping that would keep the proposed > simpler/easier to get accepted. SGTM.
On Thu, May 30, 2024 at 06:26:30PM -0700, Jakub Kicinski wrote: > On Thu, 30 May 2024 18:07:31 -0700 Joe Damato wrote: > > Unless I am missing something, I think mlx5e_fold_sw_stats64 would > > need code similar to mlx5e_stats_grp_sw_update_stats_qos and then > > rtnl would account for htb stats. > > Hm, I think you're right. I'm just surprised this could have gone > unnoticed for so long. > > > That said: since it seems the htb numbers are not included right > > now, I was proposing adding that in later both to rtnl and > > netdev-genl together, hoping that would keep the proposed > > simpler/easier to get accepted. > > SGTM. Cool, so based on that it seems like I just need to figure out the correct implementation for base and tx stats that is correct and that will be accepted. Hoping to hear back from them soon as I'd personally love to have this API available on our systems.