diff mbox series

[net-next,10/15] net/mlx5e: Let channels be SD-aware

Message ID 20231221005721.186607-11-saeed@kernel.org (mailing list archive)
State Accepted
Commit e4f9686bdee7b4dd89e0ed63cd03606e4bda4ced
Delegated to: Netdev Maintainers
Headers show
Series [net-next,01/15] net/mlx5e: Use the correct lag ports number when creating TISes | expand

Checks

Context Check Description
netdev/series_format success Pull request is its own cover letter
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1116 this patch: 1116
netdev/cc_maintainers warning 2 maintainers not CCed: przemyslaw.kitszel@intel.com naveenm@marvell.com
netdev/build_clang fail Errors and warnings before: 12 this patch: 12
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1143 this patch: 1143
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns WARNING: line length of 89 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Saeed Mahameed Dec. 21, 2023, 12:57 a.m. UTC
From: Tariq Toukan <tariqt@nvidia.com>

Distribute the channels between the different SD-devices to acheive
local numa node performance on multiple numas.

Each channel works against one specific mdev, creating all datapath
queues against it.

We distribute channels to mdevs in a round-robin policy.

Example for 2 mdevs and 6 channels:
+-------+---------+
| ch ix | mdev ix |
+-------+---------+
|   0   |    0    |
|   1   |    1    |
|   2   |    0    |
|   3   |    1    |
|   4   |    0    |
|   5   |    1    |
+-------+---------+

This round-robin distribution policy is preferred over another suggested
intuitive distribution, in which we first distribute one half of the
channels to mdev #0 and then the second half to mdev #1.

We prefer round-robin for a reason: it is less influenced by changes in
the number of channels. The mapping between channel index and mdev is
fixed, no matter how many channels the user configures. As the channel
stats are persistent to channels closure, changing the mapping every
single time would turn the accumulative stats less representing of the
channel's history.

Per-channel objects should stop using the primary mdev (priv->mdev)
directly, and instead move to using their own channel's mdev.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  1 +
 .../ethernet/mellanox/mlx5/core/en/params.c   |  2 +-
 .../net/ethernet/mellanox/mlx5/core/en/qos.c  |  8 ++---
 .../mellanox/mlx5/core/en/reporter_rx.c       |  4 +--
 .../mellanox/mlx5/core/en/reporter_tx.c       |  3 +-
 .../ethernet/mellanox/mlx5/core/en/xsk/pool.c |  6 ++--
 .../mellanox/mlx5/core/en_accel/ktls_rx.c     |  6 ++--
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 32 ++++++++++++-------
 8 files changed, 35 insertions(+), 27 deletions(-)

Comments

Jakub Kicinski Jan. 4, 2024, 10:50 p.m. UTC | #1
On Wed, 20 Dec 2023 16:57:16 -0800 Saeed Mahameed wrote:
> Example for 2 mdevs and 6 channels:
> +-------+---------+
> | ch ix | mdev ix |
> +-------+---------+
> |   0   |    0    |
> |   1   |    1    |
> |   2   |    0    |
> |   3   |    1    |
> |   4   |    0    |
> |   5   |    1    |
> +-------+---------+

Meaning Rx queue 0 goes to PF 0, Rx queue 1 goes to PF 1, etc.?
Is the user then expected to magic pixie dust the XPS or some such
to get to the right queue?

How is this going to get represented in the recently merged Netlink
queue API?
Gal Pressman Jan. 8, 2024, 12:30 p.m. UTC | #2
On 05/01/2024 0:50, Jakub Kicinski wrote:
> On Wed, 20 Dec 2023 16:57:16 -0800 Saeed Mahameed wrote:
>> Example for 2 mdevs and 6 channels:
>> +-------+---------+
>> | ch ix | mdev ix |
>> +-------+---------+
>> |   0   |    0    |
>> |   1   |    1    |
>> |   2   |    0    |
>> |   3   |    1    |
>> |   4   |    0    |
>> |   5   |    1    |
>> +-------+---------+
> 
> Meaning Rx queue 0 goes to PF 0, Rx queue 1 goes to PF 1, etc.?

Correct.

> Is the user then expected to magic pixie dust the XPS or some such
> to get to the right queue?

I'm confused, how are RX queues related to XPS?
XPS shouldn't be affected, we just make sure that whatever queue XPS
chose will go out through the "right" PF.

So for example, XPS will choose a queue according to the CPU, and the
driver will make sure that packets transmitted from this SQ are going
out through the PF closer to that NUMA.

> 
> How is this going to get represented in the recently merged Netlink
> queue API?

Can you share a link please?

All the logic is internal to the driver, so I expect it to be fine, but
I'd like to double check.
Jakub Kicinski Jan. 9, 2024, 3:08 a.m. UTC | #3
On Mon, 8 Jan 2024 14:30:54 +0200 Gal Pressman wrote:
> On 05/01/2024 0:50, Jakub Kicinski wrote:
> > On Wed, 20 Dec 2023 16:57:16 -0800 Saeed Mahameed wrote:  
> >> Example for 2 mdevs and 6 channels:
> >> +-------+---------+
> >> | ch ix | mdev ix |
> >> +-------+---------+
> >> |   0   |    0    |
> >> |   1   |    1    |
> >> |   2   |    0    |
> >> |   3   |    1    |
> >> |   4   |    0    |
> >> |   5   |    1    |
> >> +-------+---------+  
> > 
> > Meaning Rx queue 0 goes to PF 0, Rx queue 1 goes to PF 1, etc.?  
> 
> Correct.
> 
> > Is the user then expected to magic pixie dust the XPS or some such
> > to get to the right queue?  
> 
> I'm confused, how are RX queues related to XPS?

Separate sentence, perhaps I should be more verbose..

> XPS shouldn't be affected, we just make sure that whatever queue XPS
> chose will go out through the "right" PF.

But you said "correct" to queue 0 going to PF 0 and queue 1 to PF 1.
The queue IDs in my question refer to the queue mapping form the stacks
perspective. If user wants to send everything to queue 0 will it use
both PFs?

> So for example, XPS will choose a queue according to the CPU, and the
> driver will make sure that packets transmitted from this SQ are going
> out through the PF closer to that NUMA.

Sounds like queue 0 is duplicated in both PFs, then?

> > How is this going to get represented in the recently merged Netlink
> > queue API?  
> 
> Can you share a link please?

commit a90d56049acc45802f67cd7d4c058ac45b1bc26f
 
> All the logic is internal to the driver, so I expect it to be fine, but
> I'd like to double check.

Herm, "internal to the driver" is a bit of a landmine. It will be fine
for iperf testing but real users will want to configure the NIC.
Gal Pressman Jan. 9, 2024, 2:15 p.m. UTC | #4
On 09/01/2024 5:08, Jakub Kicinski wrote:
> On Mon, 8 Jan 2024 14:30:54 +0200 Gal Pressman wrote:
>> On 05/01/2024 0:50, Jakub Kicinski wrote:
>>> On Wed, 20 Dec 2023 16:57:16 -0800 Saeed Mahameed wrote:  
>>>> Example for 2 mdevs and 6 channels:
>>>> +-------+---------+
>>>> | ch ix | mdev ix |
>>>> +-------+---------+
>>>> |   0   |    0    |
>>>> |   1   |    1    |
>>>> |   2   |    0    |
>>>> |   3   |    1    |
>>>> |   4   |    0    |
>>>> |   5   |    1    |
>>>> +-------+---------+  
>>>
>>> Meaning Rx queue 0 goes to PF 0, Rx queue 1 goes to PF 1, etc.?  
>>
>> Correct.
>>
>>> Is the user then expected to magic pixie dust the XPS or some such
>>> to get to the right queue?  
>>
>> I'm confused, how are RX queues related to XPS?
> 
> Separate sentence, perhaps I should be more verbose..

Sorry, yes, your understanding is correct.
If a packet is received on RQ 0 then it is from PF 0, RQ 1 came from PF
1, etc. Though this is all from the same wire/port.

You can enable arfs for example, which will make sure that packets that
are destined to a certain CPU will be received by the PF that is closer
to it.

>> XPS shouldn't be affected, we just make sure that whatever queue XPS
>> chose will go out through the "right" PF.
> 
> But you said "correct" to queue 0 going to PF 0 and queue 1 to PF 1.
> The queue IDs in my question refer to the queue mapping form the stacks
> perspective. If user wants to send everything to queue 0 will it use
> both PFs?

If all traffic is transmitted through queue 0, it will go out from PF 0
(the PF that is closer to CPU 0 numa).

>> So for example, XPS will choose a queue according to the CPU, and the
>> driver will make sure that packets transmitted from this SQ are going
>> out through the PF closer to that NUMA.
> 
> Sounds like queue 0 is duplicated in both PFs, then?

Depends on how you look at it, each PF has X queues, the netdev has 2X
queues.

>>> How is this going to get represented in the recently merged Netlink
>>> queue API?  
>>
>> Can you share a link please?
> 
> commit a90d56049acc45802f67cd7d4c058ac45b1bc26f

Thanks, will take a look.

>> All the logic is internal to the driver, so I expect it to be fine, but
>> I'd like to double check.
> 
> Herm, "internal to the driver" is a bit of a landmine. It will be fine
> for iperf testing but real users will want to configure the NIC.

What kind of configuration are you thinking of?
Jakub Kicinski Jan. 9, 2024, 4 p.m. UTC | #5
On Tue, 9 Jan 2024 16:15:50 +0200 Gal Pressman wrote:
> >> I'm confused, how are RX queues related to XPS?  
> > 
> > Separate sentence, perhaps I should be more verbose..  
> 
> Sorry, yes, your understanding is correct.
> If a packet is received on RQ 0 then it is from PF 0, RQ 1 came from PF
> 1, etc. Though this is all from the same wire/port.
> 
> You can enable arfs for example, which will make sure that packets that
> are destined to a certain CPU will be received by the PF that is closer
> to it.

Got it.

> >> XPS shouldn't be affected, we just make sure that whatever queue XPS
> >> chose will go out through the "right" PF.  
> > 
> > But you said "correct" to queue 0 going to PF 0 and queue 1 to PF 1.
> > The queue IDs in my question refer to the queue mapping form the stacks
> > perspective. If user wants to send everything to queue 0 will it use
> > both PFs?  
> 
> If all traffic is transmitted through queue 0, it will go out from PF 0
> (the PF that is closer to CPU 0 numa).

Okay, but earlier you said: "whatever queue XPS chose will go out
through the "right" PF." - which I read as PF will be chosen based
on CPU locality regardless of XPS logic.

If queue 0 => PF 0, then user has to set up XPS to make CPUs from NUMA
node which has PF 0 use even number queues, and PF 1 to use odd number
queues. Correct?

> >> So for example, XPS will choose a queue according to the CPU, and the
> >> driver will make sure that packets transmitted from this SQ are going
> >> out through the PF closer to that NUMA.  
> > 
> > Sounds like queue 0 is duplicated in both PFs, then?  
> 
> Depends on how you look at it, each PF has X queues, the netdev has 2X
> queues.

I'm asking how it looks from the user perspective, to be clear.
From above I gather than the answer is no - queue 0 maps directly 
to PF 0 / queue 0, nothing on PF 1 will ever see traffic of queue 0.

> >> Can you share a link please?  
> > 
> > commit a90d56049acc45802f67cd7d4c058ac45b1bc26f  
> 
> Thanks, will take a look.
> 
> >> All the logic is internal to the driver, so I expect it to be fine, but
> >> I'd like to double check.
> > 
> > Herm, "internal to the driver" is a bit of a landmine. It will be fine
> > for iperf testing but real users will want to configure the NIC.
> 
> What kind of configuration are you thinking of?

Well, I was hoping you'd do the legwork and show how user configuration
logic has to be augmented for all relevant stack features to work with
multi-PF devices. I can list the APIs that come to mind while writing
this email, but that won't be exhaustive :(
Gal Pressman Jan. 10, 2024, 2:09 p.m. UTC | #6
On 09/01/2024 18:00, Jakub Kicinski wrote:
> On Tue, 9 Jan 2024 16:15:50 +0200 Gal Pressman wrote:
>>>> I'm confused, how are RX queues related to XPS?  
>>>
>>> Separate sentence, perhaps I should be more verbose..  
>>
>> Sorry, yes, your understanding is correct.
>> If a packet is received on RQ 0 then it is from PF 0, RQ 1 came from PF
>> 1, etc. Though this is all from the same wire/port.
>>
>> You can enable arfs for example, which will make sure that packets that
>> are destined to a certain CPU will be received by the PF that is closer
>> to it.
> 
> Got it.
> 
>>>> XPS shouldn't be affected, we just make sure that whatever queue XPS
>>>> chose will go out through the "right" PF.  
>>>
>>> But you said "correct" to queue 0 going to PF 0 and queue 1 to PF 1.
>>> The queue IDs in my question refer to the queue mapping form the stacks
>>> perspective. If user wants to send everything to queue 0 will it use
>>> both PFs?  
>>
>> If all traffic is transmitted through queue 0, it will go out from PF 0
>> (the PF that is closer to CPU 0 numa).
> 
> Okay, but earlier you said: "whatever queue XPS chose will go out
> through the "right" PF." - which I read as PF will be chosen based
> on CPU locality regardless of XPS logic.
> 
> If queue 0 => PF 0, then user has to set up XPS to make CPUs from NUMA
> node which has PF 0 use even number queues, and PF 1 to use odd number
> queues. Correct?

I think it is based on the default xps configuration, but I don't want
to get the details wrong, checking with Tariq and will reply (he's OOO).

>>>> So for example, XPS will choose a queue according to the CPU, and the
>>>> driver will make sure that packets transmitted from this SQ are going
>>>> out through the PF closer to that NUMA.  
>>>
>>> Sounds like queue 0 is duplicated in both PFs, then?  
>>
>> Depends on how you look at it, each PF has X queues, the netdev has 2X
>> queues.
> 
> I'm asking how it looks from the user perspective, to be clear.

From the user's perspective there is a single netdev, the PFs separation
is internal to the driver and transparent to the user.
The user configures the number of queues, and the driver splits them
between the PF.

Same for other features, the user configures the netdev like any other
netdev, it is up to the driver to make sure that the netdev model is
working.

> From above I gather than the answer is no - queue 0 maps directly 
> to PF 0 / queue 0, nothing on PF 1 will ever see traffic of queue 0.

Right, traffic received on RQ 0 is traffic that was processed by PF 0.
RQ 1 is in fact (PF 1, RQ 0).

>>>> Can you share a link please?  
>>>
>>> commit a90d56049acc45802f67cd7d4c058ac45b1bc26f  
>>
>> Thanks, will take a look.
>>
>>>> All the logic is internal to the driver, so I expect it to be fine, but
>>>> I'd like to double check.
>>>
>>> Herm, "internal to the driver" is a bit of a landmine. It will be fine
>>> for iperf testing but real users will want to configure the NIC.
>>
>> What kind of configuration are you thinking of?
> 
> Well, I was hoping you'd do the legwork and show how user configuration
> logic has to be augmented for all relevant stack features to work with
> multi-PF devices. I can list the APIs that come to mind while writing
> this email, but that won't be exhaustive :(

We have been working on this feature for a long time, we did think of
the different configurations and potential issues, and backed that up
with our testing.

TLS for example is explicitly blocked in this series for such netdevices
as we identified it as problematic.

There is always potential that we missed things, that's why I was
genuinely curious to hear if you had anything specific in mind.
Tariq Toukan Jan. 25, 2024, 8:01 a.m. UTC | #7
On 10/01/2024 16:09, Gal Pressman wrote:
> On 09/01/2024 18:00, Jakub Kicinski wrote:
>> On Tue, 9 Jan 2024 16:15:50 +0200 Gal Pressman wrote:
>>>>> I'm confused, how are RX queues related to XPS?
>>>>
>>>> Separate sentence, perhaps I should be more verbose..
>>>
>>> Sorry, yes, your understanding is correct.
>>> If a packet is received on RQ 0 then it is from PF 0, RQ 1 came from PF
>>> 1, etc. Though this is all from the same wire/port.
>>>
>>> You can enable arfs for example, which will make sure that packets that
>>> are destined to a certain CPU will be received by the PF that is closer
>>> to it.
>>
>> Got it.
>>
>>>>> XPS shouldn't be affected, we just make sure that whatever queue XPS
>>>>> chose will go out through the "right" PF.
>>>>
>>>> But you said "correct" to queue 0 going to PF 0 and queue 1 to PF 1.
>>>> The queue IDs in my question refer to the queue mapping form the stacks
>>>> perspective. If user wants to send everything to queue 0 will it use
>>>> both PFs?
>>>
>>> If all traffic is transmitted through queue 0, it will go out from PF 0
>>> (the PF that is closer to CPU 0 numa).
>>

Hi,
I'm back from a long vacation. Catching up on emails...

>> Okay, but earlier you said: "whatever queue XPS chose will go out
>> through the "right" PF." - which I read as PF will be chosen based
>> on CPU locality regardless of XPS logic.
>>
>> If queue 0 => PF 0, then user has to set up XPS to make CPUs from NUMA
>> node which has PF 0 use even number queues, and PF 1 to use odd number
>> queues. Correct?

Exactly. That's the desired configuration.
Our driver has the logic to set it in default.

Here's the default XPS on my setup:

NUMA:
   NUMA node(s):          2
   NUMA node0 CPU(s):     0-11
   NUMA node1 CPU(s):     12-23

PF0 on node0, PF1 on node1.

/sys/class/net/eth2/queues/tx-0/xps_cpus:000001
/sys/class/net/eth2/queues/tx-1/xps_cpus:001000
/sys/class/net/eth2/queues/tx-2/xps_cpus:000002
/sys/class/net/eth2/queues/tx-3/xps_cpus:002000
/sys/class/net/eth2/queues/tx-4/xps_cpus:000004
/sys/class/net/eth2/queues/tx-5/xps_cpus:004000
/sys/class/net/eth2/queues/tx-6/xps_cpus:000008
/sys/class/net/eth2/queues/tx-7/xps_cpus:008000
/sys/class/net/eth2/queues/tx-8/xps_cpus:000010
/sys/class/net/eth2/queues/tx-9/xps_cpus:010000
/sys/class/net/eth2/queues/tx-10/xps_cpus:000020
/sys/class/net/eth2/queues/tx-11/xps_cpus:020000
/sys/class/net/eth2/queues/tx-12/xps_cpus:000040
/sys/class/net/eth2/queues/tx-13/xps_cpus:040000
/sys/class/net/eth2/queues/tx-14/xps_cpus:000080
/sys/class/net/eth2/queues/tx-15/xps_cpus:080000
/sys/class/net/eth2/queues/tx-16/xps_cpus:000100
/sys/class/net/eth2/queues/tx-17/xps_cpus:100000
/sys/class/net/eth2/queues/tx-18/xps_cpus:000200
/sys/class/net/eth2/queues/tx-19/xps_cpus:200000
/sys/class/net/eth2/queues/tx-20/xps_cpus:000400
/sys/class/net/eth2/queues/tx-21/xps_cpus:400000
/sys/class/net/eth2/queues/tx-22/xps_cpus:000800
/sys/class/net/eth2/queues/tx-23/xps_cpus:800000

> 
> I think it is based on the default xps configuration, but I don't want
> to get the details wrong, checking with Tariq and will reply (he's OOO).
>
Jakub Kicinski Jan. 26, 2024, 2:40 a.m. UTC | #8
On Thu, 25 Jan 2024 10:01:05 +0200 Tariq Toukan wrote:
> Exactly. That's the desired configuration.
> Our driver has the logic to set it in default.
> 
> Here's the default XPS on my setup:
> 
> NUMA:
>    NUMA node(s):          2
>    NUMA node0 CPU(s):     0-11
>    NUMA node1 CPU(s):     12-23
> 
> PF0 on node0, PF1 on node1.

Okay, good that you took care of the defaults, but having a queue per
CPU thread is quite inefficient. Most sensible users will reconfigure
your NICs and remap IRQs and XPS. Which is fine, but we need to give
them the necessary info to do this right - documentation and preferably
the PCIe dev mapping in the new netlink queue API.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 6c143088e247..f6e78c465c7a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -792,6 +792,7 @@  struct mlx5e_channel {
 	struct hwtstamp_config    *tstamp;
 	DECLARE_BITMAP(state, MLX5E_CHANNEL_NUM_STATES);
 	int                        ix;
+	int                        vec_ix;
 	int                        cpu;
 	/* Sync between icosq recovery and XSK enable/disable. */
 	struct mutex               icosq_recovery_lock;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index 284253b79266..18f0cedc8610 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -674,7 +674,7 @@  void mlx5e_build_create_cq_param(struct mlx5e_create_cq_param *ccp, struct mlx5e
 		.napi = &c->napi,
 		.ch_stats = c->stats,
 		.node = cpu_to_node(c->cpu),
-		.ix = c->ix,
+		.ix = c->vec_ix,
 	};
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
index 34adf8c3f81a..e87e26f2c669 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/qos.c
@@ -122,8 +122,8 @@  int mlx5e_open_qos_sq(struct mlx5e_priv *priv, struct mlx5e_channels *chs,
 
 	memset(&param_sq, 0, sizeof(param_sq));
 	memset(&param_cq, 0, sizeof(param_cq));
-	mlx5e_build_sq_param(priv->mdev, params, &param_sq);
-	mlx5e_build_tx_cq_param(priv->mdev, params, &param_cq);
+	mlx5e_build_sq_param(c->mdev, params, &param_sq);
+	mlx5e_build_tx_cq_param(c->mdev, params, &param_cq);
 	err = mlx5e_open_cq(c->mdev, params->tx_cq_moderation, &param_cq, &ccp, &sq->cq);
 	if (err)
 		goto err_free_sq;
@@ -176,7 +176,7 @@  int mlx5e_activate_qos_sq(void *data, u16 node_qid, u32 hw_id)
 	 */
 	smp_wmb();
 
-	qos_dbg(priv->mdev, "Activate QoS SQ qid %u\n", node_qid);
+	qos_dbg(sq->mdev, "Activate QoS SQ qid %u\n", node_qid);
 	mlx5e_activate_txqsq(sq);
 
 	return 0;
@@ -190,7 +190,7 @@  void mlx5e_deactivate_qos_sq(struct mlx5e_priv *priv, u16 qid)
 	if (!sq) /* Handle the case when the SQ failed to open. */
 		return;
 
-	qos_dbg(priv->mdev, "Deactivate QoS SQ qid %u\n", qid);
+	qos_dbg(sq->mdev, "Deactivate QoS SQ qid %u\n", qid);
 	mlx5e_deactivate_txqsq(sq);
 
 	priv->txq2sq[mlx5e_qid_from_qos(&priv->channels, qid)] = NULL;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
index 4358798d6ce1..25d751eba99b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c
@@ -294,8 +294,8 @@  static void mlx5e_rx_reporter_diagnose_generic_rq(struct mlx5e_rq *rq,
 
 	params = &priv->channels.params;
 	rq_sz = mlx5e_rqwq_get_size(rq);
-	real_time =  mlx5_is_real_time_rq(priv->mdev);
-	rq_stride = BIT(mlx5e_mpwqe_get_log_stride_size(priv->mdev, params, NULL));
+	real_time =  mlx5_is_real_time_rq(rq->mdev);
+	rq_stride = BIT(mlx5e_mpwqe_get_log_stride_size(rq->mdev, params, NULL));
 
 	mlx5e_health_fmsg_named_obj_nest_start(fmsg, "RQ");
 	devlink_fmsg_u8_pair_put(fmsg, "type", params->rq_wq_type);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
index 6b44ddce14e9..0ab9db319530 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_tx.c
@@ -219,7 +219,6 @@  mlx5e_tx_reporter_build_diagnose_output_sq_common(struct devlink_fmsg *fmsg,
 						  struct mlx5e_txqsq *sq, int tc)
 {
 	bool stopped = netif_xmit_stopped(sq->txq);
-	struct mlx5e_priv *priv = sq->priv;
 	u8 state;
 	int err;
 
@@ -227,7 +226,7 @@  mlx5e_tx_reporter_build_diagnose_output_sq_common(struct devlink_fmsg *fmsg,
 	devlink_fmsg_u32_pair_put(fmsg, "txq ix", sq->txq_ix);
 	devlink_fmsg_u32_pair_put(fmsg, "sqn", sq->sqn);
 
-	err = mlx5_core_query_sq_state(priv->mdev, sq->sqn, &state);
+	err = mlx5_core_query_sq_state(sq->mdev, sq->sqn, &state);
 	if (!err)
 		devlink_fmsg_u8_pair_put(fmsg, "HW state", state);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
index ebada0c5af3c..db776e515b6a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/pool.c
@@ -6,10 +6,10 @@ 
 #include "setup.h"
 #include "en/params.h"
 
-static int mlx5e_xsk_map_pool(struct mlx5e_priv *priv,
+static int mlx5e_xsk_map_pool(struct mlx5_core_dev *mdev,
 			      struct xsk_buff_pool *pool)
 {
-	struct device *dev = mlx5_core_dma_dev(priv->mdev);
+	struct device *dev = mlx5_core_dma_dev(mdev);
 
 	return xsk_pool_dma_map(pool, dev, DMA_ATTR_SKIP_CPU_SYNC);
 }
@@ -89,7 +89,7 @@  static int mlx5e_xsk_enable_locked(struct mlx5e_priv *priv,
 	if (unlikely(!mlx5e_xsk_is_pool_sane(pool)))
 		return -EINVAL;
 
-	err = mlx5e_xsk_map_pool(priv, pool);
+	err = mlx5e_xsk_map_pool(mlx5_sd_ch_ix_get_dev(priv->mdev, ix), pool);
 	if (unlikely(err))
 		return err;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
index 9b597cb24598..65ccb33edafb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_rx.c
@@ -267,7 +267,7 @@  resync_post_get_progress_params(struct mlx5e_icosq *sq,
 		goto err_out;
 	}
 
-	pdev = mlx5_core_dma_dev(sq->channel->priv->mdev);
+	pdev = mlx5_core_dma_dev(sq->channel->mdev);
 	buf->dma_addr = dma_map_single(pdev, &buf->progress,
 				       PROGRESS_PARAMS_PADDED_SIZE, DMA_FROM_DEVICE);
 	if (unlikely(dma_mapping_error(pdev, buf->dma_addr))) {
@@ -425,14 +425,12 @@  void mlx5e_ktls_handle_get_psv_completion(struct mlx5e_icosq_wqe_info *wi,
 {
 	struct mlx5e_ktls_rx_resync_buf *buf = wi->tls_get_params.buf;
 	struct mlx5e_ktls_offload_context_rx *priv_rx;
-	struct mlx5e_ktls_rx_resync_ctx *resync;
 	u8 tracker_state, auth_state, *ctx;
 	struct device *dev;
 	u32 hw_seq;
 
 	priv_rx = buf->priv_rx;
-	resync = &priv_rx->resync;
-	dev = mlx5_core_dma_dev(resync->priv->mdev);
+	dev = mlx5_core_dma_dev(sq->channel->mdev);
 	if (unlikely(test_bit(MLX5E_PRIV_RX_FLAG_DELETING, priv_rx->flags)))
 		goto out;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 90a02fd3357a..8dac57282f1c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2527,14 +2527,20 @@  static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 			      struct xsk_buff_pool *xsk_pool,
 			      struct mlx5e_channel **cp)
 {
-	int cpu = mlx5_comp_vector_get_cpu(priv->mdev, ix);
 	struct net_device *netdev = priv->netdev;
+	struct mlx5_core_dev *mdev;
 	struct mlx5e_xsk_param xsk;
 	struct mlx5e_channel *c;
 	unsigned int irq;
+	int vec_ix;
+	int cpu;
 	int err;
 
-	err = mlx5_comp_irqn_get(priv->mdev, ix, &irq);
+	mdev = mlx5_sd_ch_ix_get_dev(priv->mdev, ix);
+	vec_ix = mlx5_sd_ch_ix_get_vec_ix(mdev, ix);
+	cpu = mlx5_comp_vector_get_cpu(mdev, vec_ix);
+
+	err = mlx5_comp_irqn_get(mdev, vec_ix, &irq);
 	if (err)
 		return err;
 
@@ -2547,18 +2553,19 @@  static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 		return -ENOMEM;
 
 	c->priv     = priv;
-	c->mdev     = priv->mdev;
+	c->mdev     = mdev;
 	c->tstamp   = &priv->tstamp;
 	c->ix       = ix;
+	c->vec_ix   = vec_ix;
 	c->cpu      = cpu;
-	c->pdev     = mlx5_core_dma_dev(priv->mdev);
+	c->pdev     = mlx5_core_dma_dev(mdev);
 	c->netdev   = priv->netdev;
-	c->mkey_be  = cpu_to_be32(priv->mdev->mlx5e_res.hw_objs.mkey);
+	c->mkey_be  = cpu_to_be32(mdev->mlx5e_res.hw_objs.mkey);
 	c->num_tc   = mlx5e_get_dcb_num_tc(params);
 	c->xdp      = !!params->xdp_prog;
 	c->stats    = &priv->channel_stats[ix]->ch;
 	c->aff_mask = irq_get_effective_affinity_mask(irq);
-	c->lag_port = mlx5e_enumerate_lag_port(priv->mdev, ix);
+	c->lag_port = mlx5e_enumerate_lag_port(mdev, ix);
 
 	netif_napi_add(netdev, &c->napi, mlx5e_napi_poll);
 
@@ -2936,15 +2943,18 @@  static MLX5E_DEFINE_PREACTIVATE_WRAPPER_CTX(mlx5e_update_netdev_queues);
 static void mlx5e_set_default_xps_cpumasks(struct mlx5e_priv *priv,
 					   struct mlx5e_params *params)
 {
-	struct mlx5_core_dev *mdev = priv->mdev;
-	int num_comp_vectors, ix, irq;
-
-	num_comp_vectors = mlx5_comp_vectors_max(mdev);
+	int ix;
 
 	for (ix = 0; ix < params->num_channels; ix++) {
+		int num_comp_vectors, irq, vec_ix;
+		struct mlx5_core_dev *mdev;
+
+		mdev = mlx5_sd_ch_ix_get_dev(priv->mdev, ix);
+		num_comp_vectors = mlx5_comp_vectors_max(mdev);
 		cpumask_clear(priv->scratchpad.cpumask);
+		vec_ix = mlx5_sd_ch_ix_get_vec_ix(mdev, ix);
 
-		for (irq = ix; irq < num_comp_vectors; irq += params->num_channels) {
+		for (irq = vec_ix; irq < num_comp_vectors; irq += params->num_channels) {
 			int cpu = mlx5_comp_vector_get_cpu(mdev, irq);
 
 			cpumask_set_cpu(cpu, priv->scratchpad.cpumask);