[v6,0/3] sched, net: NUMA-aware CPU spreading interface

Message ID	20221028164959.1367250-1-vschneid@redhat.com (mailing list archive)
Headers	show Return-Path: <linux-rdma-owner@kernel.org> From: Valentin Schneider <vschneid@redhat.com> To: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Saeed Mahameed <saeedm@nvidia.com>, Leon Romanovsky <leon@kernel.org>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Yury Norov <yury.norov@gmail.com>, Andy Shevchenko <andriy.shevchenko@linux.intel.com>, Rasmus Villemoes <linux@rasmusvillemoes.dk>, Ingo Molnar <mingo@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Mel Gorman <mgorman@suse.de>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Heiko Carstens <hca@linux.ibm.com>, Tony Luck <tony.luck@intel.com>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, Gal Pressman <gal@nvidia.com>, Tariq Toukan <tariqt@nvidia.com>, Jesse Brandeburg <jesse.brandeburg@intel.com> Subject: [PATCH v6 0/3] sched, net: NUMA-aware CPU spreading interface Date: Fri, 28 Oct 2022 17:49:56 +0100 Message-Id: <20221028164959.1367250-1-vschneid@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	sched, net: NUMA-aware CPU spreading interface \| expand [v6,0/3] sched, net: NUMA-aware CPU spreading interface [v6,1/3] sched/topology: Introduce sched_numa_hop_mask() [v6,2/3] sched/topology: Introduce for_each_numa_hop_mask() [v6,3/3] net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity hints

Message ID

20221028164959.1367250-1-vschneid@redhat.com (mailing list archive)

Headers

From: Valentin Schneider <vschneid@redhat.com>
To: netdev@vger.kernel.org, linux-rdma@vger.kernel.org,
        linux-kernel@vger.kernel.org
Cc: Saeed Mahameed <saeedm@nvidia.com>,
        Leon Romanovsky <leon@kernel.org>,
        "David S. Miller" <davem@davemloft.net>,
        Eric Dumazet <edumazet@google.com>,
        Jakub Kicinski <kuba@kernel.org>,
        Paolo Abeni <pabeni@redhat.com>,
        Yury Norov <yury.norov@gmail.com>,
        Andy Shevchenko <andriy.shevchenko@linux.intel.com>,
        Rasmus Villemoes <linux@rasmusvillemoes.dk>,
        Ingo Molnar <mingo@kernel.org>,
        Peter Zijlstra <peterz@infradead.org>,
        Vincent Guittot <vincent.guittot@linaro.org>,
        Dietmar Eggemann <dietmar.eggemann@arm.com>,
        Steven Rostedt <rostedt@goodmis.org>,
        Mel Gorman <mgorman@suse.de>,
        Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
        Heiko Carstens <hca@linux.ibm.com>,
        Tony Luck <tony.luck@intel.com>,
        Jonathan Cameron <Jonathan.Cameron@huawei.com>,
        Gal Pressman <gal@nvidia.com>,
        Tariq Toukan <tariqt@nvidia.com>,
        Jesse Brandeburg <jesse.brandeburg@intel.com>
Subject: [PATCH v6 0/3] sched, net: NUMA-aware CPU spreading interface
Date: Fri, 28 Oct 2022 17:49:56 +0100
Message-Id: <20221028164959.1367250-1-vschneid@redhat.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

sched, net: NUMA-aware CPU spreading interface | expand

Message

Valentin Schneider Oct. 28, 2022, 4:49 p.m. UTC

Hi folks,

Tariq pointed out in [1] that drivers allocating IRQ vectors would benefit
from having smarter NUMA-awareness (cpumask_local_spread() doesn't quite cut
it).

The proposed interface involved an array of CPUs and a temporary cpumask, and
being my difficult self what I'm proposing here is an interface that doesn't
require any temporary storage other than some stack variables (at the cost of
one wild macro).

[1]: https://lore.kernel.org/all/20220728191203.4055-1-tariqt@nvidia.com/

Revisions
=========

v5 -> v6
++++++++

o Simplified iterator macro (Andy)
o Cleaned up sched_numa_hop_mask (Andy, Yury)
o Applied Yury's RB tags 

v4 -> v5
++++++++

o Rebased onto 6.1-rc1
o Ditched the CPU iterator, moved to a cpumask iterator (Yury)

v3 -> v4
++++++++

o Rebased on top of Yury's bitmap-for-next
o Added Tariq's mlx5e patch
o Made sched_numa_hop_mask() return cpu_online_mask for the NUMA_NO_NODE &&
  hops=0 case

v2 -> v3
++++++++

o Added for_each_cpu_and() and for_each_cpu_andnot() tests (Yury)
o New patches to fix issues raised by running the above

o New patch to use for_each_cpu_andnot() in sched/core.c (Yury)

v1 -> v2
++++++++

o Split _find_next_bit() @invert into @invert1 and @invert2 (Yury)
o Rebase onto v6.0-rc1

Cheers,
Valentin

Tariq Toukan (1):
  net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity
    hints

Valentin Schneider (2):
  sched/topology: Introduce sched_numa_hop_mask()
  sched/topology: Introduce for_each_numa_hop_mask()

 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 18 +++++++++--
 include/linux/topology.h                     | 27 +++++++++++++++++
 kernel/sched/topology.c                      | 32 ++++++++++++++++++++
 3 files changed, 75 insertions(+), 2 deletions(-)

--
2.31.1

Comments

Jakub Kicinski Nov. 3, 2022, 2:56 a.m. UTC | #1

On Fri, 28 Oct 2022 17:49:56 +0100 Valentin Schneider wrote:
> Tariq pointed out in [1] that drivers allocating IRQ vectors would benefit
> from having smarter NUMA-awareness (cpumask_local_spread() doesn't quite cut
> it).
> 
> The proposed interface involved an array of CPUs and a temporary cpumask, and
> being my difficult self what I'm proposing here is an interface that doesn't
> require any temporary storage other than some stack variables (at the cost of
> one wild macro).
> 
> [1]: https://lore.kernel.org/all/20220728191203.4055-1-tariqt@nvidia.com/

Not sure who's expected to take these, no preference here so:

Acked-by: Jakub Kicinski <kuba@kernel.org>

Thanks for ironing it out!

Tariq Toukan Nov. 8, 2022, 11:25 a.m. UTC | #2

On 11/3/2022 4:56 AM, Jakub Kicinski wrote:
> On Fri, 28 Oct 2022 17:49:56 +0100 Valentin Schneider wrote:
>> Tariq pointed out in [1] that drivers allocating IRQ vectors would benefit
>> from having smarter NUMA-awareness (cpumask_local_spread() doesn't quite cut
>> it).
>>
>> The proposed interface involved an array of CPUs and a temporary cpumask, and
>> being my difficult self what I'm proposing here is an interface that doesn't
>> require any temporary storage other than some stack variables (at the cost of
>> one wild macro).
>>
>> [1]: https://lore.kernel.org/all/20220728191203.4055-1-tariqt@nvidia.com/
> 
> Not sure who's expected to take these, no preference here so:
> 
> Acked-by: Jakub Kicinski <kuba@kernel.org>
> 
> Thanks for ironing it out!

Thanks Jakub.

Valentin, what do you think?
Shouldn't it go through the sched branch?

Valentin Schneider Nov. 8, 2022, 12:07 p.m. UTC | #3

On 08/11/22 13:25, Tariq Toukan wrote:
> On 11/3/2022 4:56 AM, Jakub Kicinski wrote:
>> On Fri, 28 Oct 2022 17:49:56 +0100 Valentin Schneider wrote:
>>> Tariq pointed out in [1] that drivers allocating IRQ vectors would benefit
>>> from having smarter NUMA-awareness (cpumask_local_spread() doesn't quite cut
>>> it).
>>>
>>> The proposed interface involved an array of CPUs and a temporary cpumask, and
>>> being my difficult self what I'm proposing here is an interface that doesn't
>>> require any temporary storage other than some stack variables (at the cost of
>>> one wild macro).
>>>
>>> [1]: https://lore.kernel.org/all/20220728191203.4055-1-tariqt@nvidia.com/
>>
>> Not sure who's expected to take these, no preference here so:
>>
>> Acked-by: Jakub Kicinski <kuba@kernel.org>
>>
>> Thanks for ironing it out!
>
> Thanks Jakub.
>
> Valentin, what do you think?
> Shouldn't it go through the sched branch?

So yeah the topology bits should go through tip/sched/core, and given it's
the only user of the new interface, the mlx5e one should probably be
bundled with them.