Message ID | 20200923181126.223766-2-nitesh@redhat.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
Series | isolation: limit msix vectors based on housekeeping CPUs | expand |
On Wed, Sep 23, 2020 at 02:11:23PM -0400, Nitesh Narayan Lal wrote: > Introduce a new API hk_num_online_cpus(), that can be used to > retrieve the number of online housekeeping CPUs that are meant to handle > managed IRQ jobs. > > This API is introduced for the drivers that were previously relying only > on num_online_cpus() to determine the number of MSIX vectors to create. > In an RT environment with large isolated but fewer housekeeping CPUs this > was leading to a situation where an attempt to move all of the vectors > corresponding to isolated CPUs to housekeeping CPUs were failing due to > per CPU vector limit. > > Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> > --- > include/linux/sched/isolation.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h > index cc9f393e2a70..2e96b626e02e 100644 > --- a/include/linux/sched/isolation.h > +++ b/include/linux/sched/isolation.h > @@ -57,4 +57,17 @@ static inline bool housekeeping_cpu(int cpu, enum hk_flags flags) > return true; > } > > +static inline unsigned int hk_num_online_cpus(void) This breaks with the established naming of that header.
On Thu, Sep 24, 2020 at 10:40:29AM +0200, peterz@infradead.org wrote: > On Wed, Sep 23, 2020 at 02:11:23PM -0400, Nitesh Narayan Lal wrote: > > Introduce a new API hk_num_online_cpus(), that can be used to > > retrieve the number of online housekeeping CPUs that are meant to handle > > managed IRQ jobs. > > > > This API is introduced for the drivers that were previously relying only > > on num_online_cpus() to determine the number of MSIX vectors to create. > > In an RT environment with large isolated but fewer housekeeping CPUs this > > was leading to a situation where an attempt to move all of the vectors > > corresponding to isolated CPUs to housekeeping CPUs were failing due to > > per CPU vector limit. > > > > Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> > > --- > > include/linux/sched/isolation.h | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h > > index cc9f393e2a70..2e96b626e02e 100644 > > --- a/include/linux/sched/isolation.h > > +++ b/include/linux/sched/isolation.h > > @@ -57,4 +57,17 @@ static inline bool housekeeping_cpu(int cpu, enum hk_flags flags) > > return true; > > } > > > > +static inline unsigned int hk_num_online_cpus(void) > > This breaks with the established naming of that header. I guess we can make it housekeeping_num_online_cpus() ?
On Wed, Sep 23, 2020 at 02:11:23PM -0400, Nitesh Narayan Lal wrote: > Introduce a new API hk_num_online_cpus(), that can be used to > retrieve the number of online housekeeping CPUs that are meant to handle > managed IRQ jobs. > > This API is introduced for the drivers that were previously relying only > on num_online_cpus() to determine the number of MSIX vectors to create. > In an RT environment with large isolated but fewer housekeeping CPUs this > was leading to a situation where an attempt to move all of the vectors > corresponding to isolated CPUs to housekeeping CPUs were failing due to > per CPU vector limit. > > Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> > --- > include/linux/sched/isolation.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h > index cc9f393e2a70..2e96b626e02e 100644 > --- a/include/linux/sched/isolation.h > +++ b/include/linux/sched/isolation.h > @@ -57,4 +57,17 @@ static inline bool housekeeping_cpu(int cpu, enum hk_flags flags) > return true; > } > > +static inline unsigned int hk_num_online_cpus(void) > +{ > +#ifdef CONFIG_CPU_ISOLATION > + const struct cpumask *hk_mask; > + > + if (static_branch_unlikely(&housekeeping_overridden)) { > + hk_mask = housekeeping_cpumask(HK_FLAG_MANAGED_IRQ); HK_FLAG_MANAGED_IRQ should be pass as an argument to the function: housekeeping_num_online_cpus(HK_FLAG_MANAGED_IRQ) because it's completely arbitrary otherwise. > + return cpumask_weight(hk_mask); > + } > +#endif > + return cpumask_weight(cpu_online_mask); > +} > + > #endif /* _LINUX_SCHED_ISOLATION_H */ > -- > 2.18.2 >
On 9/24/20 8:09 AM, Frederic Weisbecker wrote: > On Thu, Sep 24, 2020 at 10:40:29AM +0200, peterz@infradead.org wrote: >> On Wed, Sep 23, 2020 at 02:11:23PM -0400, Nitesh Narayan Lal wrote: >>> Introduce a new API hk_num_online_cpus(), that can be used to >>> retrieve the number of online housekeeping CPUs that are meant to handle >>> managed IRQ jobs. >>> >>> This API is introduced for the drivers that were previously relying only >>> on num_online_cpus() to determine the number of MSIX vectors to create. >>> In an RT environment with large isolated but fewer housekeeping CPUs this >>> was leading to a situation where an attempt to move all of the vectors >>> corresponding to isolated CPUs to housekeeping CPUs were failing due to >>> per CPU vector limit. >>> >>> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> >>> --- >>> include/linux/sched/isolation.h | 13 +++++++++++++ >>> 1 file changed, 13 insertions(+) >>> >>> diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h >>> index cc9f393e2a70..2e96b626e02e 100644 >>> --- a/include/linux/sched/isolation.h >>> +++ b/include/linux/sched/isolation.h >>> @@ -57,4 +57,17 @@ static inline bool housekeeping_cpu(int cpu, enum hk_flags flags) >>> return true; >>> } >>> >>> +static inline unsigned int hk_num_online_cpus(void) >> This breaks with the established naming of that header. > I guess we can make it housekeeping_num_online_cpus() ? Right, I can do that. Thanks >
On Thu, Sep 24, 2020 at 02:09:57PM +0200, Frederic Weisbecker wrote: > > > +static inline unsigned int hk_num_online_cpus(void) > > > > This breaks with the established naming of that header. > > I guess we can make it housekeeping_num_online_cpus() ? That would be consistent :-)
On 9/24/20 8:11 AM, Frederic Weisbecker wrote: > On Wed, Sep 23, 2020 at 02:11:23PM -0400, Nitesh Narayan Lal wrote: >> Introduce a new API hk_num_online_cpus(), that can be used to >> retrieve the number of online housekeeping CPUs that are meant to handle >> managed IRQ jobs. >> >> This API is introduced for the drivers that were previously relying only >> on num_online_cpus() to determine the number of MSIX vectors to create. >> In an RT environment with large isolated but fewer housekeeping CPUs this >> was leading to a situation where an attempt to move all of the vectors >> corresponding to isolated CPUs to housekeeping CPUs were failing due to >> per CPU vector limit. >> >> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> >> --- >> include/linux/sched/isolation.h | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h >> index cc9f393e2a70..2e96b626e02e 100644 >> --- a/include/linux/sched/isolation.h >> +++ b/include/linux/sched/isolation.h >> @@ -57,4 +57,17 @@ static inline bool housekeeping_cpu(int cpu, enum hk_flags flags) >> return true; >> } >> >> +static inline unsigned int hk_num_online_cpus(void) >> +{ >> +#ifdef CONFIG_CPU_ISOLATION >> + const struct cpumask *hk_mask; >> + >> + if (static_branch_unlikely(&housekeeping_overridden)) { >> + hk_mask = housekeeping_cpumask(HK_FLAG_MANAGED_IRQ); > HK_FLAG_MANAGED_IRQ should be pass as an argument to the function: > > housekeeping_num_online_cpus(HK_FLAG_MANAGED_IRQ) because it's > completely arbitrary otherwise. Yeap that is more sensible, I will do that. Do you have any other concerns/suggestions on any other patch? > >> + return cpumask_weight(hk_mask); >> + } >> +#endif >> + return cpumask_weight(cpu_online_mask); >> +} >> + >> #endif /* _LINUX_SCHED_ISOLATION_H */ >> -- >> 2.18.2 >>
FWIW, cross-posting to moderated lists is annoying. I don't know why we allow them in MAINTAINERS :-(
On 9/24/20 8:46 AM, Peter Zijlstra wrote: > > FWIW, cross-posting to moderated lists is annoying. I don't know why we > allow them in MAINTAINERS :-( Yeah, it sends out an acknowledgment for every email. I had to include it because sending the patches to it apparently allows them to get tested by Intel folks. >
On Wed, Sep 23, 2020 at 02:11:23PM -0400, Nitesh Narayan Lal wrote: > Introduce a new API hk_num_online_cpus(), that can be used to > retrieve the number of online housekeeping CPUs that are meant to handle > managed IRQ jobs. > > This API is introduced for the drivers that were previously relying only > on num_online_cpus() to determine the number of MSIX vectors to create. > In an RT environment with large isolated but fewer housekeeping CPUs this > was leading to a situation where an attempt to move all of the vectors > corresponding to isolated CPUs to housekeeping CPUs were failing due to > per CPU vector limit. > > Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> > --- > include/linux/sched/isolation.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h > index cc9f393e2a70..2e96b626e02e 100644 > --- a/include/linux/sched/isolation.h > +++ b/include/linux/sched/isolation.h > @@ -57,4 +57,17 @@ static inline bool housekeeping_cpu(int cpu, enum hk_flags flags) > return true; > } > > +static inline unsigned int hk_num_online_cpus(void) > +{ > +#ifdef CONFIG_CPU_ISOLATION > + const struct cpumask *hk_mask; > + > + if (static_branch_unlikely(&housekeeping_overridden)) { > + hk_mask = housekeeping_cpumask(HK_FLAG_MANAGED_IRQ); > + return cpumask_weight(hk_mask); > + } > +#endif > + return cpumask_weight(cpu_online_mask); Just curious: why is this not #ifdef CONFIG_CPU_ISOLATION ... #endif return num_online_cpus(); > +} > + > #endif /* _LINUX_SCHED_ISOLATION_H */ > -- > 2.18.2 >
On 9/24/20 4:47 PM, Bjorn Helgaas wrote: > On Wed, Sep 23, 2020 at 02:11:23PM -0400, Nitesh Narayan Lal wrote: >> Introduce a new API hk_num_online_cpus(), that can be used to >> retrieve the number of online housekeeping CPUs that are meant to handle >> managed IRQ jobs. >> >> This API is introduced for the drivers that were previously relying only >> on num_online_cpus() to determine the number of MSIX vectors to create. >> In an RT environment with large isolated but fewer housekeeping CPUs this >> was leading to a situation where an attempt to move all of the vectors >> corresponding to isolated CPUs to housekeeping CPUs were failing due to >> per CPU vector limit. >> >> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> >> --- >> include/linux/sched/isolation.h | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h >> index cc9f393e2a70..2e96b626e02e 100644 >> --- a/include/linux/sched/isolation.h >> +++ b/include/linux/sched/isolation.h >> @@ -57,4 +57,17 @@ static inline bool housekeeping_cpu(int cpu, enum hk_flags flags) >> return true; >> } >> >> +static inline unsigned int hk_num_online_cpus(void) >> +{ >> +#ifdef CONFIG_CPU_ISOLATION >> + const struct cpumask *hk_mask; >> + >> + if (static_branch_unlikely(&housekeeping_overridden)) { >> + hk_mask = housekeeping_cpumask(HK_FLAG_MANAGED_IRQ); >> + return cpumask_weight(hk_mask); >> + } >> +#endif >> + return cpumask_weight(cpu_online_mask); > Just curious: why is this not > > #ifdef CONFIG_CPU_ISOLATION > ... > #endif > return num_online_cpus(); I think doing an atomic read is better than a bitmap operation. Thanks for pointing this out. > >> +} >> + >> #endif /* _LINUX_SCHED_ISOLATION_H */ >> -- >> 2.18.2 >>
diff --git a/include/linux/sched/isolation.h b/include/linux/sched/isolation.h index cc9f393e2a70..2e96b626e02e 100644 --- a/include/linux/sched/isolation.h +++ b/include/linux/sched/isolation.h @@ -57,4 +57,17 @@ static inline bool housekeeping_cpu(int cpu, enum hk_flags flags) return true; } +static inline unsigned int hk_num_online_cpus(void) +{ +#ifdef CONFIG_CPU_ISOLATION + const struct cpumask *hk_mask; + + if (static_branch_unlikely(&housekeeping_overridden)) { + hk_mask = housekeeping_cpumask(HK_FLAG_MANAGED_IRQ); + return cpumask_weight(hk_mask); + } +#endif + return cpumask_weight(cpu_online_mask); +} + #endif /* _LINUX_SCHED_ISOLATION_H */
Introduce a new API hk_num_online_cpus(), that can be used to retrieve the number of online housekeeping CPUs that are meant to handle managed IRQ jobs. This API is introduced for the drivers that were previously relying only on num_online_cpus() to determine the number of MSIX vectors to create. In an RT environment with large isolated but fewer housekeeping CPUs this was leading to a situation where an attempt to move all of the vectors corresponding to isolated CPUs to housekeeping CPUs were failing due to per CPU vector limit. Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com> --- include/linux/sched/isolation.h | 13 +++++++++++++ 1 file changed, 13 insertions(+)