Message ID | 20231107215742.363031-41-ankur.a.arora@oracle.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Make the kernel preemptible | expand |
On Tue, Nov 07, 2023 at 01:57:26PM -0800, Ankur Arora wrote: > While making up its mind about whether to reschedule a target > runqueue eagerly or lazily, resched_curr() needs to know if the > target is executing in the kernel or in userspace. > > Add ct_state_cpu(). > > Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> > > --- > Using context-tracking for this seems like overkill. Is there a better > way to achieve this? One problem with depending on user_enter() is that > it happens much too late for our purposes. From the scheduler's > point-of-view the exit state has effectively transitioned once the > task exits the exit_to_user_loop() so we will see stale state > while the task is done with exit_to_user_loop() but has not yet > executed user_enter(). > > --- > include/linux/context_tracking_state.h | 21 +++++++++++++++++++++ > kernel/Kconfig.preempt | 1 + > 2 files changed, 22 insertions(+) > > diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h > index bbff5f7f8803..6a8f1c7ba105 100644 > --- a/include/linux/context_tracking_state.h > +++ b/include/linux/context_tracking_state.h > @@ -53,6 +53,13 @@ static __always_inline int __ct_state(void) > { > return raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_STATE_MASK; > } > + > +static __always_inline int __ct_state_cpu(int cpu) > +{ > + struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); > + > + return atomic_read(&ct->state) & CT_STATE_MASK; > +} > #endif > > #ifdef CONFIG_CONTEXT_TRACKING_IDLE > @@ -139,6 +146,20 @@ static __always_inline int ct_state(void) > return ret; > } > > +static __always_inline int ct_state_cpu(int cpu) > +{ > + int ret; > + > + if (!context_tracking_enabled_cpu(cpu)) > + return CONTEXT_DISABLED; > + > + preempt_disable(); > + ret = __ct_state_cpu(cpu); > + preempt_enable(); > + > + return ret; > +} Those preempt_disable/enable are pointless. But this patch is problematic, you do *NOT* want to rely on context tracking. Context tracking adds atomics to the entry path, this is slow and even with CONFIG_CONTEXT_TRACKING it is disabled until you configure the NOHZ_FULL nonsense. This simply cannot be.
Peter Zijlstra <peterz@infradead.org> writes: > On Tue, Nov 07, 2023 at 01:57:26PM -0800, Ankur Arora wrote: >> While making up its mind about whether to reschedule a target >> runqueue eagerly or lazily, resched_curr() needs to know if the >> target is executing in the kernel or in userspace. >> >> Add ct_state_cpu(). >> >> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> >> >> --- >> Using context-tracking for this seems like overkill. Is there a better >> way to achieve this? One problem with depending on user_enter() is that >> it happens much too late for our purposes. From the scheduler's >> point-of-view the exit state has effectively transitioned once the >> task exits the exit_to_user_loop() so we will see stale state >> while the task is done with exit_to_user_loop() but has not yet >> executed user_enter(). >> >> --- >> include/linux/context_tracking_state.h | 21 +++++++++++++++++++++ >> kernel/Kconfig.preempt | 1 + >> 2 files changed, 22 insertions(+) >> >> diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h >> index bbff5f7f8803..6a8f1c7ba105 100644 >> --- a/include/linux/context_tracking_state.h >> +++ b/include/linux/context_tracking_state.h >> @@ -53,6 +53,13 @@ static __always_inline int __ct_state(void) >> { >> return raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_STATE_MASK; >> } >> + >> +static __always_inline int __ct_state_cpu(int cpu) >> +{ >> + struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); >> + >> + return atomic_read(&ct->state) & CT_STATE_MASK; >> +} >> #endif >> >> #ifdef CONFIG_CONTEXT_TRACKING_IDLE >> @@ -139,6 +146,20 @@ static __always_inline int ct_state(void) >> return ret; >> } >> >> +static __always_inline int ct_state_cpu(int cpu) >> +{ >> + int ret; >> + >> + if (!context_tracking_enabled_cpu(cpu)) >> + return CONTEXT_DISABLED; >> + >> + preempt_disable(); >> + ret = __ct_state_cpu(cpu); >> + preempt_enable(); >> + >> + return ret; >> +} > > Those preempt_disable/enable are pointless. > > But this patch is problematic, you do *NOT* want to rely on context > tracking. Context tracking adds atomics to the entry path, this is slow > and even with CONFIG_CONTEXT_TRACKING it is disabled until you configure > the NOHZ_FULL nonsense. Yeah, I had missed the fact that even though the ct->state was updated for both ct->active, !ct->active but the static branch was only enabled with NOHZ_FULL. Will drop. -- ankur
diff --git a/include/linux/context_tracking_state.h b/include/linux/context_tracking_state.h index bbff5f7f8803..6a8f1c7ba105 100644 --- a/include/linux/context_tracking_state.h +++ b/include/linux/context_tracking_state.h @@ -53,6 +53,13 @@ static __always_inline int __ct_state(void) { return raw_atomic_read(this_cpu_ptr(&context_tracking.state)) & CT_STATE_MASK; } + +static __always_inline int __ct_state_cpu(int cpu) +{ + struct context_tracking *ct = per_cpu_ptr(&context_tracking, cpu); + + return atomic_read(&ct->state) & CT_STATE_MASK; +} #endif #ifdef CONFIG_CONTEXT_TRACKING_IDLE @@ -139,6 +146,20 @@ static __always_inline int ct_state(void) return ret; } +static __always_inline int ct_state_cpu(int cpu) +{ + int ret; + + if (!context_tracking_enabled_cpu(cpu)) + return CONTEXT_DISABLED; + + preempt_disable(); + ret = __ct_state_cpu(cpu); + preempt_enable(); + + return ret; +} + #else static __always_inline bool context_tracking_enabled(void) { return false; } static __always_inline bool context_tracking_enabled_cpu(int cpu) { return false; } diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt index 715e7aebb9d8..aa87b5cd3ecc 100644 --- a/kernel/Kconfig.preempt +++ b/kernel/Kconfig.preempt @@ -80,6 +80,7 @@ config PREEMPT_COUNT config PREEMPTION bool select PREEMPT_COUNT + select CONTEXT_TRACKING_USER config SCHED_CORE bool "Core Scheduling for SMT"
While making up its mind about whether to reschedule a target runqueue eagerly or lazily, resched_curr() needs to know if the target is executing in the kernel or in userspace. Add ct_state_cpu(). Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> --- Using context-tracking for this seems like overkill. Is there a better way to achieve this? One problem with depending on user_enter() is that it happens much too late for our purposes. From the scheduler's point-of-view the exit state has effectively transitioned once the task exits the exit_to_user_loop() so we will see stale state while the task is done with exit_to_user_loop() but has not yet executed user_enter(). --- include/linux/context_tracking_state.h | 21 +++++++++++++++++++++ kernel/Kconfig.preempt | 1 + 2 files changed, 22 insertions(+)