diff mbox series

entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up

Message ID 20221223211507.84249-1-frederic@kernel.org (mailing list archive)
State Accepted
Commit b059492a647e6ae9283564179f5bea81b0426671
Headers show
Series entry/rcu: Check TIF_RESCHED _after_ delayed RCU wake-up | expand

Commit Message

Frederic Weisbecker Dec. 23, 2022, 9:15 p.m. UTC
RCU sometimes needs to perform a delayed wake up for specific kthreads
handling callbacks offloading (RCU_NOCB). This is handled through timers
and upon entry to idle (also guest and user on nohz_full).

However the delayed wake-up on kernel exit is actually performed after
the thread flags are fetched towards the fast path check for work to
do on exit to user. As a result, and if there is no other pending work
to do upon that kernel exit, the current task will resume to userspace
with TIF_RESCHED set and the pending wake up ignored.

Fix this with fetching the thread flags _after_ the delayed RCU-nocb
kthread wake-up.

Fixes: 47b8ff194c1f ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/entry/common.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Paul E. McKenney Jan. 4, 2023, 8:52 p.m. UTC | #1
On Fri, Dec 23, 2022 at 10:15:07PM +0100, Frederic Weisbecker wrote:
> RCU sometimes needs to perform a delayed wake up for specific kthreads
> handling offloaded callbacks (RCU_NOCB).  These wakeups are performed
> by timers and upon entry to idle (also to guest and to user on nohz_full).
> 
> However the delayed wake-up on kernel exit is actually performed after
> the thread flags are fetched towards the fast path check for work to
> do on exit to user. As a result, and if there is no other pending work
> to do upon that kernel exit, the current task will resume to userspace
> with TIF_RESCHED set and the pending wake up ignored.
> 
> Fix this with fetching the thread flags _after_ the delayed RCU-nocb
> kthread wake-up.
> 
> Fixes: 47b8ff194c1f ("entry: Explicitly flush pending rcuog wakeup before last rescheduling point")
> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>

Queued and pushed, thank you!

							Thanx, Paul

> ---
>  kernel/entry/common.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index 846add8394c4..a134e26b58c6 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -192,13 +192,14 @@ static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
>  
>  static void exit_to_user_mode_prepare(struct pt_regs *regs)
>  {
> -	unsigned long ti_work = read_thread_flags();
> +	unsigned long ti_work;
>  
>  	lockdep_assert_irqs_disabled();
>  
>  	/* Flush pending rcuog wakeup before the last need_resched() check */
>  	tick_nohz_user_enter_prepare();
>  
> +	ti_work = read_thread_flags();
>  	if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK))
>  		ti_work = exit_to_user_mode_loop(regs, ti_work);
>  
> -- 
> 2.25.1
>
diff mbox series

Patch

diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 846add8394c4..a134e26b58c6 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -192,13 +192,14 @@  static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
 
 static void exit_to_user_mode_prepare(struct pt_regs *regs)
 {
-	unsigned long ti_work = read_thread_flags();
+	unsigned long ti_work;
 
 	lockdep_assert_irqs_disabled();
 
 	/* Flush pending rcuog wakeup before the last need_resched() check */
 	tick_nohz_user_enter_prepare();
 
+	ti_work = read_thread_flags();
 	if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK))
 		ti_work = exit_to_user_mode_loop(regs, ti_work);