diff mbox series

[v3] rcu-tasks: Make RCU Tasks Trace checking for userspace execution

Message ID 20220719043900.437443-1-qiang1.zhang@intel.com (mailing list archive)
State New, archived
Headers show
Series [v3] rcu-tasks: Make RCU Tasks Trace checking for userspace execution | expand

Commit Message

Zqiang July 19, 2022, 4:39 a.m. UTC
For RCU tasks trace, the userspace execution is also a valid quiescent.
when the scheduling clock interrupt handler interrupts current task and
check current tasks running in userspace, then invoke rcu_tasks_trace_qs()
to check quiescent state, usually, the current tasks ->trc_reader_nesting
should be zero, if the current tasks ->trc_reader_special.b.need_qs is not
set, set TRC_NEED_QS_CHECKED to ->trc_reader_special.b.need_qs. this cause
grace period kthread remove task from holdout list if current tasks is in
holdout list.

But sometimes, although the scheduling clock interrupt handler check
current tasks running in userspace, but the current tasks
->trc_reader_nesting maybe not zero (if one of the functions in the
scheduling-clock interrupt handler were traced/instrumented), and then
invoke rcu_tasks_trace_qs(), if the current tasks ->trc_reader_nesting
is still not zero, the current tasks will be insert local CPU blocked list.
if starts a new RCU tasks trace grace period and the grace period kthread
scan running tasks on each CPU, find that current tasks is running, will
also insert it to hold out list.

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 v1->v2:
 Fix build error due to undeclared rcu_tasks_trace_qs(), note in no-PREEMPT_RCU
 kernel, the RCU Tasks is replaced by RCU, so rcu_note_voluntary_context_switch()
 only include rcu_tasks_trace_qs().
 
 v2->v3:
 Modify commit information.

 kernel/rcu/tree_plugin.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Paul E. McKenney July 19, 2022, 9:57 p.m. UTC | #1
On Tue, Jul 19, 2022 at 12:39:00PM +0800, Zqiang wrote:
> For RCU tasks trace, the userspace execution is also a valid quiescent.
> when the scheduling clock interrupt handler interrupts current task and
> check current tasks running in userspace, then invoke rcu_tasks_trace_qs()
> to check quiescent state, usually, the current tasks ->trc_reader_nesting
> should be zero, if the current tasks ->trc_reader_special.b.need_qs is not
> set, set TRC_NEED_QS_CHECKED to ->trc_reader_special.b.need_qs. this cause
> grace period kthread remove task from holdout list if current tasks is in
> holdout list.
> 
> But sometimes, although the scheduling clock interrupt handler check
> current tasks running in userspace, but the current tasks
> ->trc_reader_nesting maybe not zero (if one of the functions in the
> scheduling-clock interrupt handler were traced/instrumented), and then
> invoke rcu_tasks_trace_qs(), if the current tasks ->trc_reader_nesting
> is still not zero, the current tasks will be insert local CPU blocked list.
> if starts a new RCU tasks trace grace period and the grace period kthread
> scan running tasks on each CPU, find that current tasks is running, will
> also insert it to hold out list.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>

Thank you!  I queued this with wordsmithing and also with a bit of code
consolidation.  With this consolidation, instead of adding a line of code,
we are removing three of them.

But please check to see whether I messed anything up along the way.

							Thanx, Paul

------------------------------------------------------------------------

commit c3e7fd0449642df8c427b442751fbe593e4673df
Author: Zqiang <qiang1.zhang@intel.com>
Date:   Tue Jul 19 12:39:00 2022 +0800

    rcu-tasks: Make RCU Tasks Trace check for userspace execution
    
    Userspace execution is a valid quiescent state for RCU Tasks Trace,
    but the scheduling-clock interrupt does not currently report such
    quiescent states.
    
    Of course, the scheduling-clock interrupt is not strictly speaking
    userspace execution.  However, the only way that this code is not
    in a quiescent state is if something invoked rcu_read_lock_trace(),
    and that would be reflected in the ->trc_reader_nesting field in
    the task_struct structure.  Furthermore, this field is checked by
    rcu_tasks_trace_qs(), which is invoked by rcu_tasks_qs() which is in
    turn invoked by rcu_note_voluntary_context_switch() in kernels building
    at least one of the RCU Tasks flavors.  It is therefore safe to invoke
    rcu_tasks_trace_qs() from the rcu_sched_clock_irq().
    
    But rcu_tasks_qs() also invokes rcu_tasks_classic_qs() for RCU
    Tasks, which lacks the read-side markers provided by RCU Tasks Trace.
    This raises the possibility that an RCU Tasks grace period could start
    after the interrupt from userspace execution, but before the call to
    rcu_sched_clock_irq().  However, it turns out that this is safe because
    the RCU Tasks grace period waits for an RCU grace period, which will
    wait for the entire scheduling-clock interrupt handler, including any
    RCU Tasks read-side critical section that this handler might contain.
    
    This commit therefore updates the rcu_sched_clock_irq() function's
    check for usermode execution and its call to rcu_tasks_classic_qs()
    to instead check for both usermode execution and interrupt from idle,
    and to instead call rcu_note_voluntary_context_switch().  This
    consolidates code and provides more faster RCU Tasks Trace
    reporting of quiescent states in kernels that do scheduling-clock
    interrupts for userspace execution.
    
    [ paulmck: Consolidate checks into rcu_sched_clock_irq(). ]
    
    Signed-off-by: Zqiang <qiang1.zhang@intel.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 84d2817766888..2122359f0c862 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2341,8 +2341,8 @@ void rcu_sched_clock_irq(int user)
 	rcu_flavor_sched_clock_irq(user);
 	if (rcu_pending(user))
 		invoke_rcu_core();
-	if (user)
-		rcu_tasks_classic_qs(current, false);
+	if (user || rcu_is_cpu_rrupt_from_idle())
+		rcu_note_voluntary_context_switch(current);
 	lockdep_assert_irqs_disabled();
 
 	trace_rcu_utilization(TPS("End scheduler-tick"));
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 4152816dd29f6..b2219577fbe2d 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -718,9 +718,6 @@ static void rcu_flavor_sched_clock_irq(int user)
 	struct task_struct *t = current;
 
 	lockdep_assert_irqs_disabled();
-	if (user || rcu_is_cpu_rrupt_from_idle()) {
-		rcu_note_voluntary_context_switch(current);
-	}
 	if (rcu_preempt_depth() > 0 ||
 	    (preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK))) {
 		/* No QS, force context switch if deferred. */
@@ -976,7 +973,6 @@ static void rcu_flavor_sched_clock_irq(int user)
 		 * neither access nor modify, at least not while the
 		 * corresponding CPU is online.
 		 */
-
 		rcu_qs();
 	}
 }
Zqiang July 20, 2022, 10:29 a.m. UTC | #2
On Tue, Jul 19, 2022 at 12:39:00PM +0800, Zqiang wrote:
> For RCU tasks trace, the userspace execution is also a valid quiescent.
> when the scheduling clock interrupt handler interrupts current task and
> check current tasks running in userspace, then invoke rcu_tasks_trace_qs()
> to check quiescent state, usually, the current tasks ->trc_reader_nesting
> should be zero, if the current tasks ->trc_reader_special.b.need_qs is not
> set, set TRC_NEED_QS_CHECKED to ->trc_reader_special.b.need_qs. this cause
> grace period kthread remove task from holdout list if current tasks is in
> holdout list.
> 
> But sometimes, although the scheduling clock interrupt handler check
> current tasks running in userspace, but the current tasks
> ->trc_reader_nesting maybe not zero (if one of the functions in the
> scheduling-clock interrupt handler were traced/instrumented), and then
> invoke rcu_tasks_trace_qs(), if the current tasks ->trc_reader_nesting
> is still not zero, the current tasks will be insert local CPU blocked list.
> if starts a new RCU tasks trace grace period and the grace period kthread
> scan running tasks on each CPU, find that current tasks is running, will
> also insert it to hold out list.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
>
>Thank you!  I queued this with wordsmithing and also with a bit of code
>consolidation.  With this consolidation, instead of adding a line of code,
>we are removing three of them.
>
>But please check to see whether I messed anything up along the way.

Thanks wordsmithing 
diff mbox series

Patch

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 4152816dd29f..5fb0b2dd24fd 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -976,7 +976,7 @@  static void rcu_flavor_sched_clock_irq(int user)
 		 * neither access nor modify, at least not while the
 		 * corresponding CPU is online.
 		 */
-
+		rcu_note_voluntary_context_switch(current);
 		rcu_qs();
 	}
 }