diff mbox series

[RFC,06/10] rcu/hotplug: Make rcutree_dead_cpu() parallel

Message ID 20220822021520.6996-7-kernelfans@gmail.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Pingfan Liu Aug. 22, 2022, 2:15 a.m. UTC
In order to support parallel, rcu_state.n_online_cpus should be
atomic_dec()

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Price <steven.price@arm.com>
Cc: "Peter Zijlstra
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
To: linux-kernel@vger.kernel.org
To: rcu@vger.kernel.org
---
 kernel/cpu.c      | 1 +
 kernel/rcu/tree.c | 3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)

Comments

Paul E. McKenney Aug. 22, 2022, 2:45 a.m. UTC | #1
On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> In order to support parallel, rcu_state.n_online_cpus should be
> atomic_dec()
> 
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>

I have to ask...  What testing have you subjected this patch to?

							Thanx, Paul

> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Steven Price <steven.price@arm.com>
> Cc: "Peter Zijlstra
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> To: linux-kernel@vger.kernel.org
> To: rcu@vger.kernel.org
> ---
>  kernel/cpu.c      | 1 +
>  kernel/rcu/tree.c | 3 ++-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 1261c3f3be51..90debbe28e85 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1872,6 +1872,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
>  		.name			= "RCU/tree:prepare",
>  		.startup.single		= rcutree_prepare_cpu,
>  		.teardown.single	= rcutree_dead_cpu,
> +		.support_kexec_parallel	= true,
>  	},
>  	/*
>  	 * On the tear-down path, timers_dead_cpu() must be invoked
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 79aea7df4345..07d31e16c65e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2168,7 +2168,8 @@ int rcutree_dead_cpu(unsigned int cpu)
>  	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
>  		return 0;
>  
> -	WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> +	/* Hot remove path allows parallel, while Hot add races against remove on lock */
> +	atomic_dec((atomic_t *)&rcu_state.n_online_cpus);
>  	/* Adjust any no-longer-needed kthreads. */
>  	rcu_boost_kthread_setaffinity(rnp, -1);
>  	// Stop-machine done, so allow nohz_full to disable tick.
> -- 
> 2.31.1
>
Joel Fernandes Aug. 22, 2022, 6:08 p.m. UTC | #2
On Sun, Aug 21, 2022 at 10:16 PM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> In order to support parallel, rcu_state.n_online_cpus should be
> atomic_dec()

What does Parallel mean? Is that some kexec terminology?

Thanks,

 - Joel

>
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: "Paul E. McKenney" <paulmck@kernel.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> Cc: Josh Triplett <josh@joshtriplett.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> Cc: Joel Fernandes <joel@joelfernandes.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Steven Price <steven.price@arm.com>
> Cc: "Peter Zijlstra
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> To: linux-kernel@vger.kernel.org
> To: rcu@vger.kernel.org
> ---
>  kernel/cpu.c      | 1 +
>  kernel/rcu/tree.c | 3 ++-
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 1261c3f3be51..90debbe28e85 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -1872,6 +1872,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
>                 .name                   = "RCU/tree:prepare",
>                 .startup.single         = rcutree_prepare_cpu,
>                 .teardown.single        = rcutree_dead_cpu,
> +               .support_kexec_parallel = true,
>         },
>         /*
>          * On the tear-down path, timers_dead_cpu() must be invoked
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 79aea7df4345..07d31e16c65e 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2168,7 +2168,8 @@ int rcutree_dead_cpu(unsigned int cpu)
>         if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
>                 return 0;
>
> -       WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> +       /* Hot remove path allows parallel, while Hot add races against remove on lock */
> +       atomic_dec((atomic_t *)&rcu_state.n_online_cpus);
>         /* Adjust any no-longer-needed kthreads. */
>         rcu_boost_kthread_setaffinity(rnp, -1);
>         // Stop-machine done, so allow nohz_full to disable tick.
> --
> 2.31.1
>
Pingfan Liu Aug. 23, 2022, 1:50 a.m. UTC | #3
On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> > In order to support parallel, rcu_state.n_online_cpus should be
> > atomic_dec()
> > 
> > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> 
> I have to ask...  What testing have you subjected this patch to?
> 

This patch subjects to [1]. The series aims to enable kexec-reboot in
parallel on all cpu. As a result, the involved RCU part is expected to
support parallel.

[1]: https://lore.kernel.org/linux-arm-kernel/20220822021520.6996-3-kernelfans@gmail.com/T/#mf62352138d7b040fdb583ba66f8cd0ed1e145feb

Thanks,

	Pingfan


> 							Thanx, Paul
> 
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> > Cc: Josh Triplett <josh@joshtriplett.org>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: "Peter Zijlstra
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> > To: linux-kernel@vger.kernel.org
> > To: rcu@vger.kernel.org
> > ---
> >  kernel/cpu.c      | 1 +
> >  kernel/rcu/tree.c | 3 ++-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 1261c3f3be51..90debbe28e85 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -1872,6 +1872,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
> >  		.name			= "RCU/tree:prepare",
> >  		.startup.single		= rcutree_prepare_cpu,
> >  		.teardown.single	= rcutree_dead_cpu,
> > +		.support_kexec_parallel	= true,
> >  	},
> >  	/*
> >  	 * On the tear-down path, timers_dead_cpu() must be invoked
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 79aea7df4345..07d31e16c65e 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2168,7 +2168,8 @@ int rcutree_dead_cpu(unsigned int cpu)
> >  	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> >  		return 0;
> >  
> > -	WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> > +	/* Hot remove path allows parallel, while Hot add races against remove on lock */
> > +	atomic_dec((atomic_t *)&rcu_state.n_online_cpus);
> >  	/* Adjust any no-longer-needed kthreads. */
> >  	rcu_boost_kthread_setaffinity(rnp, -1);
> >  	// Stop-machine done, so allow nohz_full to disable tick.
> > -- 
> > 2.31.1
> >
Pingfan Liu Aug. 23, 2022, 1:56 a.m. UTC | #4
On Mon, Aug 22, 2022 at 02:08:38PM -0400, Joel Fernandes wrote:
> On Sun, Aug 21, 2022 at 10:16 PM Pingfan Liu <kernelfans@gmail.com> wrote:
> >
> > In order to support parallel, rcu_state.n_online_cpus should be
> > atomic_dec()
> 
> What does Parallel mean? Is that some kexec terminology?
> 

'Parallel' means concurrent. It is not a kexec terminology, instead,
should be SMP.

Thanks,

	Pingfan


> Thanks,
> 
>  - Joel
> 
> >
> > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> > Cc: Josh Triplett <josh@joshtriplett.org>
> > Cc: Steven Rostedt <rostedt@goodmis.org>
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > Cc: Joel Fernandes <joel@joelfernandes.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Steven Price <steven.price@arm.com>
> > Cc: "Peter Zijlstra
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> > To: linux-kernel@vger.kernel.org
> > To: rcu@vger.kernel.org
> > ---
> >  kernel/cpu.c      | 1 +
> >  kernel/rcu/tree.c | 3 ++-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > index 1261c3f3be51..90debbe28e85 100644
> > --- a/kernel/cpu.c
> > +++ b/kernel/cpu.c
> > @@ -1872,6 +1872,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
> >                 .name                   = "RCU/tree:prepare",
> >                 .startup.single         = rcutree_prepare_cpu,
> >                 .teardown.single        = rcutree_dead_cpu,
> > +               .support_kexec_parallel = true,
> >         },
> >         /*
> >          * On the tear-down path, timers_dead_cpu() must be invoked
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 79aea7df4345..07d31e16c65e 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2168,7 +2168,8 @@ int rcutree_dead_cpu(unsigned int cpu)
> >         if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> >                 return 0;
> >
> > -       WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> > +       /* Hot remove path allows parallel, while Hot add races against remove on lock */
> > +       atomic_dec((atomic_t *)&rcu_state.n_online_cpus);
> >         /* Adjust any no-longer-needed kthreads. */
> >         rcu_boost_kthread_setaffinity(rnp, -1);
> >         // Stop-machine done, so allow nohz_full to disable tick.
> > --
> > 2.31.1
> >
Paul E. McKenney Aug. 23, 2022, 3:01 a.m. UTC | #5
On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
> On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> > On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> > > In order to support parallel, rcu_state.n_online_cpus should be
> > > atomic_dec()
> > > 
> > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > 
> > I have to ask...  What testing have you subjected this patch to?
> > 
> 
> This patch subjects to [1]. The series aims to enable kexec-reboot in
> parallel on all cpu. As a result, the involved RCU part is expected to
> support parallel.

I understand (and even sympathize with) the expectation.  But results
sometimes diverge from expectations.  There have been implicit assumptions
in RCU about only one CPU going offline at a time, and I am not sure
that all of them have been addressed.  Concurrent CPU onlining has
been looked at recently here:

https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing

You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
atomic, which is good.  Did you look through the rest of RCU's CPU-offline
code paths and related code paths?

> [1]: https://lore.kernel.org/linux-arm-kernel/20220822021520.6996-3-kernelfans@gmail.com/T/#mf62352138d7b040fdb583ba66f8cd0ed1e145feb

Perhaps I am more blind than usual today, but I am not seeing anything
in this patch describing the testing.  At this point, I am thinking in
terms of making rcutorture test concurrent CPU offlining.

Thoughts?

							Thanx, Paul

> Thanks,
> 
> 	Pingfan
> 
> 
> > 							Thanx, Paul
> > 
> > > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > > Cc: Frederic Weisbecker <frederic@kernel.org>
> > > Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> > > Cc: Josh Triplett <josh@joshtriplett.org>
> > > Cc: Steven Rostedt <rostedt@goodmis.org>
> > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > > Cc: Joel Fernandes <joel@joelfernandes.org>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Steven Price <steven.price@arm.com>
> > > Cc: "Peter Zijlstra
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> > > To: linux-kernel@vger.kernel.org
> > > To: rcu@vger.kernel.org
> > > ---
> > >  kernel/cpu.c      | 1 +
> > >  kernel/rcu/tree.c | 3 ++-
> > >  2 files changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > > index 1261c3f3be51..90debbe28e85 100644
> > > --- a/kernel/cpu.c
> > > +++ b/kernel/cpu.c
> > > @@ -1872,6 +1872,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
> > >  		.name			= "RCU/tree:prepare",
> > >  		.startup.single		= rcutree_prepare_cpu,
> > >  		.teardown.single	= rcutree_dead_cpu,
> > > +		.support_kexec_parallel	= true,
> > >  	},
> > >  	/*
> > >  	 * On the tear-down path, timers_dead_cpu() must be invoked
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 79aea7df4345..07d31e16c65e 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -2168,7 +2168,8 @@ int rcutree_dead_cpu(unsigned int cpu)
> > >  	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> > >  		return 0;
> > >  
> > > -	WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> > > +	/* Hot remove path allows parallel, while Hot add races against remove on lock */
> > > +	atomic_dec((atomic_t *)&rcu_state.n_online_cpus);
> > >  	/* Adjust any no-longer-needed kthreads. */
> > >  	rcu_boost_kthread_setaffinity(rnp, -1);
> > >  	// Stop-machine done, so allow nohz_full to disable tick.
> > > -- 
> > > 2.31.1
> > >
Joel Fernandes Aug. 23, 2022, 3:14 a.m. UTC | #6
On Mon, Aug 22, 2022 at 9:56 PM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> On Mon, Aug 22, 2022 at 02:08:38PM -0400, Joel Fernandes wrote:
> > On Sun, Aug 21, 2022 at 10:16 PM Pingfan Liu <kernelfans@gmail.com> wrote:
> > >
> > > In order to support parallel, rcu_state.n_online_cpus should be
> > > atomic_dec()
> >
> > What does Parallel mean? Is that some kexec terminology?
> >
>
> 'Parallel' means concurrent. It is not a kexec terminology, instead,
> should be SMP.

Ah ok! Makes sense. Apologies to be the word-police here, but you
probably could reword it to "In order to support parallel offlining"
or some such.

 - Joel



>
> Thanks,
>
>         Pingfan
>
>
> > Thanks,
> >
> >  - Joel
> >
> > >
> > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > > Cc: Frederic Weisbecker <frederic@kernel.org>
> > > Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> > > Cc: Josh Triplett <josh@joshtriplett.org>
> > > Cc: Steven Rostedt <rostedt@goodmis.org>
> > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > > Cc: Joel Fernandes <joel@joelfernandes.org>
> > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > Cc: Steven Price <steven.price@arm.com>
> > > Cc: "Peter Zijlstra
> > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> > > To: linux-kernel@vger.kernel.org
> > > To: rcu@vger.kernel.org
> > > ---
> > >  kernel/cpu.c      | 1 +
> > >  kernel/rcu/tree.c | 3 ++-
> > >  2 files changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > > index 1261c3f3be51..90debbe28e85 100644
> > > --- a/kernel/cpu.c
> > > +++ b/kernel/cpu.c
> > > @@ -1872,6 +1872,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
> > >                 .name                   = "RCU/tree:prepare",
> > >                 .startup.single         = rcutree_prepare_cpu,
> > >                 .teardown.single        = rcutree_dead_cpu,
> > > +               .support_kexec_parallel = true,
> > >         },
> > >         /*
> > >          * On the tear-down path, timers_dead_cpu() must be invoked
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 79aea7df4345..07d31e16c65e 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -2168,7 +2168,8 @@ int rcutree_dead_cpu(unsigned int cpu)
> > >         if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> > >                 return 0;
> > >
> > > -       WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> > > +       /* Hot remove path allows parallel, while Hot add races against remove on lock */
> > > +       atomic_dec((atomic_t *)&rcu_state.n_online_cpus);
> > >         /* Adjust any no-longer-needed kthreads. */
> > >         rcu_boost_kthread_setaffinity(rnp, -1);
> > >         // Stop-machine done, so allow nohz_full to disable tick.
> > > --
> > > 2.31.1
> > >
Pingfan Liu Aug. 24, 2022, 1:38 p.m. UTC | #7
On Tue, Aug 23, 2022 at 11:14 AM Joel Fernandes <joel@joelfernandes.org> wrote:
>
> On Mon, Aug 22, 2022 at 9:56 PM Pingfan Liu <kernelfans@gmail.com> wrote:
> >
> > On Mon, Aug 22, 2022 at 02:08:38PM -0400, Joel Fernandes wrote:
> > > On Sun, Aug 21, 2022 at 10:16 PM Pingfan Liu <kernelfans@gmail.com> wrote:
> > > >
> > > > In order to support parallel, rcu_state.n_online_cpus should be
> > > > atomic_dec()
> > >
> > > What does Parallel mean? Is that some kexec terminology?
> > >
> >
> > 'Parallel' means concurrent. It is not a kexec terminology, instead,
> > should be SMP.
>
> Ah ok! Makes sense. Apologies to be the word-police here, but you
> probably could reword it to "In order to support parallel offlining"
> or some such.
>

Thanks for your advice. It is a good English lesson, which can give
more productivity in the community.


Thanks,

    Pingfan


>  - Joel
>
>
>
> >
> > Thanks,
> >
> >         Pingfan
> >
> >
> > > Thanks,
> > >
> > >  - Joel
> > >
> > > >
> > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > > > Cc: "Paul E. McKenney" <paulmck@kernel.org>
> > > > Cc: Frederic Weisbecker <frederic@kernel.org>
> > > > Cc: Neeraj Upadhyay <quic_neeraju@quicinc.com>
> > > > Cc: Josh Triplett <josh@joshtriplett.org>
> > > > Cc: Steven Rostedt <rostedt@goodmis.org>
> > > > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > > > Cc: Lai Jiangshan <jiangshanlai@gmail.com>
> > > > Cc: Joel Fernandes <joel@joelfernandes.org>
> > > > Cc: Thomas Gleixner <tglx@linutronix.de>
> > > > Cc: Steven Price <steven.price@arm.com>
> > > > Cc: "Peter Zijlstra
> > > > Cc: Mark Rutland <mark.rutland@arm.com>
> > > > Cc: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> > > > Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
> > > > To: linux-kernel@vger.kernel.org
> > > > To: rcu@vger.kernel.org
> > > > ---
> > > >  kernel/cpu.c      | 1 +
> > > >  kernel/rcu/tree.c | 3 ++-
> > > >  2 files changed, 3 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/kernel/cpu.c b/kernel/cpu.c
> > > > index 1261c3f3be51..90debbe28e85 100644
> > > > --- a/kernel/cpu.c
> > > > +++ b/kernel/cpu.c
> > > > @@ -1872,6 +1872,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
> > > >                 .name                   = "RCU/tree:prepare",
> > > >                 .startup.single         = rcutree_prepare_cpu,
> > > >                 .teardown.single        = rcutree_dead_cpu,
> > > > +               .support_kexec_parallel = true,
> > > >         },
> > > >         /*
> > > >          * On the tear-down path, timers_dead_cpu() must be invoked
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 79aea7df4345..07d31e16c65e 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -2168,7 +2168,8 @@ int rcutree_dead_cpu(unsigned int cpu)
> > > >         if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
> > > >                 return 0;
> > > >
> > > > -       WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
> > > > +       /* Hot remove path allows parallel, while Hot add races against remove on lock */
> > > > +       atomic_dec((atomic_t *)&rcu_state.n_online_cpus);
> > > >         /* Adjust any no-longer-needed kthreads. */
> > > >         rcu_boost_kthread_setaffinity(rnp, -1);
> > > >         // Stop-machine done, so allow nohz_full to disable tick.
> > > > --
> > > > 2.31.1
> > > >
Jason A. Donenfeld Aug. 24, 2022, 1:44 p.m. UTC | #8
On 8/22/22, Pingfan Liu <kernelfans@gmail.com> wrote:
> On Mon, Aug 22, 2022 at 02:08:38PM -0400, Joel Fernandes wrote:
>> On Sun, Aug 21, 2022 at 10:16 PM Pingfan Liu <kernelfans@gmail.com>
>> wrote:
>> >
>> > In order to support parallel, rcu_state.n_online_cpus should be
>> > atomic_dec()
>>
>> What does Parallel mean? Is that some kexec terminology?
>>
>
> 'Parallel' means concurrent. It is not a kexec terminology, instead,
> should be SMP.

Only sort of. See section A.6 of Paul 's book:
http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html

Jason
Pingfan Liu Aug. 24, 2022, 1:53 p.m. UTC | #9
On Tue, Aug 23, 2022 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
> > On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> > > On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> > > > In order to support parallel, rcu_state.n_online_cpus should be
> > > > atomic_dec()
> > > >
> > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > >
> > > I have to ask...  What testing have you subjected this patch to?
> > >
> >
> > This patch subjects to [1]. The series aims to enable kexec-reboot in
> > parallel on all cpu. As a result, the involved RCU part is expected to
> > support parallel.
>
> I understand (and even sympathize with) the expectation.  But results
> sometimes diverge from expectations.  There have been implicit assumptions
> in RCU about only one CPU going offline at a time, and I am not sure
> that all of them have been addressed.  Concurrent CPU onlining has
> been looked at recently here:
>
> https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
>
> You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
> atomic, which is good.  Did you look through the rest of RCU's CPU-offline
> code paths and related code paths?
>

I went through those codes at a shallow level, especially at each
cpuhp_step hook in the RCU system.

But as you pointed out, there are implicit assumptions about only one
CPU going offline at a time, I will chew the google doc which you
share.  Then I can come to a final result.

> > [1]: https://lore.kernel.org/linux-arm-kernel/20220822021520.6996-3-kernelfans@gmail.com/T/#mf62352138d7b040fdb583ba66f8cd0ed1e145feb
>
> Perhaps I am more blind than usual today, but I am not seeing anything
> in this patch describing the testing.  At this point, I am thinking in
> terms of making rcutorture test concurrent CPU offlining parallel
>

Yes, testing results are more convincing in this area.

After making clear the implicit assumptions, I will write some code to
bridge my code and rcutorture test. Since the series is a little
different from parallel cpu offlining. It happens after all devices
are torn down, and there is no way to rollback.

> Thoughts?
>

Need a deeper dive into this field. Hope to bring out something soon.


Thanks,

    Pingfan
Paul E. McKenney Aug. 24, 2022, 4:20 p.m. UTC | #10
On Wed, Aug 24, 2022 at 09:53:11PM +0800, Pingfan Liu wrote:
> On Tue, Aug 23, 2022 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
> > > On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> > > > On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> > > > > In order to support parallel, rcu_state.n_online_cpus should be
> > > > > atomic_dec()
> > > > >
> > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > > >
> > > > I have to ask...  What testing have you subjected this patch to?
> > > >
> > >
> > > This patch subjects to [1]. The series aims to enable kexec-reboot in
> > > parallel on all cpu. As a result, the involved RCU part is expected to
> > > support parallel.
> >
> > I understand (and even sympathize with) the expectation.  But results
> > sometimes diverge from expectations.  There have been implicit assumptions
> > in RCU about only one CPU going offline at a time, and I am not sure
> > that all of them have been addressed.  Concurrent CPU onlining has
> > been looked at recently here:
> >
> > https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
> >
> > You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
> > atomic, which is good.  Did you look through the rest of RCU's CPU-offline
> > code paths and related code paths?
> 
> I went through those codes at a shallow level, especially at each
> cpuhp_step hook in the RCU system.

And that is fine, at least as a first step.

> But as you pointed out, there are implicit assumptions about only one
> CPU going offline at a time, I will chew the google doc which you
> share.  Then I can come to a final result.

Boqun Feng, Neeraj Upadhyay, Uladzislau Rezki, and I took a quick look,
and rcu_boost_kthread_setaffinity() seems to need some help.  As it
stands, it appears that concurrent invocations of this function from the
CPU-offline path will cause all but the last outgoing CPU's bit to be
(incorrectly) set in the cpumask_var_t passed to set_cpus_allowed_ptr().

This should not be difficult to fix, for example, by maintaining a
separate per-leaf-rcu_node-structure bitmask of the concurrently outgoing
CPUs for that rcu_node structure.  (Similar in structure to the
->qsmask field.)

There are probably more where that one came from.  ;-)

> > > [1]: https://lore.kernel.org/linux-arm-kernel/20220822021520.6996-3-kernelfans@gmail.com/T/#mf62352138d7b040fdb583ba66f8cd0ed1e145feb
> >
> > Perhaps I am more blind than usual today, but I am not seeing anything
> > in this patch describing the testing.  At this point, I am thinking in
> > terms of making rcutorture test concurrent CPU offlining parallel
> 
> Yes, testing results are more convincing in this area.
> 
> After making clear the implicit assumptions, I will write some code to
> bridge my code and rcutorture test. Since the series is a little
> different from parallel cpu offlining. It happens after all devices
> are torn down, and there is no way to rollback.

Very good, looking forward to seeing what you come up with!

> > Thoughts?
> 
> Need a deeper dive into this field. Hope to bring out something soon.

Again, looking forward to seeing what you find!

							Thanx, Paul
Joel Fernandes Aug. 24, 2022, 5:26 p.m. UTC | #11
On 8/24/2022 12:20 PM, Paul E. McKenney wrote:
> On Wed, Aug 24, 2022 at 09:53:11PM +0800, Pingfan Liu wrote:
>> On Tue, Aug 23, 2022 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>>>
>>> On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
>>>> On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
>>>>> On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
>>>>>> In order to support parallel, rcu_state.n_online_cpus should be
>>>>>> atomic_dec()
>>>>>>
>>>>>> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
>>>>>
>>>>> I have to ask...  What testing have you subjected this patch to?
>>>>>
>>>>
>>>> This patch subjects to [1]. The series aims to enable kexec-reboot in
>>>> parallel on all cpu. As a result, the involved RCU part is expected to
>>>> support parallel.
>>>
>>> I understand (and even sympathize with) the expectation.  But results
>>> sometimes diverge from expectations.  There have been implicit assumptions
>>> in RCU about only one CPU going offline at a time, and I am not sure
>>> that all of them have been addressed.  Concurrent CPU onlining has
>>> been looked at recently here:
>>>
>>> https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
>>>
>>> You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
>>> atomic, which is good.  Did you look through the rest of RCU's CPU-offline
>>> code paths and related code paths?
>>
>> I went through those codes at a shallow level, especially at each
>> cpuhp_step hook in the RCU system.
> 
> And that is fine, at least as a first step.
> 
>> But as you pointed out, there are implicit assumptions about only one
>> CPU going offline at a time, I will chew the google doc which you
>> share.  Then I can come to a final result.
> 
> Boqun Feng, Neeraj Upadhyay, Uladzislau Rezki, and I took a quick look,
> and rcu_boost_kthread_setaffinity() seems to need some help.  As it
> stands, it appears that concurrent invocations of this function from the
> CPU-offline path will cause all but the last outgoing CPU's bit to be
> (incorrectly) set in the cpumask_var_t passed to set_cpus_allowed_ptr().
> 
> This should not be difficult to fix, for example, by maintaining a
> separate per-leaf-rcu_node-structure bitmask of the concurrently outgoing
> CPUs for that rcu_node structure.  (Similar in structure to the
> ->qsmask field.)
> 
> There are probably more where that one came from.  ;-)

Should rcutree_dying_cpu() access to rnp->qsmask have a READ_ONCE() ? I was
thinking grace period initialization or qs reporting paths racing with that. Its
just tracing, still :)

Thanks,

- Joel
Paul E. McKenney Aug. 24, 2022, 7:21 p.m. UTC | #12
On Wed, Aug 24, 2022 at 01:26:01PM -0400, Joel Fernandes wrote:
> 
> 
> On 8/24/2022 12:20 PM, Paul E. McKenney wrote:
> > On Wed, Aug 24, 2022 at 09:53:11PM +0800, Pingfan Liu wrote:
> >> On Tue, Aug 23, 2022 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >>>
> >>> On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
> >>>> On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> >>>>> On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> >>>>>> In order to support parallel, rcu_state.n_online_cpus should be
> >>>>>> atomic_dec()
> >>>>>>
> >>>>>> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> >>>>>
> >>>>> I have to ask...  What testing have you subjected this patch to?
> >>>>>
> >>>>
> >>>> This patch subjects to [1]. The series aims to enable kexec-reboot in
> >>>> parallel on all cpu. As a result, the involved RCU part is expected to
> >>>> support parallel.
> >>>
> >>> I understand (and even sympathize with) the expectation.  But results
> >>> sometimes diverge from expectations.  There have been implicit assumptions
> >>> in RCU about only one CPU going offline at a time, and I am not sure
> >>> that all of them have been addressed.  Concurrent CPU onlining has
> >>> been looked at recently here:
> >>>
> >>> https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
> >>>
> >>> You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
> >>> atomic, which is good.  Did you look through the rest of RCU's CPU-offline
> >>> code paths and related code paths?
> >>
> >> I went through those codes at a shallow level, especially at each
> >> cpuhp_step hook in the RCU system.
> > 
> > And that is fine, at least as a first step.
> > 
> >> But as you pointed out, there are implicit assumptions about only one
> >> CPU going offline at a time, I will chew the google doc which you
> >> share.  Then I can come to a final result.
> > 
> > Boqun Feng, Neeraj Upadhyay, Uladzislau Rezki, and I took a quick look,
> > and rcu_boost_kthread_setaffinity() seems to need some help.  As it
> > stands, it appears that concurrent invocations of this function from the
> > CPU-offline path will cause all but the last outgoing CPU's bit to be
> > (incorrectly) set in the cpumask_var_t passed to set_cpus_allowed_ptr().
> > 
> > This should not be difficult to fix, for example, by maintaining a
> > separate per-leaf-rcu_node-structure bitmask of the concurrently outgoing
> > CPUs for that rcu_node structure.  (Similar in structure to the
> > ->qsmask field.)
> > 
> > There are probably more where that one came from.  ;-)
> 
> Should rcutree_dying_cpu() access to rnp->qsmask have a READ_ONCE() ? I was
> thinking grace period initialization or qs reporting paths racing with that. Its
> just tracing, still :)

Looks like it should be regardless of Pingfan's patches, given that
the grace-period kthread might report a quiescent state concurrently.

Good catch!

							Thanx, Paul
Joel Fernandes Aug. 24, 2022, 10:54 p.m. UTC | #13
On Wed, Aug 24, 2022 at 3:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Wed, Aug 24, 2022 at 01:26:01PM -0400, Joel Fernandes wrote:
> >
> >
> > On 8/24/2022 12:20 PM, Paul E. McKenney wrote:
> > > On Wed, Aug 24, 2022 at 09:53:11PM +0800, Pingfan Liu wrote:
> > >> On Tue, Aug 23, 2022 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >>>
> > >>> On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
> > >>>> On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> > >>>>> On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> > >>>>>> In order to support parallel, rcu_state.n_online_cpus should be
> > >>>>>> atomic_dec()
> > >>>>>>
> > >>>>>> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > >>>>>
> > >>>>> I have to ask...  What testing have you subjected this patch to?
> > >>>>>
> > >>>>
> > >>>> This patch subjects to [1]. The series aims to enable kexec-reboot in
> > >>>> parallel on all cpu. As a result, the involved RCU part is expected to
> > >>>> support parallel.
> > >>>
> > >>> I understand (and even sympathize with) the expectation.  But results
> > >>> sometimes diverge from expectations.  There have been implicit assumptions
> > >>> in RCU about only one CPU going offline at a time, and I am not sure
> > >>> that all of them have been addressed.  Concurrent CPU onlining has
> > >>> been looked at recently here:
> > >>>
> > >>> https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
> > >>>
> > >>> You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
> > >>> atomic, which is good.  Did you look through the rest of RCU's CPU-offline
> > >>> code paths and related code paths?
> > >>
> > >> I went through those codes at a shallow level, especially at each
> > >> cpuhp_step hook in the RCU system.
> > >
> > > And that is fine, at least as a first step.
> > >
> > >> But as you pointed out, there are implicit assumptions about only one
> > >> CPU going offline at a time, I will chew the google doc which you
> > >> share.  Then I can come to a final result.
> > >
> > > Boqun Feng, Neeraj Upadhyay, Uladzislau Rezki, and I took a quick look,
> > > and rcu_boost_kthread_setaffinity() seems to need some help.  As it
> > > stands, it appears that concurrent invocations of this function from the
> > > CPU-offline path will cause all but the last outgoing CPU's bit to be
> > > (incorrectly) set in the cpumask_var_t passed to set_cpus_allowed_ptr().
> > >
> > > This should not be difficult to fix, for example, by maintaining a
> > > separate per-leaf-rcu_node-structure bitmask of the concurrently outgoing
> > > CPUs for that rcu_node structure.  (Similar in structure to the
> > > ->qsmask field.)
> > >
> > > There are probably more where that one came from.  ;-)
> >
> > Should rcutree_dying_cpu() access to rnp->qsmask have a READ_ONCE() ? I was
> > thinking grace period initialization or qs reporting paths racing with that. Its
> > just tracing, still :)
>
> Looks like it should be regardless of Pingfan's patches, given that
> the grace-period kthread might report a quiescent state concurrently.

Thanks for confirming, I'll queue it into my next revision of the series.

 - Joel
Paul E. McKenney Aug. 24, 2022, 11:01 p.m. UTC | #14
On Wed, Aug 24, 2022 at 06:54:01PM -0400, Joel Fernandes wrote:
> On Wed, Aug 24, 2022 at 3:21 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Wed, Aug 24, 2022 at 01:26:01PM -0400, Joel Fernandes wrote:
> > >
> > >
> > > On 8/24/2022 12:20 PM, Paul E. McKenney wrote:
> > > > On Wed, Aug 24, 2022 at 09:53:11PM +0800, Pingfan Liu wrote:
> > > >> On Tue, Aug 23, 2022 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >>>
> > > >>> On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
> > > >>>> On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> > > >>>>> On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> > > >>>>>> In order to support parallel, rcu_state.n_online_cpus should be
> > > >>>>>> atomic_dec()
> > > >>>>>>
> > > >>>>>> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > > >>>>>
> > > >>>>> I have to ask...  What testing have you subjected this patch to?
> > > >>>>>
> > > >>>>
> > > >>>> This patch subjects to [1]. The series aims to enable kexec-reboot in
> > > >>>> parallel on all cpu. As a result, the involved RCU part is expected to
> > > >>>> support parallel.
> > > >>>
> > > >>> I understand (and even sympathize with) the expectation.  But results
> > > >>> sometimes diverge from expectations.  There have been implicit assumptions
> > > >>> in RCU about only one CPU going offline at a time, and I am not sure
> > > >>> that all of them have been addressed.  Concurrent CPU onlining has
> > > >>> been looked at recently here:
> > > >>>
> > > >>> https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
> > > >>>
> > > >>> You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
> > > >>> atomic, which is good.  Did you look through the rest of RCU's CPU-offline
> > > >>> code paths and related code paths?
> > > >>
> > > >> I went through those codes at a shallow level, especially at each
> > > >> cpuhp_step hook in the RCU system.
> > > >
> > > > And that is fine, at least as a first step.
> > > >
> > > >> But as you pointed out, there are implicit assumptions about only one
> > > >> CPU going offline at a time, I will chew the google doc which you
> > > >> share.  Then I can come to a final result.
> > > >
> > > > Boqun Feng, Neeraj Upadhyay, Uladzislau Rezki, and I took a quick look,
> > > > and rcu_boost_kthread_setaffinity() seems to need some help.  As it
> > > > stands, it appears that concurrent invocations of this function from the
> > > > CPU-offline path will cause all but the last outgoing CPU's bit to be
> > > > (incorrectly) set in the cpumask_var_t passed to set_cpus_allowed_ptr().
> > > >
> > > > This should not be difficult to fix, for example, by maintaining a
> > > > separate per-leaf-rcu_node-structure bitmask of the concurrently outgoing
> > > > CPUs for that rcu_node structure.  (Similar in structure to the
> > > > ->qsmask field.)
> > > >
> > > > There are probably more where that one came from.  ;-)
> > >
> > > Should rcutree_dying_cpu() access to rnp->qsmask have a READ_ONCE() ? I was
> > > thinking grace period initialization or qs reporting paths racing with that. Its
> > > just tracing, still :)
> >
> > Looks like it should be regardless of Pingfan's patches, given that
> > the grace-period kthread might report a quiescent state concurrently.
> 
> Thanks for confirming, I'll queue it into my next revision of the series.

Sounds good!

							Thanx, Paul
Paul E. McKenney Aug. 31, 2022, 4:15 p.m. UTC | #15
On Wed, Aug 24, 2022 at 09:20:50AM -0700, Paul E. McKenney wrote:
> On Wed, Aug 24, 2022 at 09:53:11PM +0800, Pingfan Liu wrote:
> > On Tue, Aug 23, 2022 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
> > > > On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> > > > > On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> > > > > > In order to support parallel, rcu_state.n_online_cpus should be
> > > > > > atomic_dec()
> > > > > >
> > > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > > > >
> > > > > I have to ask...  What testing have you subjected this patch to?
> > > > >
> > > >
> > > > This patch subjects to [1]. The series aims to enable kexec-reboot in
> > > > parallel on all cpu. As a result, the involved RCU part is expected to
> > > > support parallel.
> > >
> > > I understand (and even sympathize with) the expectation.  But results
> > > sometimes diverge from expectations.  There have been implicit assumptions
> > > in RCU about only one CPU going offline at a time, and I am not sure
> > > that all of them have been addressed.  Concurrent CPU onlining has
> > > been looked at recently here:
> > >
> > > https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
> > >
> > > You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
> > > atomic, which is good.  Did you look through the rest of RCU's CPU-offline
> > > code paths and related code paths?
> > 
> > I went through those codes at a shallow level, especially at each
> > cpuhp_step hook in the RCU system.
> 
> And that is fine, at least as a first step.
> 
> > But as you pointed out, there are implicit assumptions about only one
> > CPU going offline at a time, I will chew the google doc which you
> > share.  Then I can come to a final result.
> 
> Boqun Feng, Neeraj Upadhyay, Uladzislau Rezki, and I took a quick look,
> and rcu_boost_kthread_setaffinity() seems to need some help.  As it
> stands, it appears that concurrent invocations of this function from the
> CPU-offline path will cause all but the last outgoing CPU's bit to be
> (incorrectly) set in the cpumask_var_t passed to set_cpus_allowed_ptr().
> 
> This should not be difficult to fix, for example, by maintaining a
> separate per-leaf-rcu_node-structure bitmask of the concurrently outgoing
> CPUs for that rcu_node structure.  (Similar in structure to the
> ->qsmask field.)
> 
> There are probably more where that one came from.  ;-)

And here is one more from this week's session.

The calls to tick_dep_set() and tick_dep_clear() use atomic operations,
but they operate on a global variable.  This means that the first call
to rcutree_offline_cpu() would enable the tick and the first call to
rcutree_dead_cpu() would disable the tick.  This might be OK, but it
is at the very least bad practice.  There needs to be a counter
mediating these calls.

For more detail, please see the Google document:

https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing

							Thanx, Paul

> > > > [1]: https://lore.kernel.org/linux-arm-kernel/20220822021520.6996-3-kernelfans@gmail.com/T/#mf62352138d7b040fdb583ba66f8cd0ed1e145feb
> > >
> > > Perhaps I am more blind than usual today, but I am not seeing anything
> > > in this patch describing the testing.  At this point, I am thinking in
> > > terms of making rcutorture test concurrent CPU offlining parallel
> > 
> > Yes, testing results are more convincing in this area.
> > 
> > After making clear the implicit assumptions, I will write some code to
> > bridge my code and rcutorture test. Since the series is a little
> > different from parallel cpu offlining. It happens after all devices
> > are torn down, and there is no way to rollback.
> 
> Very good, looking forward to seeing what you come up with!
> 
> > > Thoughts?
> > 
> > Need a deeper dive into this field. Hope to bring out something soon.
> 
> Again, looking forward to seeing what you find!
> 
> 							Thanx, Paul
Pingfan Liu Sept. 5, 2022, 3:53 a.m. UTC | #16
On Thu, Sep 1, 2022 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Wed, Aug 24, 2022 at 09:20:50AM -0700, Paul E. McKenney wrote:
> > On Wed, Aug 24, 2022 at 09:53:11PM +0800, Pingfan Liu wrote:
> > > On Tue, Aug 23, 2022 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
> > > > > On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> > > > > > On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> > > > > > > In order to support parallel, rcu_state.n_online_cpus should be
> > > > > > > atomic_dec()
> > > > > > >
> > > > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > > > > >
> > > > > > I have to ask...  What testing have you subjected this patch to?
> > > > > >
> > > > >
> > > > > This patch subjects to [1]. The series aims to enable kexec-reboot in
> > > > > parallel on all cpu. As a result, the involved RCU part is expected to
> > > > > support parallel.
> > > >
> > > > I understand (and even sympathize with) the expectation.  But results
> > > > sometimes diverge from expectations.  There have been implicit assumptions
> > > > in RCU about only one CPU going offline at a time, and I am not sure
> > > > that all of them have been addressed.  Concurrent CPU onlining has
> > > > been looked at recently here:
> > > >
> > > > https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
> > > >
> > > > You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
> > > > atomic, which is good.  Did you look through the rest of RCU's CPU-offline
> > > > code paths and related code paths?
> > >
> > > I went through those codes at a shallow level, especially at each
> > > cpuhp_step hook in the RCU system.
> >
> > And that is fine, at least as a first step.
> >
> > > But as you pointed out, there are implicit assumptions about only one
> > > CPU going offline at a time, I will chew the google doc which you
> > > share.  Then I can come to a final result.
> >
> > Boqun Feng, Neeraj Upadhyay, Uladzislau Rezki, and I took a quick look,
> > and rcu_boost_kthread_setaffinity() seems to need some help.  As it
> > stands, it appears that concurrent invocations of this function from the
> > CPU-offline path will cause all but the last outgoing CPU's bit to be
> > (incorrectly) set in the cpumask_var_t passed to set_cpus_allowed_ptr().
> >
> > This should not be difficult to fix, for example, by maintaining a
> > separate per-leaf-rcu_node-structure bitmask of the concurrently outgoing
> > CPUs for that rcu_node structure.  (Similar in structure to the
> > ->qsmask field.)
> >

Sorry to reply late, since I am interrupted by some other things.
I have took a different way and posted a series ([PATCH 1/3] rcu:
remove redundant cpu affinity setting during teardown) for that on
https://lore.kernel.org/rcu/20220905033852.18988-1-kernelfans@gmail.com/T/#t

Besides, for the integration of the concurrency cpu hot-removing into
the rcu torture test, I begin to do it.

> > There are probably more where that one came from.  ;-)
>
> And here is one more from this week's session.
>

Thanks for the update.

> The calls to tick_dep_set() and tick_dep_clear() use atomic operations,
> but they operate on a global variable.  This means that the first call
> to rcutree_offline_cpu() would enable the tick and the first call to
> rcutree_dead_cpu() would disable the tick.  This might be OK, but it
> is at the very least bad practice.  There needs to be a counter
> mediating these calls.
>

I will see what I can do here.

> For more detail, please see the Google document:
>
> https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
>

Have read it and hope that both online and offline concurrency can
come to true in near future.

Thanks,

    Pingfan
Paul E. McKenney Sept. 6, 2022, 6:45 p.m. UTC | #17
On Mon, Sep 05, 2022 at 11:53:52AM +0800, Pingfan Liu wrote:
> On Thu, Sep 1, 2022 at 12:15 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Wed, Aug 24, 2022 at 09:20:50AM -0700, Paul E. McKenney wrote:
> > > On Wed, Aug 24, 2022 at 09:53:11PM +0800, Pingfan Liu wrote:
> > > > On Tue, Aug 23, 2022 at 11:01 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > On Tue, Aug 23, 2022 at 09:50:56AM +0800, Pingfan Liu wrote:
> > > > > > On Sun, Aug 21, 2022 at 07:45:28PM -0700, Paul E. McKenney wrote:
> > > > > > > On Mon, Aug 22, 2022 at 10:15:16AM +0800, Pingfan Liu wrote:
> > > > > > > > In order to support parallel, rcu_state.n_online_cpus should be
> > > > > > > > atomic_dec()
> > > > > > > >
> > > > > > > > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > > > > > >
> > > > > > > I have to ask...  What testing have you subjected this patch to?
> > > > > > >
> > > > > >
> > > > > > This patch subjects to [1]. The series aims to enable kexec-reboot in
> > > > > > parallel on all cpu. As a result, the involved RCU part is expected to
> > > > > > support parallel.
> > > > >
> > > > > I understand (and even sympathize with) the expectation.  But results
> > > > > sometimes diverge from expectations.  There have been implicit assumptions
> > > > > in RCU about only one CPU going offline at a time, and I am not sure
> > > > > that all of them have been addressed.  Concurrent CPU onlining has
> > > > > been looked at recently here:
> > > > >
> > > > > https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
> > > > >
> > > > > You did us atomic_dec() to make rcu_state.n_online_cpus decrementing be
> > > > > atomic, which is good.  Did you look through the rest of RCU's CPU-offline
> > > > > code paths and related code paths?
> > > >
> > > > I went through those codes at a shallow level, especially at each
> > > > cpuhp_step hook in the RCU system.
> > >
> > > And that is fine, at least as a first step.
> > >
> > > > But as you pointed out, there are implicit assumptions about only one
> > > > CPU going offline at a time, I will chew the google doc which you
> > > > share.  Then I can come to a final result.
> > >
> > > Boqun Feng, Neeraj Upadhyay, Uladzislau Rezki, and I took a quick look,
> > > and rcu_boost_kthread_setaffinity() seems to need some help.  As it
> > > stands, it appears that concurrent invocations of this function from the
> > > CPU-offline path will cause all but the last outgoing CPU's bit to be
> > > (incorrectly) set in the cpumask_var_t passed to set_cpus_allowed_ptr().
> > >
> > > This should not be difficult to fix, for example, by maintaining a
> > > separate per-leaf-rcu_node-structure bitmask of the concurrently outgoing
> > > CPUs for that rcu_node structure.  (Similar in structure to the
> > > ->qsmask field.)
> > >
> 
> Sorry to reply late, since I am interrupted by some other things.
> I have took a different way and posted a series ([PATCH 1/3] rcu:
> remove redundant cpu affinity setting during teardown) for that on
> https://lore.kernel.org/rcu/20220905033852.18988-1-kernelfans@gmail.com/T/#t

And I took patch #3, thank you!

#1 allows the kthread to run on the outgoing CPU, which is to be
avoided, and #2 depends on #1.

> Besides, for the integration of the concurrency cpu hot-removing into
> the rcu torture test, I begin to do it.

Very good!  I am looking forward to seeing what you come up with.

> > > There are probably more where that one came from.  ;-)
> >
> > And here is one more from this week's session.
> 
> Thanks for the update.
> 
> > The calls to tick_dep_set() and tick_dep_clear() use atomic operations,
> > but they operate on a global variable.  This means that the first call
> > to rcutree_offline_cpu() would enable the tick and the first call to
> > rcutree_dead_cpu() would disable the tick.  This might be OK, but it
> > is at the very least bad practice.  There needs to be a counter
> > mediating these calls.
> 
> I will see what I can do here.
> 
> > For more detail, please see the Google document:
> >
> > https://docs.google.com/document/d/1jymsaCPQ1PUDcfjIKm0UIbVdrJAaGX-6cXrmcfm0PRU/edit?usp=sharing
> >
> 
> Have read it and hope that both online and offline concurrency can
> come to true in near future.

Indeed, I suspect that a lot of people would like to see faster kexec!

							Thanx, Paul
diff mbox series

Patch

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 1261c3f3be51..90debbe28e85 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1872,6 +1872,7 @@  static struct cpuhp_step cpuhp_hp_states[] = {
 		.name			= "RCU/tree:prepare",
 		.startup.single		= rcutree_prepare_cpu,
 		.teardown.single	= rcutree_dead_cpu,
+		.support_kexec_parallel	= true,
 	},
 	/*
 	 * On the tear-down path, timers_dead_cpu() must be invoked
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 79aea7df4345..07d31e16c65e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2168,7 +2168,8 @@  int rcutree_dead_cpu(unsigned int cpu)
 	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU))
 		return 0;
 
-	WRITE_ONCE(rcu_state.n_online_cpus, rcu_state.n_online_cpus - 1);
+	/* Hot remove path allows parallel, while Hot add races against remove on lock */
+	atomic_dec((atomic_t *)&rcu_state.n_online_cpus);
 	/* Adjust any no-longer-needed kthreads. */
 	rcu_boost_kthread_setaffinity(rnp, -1);
 	// Stop-machine done, so allow nohz_full to disable tick.