diff mbox series

[v3] ring-buffer: Prevent inconsistent operation on cpu_buffer->resize_disabled

Message ID 20230410073512.13362-1-Tze-nan.Wu@mediatek.com (mailing list archive)
State Superseded
Headers show
Series [v3] ring-buffer: Prevent inconsistent operation on cpu_buffer->resize_disabled | expand

Commit Message

Tze-nan Wu April 10, 2023, 7:35 a.m. UTC
Write to buffer_size_kb can permanently fail, due to cpu_online_mask may
changed between two for_each_online_buffer_cpu loops.
The number of increasing and decreasing on cpu_buffer->resize_disable
may be inconsistent, leading that the resize_disabled in some CPUs
becoming none zero after ring_buffer_reset_online_cpus return.

This issue can be reproduced by "echo 0 > trace" while hotplugging cpu.
After reproducing success, we can find out buffer_size_kb will not be
functional anymore.

Prevent the two "loops" in this function from iterating through different
online cpus by copying cpu_online_mask at the entry of the function.

Changes from v1 to v3:
  Declare the cpumask variable statically rather than dynamically.

Changes from v2 to v3:
  Considering holding cpu_hotplug_lock too long because of the
  synchronize_rcu(), maybe it's better to prevent the issue by copying
  cpu_online_mask at the entry of the function as V1 does, instead of
  using cpus_read_lock().

Link: https://lore.kernel.org/lkml/20230408052226.25268-1-Tze-nan.Wu@mediatek.com/
Link: https://lore.kernel.org/oe-kbuild-all/202304082051.Dp50upfS-lkp@intel.com/
Link: https://lore.kernel.org/oe-kbuild-all/202304081615.eiaqpbV8-lkp@intel.com/

Cc: stable@vger.kernel.org
Cc: npiggin@gmail.com
Fixes: b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
---
 kernel/trace/ring_buffer.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

Comments

Steven Rostedt April 11, 2023, 4:44 p.m. UTC | #1
Please have each new patch be a new thread, and not a Cc to the previous
version of the patch. As it makes it hard to find in INBOXs.

On Mon, 10 Apr 2023 15:35:08 +0800
Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote:

> Write to buffer_size_kb can permanently fail, due to cpu_online_mask may
> changed between two for_each_online_buffer_cpu loops.
> The number of increasing and decreasing on cpu_buffer->resize_disable
> may be inconsistent, leading that the resize_disabled in some CPUs
> becoming none zero after ring_buffer_reset_online_cpus return.
> 
> This issue can be reproduced by "echo 0 > trace" while hotplugging cpu.
> After reproducing success, we can find out buffer_size_kb will not be
> functional anymore.
> 
> Prevent the two "loops" in this function from iterating through different
> online cpus by copying cpu_online_mask at the entry of the function.
> 

The "Changes from" need to go below  the '---', otherwise they are added to
the git commit (we don't want it there).

> Changes from v1 to v3:
>   Declare the cpumask variable statically rather than dynamically.
> 
> Changes from v2 to v3:
>   Considering holding cpu_hotplug_lock too long because of the
>   synchronize_rcu(), maybe it's better to prevent the issue by copying
>   cpu_online_mask at the entry of the function as V1 does, instead of
>   using cpus_read_lock().
> 
> Link: https://lore.kernel.org/lkml/20230408052226.25268-1-Tze-nan.Wu@mediatek.com/
> Link: https://lore.kernel.org/oe-kbuild-all/202304082051.Dp50upfS-lkp@intel.com/
> Link: https://lore.kernel.org/oe-kbuild-all/202304081615.eiaqpbV8-lkp@intel.com/
> 
> Cc: stable@vger.kernel.org
> Cc: npiggin@gmail.com
> Fixes: b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU")
> Reported-by: kernel test robot <lkp@intel.com>
> Reviewed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> ---

This is where the "Changes from" go. And since this patch is not suppose to
be a Cc. But since it's still good to have a link to it. You could do:

Changes from v2 to v3: https://lore.kernel.org/linux-trace-kernel/20230409024616.31099-1-Tze-nan.Wu@mediatek.com/
  Considering holding cpu_hotplug_lock too long because of the
  synchronize_rcu(), maybe it's better to prevent the issue by copying
  cpu_online_mask at the entry of the function as V1 does, instead of
  using cpus_read_lock().


Where the previous version changes has the lore link to the previous patch,
in case someone wants to look at it.


>  kernel/trace/ring_buffer.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> index 76a2d91eecad..dc758930dacb 100644
> --- a/kernel/trace/ring_buffer.c
> +++ b/kernel/trace/ring_buffer.c
> @@ -288,9 +288,6 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data);
>  #define for_each_buffer_cpu(buffer, cpu)		\
>  	for_each_cpu(cpu, buffer->cpumask)
>  
> -#define for_each_online_buffer_cpu(buffer, cpu)		\
> -	for_each_cpu_and(cpu, buffer->cpumask, cpu_online_mask)
> -
>  #define TS_SHIFT	27
>  #define TS_MASK		((1ULL << TS_SHIFT) - 1)
>  #define TS_DELTA_TEST	(~TS_MASK)
> @@ -5353,12 +5350,19 @@ EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);
>  void ring_buffer_reset_online_cpus(struct trace_buffer *buffer)
>  {
>  	struct ring_buffer_per_cpu *cpu_buffer;
> +	cpumask_t reset_online_cpumask;

It's usually considered bad form to put a cpumask on the stack. As it can
be 128 bytes for a machine with 1024 CPUs (and yes they do exist). Also,
the mask size is set to NR_CPUS not the actual size, so you do not even
need to have it that big.


>  	int cpu;
>  
> +	/*
> +	 * Record cpu_online_mask here to make sure we iterate through the same
> +	 * online CPUs in the following two loops.
> +	 */
> +	cpumask_copy(&reset_online_cpumask, cpu_online_mask);
> +
>  	/* prevent another thread from changing buffer sizes */
>  	mutex_lock(&buffer->mutex);
>  
> -	for_each_online_buffer_cpu(buffer, cpu) {
> +	for_each_cpu_and(cpu, buffer->cpumask, &reset_online_cpumask) {
>  		cpu_buffer = buffer->buffers[cpu];
>  
>  		atomic_inc(&cpu_buffer->resize_disabled);

Anyway, we don't need to modify any of the above, and just do the following
instead of atomic_inc():

#define RESET_BIT	(1 << 30)

		atomic_add(&cpu_buffer->resize_disabled, RESET_BIT);


> @@ -5368,7 +5372,7 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer)
>  	/* Make sure all commits have finished */
>  	synchronize_rcu();
>  
> -	for_each_online_buffer_cpu(buffer, cpu) {
> +	for_each_cpu_and(cpu, buffer->cpumask, &reset_online_cpumask) {
>  		cpu_buffer = buffer->buffers[cpu];

Then here we can do:

		/*
		 * If a CPU came online during the synchronize_rcu(), then
		 * ignore it.
		 */
		if (!atomic_read(&cpu_buffer->resize_disabled) & RESET_BIT))
			continue;

		atomic_sub(&cpu_buffer->resize_disabled, RESET_BIT);


As the resize_disabled only needs to be set to something to make it
disabled.

-- Steve

>  
>  		reset_disabled_cpu_buffer(cpu_buffer);
Steven Rostedt April 11, 2023, 4:48 p.m. UTC | #2
On Tue, 11 Apr 2023 12:44:03 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> Please have each new patch be a new thread, and not a Cc to the previous

Should have been "reply to the previous" :-p

-- Steve

> version of the patch. As it makes it hard to find in INBOXs.
Tze-nan Wu April 12, 2023, 2:27 a.m. UTC | #3
Hi Steve,

On Tue, 2023-04-11 at 12:44 -0400, Steven Rostedt wrote:
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> Please have each new patch be a new thread, and not a Cc to the
> previous
> version of the patch. As it makes it hard to find in INBOXs.
> 

No problem, got it.

> On Mon, 10 Apr 2023 15:35:08 +0800
> Tze-nan Wu <Tze-nan.Wu@mediatek.com> wrote:
> 
> > Write to buffer_size_kb can permanently fail, due to
> > cpu_online_mask may
> > changed between two for_each_online_buffer_cpu loops.
> > The number of increasing and decreasing on cpu_buffer-
> > >resize_disable
> > may be inconsistent, leading that the resize_disabled in some CPUs
> > becoming none zero after ring_buffer_reset_online_cpus return.
> > 
> > This issue can be reproduced by "echo 0 > trace" while hotplugging
> > cpu.
> > After reproducing success, we can find out buffer_size_kb will not
> > be
> > functional anymore.
> > 
> > Prevent the two "loops" in this function from iterating through
> > different
> > online cpus by copying cpu_online_mask at the entry of the
> > function.
> > 
> 
> The "Changes from" need to go below  the '---', otherwise they are
> added to
> the git commit (we don't want it there).
> 

Will remember this, won't happened next time :)

> > Changes from v1 to v3:
> >   Declare the cpumask variable statically rather than dynamically.
> > 
> > Changes from v2 to v3:
> >   Considering holding cpu_hotplug_lock too long because of the
> >   synchronize_rcu(), maybe it's better to prevent the issue by
> > copying
> >   cpu_online_mask at the entry of the function as V1 does, instead
> > of
> >   using cpus_read_lock().
> > 
> > Link: 
> > https://lore.kernel.org/lkml/20230408052226.25268-1-Tze-nan.Wu@mediatek.com/
> > Link: 
> > https://lore.kernel.org/oe-kbuild-all/202304082051.Dp50upfS-lkp@intel.com/
> > Link: 
> > https://lore.kernel.org/oe-kbuild-all/202304081615.eiaqpbV8-lkp@intel.com/
> > 
> > Cc: stable@vger.kernel.org
> > Cc: npiggin@gmail.com
> > Fixes: b23d7a5f4a07 ("ring-buffer: speed up buffer resets by
> > avoiding synchronize_rcu for each CPU")
> > Reported-by: kernel test robot <lkp@intel.com>
> > Reviewed-by: Cheng-Jui Wang <cheng-jui.wang@mediatek.com>
> > Signed-off-by: Tze-nan Wu <Tze-nan.Wu@mediatek.com>
> > ---
> 
> This is where the "Changes from" go. And since this patch is not
> suppose to
> be a Cc. But since it's still good to have a link to it. You could
> do:
> 
> Changes from v2 to v3: 
> https://lore.kernel.org/linux-trace-kernel/20230409024616.31099-1-Tze-nan.Wu@mediatek.com/
>   Considering holding cpu_hotplug_lock too long because of the
>   synchronize_rcu(), maybe it's better to prevent the issue by
> copying
>   cpu_online_mask at the entry of the function as V1 does, instead of
>   using cpus_read_lock().
> 
> 
> Where the previous version changes has the lore link to the previous
> patch,
> in case someone wants to look at it.
> 

Sure, a link here is really helpful.
Will follow this format in the future.

> 
> >  kernel/trace/ring_buffer.c | 14 +++++++++-----
> >  1 file changed, 9 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/trace/ring_buffer.c
> > b/kernel/trace/ring_buffer.c
> > index 76a2d91eecad..dc758930dacb 100644
> > --- a/kernel/trace/ring_buffer.c
> > +++ b/kernel/trace/ring_buffer.c
> > @@ -288,9 +288,6 @@ EXPORT_SYMBOL_GPL(ring_buffer_event_data);
> >  #define for_each_buffer_cpu(buffer, cpu)             \
> >       for_each_cpu(cpu, buffer->cpumask)
> > 
> > -#define for_each_online_buffer_cpu(buffer, cpu)              \
> > -     for_each_cpu_and(cpu, buffer->cpumask, cpu_online_mask)
> > -
> >  #define TS_SHIFT     27
> >  #define TS_MASK              ((1ULL << TS_SHIFT) - 1)
> >  #define TS_DELTA_TEST        (~TS_MASK)
> > @@ -5353,12 +5350,19 @@ EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);
> >  void ring_buffer_reset_online_cpus(struct trace_buffer *buffer)
> >  {
> >       struct ring_buffer_per_cpu *cpu_buffer;
> > +     cpumask_t reset_online_cpumask;
> 
> It's usually considered bad form to put a cpumask on the stack. As it
> can
> be 128 bytes for a machine with 1024 CPUs (and yes they do exist).
> Also,
> the mask size is set to NR_CPUS not the actual size, so you do not
> even
> need to have it that big.
> 

Never thought about that until you told me,
I will keep it in mind before declare a cpumask next time.

> 
> >       int cpu;
> > 
> > +     /*
> > +      * Record cpu_online_mask here to make sure we iterate
> > through the same
> > +      * online CPUs in the following two loops.
> > +      */
> > +     cpumask_copy(&reset_online_cpumask, cpu_online_mask);
> > +
> >       /* prevent another thread from changing buffer sizes */
> >       mutex_lock(&buffer->mutex);
> > 
> > -     for_each_online_buffer_cpu(buffer, cpu) {
> > +     for_each_cpu_and(cpu, buffer->cpumask, &reset_online_cpumask)
> > {
> >               cpu_buffer = buffer->buffers[cpu];
> > 
> >               atomic_inc(&cpu_buffer->resize_disabled);
> 
> Anyway, we don't need to modify any of the above, and just do the
> following
> instead of atomic_inc():
> 
> #define RESET_BIT       (1 << 30)
> 
>                 atomic_add(&cpu_buffer->resize_disabled, RESET_BIT);
> 
> 
> > @@ -5368,7 +5372,7 @@ void ring_buffer_reset_online_cpus(struct
> > trace_buffer *buffer)
> >       /* Make sure all commits have finished */
> >       synchronize_rcu();
> > 
> > -     for_each_online_buffer_cpu(buffer, cpu) {
> > +     for_each_cpu_and(cpu, buffer->cpumask, &reset_online_cpumask)

Maybe we should use for_each_buffer_cpu(buffer, cpu) here?
since a CPU may also came offline during synchronize_rcu().

> > {
> >               cpu_buffer = buffer->buffers[cpu];
> 
> Then here we can do:
> 
>                 /*
>                  * If a CPU came online during the synchronize_rcu(),
> then
>                  * ignore it.
>                  */
>                 if (!atomic_read(&cpu_buffer->resize_disabled) &
> RESET_BIT))
>                         continue;
> 
>                 atomic_sub(&cpu_buffer->resize_disabled, RESET_BIT);
> 
> 
> As the resize_disabled only needs to be set to something to make it
> disabled.
> 
> -- Steve
> 

Thanks for all your suggestions, learn a lot from here, really
appriciate :).
I will upload a v4 patch in new thread as soon as the new patch pass my
test.

-- Tzenan


> > 
> >               reset_disabled_cpu_buffer(cpu_buffer);
> 
>
Steven Rostedt April 12, 2023, 2:32 a.m. UTC | #4
On Wed, 12 Apr 2023 02:27:53 +0000
Tze-nan Wu (吳澤南) <Tze-nan.Wu@mediatek.com> wrote:

> > > @@ -5368,7 +5372,7 @@ void ring_buffer_reset_online_cpus(struct
> > > trace_buffer *buffer)
> > >       /* Make sure all commits have finished */
> > >       synchronize_rcu();
> > > 
> > > -     for_each_online_buffer_cpu(buffer, cpu) {
> > > +     for_each_cpu_and(cpu, buffer->cpumask, &reset_online_cpumask)  
> 
> Maybe we should use for_each_buffer_cpu(buffer, cpu) here?
> since a CPU may also came offline during synchronize_rcu().

Yeah, I guess that works too (not looking at the code at the moment though).

-- Steve

> 
> > > {
> > >               cpu_buffer = buffer->buffers[cpu];  
> >
Tze-nan Wu April 14, 2023, 1:16 a.m. UTC | #5
In case those who are following this thread but are not included in the
cc list, the v4 patch cannot be easily found by them due to the tile of
the newer patch had been changed.

Let my put a link here:
  
https://lore.kernel.org/lkml/20230412112401.25081-1-Tze-nan.Wu@mediatek.com/
diff mbox series

Patch

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 76a2d91eecad..dc758930dacb 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -288,9 +288,6 @@  EXPORT_SYMBOL_GPL(ring_buffer_event_data);
 #define for_each_buffer_cpu(buffer, cpu)		\
 	for_each_cpu(cpu, buffer->cpumask)
 
-#define for_each_online_buffer_cpu(buffer, cpu)		\
-	for_each_cpu_and(cpu, buffer->cpumask, cpu_online_mask)
-
 #define TS_SHIFT	27
 #define TS_MASK		((1ULL << TS_SHIFT) - 1)
 #define TS_DELTA_TEST	(~TS_MASK)
@@ -5353,12 +5350,19 @@  EXPORT_SYMBOL_GPL(ring_buffer_reset_cpu);
 void ring_buffer_reset_online_cpus(struct trace_buffer *buffer)
 {
 	struct ring_buffer_per_cpu *cpu_buffer;
+	cpumask_t reset_online_cpumask;
 	int cpu;
 
+	/*
+	 * Record cpu_online_mask here to make sure we iterate through the same
+	 * online CPUs in the following two loops.
+	 */
+	cpumask_copy(&reset_online_cpumask, cpu_online_mask);
+
 	/* prevent another thread from changing buffer sizes */
 	mutex_lock(&buffer->mutex);
 
-	for_each_online_buffer_cpu(buffer, cpu) {
+	for_each_cpu_and(cpu, buffer->cpumask, &reset_online_cpumask) {
 		cpu_buffer = buffer->buffers[cpu];
 
 		atomic_inc(&cpu_buffer->resize_disabled);
@@ -5368,7 +5372,7 @@  void ring_buffer_reset_online_cpus(struct trace_buffer *buffer)
 	/* Make sure all commits have finished */
 	synchronize_rcu();
 
-	for_each_online_buffer_cpu(buffer, cpu) {
+	for_each_cpu_and(cpu, buffer->cpumask, &reset_online_cpumask) {
 		cpu_buffer = buffer->buffers[cpu];
 
 		reset_disabled_cpu_buffer(cpu_buffer);