vmstat: disable vmstat_work on vmstat_cpu_down_prep()

Message ID	20241220134234.3809621-1-koichiro.den@canonical.com (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Koichiro Den <koichiro.den@canonical.com> To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org Subject: [PATCH] vmstat: disable vmstat_work on vmstat_cpu_down_prep() Date: Fri, 20 Dec 2024 22:42:34 +0900 Message-ID: <20241220134234.3809621-1-koichiro.den@canonical.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	vmstat: disable vmstat_work on vmstat_cpu_down_prep() \| expand vmstat: disable vmstat_work on vmstat_cpu_down_prep()

Message ID

20241220134234.3809621-1-koichiro.den@canonical.com (mailing list archive)

State

New

Headers

From: Koichiro Den <koichiro.den@canonical.com>
To: linux-mm@kvack.org
Cc: akpm@linux-foundation.org,
	linux-kernel@vger.kernel.org
Subject: [PATCH] vmstat: disable vmstat_work on vmstat_cpu_down_prep()
Date: Fri, 20 Dec 2024 22:42:34 +0900
Message-ID: <20241220134234.3809621-1-koichiro.den@canonical.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

vmstat: disable vmstat_work on vmstat_cpu_down_prep() | expand

Commit Message

Koichiro Den Dec. 20, 2024, 1:42 p.m. UTC

Even after mm/vmstat:online teardown, shepherd may still queue work for
the dying cpu until the cpu is removed from online mask. While it's
quite rare, this means that after unbind_workers() unbinds a per-cpu
kworker, it potentially runs vmstat_update for the dying CPU on an
irrelevant cpu before entering STARTING section.
When CONFIG_DEBUG_PREEMPT=y, it results in the following error with the
backtrace.

  BUG: using smp_processor_id() in preemptible [00000000] code: \
                                               kworker/7:3/1702
  caller is refresh_cpu_vm_stats+0x235/0x5f0
  CPU: 0 UID: 0 PID: 1702 Comm: kworker/7:3 Tainted: G
  Tainted: [N]=TEST
  Workqueue: mm_percpu_wq vmstat_update
  Call Trace:
   <TASK>
   dump_stack_lvl+0x8d/0xb0
   check_preemption_disabled+0xce/0xe0
   refresh_cpu_vm_stats+0x235/0x5f0
   vmstat_update+0x17/0xa0
   process_one_work+0x869/0x1aa0
   worker_thread+0x5e5/0x1100
   kthread+0x29e/0x380
   ret_from_fork+0x2d/0x70
   ret_from_fork_asm+0x1a/0x30
   </TASK>

So, disable vmstat_work reliably on vmstat_cpu_down_prep().

Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
---
 mm/vmstat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Koichiro Den Dec. 21, 2024, 3:42 a.m. UTC | #1

On Fri, Dec 20, 2024 at 10:42:34PM +0900, Koichiro Den wrote:
> Even after mm/vmstat:online teardown, shepherd may still queue work for
> the dying cpu until the cpu is removed from online mask. While it's
> quite rare, this means that after unbind_workers() unbinds a per-cpu
> kworker, it potentially runs vmstat_update for the dying CPU on an
> irrelevant cpu before entering STARTING section.
> When CONFIG_DEBUG_PREEMPT=y, it results in the following error with the
> backtrace.
> 
>   BUG: using smp_processor_id() in preemptible [00000000] code: \
>                                                kworker/7:3/1702
>   caller is refresh_cpu_vm_stats+0x235/0x5f0
>   CPU: 0 UID: 0 PID: 1702 Comm: kworker/7:3 Tainted: G
>   Tainted: [N]=TEST
>   Workqueue: mm_percpu_wq vmstat_update
>   Call Trace:
>    <TASK>
>    dump_stack_lvl+0x8d/0xb0
>    check_preemption_disabled+0xce/0xe0
>    refresh_cpu_vm_stats+0x235/0x5f0
>    vmstat_update+0x17/0xa0
>    process_one_work+0x869/0x1aa0
>    worker_thread+0x5e5/0x1100
>    kthread+0x29e/0x380
>    ret_from_fork+0x2d/0x70
>    ret_from_fork_asm+0x1a/0x30
>    </TASK>
> 
> So, disable vmstat_work reliably on vmstat_cpu_down_prep().
> 
> Signed-off-by: Koichiro Den <koichiro.den@canonical.com>
> ---
>  mm/vmstat.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 4d016314a56c..44e1d87dcf01 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -2154,7 +2154,7 @@ static int vmstat_cpu_online(unsigned int cpu)
>  
>  static int vmstat_cpu_down_prep(unsigned int cpu)
>  {
> -	cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
> +	disable_delayed_work_sync(&per_cpu(vmstat_work, cpu));
>  	return 0;
>  }
>  
> -- 

Andrew, 

I just noticed my silly mistake - I needed to enable the work in the
opposite direction. It looks like you've already queued this (v1)
to mm-hotfixes-unstable. Could you please replace it with v2?
(https://lore.kernel.org/all/20241221033321.4154409-1-koichiro.den@canonical.com/)
Sorry to bother you. Let me know if submitting a separate follow-up patch
would be more appropriate.
Thank you.

-Koichiro Den

diff --git a/mm/vmstat.c b/mm/vmstat.c
index 4d016314a56c..44e1d87dcf01 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -2154,7 +2154,7 @@  static int vmstat_cpu_online(unsigned int cpu)
 
 static int vmstat_cpu_down_prep(unsigned int cpu)
 {
-	cancel_delayed_work_sync(&per_cpu(vmstat_work, cpu));
+	disable_delayed_work_sync(&per_cpu(vmstat_work, cpu));
 	return 0;
 }

vmstat: disable vmstat_work on vmstat_cpu_down_prep()

Commit Message

Comments

Patch