diff mbox series

acpi/ghes: load balance timers

Message ID 20240320052724.41099-1-lirongqing@baidu.com (mailing list archive)
State Changes Requested, archived
Headers show
Series acpi/ghes: load balance timers | expand

Commit Message

lirongqing March 20, 2024, 5:27 a.m. UTC
Kernel needs to set up a timer for each poll type notification, On
some system, these are tens of thousands of timers, which expires
periodically and preempt one CPU which calls ghes_probe at boot stage

so load balance evenly timers to all online cpus, reduce task jitter

Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
 drivers/acpi/apei/ghes.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

Comments

Luck, Tony March 20, 2024, 3:27 p.m. UTC | #1
> Kernel needs to set up a timer for each poll type notification, On
> some system, these are tens of thousands of timers, which expires
> periodically and preempt one CPU which calls ghes_probe at boot stage
>
> so load balance evenly timers to all online cpus, reduce task jitter

Tens of thousands of timers still sounds bad. Spreading the pain across
all CPUs just moves the pain around.

Question: do these all have the same, or similar poll interval?

Could these be both spread out and batched? E.g. have a kernel thread
on each CPU that runs periodically. Assign bunches of these ghes
poll items to each CPU to be handled by the kernel thread?

-Tony
diff mbox series

Patch

diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index ab2a82c..7bc7053 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -934,9 +934,10 @@  static int ghes_proc(struct ghes *ghes)
 	return rc;
 }
 
-static void ghes_add_timer(struct ghes *ghes)
+static void ghes_add_timer(struct ghes *ghes, bool probe)
 {
 	struct acpi_hest_generic *g = ghes->generic;
+	static int cpu_running_timer;
 	unsigned long expire;
 
 	if (!g->notify.poll_interval) {
@@ -946,7 +947,16 @@  static void ghes_add_timer(struct ghes *ghes)
 	}
 	expire = jiffies + msecs_to_jiffies(g->notify.poll_interval);
 	ghes->timer.expires = round_jiffies_relative(expire);
-	add_timer(&ghes->timer);
+
+	if (probe) {
+		cpu_running_timer = cpumask_next(cpu_running_timer, cpu_online_mask);
+		if (cpu_running_timer >= nr_cpu_ids)
+			cpu_running_timer = cpumask_first(cpu_online_mask);
+
+		add_timer_on(&ghes->timer, cpu_running_timer);
+	} else {
+		add_timer_on(&ghes->timer, raw_smp_processor_id());
+	}
 }
 
 static void ghes_poll_func(struct timer_list *t)
@@ -958,7 +968,7 @@  static void ghes_poll_func(struct timer_list *t)
 	ghes_proc(ghes);
 	spin_unlock_irqrestore(&ghes_notify_lock_irq, flags);
 	if (!(ghes->flags & GHES_EXITING))
-		ghes_add_timer(ghes);
+		ghes_add_timer(ghes, false);
 }
 
 static irqreturn_t ghes_irq_func(int irq, void *data)
@@ -1388,7 +1398,7 @@  static int ghes_probe(struct platform_device *ghes_dev)
 	switch (generic->notify.type) {
 	case ACPI_HEST_NOTIFY_POLLED:
 		timer_setup(&ghes->timer, ghes_poll_func, 0);
-		ghes_add_timer(ghes);
+		ghes_add_timer(ghes, true);
 		break;
 	case ACPI_HEST_NOTIFY_EXTERNAL:
 		/* External interrupt vector is GSI */