diff mbox series

softirq: let the userside tune the SOFTIRQ_NOW_MASK with sysctl

Message ID 20230410023041.49857-1-kerneljasonxing@gmail.com (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series softirq: let the userside tune the SOFTIRQ_NOW_MASK with sysctl | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Jason Xing April 10, 2023, 2:30 a.m. UTC
From: Jason Xing <kernelxing@tencent.com>

Currently we have two exceptions which could avoid ksoftirqd when
invoking softirqs: HI_SOFTIRQ and TASKLET_SOFTIRQ. They were introduced
in the commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous")
which says if we don't mask them, it will cause excessive latencies in
some cases.

It also mentioned that we may take time softirq into consideration:
"We should probably also consider the timer softirqs to be synchronous
and not be delayed to ksoftirqd."

The same reason goes here. In production workload, we found that some
sensitive applications are complaining about the high latency of
tx/rx path in networking, because some packets have to be delayed in
ksoftirqd kthread that can be blocked in the runqueue for some while
(say, 10-70 ms) especially in guestOS. So marking tx/rx softirq
synchronous, for instance, NET_RX_SOFTIRQ, solves such issue.

We tested and observed the high latency above 50ms of the rx path in
the real workload:
without masking: over 100 times hitting the limit per hour
with masking: less than 10 times for a whole day

As we all know the default config is not able to satisify everyone's
requirements. After applied this patch exporting the softirq mask to
the userside, we can serve different cases by tuning with sysctl.

Signed-off-by: Jason Xing <kernelxing@tencent.com>
---
 kernel/softirq.c | 29 +++++++++++++++++++++++++++--
 1 file changed, 27 insertions(+), 2 deletions(-)

Comments

Thomas Gleixner May 9, 2023, 1:05 p.m. UTC | #1
On Mon, Apr 10 2023 at 10:30, Jason Xing wrote:
> From: Jason Xing <kernelxing@tencent.com>
>
> Currently we have two exceptions which could avoid ksoftirqd when
> invoking softirqs: HI_SOFTIRQ and TASKLET_SOFTIRQ. They were introduced
> in the commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous")
> which says if we don't mask them, it will cause excessive latencies in
> some cases.

As we are ripping this out, I'll ignore this patch.
Jason Xing May 9, 2023, 1:25 p.m. UTC | #2
On Tue, May 9, 2023 at 9:05 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, Apr 10 2023 at 10:30, Jason Xing wrote:
> > From: Jason Xing <kernelxing@tencent.com>
> >
> > Currently we have two exceptions which could avoid ksoftirqd when
> > invoking softirqs: HI_SOFTIRQ and TASKLET_SOFTIRQ. They were introduced
> > in the commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous")
> > which says if we don't mask them, it will cause excessive latencies in
> > some cases.
>
> As we are ripping this out, I'll ignore this patch.

Sure, please ignore this heuristic patch. Paolo and I have already
tested that revert patch in the production environment before and
verified its usefulness, please take that one if you could.

Thanks,
Jason
diff mbox series

Patch

diff --git a/kernel/softirq.c b/kernel/softirq.c
index c8a6913c067d..aa6e52ca2c55 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -65,6 +65,8 @@  const char * const softirq_to_name[NR_SOFTIRQS] = {
 	"TASKLET", "SCHED", "HRTIMER", "RCU"
 };
 
+unsigned int sysctl_softirq_mask = 1 << HI_SOFTIRQ | 1 << TASKLET_SOFTIRQ;
+
 /*
  * we cannot loop indefinitely here to avoid userspace starvation,
  * but we also don't want to introduce a worst case 1/HZ latency
@@ -80,17 +82,23 @@  static void wakeup_softirqd(void)
 		wake_up_process(tsk);
 }
 
+static bool softirq_now_mask(unsigned long pending)
+{
+	if (pending & sysctl_softirq_mask)
+		return false;
+	return true;
+}
+
 /*
  * If ksoftirqd is scheduled, we do not want to process pending softirqs
  * right now. Let ksoftirqd handle this at its own rate, to get fairness,
  * unless we're doing some of the synchronous softirqs.
  */
-#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
 static bool ksoftirqd_running(unsigned long pending)
 {
 	struct task_struct *tsk = __this_cpu_read(ksoftirqd);
 
-	if (pending & SOFTIRQ_NOW_MASK)
+	if (softirq_now_mask(pending))
 		return false;
 	return tsk && task_is_running(tsk) && !__kthread_should_park(tsk);
 }
@@ -903,6 +911,22 @@  void tasklet_unlock_wait(struct tasklet_struct *t)
 EXPORT_SYMBOL_GPL(tasklet_unlock_wait);
 #endif
 
+static struct ctl_table softirq_sysctls[] = {
+	{
+		.procname	= "softirq_mask",
+		.data		= &sysctl_softirq_mask,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler   = proc_dointvec,
+	},
+	{}
+};
+
+static void __init softirq_mask_sysctl_init(void)
+{
+	register_sysctl_init("kernel", softirq_sysctls);
+}
+
 void __init softirq_init(void)
 {
 	int cpu;
@@ -916,6 +940,7 @@  void __init softirq_init(void)
 
 	open_softirq(TASKLET_SOFTIRQ, tasklet_action);
 	open_softirq(HI_SOFTIRQ, tasklet_hi_action);
+	softirq_mask_sysctl_init();
 }
 
 static int ksoftirqd_should_run(unsigned int cpu)