[RFC] hung_task:add detecting task in D state milliseconds timeout
diff mbox series

Message ID 1593698893-6371-1-git-send-email-chey84736@gmail.com
State New
Headers show
Series
  • [RFC] hung_task:add detecting task in D state milliseconds timeout
Related show

Commit Message

yang che July 2, 2020, 2:08 p.m. UTC
current hung_task_check_interval_secs and hung_task_timeout_secs
only supports seconds.in some cases,the TASK_UNINTERRUPTIBLE state
takes less than 1 second.The task of the graphical interface,
the unterruptible state lasts for hundreds of milliseconds
will cause the interface to freeze

echo 1 > /proc/sys/kernel/hung_task_milliseconds
value of hung_task_check_interval_secs and hung_task_timeout_secs whill
to milliseconds

Signed-off-by: yang che <chey84736@gmail.com>
---
 include/linux/sched/sysctl.h |  1 +
 kernel/hung_task.c           | 33 +++++++++++++++++++++++++++------
 kernel/sysctl.c              |  9 +++++++++
 3 files changed, 37 insertions(+), 6 deletions(-)

Comments

Matthew Wilcox July 2, 2020, 5:43 p.m. UTC | #1
On Thu, Jul 02, 2020 at 10:08:13PM +0800, yang che wrote:
> current hung_task_check_interval_secs and hung_task_timeout_secs
> only supports seconds.in some cases,the TASK_UNINTERRUPTIBLE state
> takes less than 1 second.The task of the graphical interface,
> the unterruptible state lasts for hundreds of milliseconds
> will cause the interface to freeze

The primary problem I see with this patch is that writing to the
millisecond file silently overrides the setting in the seconds file.
If you end up redoing this patch, there needs to be one variable which
is scaled when reading/writing the seconds file.

Taking a step back though, I think this is the wrong tool for the job.
I'm pretty sure KernelShark will do what you want without any kernel
modifications.
yang che July 3, 2020, 3:18 a.m. UTC | #2
my understanding, KernelShark can't trigger panic, hung_task can
trigger. According to my use,
sometimes need to trigger panic to grab ramdump to analyze lock and
memory problems.
So I want to increase this millisecond support.


Matthew Wilcox <willy@infradead.org> 于2020年7月3日周五 上午1:43写道:
>
> On Thu, Jul 02, 2020 at 10:08:13PM +0800, yang che wrote:
> > current hung_task_check_interval_secs and hung_task_timeout_secs
> > only supports seconds.in some cases,the TASK_UNINTERRUPTIBLE state
> > takes less than 1 second.The task of the graphical interface,
> > the unterruptible state lasts for hundreds of milliseconds
> > will cause the interface to freeze
>
> The primary problem I see with this patch is that writing to the
> millisecond file silently overrides the setting in the seconds file.
> If you end up redoing this patch, there needs to be one variable which
> is scaled when reading/writing the seconds file.
>
> Taking a step back though, I think this is the wrong tool for the job.
> I'm pretty sure KernelShark will do what you want without any kernel
> modifications.
>
Matthew Wilcox July 5, 2020, 5:18 p.m. UTC | #3
On Fri, Jul 03, 2020 at 11:18:28AM +0800, yang che wrote:
>   my understanding, KernelShark can't trigger panic, hung_task can
> trigger. According to my use,
> sometimes need to trigger panic to grab ramdump to analyze lock and
> memory problems.

You shouldn't need to trigger a panic to analyse locking or memory
problems.  KernelShark is supposed to be able to help you do that without
bringing down the system.  Give it a try, and if it doesn't work, Steven
Rostedt is very interested in making it work for your case.
yang che July 6, 2020, 9:43 a.m. UTC | #4
I will learn how to use KernelShark. Try to solve my problem,thanks
for your suggestion.
Talk about I solved a problem with  hung task milliseconds:
   the process get anon_vma read lock when it directly reclaims
memory, but other process down anon_vma write lock,
long time waiting for write lock up. Since anonymous pages can be
inherited from the parent process,
need to analyze whether the anonymous page inherits the parent
process, find is inherits parent process,
use anon_vma's red black tree and  anon_vma_chain find all child
processes have inherited this anonymous page
of the parent process,and analyze the corresponding mapping file of
the current anonymous page in vma.
find what file caused by this problem.
  I used crash+ramdump to analyze this problem before, I will try to
use KernelShark analyze this problem.

I want to ask whether the hung task can be added to support the
detection of millisecond settings?
In theory, there is no harm, and the detection time can be more accurate.

Matthew Wilcox <willy@infradead.org> 于2020年7月6日周一 上午1:18写道:
>
> On Fri, Jul 03, 2020 at 11:18:28AM +0800, yang che wrote:
> >   my understanding, KernelShark can't trigger panic, hung_task can
> > trigger. According to my use,
> > sometimes need to trigger panic to grab ramdump to analyze lock and
> > memory problems.
>
> You shouldn't need to trigger a panic to analyse locking or memory
> problems.  KernelShark is supposed to be able to help you do that without
> bringing down the system.  Give it a try, and if it doesn't work, Steven
> Rostedt is very interested in making it work for your case.

Patch
diff mbox series

diff --git a/include/linux/sched/sysctl.h b/include/linux/sched/sysctl.h
index 660ac49..e5e5de2 100644
--- a/include/linux/sched/sysctl.h
+++ b/include/linux/sched/sysctl.h
@@ -16,6 +16,7 @@  extern unsigned int sysctl_hung_task_all_cpu_backtrace;
 
 extern int	     sysctl_hung_task_check_count;
 extern unsigned int  sysctl_hung_task_panic;
+extern unsigned int  sysctl_hung_task_millisecond;
 extern unsigned long sysctl_hung_task_timeout_secs;
 extern unsigned long sysctl_hung_task_check_interval_secs;
 extern int sysctl_hung_task_warnings;
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index ce76f49..7f34912 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -44,6 +44,14 @@  int __read_mostly sysctl_hung_task_check_count = PID_MAX_LIMIT;
 unsigned long __read_mostly sysctl_hung_task_timeout_secs = CONFIG_DEFAULT_HUNG_TASK_TIMEOUT;
 
 /*
+ * sysctl_hung_task_milliseconds is enable milliseconds
+ *
+ * if is 1 , hung_task_timeout_secs and hung_task_check_interval_secs will
+ * means set to millisecondsuse. as hung_task_timeout_secs is 5, will 5 milliseconds
+ */
+unsigned int __read_mostly sysctl_hung_task_millisecond;
+
+/*
  * Zero (default value) means use sysctl_hung_task_timeout_secs:
  */
 unsigned long __read_mostly sysctl_hung_task_check_interval_secs;
@@ -108,8 +116,13 @@  static void check_hung_task(struct task_struct *t, unsigned long timeout)
 		t->last_switch_time = jiffies;
 		return;
 	}
-	if (time_is_after_jiffies(t->last_switch_time + timeout * HZ))
-		return;
+	if (sysctl_hung_task_millisecond) {
+		if (time_is_after_jiffies(t->last_switch_time + (timeout * HZ) / 1000))
+			return;
+	} else {
+		if (time_is_after_jiffies(t->last_switch_time + timeout * HZ))
+			return;
+	}
 
 	trace_sched_process_hang(t);
 
@@ -126,8 +139,12 @@  static void check_hung_task(struct task_struct *t, unsigned long timeout)
 	if (sysctl_hung_task_warnings) {
 		if (sysctl_hung_task_warnings > 0)
 			sysctl_hung_task_warnings--;
-		pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
-		       t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
+		if (sysctl_hung_task_millisecond)
+			pr_err("INFO: task %s:%d blocked for more than %ld milliiseconds.\n",
+				t->comm, t->pid, (jiffies - t->last_switch_time) / HZ * 1000);
+		else
+			pr_err("INFO: task %s:%d blocked for more than %ld seconds.\n",
+				t->comm, t->pid, (jiffies - t->last_switch_time) / HZ);
 		pr_err("      %s %s %.*s\n",
 			print_tainted(), init_utsname()->release,
 			(int)strcspn(init_utsname()->version, " "),
@@ -217,8 +234,12 @@  static long hung_timeout_jiffies(unsigned long last_checked,
 				 unsigned long timeout)
 {
 	/* timeout of 0 will disable the watchdog */
-	return timeout ? last_checked - jiffies + timeout * HZ :
-		MAX_SCHEDULE_TIMEOUT;
+	if (sysctl_hung_task_millisecond)
+		return timeout ? last_checked - jiffies + (timeout * HZ) / 1000 :
+			MAX_SCHEDULE_TIMEOUT;
+	else
+		return timeout ? last_checked - jiffies + timeout * HZ :
+			MAX_SCHEDULE_TIMEOUT;
 }
 
 /*
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index db1ce7a..0bdcd66 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -2476,6 +2476,15 @@  static struct ctl_table kern_table[] = {
 		.extra1		= SYSCTL_ZERO,
 	},
 	{
+		.procname       = "hung_task_milliseconds",
+		.data           = &sysctl_hung_task_millisecond,
+		.maxlen         = sizeof(int),
+		.mode           = 0644,
+		.proc_handler   = proc_dointvec_minmax,
+		.extra1         = SYSCTL_ZERO,
+		.extra2         = SYSCTL_ONE,
+	},
+	{
 		.procname	= "hung_task_timeout_secs",
 		.data		= &sysctl_hung_task_timeout_secs,
 		.maxlen		= sizeof(unsigned long),