diff mbox series

[v1] Ftrace: make sched_wakeup can focus on the target process

Message ID 20231009153714.10743-1-tangjinyu@tinylab.org (mailing list archive)
State Rejected
Headers show
Series [v1] Ftrace: make sched_wakeup can focus on the target process | expand

Commit Message

Jinyu Tang Oct. 9, 2023, 3:37 p.m. UTC
When we want to know what happened in kernel when our app
has more latency than we hope, but the larger latency of
our app may be lower than other process in the syetem.
We feel sad after waiting a long time but only get other 
process sched_wakeup trace.

This Patch can let us only trace target process sched-wakeup 
time, other process sched-wakeup will be dropped and won't
change tracing_max_latency.

The patch is tested by the following commands:

$ mount -t tracefs none /sys/kernel/tracing
$ echo wakeup_rt > /sys/kernel/tracing/current_tracer
# some other stress-ng options are also tested 
$ stress-ng --cpu 4 &
$ cyclictest --mlockall --smp --priority=99 &
$ cyclictest_pid=$!
# child thread of cyclictest main process
$ thread_pid=$((cyclictest_pid + 1))

$ echo ${thread_pid} > /sys/kernel/tracing/set_wakeup_pid

$ echo 1 > /sys/kernel/tracing/tracing_on
$ echo 0 > /sys/kernel/tracing/tracing_max_latency
$ wait ${cyclictest_pid}
$ echo 0 > /sys/kernel/tracing/tracing_on
$ cat /sys/kernel/tracing/trace

The maximum latency and backtrace recorded in the trace file will be only
generated by the target process.
Then we can eliminate interference from other programs, making it easier 
to identify the cause of latency.

Tested-by: Jiexun Wang <wangjiexun@tinylab.org>
Signed-off-by: Jinyu Tang <tangjinyu@tinylab.org>
---
 kernel/trace/trace.h              |   3 +
 kernel/trace/trace_sched_wakeup.c | 179 ++++++++++++++++++++++++++++++
 2 files changed, 182 insertions(+)

Comments

Steven Rostedt Oct. 9, 2023, 4:25 p.m. UTC | #1
On Mon,  9 Oct 2023 23:37:14 +0800
Jinyu Tang <tangjinyu@tinylab.org> wrote:

> When we want to know what happened in kernel when our app
> has more latency than we hope, but the larger latency of
> our app may be lower than other process in the syetem.
> We feel sad after waiting a long time but only get other 
> process sched_wakeup trace.
> 
> This Patch can let us only trace target process sched-wakeup 
> time, other process sched-wakeup will be dropped and won't
> change tracing_max_latency.
> 
> The patch is tested by the following commands:
> 
> $ mount -t tracefs none /sys/kernel/tracing
> $ echo wakeup_rt > /sys/kernel/tracing/current_tracer
> # some other stress-ng options are also tested 
> $ stress-ng --cpu 4 &
> $ cyclictest --mlockall --smp --priority=99 &
> $ cyclictest_pid=$!
> # child thread of cyclictest main process
> $ thread_pid=$((cyclictest_pid + 1))
> 
> $ echo ${thread_pid} > /sys/kernel/tracing/set_wakeup_pid
> 
> $ echo 1 > /sys/kernel/tracing/tracing_on
> $ echo 0 > /sys/kernel/tracing/tracing_max_latency
> $ wait ${cyclictest_pid}
> $ echo 0 > /sys/kernel/tracing/tracing_on
> $ cat /sys/kernel/tracing/trace
> 
> The maximum latency and backtrace recorded in the trace file will be only
> generated by the target process.
> Then we can eliminate interference from other programs, making it easier 
> to identify the cause of latency.
> 
> Tested-by: Jiexun Wang <wangjiexun@tinylab.org>
> Signed-off-by: Jinyu Tang <tangjinyu@tinylab.org>
> ---


Honestly, the wakeup tracers are obsolete. I haven't used them in years. I
use synthetic events instead:

 # cd /sys/kernel/tracing
 # echo 'wakeup_lat pid_t pid; u64 delay;' > synthetic_events
 # echo 'hist:keys=pid:ts=common_timestamp.usecs' if pid==$thread_pid > events/sched/sched_waking/trigger
 # echo 'hist:keys=next_pid:delta=common_timestamp.usecs-$ts:onmax($delta).trace(wakeup_lat,next_pid,$delta)' > events/sched/sched_switch/trigger
 # echo 1 > events/synthetic/wakeup_lat/enable
 # cat trace
# tracer: nop
#
# entries-in-buffer/entries-written: 3/3   #P:8
#
#                                _-----=> irqs-off/BH-disabled
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| / _-=> migrate-disable
#                              |||| /     delay
#           TASK-PID     CPU#  |||||  TIMESTAMP  FUNCTION
#              | |         |   |||||     |         |
          <idle>-0       [000] d..4. 350799.423428: wakeup_lat: pid=59921 delay=1281
          <idle>-0       [000] d..4. 350800.423441: wakeup_lat: pid=59921 delay=1317
          <idle>-0       [000] d..4. 350801.423445: wakeup_lat: pid=59921 delay=1331

I could also make it record stack traces, disable tracing, and all sorts of
other nifty things.

-- Steve
Daniel Bristot de Oliveira Oct. 10, 2023, 6:53 a.m. UTC | #2
On 10/9/23 17:37, Jinyu Tang wrote:
> $ cyclictest --mlockall --smp --priority=99 &

rtla timerlat -a <the amount of latency you tolerate>

will give you an structured analysis of your latency...

https://bristot.me/linux-scheduling-latency-debug-and-analysis/

-- Daniel
kernel test robot Oct. 11, 2023, 1:13 a.m. UTC | #3
Hi Jinyu,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on rostedt-trace/for-next v6.6-rc5 next-20231010]
[cannot apply to rostedt-trace/for-next-urgent]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Jinyu-Tang/Ftrace-make-sched_wakeup-can-focus-on-the-target-process/20231009-234127
base:   linus/master
patch link:    https://lore.kernel.org/r/20231009153714.10743-1-tangjinyu%40tinylab.org
patch subject: [PATCH v1] Ftrace: make sched_wakeup can focus on the target process
config: i386-randconfig-062-20231010 (https://download.01.org/0day-ci/archive/20231011/202310110813.FxuaTrH0-lkp@intel.com/config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231011/202310110813.FxuaTrH0-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310110813.FxuaTrH0-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
>> kernel/trace/trace_sched_wakeup.c:368:1: sparse: sparse: symbol 'sched_wakeup_mutex' was not declared. Should it be static?
kernel test robot Oct. 20, 2023, 7:59 a.m. UTC | #4
Hello,

kernel test robot noticed "BUG:kernel_NULL_pointer_dereference,address" on:

commit: e70b12f847f6d1b5db838c0eefa9d1d00c1591bd ("[PATCH v1] Ftrace: make sched_wakeup can focus on the target process")
url: https://github.com/intel-lab-lkp/linux/commits/Jinyu-Tang/Ftrace-make-sched_wakeup-can-focus-on-the-target-process/20231009-234127
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 94f6f0550c625fab1f373bb86a6669b45e9748b3
patch link: https://lore.kernel.org/all/20231009153714.10743-1-tangjinyu@tinylab.org/
patch subject: [PATCH v1] Ftrace: make sched_wakeup can focus on the target process

in testcase: boot

compiler: gcc-7
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


+---------------------------------------------+----------+------------+
|                                             | v6.6-rc5 | e70b12f847 |
+---------------------------------------------+----------+------------+
| boot_successes                              | 33       | 0          |
| boot_failures                               | 0        | 18         |
| BUG:kernel_NULL_pointer_dereference,address | 0        | 18         |
| Oops:#[##]                                  | 0        | 18         |
| EIP:__kmem_cache_alloc_lru                  | 0        | 18         |
| Kernel_panic-not_syncing:Fatal_exception    | 0        | 18         |
+---------------------------------------------+----------+------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202310201530.46065346-oliver.sang@intel.com


[    6.082261][    T1] BUG: kernel NULL pointer dereference, address: 0000000c
[    6.083178][    T1] #PF: supervisor read access in kernel mode
[    6.083178][    T1] #PF: error_code(0x0000) - not-present page
[    6.083178][    T1] *pde = 00000000
[    6.083178][    T1] Oops: 0000 [#1]
[    6.083178][    T1] CPU: 0 PID: 1 Comm: swapper Not tainted 6.6.0-rc5-00001-ge70b12f847f6 #1
[ 6.083178][ T1] EIP: __kmem_cache_alloc_lru+0x1d/0x88 
[ 6.083178][ T1] Code: 04 eb a7 31 db e9 a5 fe ff ff 8d 76 00 3e 8d 74 26 00 55 b9 ff ff ff ff 89 e5 83 ec 18 89 5d f4 89 c3 89 75 f8 89 7d fc 89 d7 <8b> 40 0c 89 04 24 89 d8 e8 ca fd ff ff 8b 55 04 89 c6 a1 e4 38 25
All code
========
   0:	04 eb                	add    $0xeb,%al
   2:	a7                   	cmpsl  %es:(%rdi),%ds:(%rsi)
   3:	31 db                	xor    %ebx,%ebx
   5:	e9 a5 fe ff ff       	jmpq   0xfffffffffffffeaf
   a:	8d 76 00             	lea    0x0(%rsi),%esi
   d:	3e 8d 74 26 00       	lea    %ds:0x0(%rsi,%riz,1),%esi
  12:	55                   	push   %rbp
  13:	b9 ff ff ff ff       	mov    $0xffffffff,%ecx
  18:	89 e5                	mov    %esp,%ebp
  1a:	83 ec 18             	sub    $0x18,%esp
  1d:	89 5d f4             	mov    %ebx,-0xc(%rbp)
  20:	89 c3                	mov    %eax,%ebx
  22:	89 75 f8             	mov    %esi,-0x8(%rbp)
  25:	89 7d fc             	mov    %edi,-0x4(%rbp)
  28:	89 d7                	mov    %edx,%edi
  2a:*	8b 40 0c             	mov    0xc(%rax),%eax		<-- trapping instruction
  2d:	89 04 24             	mov    %eax,(%rsp)
  30:	89 d8                	mov    %ebx,%eax
  32:	e8 ca fd ff ff       	callq  0xfffffffffffffe01
  37:	8b 55 04             	mov    0x4(%rbp),%edx
  3a:	89 c6                	mov    %eax,%esi
  3c:	a1                   	.byte 0xa1
  3d:	e4 38                	in     $0x38,%al
  3f:	25                   	.byte 0x25

Code starting with the faulting instruction
===========================================
   0:	8b 40 0c             	mov    0xc(%rax),%eax
   3:	89 04 24             	mov    %eax,(%rsp)
   6:	89 d8                	mov    %ebx,%eax
   8:	e8 ca fd ff ff       	callq  0xfffffffffffffdd7
   d:	8b 55 04             	mov    0x4(%rbp),%edx
  10:	89 c6                	mov    %eax,%esi
  12:	a1                   	.byte 0xa1
  13:	e4 38                	in     $0x38,%al
  15:	25                   	.byte 0x25
[    6.083178][    T1] EAX: 00000000 EBX: 00000000 ECX: ffffffff EDX: 00000cc0
[    6.083178][    T1] ESI: c3213800 EDI: 00000cc0 EBP: c3217d78 ESP: c3217d60
[    6.083178][    T1] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010286
[    6.083178][    T1] CR0: 80050033 CR2: 0000000c CR3: 02364000 CR4: 000406d0
[    6.083178][    T1] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    6.083178][    T1] DR6: fffe0ff0 DR7: 00000400
[    6.083178][    T1] Call Trace:
[ 6.083178][ T1] ? show_regs (arch/x86/kernel/dumpstack.c:479) 
[ 6.083178][ T1] ? __die_body (arch/x86/kernel/dumpstack.c:421) 
[ 6.083178][ T1] ? __die (arch/x86/kernel/dumpstack.c:435) 
[ 6.083178][ T1] ? page_fault_oops (arch/x86/mm/fault.c:702) 
[ 6.083178][ T1] ? kernelmode_fixup_or_oops+0x94/0xf4 
[ 6.083178][ T1] ? __bad_area_nosemaphore+0x12f/0x1e4 
[ 6.083178][ T1] ? bad_area_nosemaphore (arch/x86/mm/fault.c:867) 
[ 6.083178][ T1] ? exc_page_fault (arch/x86/mm/fault.c:1472 arch/x86/mm/fault.c:1505 arch/x86/mm/fault.c:1561) 
[ 6.083178][ T1] ? _raw_spin_unlock (kernel/locking/spinlock.c:187) 
[ 6.083178][ T1] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1518) 
[ 6.083178][ T1] ? handle_exception (arch/x86/entry/entry_32.S:1049) 
[ 6.083178][ T1] ? lookup_open (fs/namei.c:3121 fs/namei.c:3153 fs/namei.c:3456) 
[ 6.083178][ T1] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1518) 
[ 6.083178][ T1] ? __kmem_cache_alloc_lru+0x1d/0x88 
[ 6.083178][ T1] ? pvclock_clocksource_read_nowd (arch/x86/mm/fault.c:1518) 
[ 6.083178][ T1] ? __kmem_cache_alloc_lru+0x1d/0x88 
[ 6.083178][ T1] kmem_cache_alloc (mm/slub.c:3503) 
[ 6.083178][ T1] tracefs_alloc_inode (fs/tracefs/inode.c:38) 
[ 6.083178][ T1] alloc_inode (fs/inode.c:259) 
[ 6.083178][ T1] new_inode_pseudo (fs/inode.c:1006) 
[ 6.083178][ T1] new_inode (fs/inode.c:1031) 
[ 6.083178][ T1] tracefs_get_inode (fs/tracefs/inode.c:153) 
[ 6.083178][ T1] ? tracefs_start_creating (fs/tracefs/inode.c:470) 
[ 6.083178][ T1] tracefs_create_file (fs/tracefs/inode.c:616) 
[ 6.083178][ T1] ? set_tracer_flag (kernel/trace/trace.c:5441) 
[ 6.083178][ T1] __wakeup_tracer_init (kernel/trace/trace_sched_wakeup.c:858) 
[ 6.083178][ T1] wakeup_tracer_init (kernel/trace/trace_sched_wakeup.c:873) 
[ 6.083178][ T1] trace_selftest_startup_wakeup (kernel/trace/trace_selftest.c:1216 (discriminator 1)) 
[ 6.083178][ T1] ? trace_selftest_ops (kernel/trace/trace_selftest.c:1155) 
[ 6.083178][ T1] ? wait_for_completion (kernel/sched/completion.c:97 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:148) 
[ 6.083178][ T1] run_tracer_selftest (kernel/trace/trace.c:2026) 
[ 6.083178][ T1] register_tracer (kernel/trace/trace.c:2063 kernel/trace/trace.c:2187) 
[ 6.083178][ T1] init_wakeup_tracer (kernel/trace/trace_sched_wakeup.c:985) 
[ 6.083178][ T1] ? init_function_trace (kernel/trace/trace_sched_wakeup.c:982) 
[ 6.083178][ T1] do_one_initcall (init/main.c:1232) 
[ 6.083178][ T1] ? parameq (kernel/params.c:89 kernel/params.c:98) 
[ 6.083178][ T1] ? parse_args (kernel/params.c:184) 
[ 6.083178][ T1] ? kernel_init_freeable (init/main.c:1304 init/main.c:1329 init/main.c:1547) 
[ 6.083178][ T1] kernel_init_freeable (init/main.c:1293 init/main.c:1310 init/main.c:1329 init/main.c:1547) 
[ 6.083178][ T1] ? rdinit_setup (init/main.c:1278) 
[ 6.083178][ T1] ? rest_init (init/main.c:1429) 
[ 6.083178][ T1] kernel_init (init/main.c:1439) 
[ 6.083178][ T1] ? schedule_tail (kernel/sched/core.c:5318) 
[ 6.083178][ T1] ret_from_fork (arch/x86/kernel/process.c:153) 
[ 6.083178][ T1] ? rest_init (init/main.c:1429) 
[ 6.083178][ T1] ret_from_fork_asm (arch/x86/entry/entry_32.S:741) 
[ 6.083178][ T1] entry_INT80_32 (arch/x86/entry/entry_32.S:947) 
[    6.083178][    T1] Modules linked in:
[    6.083178][    T1] CR2: 000000000000000c
[    6.083178][    T1] ---[ end trace 0000000000000000 ]---
[ 6.083178][ T1] EIP: __kmem_cache_alloc_lru+0x1d/0x88 
[ 6.083178][ T1] Code: 04 eb a7 31 db e9 a5 fe ff ff 8d 76 00 3e 8d 74 26 00 55 b9 ff ff ff ff 89 e5 83 ec 18 89 5d f4 89 c3 89 75 f8 89 7d fc 89 d7 <8b> 40 0c 89 04 24 89 d8 e8 ca fd ff ff 8b 55 04 89 c6 a1 e4 38 25
All code
========
   0:	04 eb                	add    $0xeb,%al
   2:	a7                   	cmpsl  %es:(%rdi),%ds:(%rsi)
   3:	31 db                	xor    %ebx,%ebx
   5:	e9 a5 fe ff ff       	jmpq   0xfffffffffffffeaf
   a:	8d 76 00             	lea    0x0(%rsi),%esi
   d:	3e 8d 74 26 00       	lea    %ds:0x0(%rsi,%riz,1),%esi
  12:	55                   	push   %rbp
  13:	b9 ff ff ff ff       	mov    $0xffffffff,%ecx
  18:	89 e5                	mov    %esp,%ebp
  1a:	83 ec 18             	sub    $0x18,%esp
  1d:	89 5d f4             	mov    %ebx,-0xc(%rbp)
  20:	89 c3                	mov    %eax,%ebx
  22:	89 75 f8             	mov    %esi,-0x8(%rbp)
  25:	89 7d fc             	mov    %edi,-0x4(%rbp)
  28:	89 d7                	mov    %edx,%edi
  2a:*	8b 40 0c             	mov    0xc(%rax),%eax		<-- trapping instruction
  2d:	89 04 24             	mov    %eax,(%rsp)
  30:	89 d8                	mov    %ebx,%eax
  32:	e8 ca fd ff ff       	callq  0xfffffffffffffe01
  37:	8b 55 04             	mov    0x4(%rbp),%edx
  3a:	89 c6                	mov    %eax,%esi
  3c:	a1                   	.byte 0xa1
  3d:	e4 38                	in     $0x38,%al
  3f:	25                   	.byte 0x25

Code starting with the faulting instruction
===========================================
   0:	8b 40 0c             	mov    0xc(%rax),%eax
   3:	89 04 24             	mov    %eax,(%rsp)
   6:	89 d8                	mov    %ebx,%eax
   8:	e8 ca fd ff ff       	callq  0xfffffffffffffdd7
   d:	8b 55 04             	mov    0x4(%rbp),%edx
  10:	89 c6                	mov    %eax,%esi
  12:	a1                   	.byte 0xa1
  13:	e4 38                	in     $0x38,%al
  15:	25                   	.byte 0x25


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231020/202310201530.46065346-oliver.sang@intel.com
diff mbox series

Patch

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 77debe53f07c..c6f212e8bfd2 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -403,6 +403,9 @@  struct trace_array {
 #endif
 	/* function tracing enabled */
 	int			function_enabled;
+#endif
+#ifdef CONFIG_SCHED_TRACER
+	struct trace_pid_list	__rcu *wakeup_pids;
 #endif
 	int			no_filter_buffering_ref;
 	struct list_head	hist_vars;
diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
index 0469a04a355f..b6cb9391e120 100644
--- a/kernel/trace/trace_sched_wakeup.c
+++ b/kernel/trace/trace_sched_wakeup.c
@@ -10,6 +10,9 @@ 
  *  Copyright (C) 2004-2006 Ingo Molnar
  *  Copyright (C) 2004 Nadia Yvette Chambers
  */
+#include <linux/fs.h>
+#include <linux/workqueue.h>
+#include <linux/spinlock.h>
 #include <linux/module.h>
 #include <linux/kallsyms.h>
 #include <linux/uaccess.h>
@@ -17,6 +20,7 @@ 
 #include <linux/sched/rt.h>
 #include <linux/sched/deadline.h>
 #include <trace/events/sched.h>
+#include <linux/tracefs.h>
 #include "trace.h"
 
 static struct trace_array	*wakeup_trace;
@@ -361,6 +365,169 @@  static bool report_latency(struct trace_array *tr, u64 delta)
 	return true;
 }
 
+DEFINE_MUTEX(sched_wakeup_mutex);
+static void *
+p_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	struct trace_array *tr = m->private;
+	struct trace_pid_list *pid_list = rcu_dereference_sched(tr->wakeup_pids);
+
+	return trace_pid_next(pid_list, v, pos);
+}
+
+static void *p_start(struct seq_file *m, loff_t *pos)
+	__acquires(RCU)
+{
+	struct trace_pid_list *pid_list;
+	struct trace_array *tr = m->private;
+
+	/*
+	 * Grab the mutex, to keep calls to p_next() having the same
+	 * tr->wakeup_pids as p_start() has.
+	 * If we just passed the tr->wakeup_pids around, then RCU would
+	 * have been enough, but doing that makes things more complex.
+	 */
+	mutex_lock(&sched_wakeup_mutex);
+	rcu_read_lock_sched();
+
+	pid_list = rcu_dereference_sched(tr->wakeup_pids);
+
+	if (!pid_list)
+		return NULL;
+
+	return trace_pid_start(pid_list, pos);
+}
+
+static void p_stop(struct seq_file *m, void *p)
+	__releases(RCU)
+{
+	rcu_read_unlock_sched();
+	mutex_unlock(&sched_wakeup_mutex);
+}
+
+static const struct seq_operations show_set_pid_seq_ops = {
+	.start = p_start,
+	.next = p_next,
+	.show = trace_pid_show,
+	.stop = p_stop,
+};
+
+static int
+ftrace_wakeup_open(struct inode *inode, struct file *file,
+		  const struct seq_operations *seq_ops)
+{
+	struct seq_file *m;
+	int ret;
+
+	ret = seq_open(file, seq_ops);
+	if (ret < 0)
+		return ret;
+	m = file->private_data;
+	/* copy tr over to seq ops */
+	m->private = inode->i_private;
+
+	return ret;
+}
+
+static void __clear_wakeup_pids(struct trace_array *tr)
+{
+	struct trace_pid_list *pid_list;
+
+	pid_list = rcu_dereference_protected(tr->wakeup_pids,
+					     lockdep_is_held(&sched_wakeup_mutex));
+	if (!pid_list)
+		return;
+
+	rcu_assign_pointer(tr->wakeup_pids, NULL);
+
+	synchronize_rcu();
+	trace_pid_list_free(pid_list);
+}
+
+static void clear_wakeup_pids(struct trace_array *tr)
+{
+	mutex_lock(&sched_wakeup_mutex);
+	__clear_wakeup_pids(tr);
+	mutex_unlock(&sched_wakeup_mutex);
+
+}
+
+static int
+ftrace_set_wakeup_pid_open(struct inode *inode, struct file *file)
+{
+	const struct seq_operations *seq_ops = &show_set_pid_seq_ops;
+	struct trace_array *tr = inode->i_private;
+	int ret;
+
+	if (trace_array_get(tr) < 0)
+		return -ENODEV;
+
+	if ((file->f_mode & FMODE_WRITE) &&
+	    (file->f_flags & O_TRUNC))
+		clear_wakeup_pids(tr);
+
+	ret = ftrace_wakeup_open(inode, file, seq_ops);
+
+	if (ret < 0)
+		trace_array_put(tr);
+	return ret;
+}
+
+static ssize_t
+ftrace_set_wakeup_pid_write(struct file *filp, const char __user *ubuf,
+		       size_t cnt, loff_t *ppos)
+{
+	struct seq_file *m = filp->private_data;
+	struct trace_array *tr = m->private;
+	struct trace_pid_list *filtered_pids = NULL;
+	struct trace_pid_list *pid_list;
+	ssize_t ret;
+
+	if (!cnt)
+		return 0;
+
+	mutex_lock(&sched_wakeup_mutex);
+
+	filtered_pids = rcu_dereference_protected(tr->wakeup_pids,
+					     lockdep_is_held(&sched_wakeup_mutex));
+
+	ret = trace_pid_write(filtered_pids, &pid_list, ubuf, cnt);
+	if (ret < 0)
+		goto out;
+
+	rcu_assign_pointer(tr->wakeup_pids, pid_list);
+
+	if (filtered_pids) {
+		synchronize_rcu();
+		trace_pid_list_free(filtered_pids);
+	}
+
+ out:
+	mutex_unlock(&sched_wakeup_mutex);
+
+	if (ret > 0)
+		*ppos += ret;
+
+	return ret;
+}
+
+static int ftrace_set_wakeup_pid_release(struct inode *inode, struct file *file)
+{
+	struct trace_array *tr = inode->i_private;
+
+	trace_array_put(tr);
+
+	return seq_release(inode, file);
+}
+
+static const struct file_operations ftrace_set_wakeup_pid_fops = {
+	.open = ftrace_set_wakeup_pid_open,
+	.read = seq_read,
+	.write = ftrace_set_wakeup_pid_write,
+	.llseek = seq_lseek,
+	.release = ftrace_set_wakeup_pid_release,
+};
+
 static void
 probe_wakeup_migrate_task(void *ignore, struct task_struct *task, int cpu)
 {
@@ -437,6 +604,7 @@  probe_wakeup_sched_switch(void *ignore, bool preempt,
 	long disabled;
 	int cpu;
 	unsigned int trace_ctx;
+	struct trace_pid_list *pid_list;
 
 	tracing_record_cmdline(prev);
 
@@ -466,6 +634,14 @@  probe_wakeup_sched_switch(void *ignore, bool preempt,
 
 	arch_spin_lock(&wakeup_lock);
 
+	rcu_read_lock_sched();
+	pid_list = rcu_dereference_sched(wakeup_trace->wakeup_pids);
+	rcu_read_unlock_sched();
+
+	/* We could race with grabbing wakeup_lock */
+	if (likely(trace_ignore_this_task(pid_list, NULL, next)))
+		goto out_unlock;
+
 	/* We could race with grabbing wakeup_lock */
 	if (unlikely(!tracer_enabled || next != wakeup_task))
 		goto out_unlock;
@@ -674,6 +850,9 @@  static int __wakeup_tracer_init(struct trace_array *tr)
 	set_tracer_flag(tr, TRACE_ITER_OVERWRITE, 1);
 	set_tracer_flag(tr, TRACE_ITER_LATENCY_FMT, 1);
 
+	tracefs_create_file("set_wakeup_pid", 0644, NULL,
+				    tr, &ftrace_set_wakeup_pid_fops);
+
 	tr->max_latency = 0;
 	wakeup_trace = tr;
 	ftrace_init_array_ops(tr, wakeup_tracer_call);