diff mbox series

[V2,3/3] tracing/timerlat: Add user-space interface

Message ID a7b2c215c763e95a56fa1258743332b570c81c9d.1684860626.git.bristot@kernel.org (mailing list archive)
State Superseded
Headers show
Series osnoise/timerlat improvements | expand

Commit Message

Daniel Bristot de Oliveira May 23, 2023, 5:22 p.m. UTC
Going a step further, we propose a way to use any user-space
workload as the task waiting for the timerlat timer. This is done
via a per-CPU file named osnoise/cpu$id/timerlat_fd file.

The tracef_fd allows a task to open at a time. When a task reads
the file, the timerlat timer is armed for future osnoise/timerlat_period_us
time. When the timer fires, it prints the IRQ latency and
wakes up the user-space thread waiting in the timerlat_fd.

The thread then starts to run, executes the timerlat measurement, prints
the thread scheduling latency and returns to user space.

When the thread rereads the timerlat_fd, the tracer will print the
user-ret(urn) latency, which is an additional metric.

This additional metric is also traced by the tracer and can be used, for
example of measuring the context switch overhead from kernel-to-user and
user-to-kernel, or the response time for an arbitrary execution in
user-space.

The tracer supports one thread per CPU, the thread must be pinned to
the CPU, and it cannot migrate while holding the timerlat_fd. The reason
is that the tracer is per CPU (nothing prohibits the tracer from
allowing migrations in the future). The tracer monitors the migration
of the thread and disables the tracer if detected.

The timerlat_fd is only available for opening/reading when timerlat
tracer is enabled, and NO_OSNOISE_WORKLOAD is set.

The simplest way to activate this feature from user-space is:

 -------------------------------- %< -----------------------------------
 int main(void)
 {
	char buffer[1024];
	int timerlat_fd;
	int retval;
	long cpu = 0;	/* place in CPU 0 */
	cpu_set_t set;

	CPU_ZERO(&set);
	CPU_SET(cpu, &set);

	if (sched_setaffinity(gettid(), sizeof(set), &set) == -1)
		return 1;

	snprintf(buffer, sizeof(buffer),
		"/sys/kernel/tracing/osnoise/per_cpu/cpu%ld/timerlat_fd",
		cpu);

	timerlat_fd = open(buffer, O_RDONLY);
	if (timerlat_fd < 0) {
		printf("error opening %s: %s\n", buffer, strerror(errno));
		exit(1);
	}

	for (;;) {
		retval = read(timerlat_fd, buffer, 1024);
		if (retval < 0)
			break;
	}

	close(timerlat_fd);
	exit(0);
}
 -------------------------------- >% -----------------------------------

When disabling timerlat, if there is a workload holding the timerlat_fd,
the SIGKILL will be sent to the thread.

Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
---
 Documentation/trace/timerlat-tracer.rst |  78 ++++++
 kernel/trace/trace_osnoise.c            | 312 +++++++++++++++++++++++-
 kernel/trace/trace_output.c             |   4 +-
 3 files changed, 391 insertions(+), 3 deletions(-)

Comments

kernel test robot May 23, 2023, 8:59 p.m. UTC | #1
Hi Daniel,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.4-rc3 next-20230523]
[cannot apply to rostedt-trace/for-next rostedt-trace/for-next-urgent]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
base:   linus/master
patch link:    https://lore.kernel.org/r/a7b2c215c763e95a56fa1258743332b570c81c9d.1684860626.git.bristot%40kernel.org
patch subject: [PATCH V2 3/3] tracing/timerlat: Add user-space interface
config: loongarch-allyesconfig
compiler: loongarch64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/89216b54eaf490480bc1929f5780f95a688a91bb
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
        git checkout 89216b54eaf490480bc1929f5780f95a688a91bb
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch SHELL=/bin/bash kernel/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202305240459.z43SP4XU-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> kernel/trace/trace_osnoise.c:2645:5: warning: no previous prototype for 'osnoise_create_cpu_timerlat_fd' [-Wmissing-prototypes]
    2645 | int osnoise_create_cpu_timerlat_fd(struct dentry *top_dir)
         |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


vim +/osnoise_create_cpu_timerlat_fd +2645 kernel/trace/trace_osnoise.c

  2644	
> 2645	int osnoise_create_cpu_timerlat_fd(struct dentry *top_dir)
  2646	{
  2647		struct dentry *timerlat_fd;
  2648		struct dentry *per_cpu;
  2649		struct dentry *cpu_dir;
  2650		char cpu_str[30]; /* see trace.c: tracing_init_tracefs_percpu() */
  2651		long cpu;
  2652	
  2653		/*
  2654		 * Why not using tracing instance per_cpu/ dir?
  2655		 *
  2656		 * Because osnoise/timerlat have a single workload, having
  2657		 * multiple files like these are wast of memory.
  2658		 */
  2659		per_cpu = tracefs_create_dir("per_cpu", top_dir);
  2660		if (!per_cpu)
  2661			return -ENOMEM;
  2662	
  2663		for_each_possible_cpu(cpu) {
  2664			snprintf(cpu_str, 30, "cpu%ld", cpu);
  2665			cpu_dir = tracefs_create_dir(cpu_str, per_cpu);
  2666			if (!cpu_dir)
  2667				goto out_clean;
  2668	
  2669			timerlat_fd = trace_create_file("timerlat_fd", TRACE_MODE_READ,
  2670							cpu_dir, NULL, &timerlat_fd_fops);
  2671			if (!timerlat_fd)
  2672				goto out_clean;
  2673	
  2674			/* Record the CPU */
  2675			d_inode(timerlat_fd)->i_cdev = (void *)(cpu);
  2676		}
  2677	
  2678		return 0;
  2679	
  2680	out_clean:
  2681		tracefs_remove(per_cpu);
  2682		return -ENOMEM;
  2683	}
  2684
kernel test robot May 24, 2023, 8:04 a.m. UTC | #2
Hi Daniel,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.4-rc3 next-20230523]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
base:   linus/master
patch link:    https://lore.kernel.org/r/a7b2c215c763e95a56fa1258743332b570c81c9d.1684860626.git.bristot%40kernel.org
patch subject: [PATCH V2 3/3] tracing/timerlat: Add user-space interface
config: x86_64-randconfig-s022
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.4-39-gce1a6720-dirty
        # https://github.com/intel-lab-lkp/linux/commit/89216b54eaf490480bc1929f5780f95a688a91bb
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
        git checkout 89216b54eaf490480bc1929f5780f95a688a91bb
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=x86_64 olddefconfig
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=x86_64 SHELL=/bin/bash kernel/trace/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202305241531.5ItU2LLm-lkp@intel.com/

sparse warnings: (new ones prefixed by >>)
>> kernel/trace/trace_osnoise.c:2645:5: sparse: sparse: symbol 'osnoise_create_cpu_timerlat_fd' was not declared. Should it be static?
kernel test robot May 24, 2023, 8:04 a.m. UTC | #3
Hi Daniel,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.4-rc3 next-20230524]
[cannot apply to rostedt-trace/for-next rostedt-trace/for-next-urgent]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
base:   linus/master
patch link:    https://lore.kernel.org/r/a7b2c215c763e95a56fa1258743332b570c81c9d.1684860626.git.bristot%40kernel.org
patch subject: [PATCH V2 3/3] tracing/timerlat: Add user-space interface
config: x86_64-randconfig-x064-20230522
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/89216b54eaf490480bc1929f5780f95a688a91bb
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
        git checkout 89216b54eaf490480bc1929f5780f95a688a91bb
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash kernel/trace/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202305241530.Pn2xj2vs-lkp@intel.com/

All warnings (new ones prefixed by >>):

   kernel/trace/trace_osnoise.c:2364:9: error: implicit declaration of function 'this_cpu_tmr_var' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
           tlat = this_cpu_tmr_var();
                  ^
   kernel/trace/trace_osnoise.c:2364:9: note: did you mean 'this_cpu_osn_var'?
   kernel/trace/trace_osnoise.c:226:41: note: 'this_cpu_osn_var' declared here
   static inline struct osnoise_variables *this_cpu_osn_var(void)
                                           ^
>> kernel/trace/trace_osnoise.c:2364:7: warning: incompatible integer to pointer conversion assigning to 'struct timerlat_variables *' from 'int' [-Wint-conversion]
           tlat = this_cpu_tmr_var();
                ^ ~~~~~~~~~~~~~~~~~~
   kernel/trace/trace_osnoise.c:2365:6: error: incomplete definition of type 'struct timerlat_variables'
           tlat->count = 0;
           ~~~~^
   kernel/trace/trace_osnoise.c:2309:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2387:25: error: variable has incomplete type 'struct timerlat_sample'
           struct timerlat_sample s;
                                  ^
   kernel/trace/trace_osnoise.c:2387:9: note: forward declaration of 'struct timerlat_sample'
           struct timerlat_sample s;
                  ^
   kernel/trace/trace_osnoise.c:2393:9: error: implicit declaration of function 'this_cpu_tmr_var' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
           tlat = this_cpu_tmr_var();
                  ^
   kernel/trace/trace_osnoise.c:2393:7: warning: incompatible integer to pointer conversion assigning to 'struct timerlat_variables *' from 'int' [-Wint-conversion]
           tlat = this_cpu_tmr_var();
                ^ ~~~~~~~~~~~~~~~~~~
   kernel/trace/trace_osnoise.c:2401:11: error: incomplete definition of type 'struct timerlat_variables'
                   if (tlat->uthread_migrate) {
                       ~~~~^
   kernel/trace/trace_osnoise.c:2386:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2406:16: error: use of undeclared identifier 'per_cpu_timerlat_var'
                   per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
                                ^
   kernel/trace/trace_osnoise.c:2406:16: error: use of undeclared identifier 'per_cpu_timerlat_var'
   kernel/trace/trace_osnoise.c:2406:16: error: use of undeclared identifier 'per_cpu_timerlat_var'
   kernel/trace/trace_osnoise.c:2406:44: error: member reference type 'void' is not a pointer
                   per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~  ^
   kernel/trace/trace_osnoise.c:2424:46: error: incomplete definition of type 'struct timerlat_variables'
                   now = ktime_to_ns(hrtimer_cb_get_time(&tlat->timer));
                                                          ~~~~^
   kernel/trace/trace_osnoise.c:2386:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2425:20: error: incomplete definition of type 'struct timerlat_variables'
                   diff = now - tlat->abs_period;
                                ~~~~^
   kernel/trace/trace_osnoise.c:2386:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2433:18: error: incomplete definition of type 'struct timerlat_variables'
                   s.seqnum = tlat->count;
                              ~~~~^
   kernel/trace/trace_osnoise.c:2386:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2437:3: error: implicit declaration of function 'trace_timerlat_sample' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
                   trace_timerlat_sample(&s);
                   ^
   kernel/trace/trace_osnoise.c:2441:7: error: incomplete definition of type 'struct timerlat_variables'
                   tlat->tracing_thread = false;
                   ~~~~^
   kernel/trace/trace_osnoise.c:2386:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2446:7: error: incomplete definition of type 'struct timerlat_variables'
                   tlat->tracing_thread = false;
                   ~~~~^
   kernel/trace/trace_osnoise.c:2386:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2447:7: error: incomplete definition of type 'struct timerlat_variables'
                   tlat->kthread = current;
                   ~~~~^
   kernel/trace/trace_osnoise.c:2386:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2449:21: error: incomplete definition of type 'struct timerlat_variables'
                   hrtimer_init(&tlat->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED_HARD);
                                 ~~~~^
   kernel/trace/trace_osnoise.c:2386:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2450:7: error: incomplete definition of type 'struct timerlat_variables'
                   tlat->timer.function = timerlat_irq;
                   ~~~~^
   kernel/trace/trace_osnoise.c:2386:9: note: forward declaration of 'struct timerlat_variables'
           struct timerlat_variables *tlat;
                  ^
   kernel/trace/trace_osnoise.c:2450:26: error: use of undeclared identifier 'timerlat_irq'; did you mean 'timerlat_main'?
                   tlat->timer.function = timerlat_irq;
                                          ^~~~~~~~~~~~
                                          timerlat_main
   kernel/trace/trace_osnoise.c:1856:12: note: 'timerlat_main' declared here
   static int timerlat_main(void *data)
              ^
   fatal error: too many errors emitted, stopping now [-ferror-limit=]
   2 warnings and 20 errors generated.


vim +2364 kernel/trace/trace_osnoise.c

  2305	
  2306	static int timerlat_fd_open(struct inode *inode, struct file *file)
  2307	{
  2308		struct osnoise_variables *osn_var;
  2309		struct timerlat_variables *tlat;
  2310		long cpu = (long) inode->i_cdev;
  2311	
  2312		mutex_lock(&interface_lock);
  2313	
  2314		/*
  2315		 * This file is accessible only if timerlat is enabled, and
  2316		 * NO_OSNOISE_WORKLOAD is set.
  2317		 */
  2318		if (!timerlat_enabled() || test_bit(OSN_WORKLOAD, &osnoise_options)) {
  2319			mutex_unlock(&interface_lock);
  2320			return -EINVAL;
  2321		}
  2322	
  2323		migrate_disable();
  2324	
  2325		osn_var = this_cpu_osn_var();
  2326	
  2327		/*
  2328		 * The osn_var->pid holds the single access to this file.
  2329		 */
  2330		if (osn_var->pid) {
  2331			mutex_unlock(&interface_lock);
  2332			migrate_enable();
  2333			return -EBUSY;
  2334		}
  2335	
  2336		/*
  2337		 * timerlat tracer is a per-cpu tracer. Check if the user-space too
  2338		 * is pinned to a single CPU. The tracer laters monitor if the task
  2339		 * migrates and then disables tracer if it does. However, it is
  2340		 * worth doing this basic acceptance test to avoid obviusly wrong
  2341		 * setup.
  2342		 */
  2343		if (current->nr_cpus_allowed > 1 ||  cpu != smp_processor_id()) {
  2344			mutex_unlock(&interface_lock);
  2345			migrate_enable();
  2346			return -EPERM;
  2347		}
  2348	
  2349		/*
  2350		 * From now on, it is good to go.
  2351		 */
  2352		file->private_data = inode->i_cdev;
  2353	
  2354		get_task_struct(current);
  2355	
  2356		osn_var->kthread = current;
  2357		osn_var->pid = current->pid;
  2358	
  2359		/*
  2360		 * Setup is done.
  2361		 */
  2362		mutex_unlock(&interface_lock);
  2363	
> 2364		tlat = this_cpu_tmr_var();
  2365		tlat->count = 0;
  2366	
  2367		migrate_enable();
  2368		return 0;
  2369	};
  2370
kernel test robot May 24, 2023, 12:48 p.m. UTC | #4
Hi Daniel,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.4-rc3 next-20230524]
[cannot apply to rostedt-trace/for-next rostedt-trace/for-next-urgent]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
base:   linus/master
patch link:    https://lore.kernel.org/r/a7b2c215c763e95a56fa1258743332b570c81c9d.1684860626.git.bristot%40kernel.org
patch subject: [PATCH V2 3/3] tracing/timerlat: Add user-space interface
config: i386-randconfig-i014-20230523
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/89216b54eaf490480bc1929f5780f95a688a91bb
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
        git checkout 89216b54eaf490480bc1929f5780f95a688a91bb
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=i386 olddefconfig
        make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash kernel/trace/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202305242020.VlsOXEMn-lkp@intel.com/

All error/warnings (new ones prefixed by >>):

   kernel/trace/trace_osnoise.c: In function 'timerlat_fd_open':
>> kernel/trace/trace_osnoise.c:2364:16: error: implicit declaration of function 'this_cpu_tmr_var'; did you mean 'this_cpu_osn_var'? [-Werror=implicit-function-declaration]
    2364 |         tlat = this_cpu_tmr_var();
         |                ^~~~~~~~~~~~~~~~
         |                this_cpu_osn_var
>> kernel/trace/trace_osnoise.c:2364:14: warning: assignment to 'struct timerlat_variables *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
    2364 |         tlat = this_cpu_tmr_var();
         |              ^
>> kernel/trace/trace_osnoise.c:2365:13: error: invalid use of undefined type 'struct timerlat_variables'
    2365 |         tlat->count = 0;
         |             ^~
   kernel/trace/trace_osnoise.c: In function 'timerlat_fd_read':
>> kernel/trace/trace_osnoise.c:2387:32: error: storage size of 's' isn't known
    2387 |         struct timerlat_sample s;
         |                                ^
   kernel/trace/trace_osnoise.c:2393:14: warning: assignment to 'struct timerlat_variables *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
    2393 |         tlat = this_cpu_tmr_var();
         |              ^
   kernel/trace/trace_osnoise.c:2401:25: error: invalid use of undefined type 'struct timerlat_variables'
    2401 |                 if (tlat->uthread_migrate) {
         |                         ^~
   In file included from include/asm-generic/percpu.h:7,
                    from arch/x86/include/asm/percpu.h:390,
                    from arch/x86/include/asm/current.h:10,
                    from include/linux/sched.h:12,
                    from include/linux/kthread.h:6,
                    from kernel/trace/trace_osnoise.c:19:
>> kernel/trace/trace_osnoise.c:2406:30: error: 'per_cpu_timerlat_var' undeclared (first use in this function)
    2406 |                 per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
         |                              ^~~~~~~~~~~~~~~~~~~~
   include/linux/percpu-defs.h:219:54: note: in definition of macro '__verify_pcpu_ptr'
     219 |         const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL;    \
         |                                                      ^~~
   kernel/trace/trace_osnoise.c:2406:17: note: in expansion of macro 'per_cpu_ptr'
    2406 |                 per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
         |                 ^~~~~~~~~~~
   kernel/trace/trace_osnoise.c:2406:30: note: each undeclared identifier is reported only once for each function it appears in
    2406 |                 per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
         |                              ^~~~~~~~~~~~~~~~~~~~
   include/linux/percpu-defs.h:219:54: note: in definition of macro '__verify_pcpu_ptr'
     219 |         const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL;    \
         |                                                      ^~~
   kernel/trace/trace_osnoise.c:2406:17: note: in expansion of macro 'per_cpu_ptr'
    2406 |                 per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
         |                 ^~~~~~~~~~~
   kernel/trace/trace_osnoise.c:2424:60: error: invalid use of undefined type 'struct timerlat_variables'
    2424 |                 now = ktime_to_ns(hrtimer_cb_get_time(&tlat->timer));
         |                                                            ^~
   kernel/trace/trace_osnoise.c:2425:34: error: invalid use of undefined type 'struct timerlat_variables'
    2425 |                 diff = now - tlat->abs_period;
         |                                  ^~
   kernel/trace/trace_osnoise.c:2433:32: error: invalid use of undefined type 'struct timerlat_variables'
    2433 |                 s.seqnum = tlat->count;
         |                                ^~
>> kernel/trace/trace_osnoise.c:2437:17: error: implicit declaration of function 'trace_timerlat_sample' [-Werror=implicit-function-declaration]
    2437 |                 trace_timerlat_sample(&s);
         |                 ^~~~~~~~~~~~~~~~~~~~~
   kernel/trace/trace_osnoise.c:2441:21: error: invalid use of undefined type 'struct timerlat_variables'
    2441 |                 tlat->tracing_thread = false;
         |                     ^~
   kernel/trace/trace_osnoise.c:2446:21: error: invalid use of undefined type 'struct timerlat_variables'
    2446 |                 tlat->tracing_thread = false;
         |                     ^~
   kernel/trace/trace_osnoise.c:2447:21: error: invalid use of undefined type 'struct timerlat_variables'
    2447 |                 tlat->kthread = current;
         |                     ^~
   kernel/trace/trace_osnoise.c:2449:35: error: invalid use of undefined type 'struct timerlat_variables'
    2449 |                 hrtimer_init(&tlat->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED_HARD);
         |                                   ^~
   kernel/trace/trace_osnoise.c:2450:21: error: invalid use of undefined type 'struct timerlat_variables'
    2450 |                 tlat->timer.function = timerlat_irq;
         |                     ^~
>> kernel/trace/trace_osnoise.c:2450:40: error: 'timerlat_irq' undeclared (first use in this function); did you mean 'timerlat_main'?
    2450 |                 tlat->timer.function = timerlat_irq;
         |                                        ^~~~~~~~~~~~
         |                                        timerlat_main
   kernel/trace/trace_osnoise.c:2453:21: error: invalid use of undefined type 'struct timerlat_variables'
    2453 |                 tlat->abs_period = hrtimer_cb_get_time(&tlat->timer);
         |                     ^~
   kernel/trace/trace_osnoise.c:2453:61: error: invalid use of undefined type 'struct timerlat_variables'
    2453 |                 tlat->abs_period = hrtimer_cb_get_time(&tlat->timer);
         |                                                             ^~
>> kernel/trace/trace_osnoise.c:2459:9: error: implicit declaration of function 'wait_next_period' [-Werror=implicit-function-declaration]
    2459 |         wait_next_period(tlat);
         |         ^~~~~~~~~~~~~~~~
   kernel/trace/trace_osnoise.c:2462:52: error: invalid use of undefined type 'struct timerlat_variables'
    2462 |         now = ktime_to_ns(hrtimer_cb_get_time(&tlat->timer));
         |                                                    ^~
   kernel/trace/trace_osnoise.c:2463:26: error: invalid use of undefined type 'struct timerlat_variables'
    2463 |         diff = now - tlat->abs_period;
         |                          ^~
   kernel/trace/trace_osnoise.c:2471:24: error: invalid use of undefined type 'struct timerlat_variables'
    2471 |         s.seqnum = tlat->count;
         |                        ^~
>> kernel/trace/trace_osnoise.c:2479:25: error: implicit declaration of function 'timerlat_dump_stack'; did you mean 'trace_dump_stack'? [-Werror=implicit-function-declaration]
    2479 |                         timerlat_dump_stack(time_to_us(diff));
         |                         ^~~~~~~~~~~~~~~~~~~
         |                         trace_dump_stack
>> kernel/trace/trace_osnoise.c:2387:32: warning: unused variable 's' [-Wunused-variable]
    2387 |         struct timerlat_sample s;
         |                                ^
   In file included from include/asm-generic/percpu.h:7,
                    from arch/x86/include/asm/percpu.h:390,
                    from arch/x86/include/asm/current.h:10,
                    from include/linux/sched.h:12,
                    from include/linux/kthread.h:6,
                    from kernel/trace/trace_osnoise.c:19:
   kernel/trace/trace_osnoise.c: In function 'timerlat_fd_release':
   kernel/trace/trace_osnoise.c:2500:33: error: 'per_cpu_timerlat_var' undeclared (first use in this function)
    2500 |         tlat_var = per_cpu_ptr(&per_cpu_timerlat_var, cpu);
         |                                 ^~~~~~~~~~~~~~~~~~~~
   include/linux/percpu-defs.h:219:54: note: in definition of macro '__verify_pcpu_ptr'
     219 |         const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL;    \
         |                                                      ^~~
   kernel/trace/trace_osnoise.c:2500:20: note: in expansion of macro 'per_cpu_ptr'
    2500 |         tlat_var = per_cpu_ptr(&per_cpu_timerlat_var, cpu);
         |                    ^~~~~~~~~~~
   kernel/trace/trace_osnoise.c:2502:33: error: invalid use of undefined type 'struct timerlat_variables'
    2502 |         hrtimer_cancel(&tlat_var->timer);
         |                                 ^~
   In file included from include/linux/string.h:254,
                    from arch/x86/include/asm/page_32.h:18,
                    from arch/x86/include/asm/page.h:14,
                    from arch/x86/include/asm/thread_info.h:12,
                    from include/linux/thread_info.h:60,
                    from arch/x86/include/asm/preempt.h:9,
                    from include/linux/preempt.h:78,
                    from include/linux/rcupdate.h:27,
                    from include/linux/rculist.h:11,
                    from include/linux/pid.h:5,
                    from include/linux/sched.h:14,
                    from include/linux/kthread.h:6,
                    from kernel/trace/trace_osnoise.c:19:
>> kernel/trace/trace_osnoise.c:2503:35: error: invalid application of 'sizeof' to incomplete type 'struct timerlat_variables'
    2503 |         memset(tlat_var, 0, sizeof(*tlat_var));
         |                                   ^
   include/linux/fortify-string.h:451:42: note: in definition of macro '__fortify_memset_chk'
     451 |         size_t __fortify_size = (size_t)(size);                         \
         |                                          ^~~~
   kernel/trace/trace_osnoise.c:2503:9: note: in expansion of macro 'memset'
    2503 |         memset(tlat_var, 0, sizeof(*tlat_var));
         |         ^~~~~~
   In file included from include/asm-generic/percpu.h:7,
                    from arch/x86/include/asm/percpu.h:390,
                    from arch/x86/include/asm/current.h:10,
                    from include/linux/sched.h:12,
                    from include/linux/kthread.h:6,
                    from kernel/trace/trace_osnoise.c:19:
   kernel/trace/trace_osnoise.c: In function 'check_timerlat_user_migration':
   kernel/trace/trace_osnoise.c:2529:38: error: 'per_cpu_timerlat_var' undeclared (first use in this function)
    2529 |                         per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
         |                                      ^~~~~~~~~~~~~~~~~~~~
   include/linux/percpu-defs.h:219:54: note: in definition of macro '__verify_pcpu_ptr'
     219 |         const void __percpu *__vpp_verify = (typeof((ptr) + 0))NULL;    \
         |                                                      ^~~
   kernel/trace/trace_osnoise.c:2529:25: note: in expansion of macro 'per_cpu_ptr'
    2529 |                         per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
         |                         ^~~~~~~~~~~
   At top level:
>> kernel/trace/trace_osnoise.c:2618:37: warning: 'timerlat_fd_fops' defined but not used [-Wunused-const-variable=]
    2618 | static const struct file_operations timerlat_fd_fops = {
         |                                     ^~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +2364 kernel/trace/trace_osnoise.c

  2305	
  2306	static int timerlat_fd_open(struct inode *inode, struct file *file)
  2307	{
  2308		struct osnoise_variables *osn_var;
  2309		struct timerlat_variables *tlat;
  2310		long cpu = (long) inode->i_cdev;
  2311	
  2312		mutex_lock(&interface_lock);
  2313	
  2314		/*
  2315		 * This file is accessible only if timerlat is enabled, and
  2316		 * NO_OSNOISE_WORKLOAD is set.
  2317		 */
  2318		if (!timerlat_enabled() || test_bit(OSN_WORKLOAD, &osnoise_options)) {
  2319			mutex_unlock(&interface_lock);
  2320			return -EINVAL;
  2321		}
  2322	
  2323		migrate_disable();
  2324	
  2325		osn_var = this_cpu_osn_var();
  2326	
  2327		/*
  2328		 * The osn_var->pid holds the single access to this file.
  2329		 */
  2330		if (osn_var->pid) {
  2331			mutex_unlock(&interface_lock);
  2332			migrate_enable();
  2333			return -EBUSY;
  2334		}
  2335	
  2336		/*
  2337		 * timerlat tracer is a per-cpu tracer. Check if the user-space too
  2338		 * is pinned to a single CPU. The tracer laters monitor if the task
  2339		 * migrates and then disables tracer if it does. However, it is
  2340		 * worth doing this basic acceptance test to avoid obviusly wrong
  2341		 * setup.
  2342		 */
  2343		if (current->nr_cpus_allowed > 1 ||  cpu != smp_processor_id()) {
  2344			mutex_unlock(&interface_lock);
  2345			migrate_enable();
  2346			return -EPERM;
  2347		}
  2348	
  2349		/*
  2350		 * From now on, it is good to go.
  2351		 */
  2352		file->private_data = inode->i_cdev;
  2353	
  2354		get_task_struct(current);
  2355	
  2356		osn_var->kthread = current;
  2357		osn_var->pid = current->pid;
  2358	
  2359		/*
  2360		 * Setup is done.
  2361		 */
  2362		mutex_unlock(&interface_lock);
  2363	
> 2364		tlat = this_cpu_tmr_var();
> 2365		tlat->count = 0;
  2366	
  2367		migrate_enable();
  2368		return 0;
  2369	};
  2370	
  2371	/*
  2372	 * timerlat_fd_read - Read function for "timerlat_fd" file
  2373	 * @file: The active open file structure
  2374	 * @ubuf: The userspace provided buffer to read value into
  2375	 * @cnt: The maximum number of bytes to read
  2376	 * @ppos: The current "file" position
  2377	 *
  2378	 * Prints 1 on timerlat, the number of interferences on osnoise, -1 on error.
  2379	 */
  2380	static ssize_t
  2381	timerlat_fd_read(struct file *file, char __user *ubuf, size_t count,
  2382			  loff_t *ppos)
  2383	{
  2384		long cpu = (long) file->private_data;
  2385		struct osnoise_variables *osn_var;
  2386		struct timerlat_variables *tlat;
> 2387		struct timerlat_sample s;
  2388		s64 diff;
  2389		u64 now;
  2390	
  2391		migrate_disable();
  2392	
  2393		tlat = this_cpu_tmr_var();
  2394	
  2395		/*
  2396		 * While in user-space, the thread is migratable. There is nothing
  2397		 * we can do about it.
  2398		 * So, if the thread is running on another CPU, stop the machinery.
  2399		 */
  2400		if (cpu == smp_processor_id()) {
  2401			if (tlat->uthread_migrate) {
  2402				migrate_enable();
  2403				return -EINVAL;
  2404			}
  2405		} else {
> 2406			per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
  2407			osnoise_taint("timerlat user thread migrate\n");
  2408			osnoise_stop_tracing();
  2409			migrate_enable();
  2410			return -EINVAL;
  2411		}
  2412	
  2413		osn_var = this_cpu_osn_var();
  2414	
  2415		/*
  2416		 * The timerlat in user-space runs in a different order:
  2417		 * the read() starts from the execution of the previous occurrence,
  2418		 * sleeping for the next occurrence.
  2419		 *
  2420		 * So, skip if we are entering on read() before the first wakeup
  2421		 * from timerlat IRQ:
  2422		 */
  2423		if (likely(osn_var->sampling)) {
  2424			now = ktime_to_ns(hrtimer_cb_get_time(&tlat->timer));
  2425			diff = now - tlat->abs_period;
  2426	
  2427			/*
  2428			 * it was not a timer firing, but some other signal?
  2429			 */
  2430			if (diff < 0)
  2431				goto out;
  2432	
  2433			s.seqnum = tlat->count;
  2434			s.timer_latency = diff;
  2435			s.context = THREAD_URET;
  2436	
> 2437			trace_timerlat_sample(&s);
  2438	
  2439			notify_new_max_latency(diff);
  2440	
  2441			tlat->tracing_thread = false;
  2442			if (osnoise_data.stop_tracing_total)
  2443				if (time_to_us(diff) >= osnoise_data.stop_tracing_total)
  2444					osnoise_stop_tracing();
  2445		} else {
  2446			tlat->tracing_thread = false;
  2447			tlat->kthread = current;
  2448	
  2449			hrtimer_init(&tlat->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED_HARD);
> 2450			tlat->timer.function = timerlat_irq;
  2451	
  2452			/* Annotate now to drift new period */
  2453			tlat->abs_period = hrtimer_cb_get_time(&tlat->timer);
  2454	
  2455			osn_var->sampling = 1;
  2456		}
  2457	
  2458		/* wait for the next period */
> 2459		wait_next_period(tlat);
  2460	
  2461		/* This is the wakeup from this cycle */
  2462		now = ktime_to_ns(hrtimer_cb_get_time(&tlat->timer));
  2463		diff = now - tlat->abs_period;
  2464	
  2465		/*
  2466		 * it was not a timer firing, but some other signal?
  2467		 */
  2468		if (diff < 0)
  2469			goto out;
  2470	
  2471		s.seqnum = tlat->count;
  2472		s.timer_latency = diff;
  2473		s.context = THREAD_CONTEXT;
  2474	
  2475		trace_timerlat_sample(&s);
  2476	
  2477		if (osnoise_data.stop_tracing_total) {
  2478			if (time_to_us(diff) >= osnoise_data.stop_tracing_total) {
> 2479				timerlat_dump_stack(time_to_us(diff));
  2480				notify_new_max_latency(diff);
  2481				osnoise_stop_tracing();
  2482			}
  2483		}
  2484	
  2485	out:
  2486		migrate_enable();
  2487		return 0;
  2488	}
  2489	
  2490	static int timerlat_fd_release(struct inode *inode, struct file *file)
  2491	{
  2492		struct osnoise_variables *osn_var;
  2493		struct timerlat_variables *tlat_var;
  2494		long cpu = (long) file->private_data;
  2495	
  2496		migrate_disable();
  2497		mutex_lock(&interface_lock);
  2498	
  2499		osn_var = per_cpu_ptr(&per_cpu_osnoise_var, cpu);
  2500		tlat_var = per_cpu_ptr(&per_cpu_timerlat_var, cpu);
  2501	
  2502		hrtimer_cancel(&tlat_var->timer);
> 2503		memset(tlat_var, 0, sizeof(*tlat_var));
  2504	
  2505		osn_var->sampling = 0;
  2506		osn_var->pid = 0;
  2507	
  2508		/*
  2509		 * We are leaving, not being stopped... see stop_kthread();
  2510		 */
  2511		if (osn_var->kthread) {
  2512			put_task_struct(osn_var->kthread);
  2513			osn_var->kthread = NULL;
  2514		}
  2515	
  2516		mutex_unlock(&interface_lock);
  2517		migrate_enable();
  2518		return 0;
  2519	}
  2520
Daniel Bristot de Oliveira May 25, 2023, 2:16 p.m. UTC | #5
On 5/24/23 14:48, kernel test robot wrote:
> Hi Daniel,
> 
> kernel test robot noticed the following build errors:
> 
> [auto build test ERROR on linus/master]
> [also build test ERROR on v6.4-rc3 next-20230524]
> [cannot apply to rostedt-trace/for-next rostedt-trace/for-next-urgent]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
> base:   linus/master
> patch link:    https://lore.kernel.org/r/a7b2c215c763e95a56fa1258743332b570c81c9d.1684860626.git.bristot%40kernel.org
> patch subject: [PATCH V2 3/3] tracing/timerlat: Add user-space interface
> config: i386-randconfig-i014-20230523
> compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
> reproduce (this is a W=1 build):
>         # https://github.com/intel-lab-lkp/linux/commit/89216b54eaf490480bc1929f5780f95a688a91bb
>         git remote add linux-review https://github.com/intel-lab-lkp/linux
>         git fetch --no-tags linux-review Daniel-Bristot-de-Oliveira/tracing-osnoise-Switch-from-PF_NO_SETAFFINITY-to-migrate_disable/20230524-012512
>         git checkout 89216b54eaf490480bc1929f5780f95a688a91bb
>         # save the config file
>         mkdir build_dir && cp config build_dir/.config
>         make W=1 O=build_dir ARCH=i386 olddefconfig
>         make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash kernel/trace/
> 
> If you fix the issue, kindly add following tag where applicable
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202305242020.VlsOXEMn-lkp@intel.com/
> 
> All error/warnings (new ones prefixed by >>):
> 
>    kernel/trace/trace_osnoise.c: In function 'timerlat_fd_open':
>>> kernel/trace/trace_osnoise.c:2364:16: error: implicit declaration of function 'this_cpu_tmr_var'; did you mean 'this_cpu_osn_var'? [-Werror=implicit-function-declaration]
>     2364 |         tlat = this_cpu_tmr_var();
>          |                ^~~~~~~~~~~~~~~~
>          |                this_cpu_osn_var


Thanks robot, I forgot to test without timerlat enabled. Easy to fix, an ifdef
here and there.

-- Daniel
Daniel Bristot de Oliveira May 25, 2023, 5 p.m. UTC | #6
[...]

>  
> +static void check_timerlat_user_migration(pid_t pid, long dest_cpu);
>  /*
>   * trace_sched_switch - sched:sched_switch trace event handler
>   *
> @@ -1196,6 +1199,9 @@ trace_sched_switch_callback(void *data, bool preempt,
>  	struct osnoise_variables *osn_var = this_cpu_osn_var();
>  	int workload = test_bit(OSN_WORKLOAD, &osnoise_options);
>  
> +	if (unlikely(workload && timerlat_enabled()))
> +		check_timerlat_user_migration(n->pid, smp_processor_id());
> +

it should be !workload, anyway, I will move this check to a sched_migrate_task
tracepoint because it runs less frequently and...

[...]

the tracepoint also informs the origin CPU, so it can be passed here:

> +
> +static void check_timerlat_user_migration(pid_t pid, long dest_cpu)
> +{
> +	struct osnoise_variables *osn_var;
> +	long cpu;
> +

and we can avoid this ugly loop.

> +	for_each_possible_cpu(cpu) {
> +		osn_var = per_cpu_ptr(&per_cpu_osnoise_var, cpu);
> +		if (osn_var->pid == pid && dest_cpu != cpu) {
> +			per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
> +			osnoise_taint("timerlat user thread migrate\n");
> +			osnoise_stop_tracing();
> +			break;
> +		}
> +	}

Lazy daniel should have had a look first, but at least now I know I also need
to do some ifdeffery.

-- Daniel
diff mbox series

Patch

diff --git a/Documentation/trace/timerlat-tracer.rst b/Documentation/trace/timerlat-tracer.rst
index db17df312bc8..53a56823e903 100644
--- a/Documentation/trace/timerlat-tracer.rst
+++ b/Documentation/trace/timerlat-tracer.rst
@@ -180,3 +180,81 @@  dummy_load_1ms_pd_init, which had the following code (on purpose)::
 		return 0;
 
 	}
+
+User-space interface
+---------------------------
+
+Timerlat allows user-space threads to use timerlat infra-structure to
+measure scheduling latency. This interface is accessible via a per-CPU
+file descriptor inside $tracing_dir/osnoise/per_cpu/cpu$ID/timerlat_fd.
+
+This interface is accessible under the following conditions:
+
+ - timerlat tracer is enable
+ - osnoise workload option is set to NO_OSNOISE_WORKLOAD
+ - The user-space thread is affined to a single processor
+ - The thread opens the file associated with its single processor
+ - Only one thread can access the file at a time
+
+The open() syscall will fail if any of these conditions are not met.
+After opening the file descriptor, the user space can read from it.
+
+The read() system call will run a timerlat code that will arm the
+timer in the future and wait for it as the regular kernel thread does.
+
+When the timer IRQ fires, the timerlat IRQ will execute, report the
+IRQ latency and wake up the thread waiting in the read. The thread will be
+scheduled and report the thread latency via tracer - as for the kernel
+thread.
+
+The difference from the in-kernel timerlat is that, instead of re-arming
+the timer, timerlat will return to the read() system call. At this point,
+the user can run any code.
+
+If the application rereads the file timerlat file descriptor, the tracer
+will report the return from user-space latency, which is the total
+latency. If this is the end of the work, it can be interpreted as the
+response time for the request.
+
+After reporting the total latency, timerlat will restart the cycle, arm
+a timer, and go to sleep for the following activation.
+
+If at any time one of the conditions is broken, e.g., the thread migrates
+while in user space, or the timerlat tracer is disabled, the SIG_KILL
+signal will be sent to the user-space thread.
+
+Here is an basic example of user-space code for timerlat::
+
+ int main(void)
+ {
+	char buffer[1024];
+	int timerlat_fd;
+	int retval;
+	long cpu = 0;   /* place in CPU 0 */
+	cpu_set_t set;
+
+	CPU_ZERO(&set);
+	CPU_SET(cpu, &set);
+
+	if (sched_setaffinity(gettid(), sizeof(set), &set) == -1)
+		return 1;
+
+	snprintf(buffer, sizeof(buffer),
+		"/sys/kernel/tracing/osnoise/per_cpu/cpu%ld/timerlat_fd",
+		cpu);
+
+	timerlat_fd = open(buffer, O_RDONLY);
+	if (timerlat_fd < 0) {
+		printf("error opening %s: %s\n", buffer, strerror(errno));
+		exit(1);
+	}
+
+	for (;;) {
+		retval = read(timerlat_fd, buffer, 1024);
+		if (retval < 0)
+			break;
+	}
+
+	close(timerlat_fd);
+	exit(0);
+ }
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 220172cb874d..ec7576763704 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -181,6 +181,7 @@  struct osn_irq {
 
 #define IRQ_CONTEXT	0
 #define THREAD_CONTEXT	1
+#define THREAD_URET	2
 /*
  * sofirq runtime info.
  */
@@ -238,6 +239,7 @@  struct timerlat_variables {
 	u64			abs_period;
 	bool			tracing_thread;
 	u64			count;
+	bool			uthread_migrate;
 };
 
 static DEFINE_PER_CPU(struct timerlat_variables, per_cpu_timerlat_var);
@@ -1181,6 +1183,7 @@  thread_exit(struct osnoise_variables *osn_var, struct task_struct *t)
 	osn_var->thread.arrival_time = 0;
 }
 
+static void check_timerlat_user_migration(pid_t pid, long dest_cpu);
 /*
  * trace_sched_switch - sched:sched_switch trace event handler
  *
@@ -1196,6 +1199,9 @@  trace_sched_switch_callback(void *data, bool preempt,
 	struct osnoise_variables *osn_var = this_cpu_osn_var();
 	int workload = test_bit(OSN_WORKLOAD, &osnoise_options);
 
+	if (unlikely(workload && timerlat_enabled()))
+		check_timerlat_user_migration(n->pid, smp_processor_id());
+
 	if ((p->pid != osn_var->pid) || !workload)
 		thread_exit(osn_var, p);
 
@@ -1864,10 +1870,24 @@  static void stop_kthread(unsigned int cpu)
 
 	kthread = per_cpu(per_cpu_osnoise_var, cpu).kthread;
 	if (kthread) {
-		kthread_stop(kthread);
+		if (test_bit(OSN_WORKLOAD, &osnoise_options)) {
+			kthread_stop(kthread);
+		} else {
+			/*
+			 * This is a user thread waiting on the timerlat_fd. We need
+			 * to close all users, and the best way to guarantee this is
+			 * by killing the thread. NOTE: this is a purpose specific file.
+			 */
+			kill_pid(kthread->thread_pid, SIGKILL, 1);
+			put_task_struct(kthread);
+		}
 		per_cpu(per_cpu_osnoise_var, cpu).kthread = NULL;
 	} else {
+		/* if no workload, just return */
 		if (!test_bit(OSN_WORKLOAD, &osnoise_options)) {
+			/*
+			 * This is set in the osnoise tracer case.
+			 */
 			per_cpu(per_cpu_osnoise_var, cpu).sampling = false;
 			barrier();
 			return;
@@ -1912,7 +1932,6 @@  static int start_kthread(unsigned int cpu)
 			barrier();
 			return 0;
 		}
-
 		snprintf(comm, 24, "osnoise/%d", cpu);
 	}
 
@@ -1941,6 +1960,11 @@  static int start_per_cpu_kthreads(void)
 	int retval = 0;
 	int cpu;
 
+	if (!test_bit(OSN_WORKLOAD, &osnoise_options)) {
+		if (timerlat_enabled())
+			return 0;
+	}
+
 	cpus_read_lock();
 	/*
 	 * Run only on online CPUs in which osnoise is allowed to run.
@@ -2281,6 +2305,238 @@  osnoise_cpus_write(struct file *filp, const char __user *ubuf, size_t count,
 	return err;
 }
 
+static int timerlat_fd_open(struct inode *inode, struct file *file)
+{
+	struct osnoise_variables *osn_var;
+	struct timerlat_variables *tlat;
+	long cpu = (long) inode->i_cdev;
+
+	mutex_lock(&interface_lock);
+
+	/*
+	 * This file is accessible only if timerlat is enabled, and
+	 * NO_OSNOISE_WORKLOAD is set.
+	 */
+	if (!timerlat_enabled() || test_bit(OSN_WORKLOAD, &osnoise_options)) {
+		mutex_unlock(&interface_lock);
+		return -EINVAL;
+	}
+
+	migrate_disable();
+
+	osn_var = this_cpu_osn_var();
+
+	/*
+	 * The osn_var->pid holds the single access to this file.
+	 */
+	if (osn_var->pid) {
+		mutex_unlock(&interface_lock);
+		migrate_enable();
+		return -EBUSY;
+	}
+
+	/*
+	 * timerlat tracer is a per-cpu tracer. Check if the user-space too
+	 * is pinned to a single CPU. The tracer laters monitor if the task
+	 * migrates and then disables tracer if it does. However, it is
+	 * worth doing this basic acceptance test to avoid obviusly wrong
+	 * setup.
+	 */
+	if (current->nr_cpus_allowed > 1 ||  cpu != smp_processor_id()) {
+		mutex_unlock(&interface_lock);
+		migrate_enable();
+		return -EPERM;
+	}
+
+	/*
+	 * From now on, it is good to go.
+	 */
+	file->private_data = inode->i_cdev;
+
+	get_task_struct(current);
+
+	osn_var->kthread = current;
+	osn_var->pid = current->pid;
+
+	/*
+	 * Setup is done.
+	 */
+	mutex_unlock(&interface_lock);
+
+	tlat = this_cpu_tmr_var();
+	tlat->count = 0;
+
+	migrate_enable();
+	return 0;
+};
+
+/*
+ * timerlat_fd_read - Read function for "timerlat_fd" file
+ * @file: The active open file structure
+ * @ubuf: The userspace provided buffer to read value into
+ * @cnt: The maximum number of bytes to read
+ * @ppos: The current "file" position
+ *
+ * Prints 1 on timerlat, the number of interferences on osnoise, -1 on error.
+ */
+static ssize_t
+timerlat_fd_read(struct file *file, char __user *ubuf, size_t count,
+		  loff_t *ppos)
+{
+	long cpu = (long) file->private_data;
+	struct osnoise_variables *osn_var;
+	struct timerlat_variables *tlat;
+	struct timerlat_sample s;
+	s64 diff;
+	u64 now;
+
+	migrate_disable();
+
+	tlat = this_cpu_tmr_var();
+
+	/*
+	 * While in user-space, the thread is migratable. There is nothing
+	 * we can do about it.
+	 * So, if the thread is running on another CPU, stop the machinery.
+	 */
+	if (cpu == smp_processor_id()) {
+		if (tlat->uthread_migrate) {
+			migrate_enable();
+			return -EINVAL;
+		}
+	} else {
+		per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
+		osnoise_taint("timerlat user thread migrate\n");
+		osnoise_stop_tracing();
+		migrate_enable();
+		return -EINVAL;
+	}
+
+	osn_var = this_cpu_osn_var();
+
+	/*
+	 * The timerlat in user-space runs in a different order:
+	 * the read() starts from the execution of the previous occurrence,
+	 * sleeping for the next occurrence.
+	 *
+	 * So, skip if we are entering on read() before the first wakeup
+	 * from timerlat IRQ:
+	 */
+	if (likely(osn_var->sampling)) {
+		now = ktime_to_ns(hrtimer_cb_get_time(&tlat->timer));
+		diff = now - tlat->abs_period;
+
+		/*
+		 * it was not a timer firing, but some other signal?
+		 */
+		if (diff < 0)
+			goto out;
+
+		s.seqnum = tlat->count;
+		s.timer_latency = diff;
+		s.context = THREAD_URET;
+
+		trace_timerlat_sample(&s);
+
+		notify_new_max_latency(diff);
+
+		tlat->tracing_thread = false;
+		if (osnoise_data.stop_tracing_total)
+			if (time_to_us(diff) >= osnoise_data.stop_tracing_total)
+				osnoise_stop_tracing();
+	} else {
+		tlat->tracing_thread = false;
+		tlat->kthread = current;
+
+		hrtimer_init(&tlat->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED_HARD);
+		tlat->timer.function = timerlat_irq;
+
+		/* Annotate now to drift new period */
+		tlat->abs_period = hrtimer_cb_get_time(&tlat->timer);
+
+		osn_var->sampling = 1;
+	}
+
+	/* wait for the next period */
+	wait_next_period(tlat);
+
+	/* This is the wakeup from this cycle */
+	now = ktime_to_ns(hrtimer_cb_get_time(&tlat->timer));
+	diff = now - tlat->abs_period;
+
+	/*
+	 * it was not a timer firing, but some other signal?
+	 */
+	if (diff < 0)
+		goto out;
+
+	s.seqnum = tlat->count;
+	s.timer_latency = diff;
+	s.context = THREAD_CONTEXT;
+
+	trace_timerlat_sample(&s);
+
+	if (osnoise_data.stop_tracing_total) {
+		if (time_to_us(diff) >= osnoise_data.stop_tracing_total) {
+			timerlat_dump_stack(time_to_us(diff));
+			notify_new_max_latency(diff);
+			osnoise_stop_tracing();
+		}
+	}
+
+out:
+	migrate_enable();
+	return 0;
+}
+
+static int timerlat_fd_release(struct inode *inode, struct file *file)
+{
+	struct osnoise_variables *osn_var;
+	struct timerlat_variables *tlat_var;
+	long cpu = (long) file->private_data;
+
+	migrate_disable();
+	mutex_lock(&interface_lock);
+
+	osn_var = per_cpu_ptr(&per_cpu_osnoise_var, cpu);
+	tlat_var = per_cpu_ptr(&per_cpu_timerlat_var, cpu);
+
+	hrtimer_cancel(&tlat_var->timer);
+	memset(tlat_var, 0, sizeof(*tlat_var));
+
+	osn_var->sampling = 0;
+	osn_var->pid = 0;
+
+	/*
+	 * We are leaving, not being stopped... see stop_kthread();
+	 */
+	if (osn_var->kthread) {
+		put_task_struct(osn_var->kthread);
+		osn_var->kthread = NULL;
+	}
+
+	mutex_unlock(&interface_lock);
+	migrate_enable();
+	return 0;
+}
+
+static void check_timerlat_user_migration(pid_t pid, long dest_cpu)
+{
+	struct osnoise_variables *osn_var;
+	long cpu;
+
+	for_each_possible_cpu(cpu) {
+		osn_var = per_cpu_ptr(&per_cpu_osnoise_var, cpu);
+		if (osn_var->pid == pid && dest_cpu != cpu) {
+			per_cpu_ptr(&per_cpu_timerlat_var, cpu)->uthread_migrate = 1;
+			osnoise_taint("timerlat user thread migrate\n");
+			osnoise_stop_tracing();
+			break;
+		}
+	}
+}
+
+
 /*
  * osnoise/runtime_us: cannot be greater than the period.
  */
@@ -2361,6 +2617,13 @@  static const struct file_operations osnoise_options_fops = {
 	.write		= osnoise_options_write
 };
 
+static const struct file_operations timerlat_fd_fops = {
+	.open		= timerlat_fd_open,
+	.read		= timerlat_fd_read,
+	.release	= timerlat_fd_release,
+	.llseek		= generic_file_llseek,
+};
+
 #ifdef CONFIG_TIMERLAT_TRACER
 #ifdef CONFIG_STACKTRACE
 static int init_timerlat_stack_tracefs(struct dentry *top_dir)
@@ -2381,18 +2644,63 @@  static int init_timerlat_stack_tracefs(struct dentry *top_dir)
 }
 #endif /* CONFIG_STACKTRACE */
 
+int osnoise_create_cpu_timerlat_fd(struct dentry *top_dir)
+{
+	struct dentry *timerlat_fd;
+	struct dentry *per_cpu;
+	struct dentry *cpu_dir;
+	char cpu_str[30]; /* see trace.c: tracing_init_tracefs_percpu() */
+	long cpu;
+
+	/*
+	 * Why not using tracing instance per_cpu/ dir?
+	 *
+	 * Because osnoise/timerlat have a single workload, having
+	 * multiple files like these are wast of memory.
+	 */
+	per_cpu = tracefs_create_dir("per_cpu", top_dir);
+	if (!per_cpu)
+		return -ENOMEM;
+
+	for_each_possible_cpu(cpu) {
+		snprintf(cpu_str, 30, "cpu%ld", cpu);
+		cpu_dir = tracefs_create_dir(cpu_str, per_cpu);
+		if (!cpu_dir)
+			goto out_clean;
+
+		timerlat_fd = trace_create_file("timerlat_fd", TRACE_MODE_READ,
+						cpu_dir, NULL, &timerlat_fd_fops);
+		if (!timerlat_fd)
+			goto out_clean;
+
+		/* Record the CPU */
+		d_inode(timerlat_fd)->i_cdev = (void *)(cpu);
+	}
+
+	return 0;
+
+out_clean:
+	tracefs_remove(per_cpu);
+	return -ENOMEM;
+}
+
 /*
  * init_timerlat_tracefs - A function to initialize the timerlat interface files
  */
 static int init_timerlat_tracefs(struct dentry *top_dir)
 {
 	struct dentry *tmp;
+	int retval;
 
 	tmp = tracefs_create_file("timerlat_period_us", TRACE_MODE_WRITE, top_dir,
 				  &timerlat_period, &trace_min_max_fops);
 	if (!tmp)
 		return -ENOMEM;
 
+	retval = osnoise_create_cpu_timerlat_fd(top_dir);
+	if (retval)
+		return retval;
+
 	return init_timerlat_stack_tracefs(top_dir);
 }
 #else /* CONFIG_TIMERLAT_TRACER */
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 15f05faaae44..9f10c0071c4f 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -1446,6 +1446,8 @@  static struct trace_event trace_osnoise_event = {
 };
 
 /* TRACE_TIMERLAT */
+
+static char *timerlat_lat_context[] = {"irq", "thread", "user-ret"};
 static enum print_line_t
 trace_timerlat_print(struct trace_iterator *iter, int flags,
 		     struct trace_event *event)
@@ -1458,7 +1460,7 @@  trace_timerlat_print(struct trace_iterator *iter, int flags,
 
 	trace_seq_printf(s, "#%-5u context %6s timer_latency %9llu ns\n",
 			 field->seqnum,
-			 field->context ? "thread" : "irq",
+			 timerlat_lat_context[field->context],
 			 field->timer_latency);
 
 	return trace_handle_return(s);