tracing/markers: make markers select tracepoints

Message ID	499edf47.1818d00a.060b.2b8d@mx.google.com (mailing list archive)
State	New, archived
Headers	show Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n1KGoQlX003678 for <patchwork-kbuild@patchwork.kernel.org>; Fri, 20 Feb 2009 16:50:26 GMT DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:cc:date:subject; b=kQvQk1zHrTIzheUU33oOkiCBvKuzARBFJLtXGMMrewUC6fGuHi8Bx8QiuuMkdC9F/J aKrv+5xRCYzRdeDmbaGdME6K5C3V8Sewc0iTu3Fr5vMfeqXOrl+iST+0aTp8WGFXadWl iq47quxN6Ks/0OjNEhzZc/itXofoKmtMc2SUw= Message-ID: <499edf47.1818d00a.060b.2b8d@mx.google.com> From: Frederic Weisbecker <fweisbec@gmail.com> To: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org>, linux-kernel@vger.kernel.org, zippel@linux-m68k.org, linux-kbuild@vger.kernel.org Date: Fri, 20 Feb 2009 17:34:06 +0100 Subject: [PATCH] tracing/markers: make markers select tracepoints Sender: linux-kbuild-owner@vger.kernel.org Precedence: bulk

Frederic Weisbecker Feb. 20, 2009, 4:34 p.m. UTC

Sometimes it happens that KConfig dependencies are not handled
like in the following scenario:

- config A
   bool

- config B
   bool
   depends on A

-config C
   bool
   select B

If one selects C, then it will select B without checking its dependency to A, if A
hasn't been selected elsewhere, it will result in a build crash.

This is what happens on the following build error:

kernel/built-in.o: In function `marker_update_probe_range':
(.text+0x52f64): undefined reference to `tracepoint_probe_register_noupdate'
kernel/built-in.o: In function `marker_update_probe_range':
(.text+0x52f74): undefined reference to `tracepoint_probe_unregister_noupdate'
kernel/built-in.o: In function `marker_update_probe_range':
(.text+0x52fb9): undefined reference to `tracepoint_probe_unregister_noupdate'
kernel/built-in.o: In function `marker_update_probes':
marker.c:(.text+0x530ba): undefined reference to `tracepoint_probe_update_all'

CONFIG_KVM_TRACE will select CONFIG_MARKER, but the latter depends on CONFIG_TRACEPOINTS
which will not be selected.

A temporary fix is to make CONFIG_MARKER select CONFIG_TRACEPOINTS, though it doesn't
fix the source KConfig dependency handling problem.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
---
 init/Kconfig |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Randy Dunlap Feb. 20, 2009, 4:59 p.m. UTC | #1

Frederic Weisbecker wrote:
> Sometimes it happens that KConfig dependencies are not handled
> like in the following scenario:
> 
> - config A
>    bool
> 
> - config B
>    bool
>    depends on A
> 
> -config C
>    bool
>    select B
> 
> If one selects C, then it will select B without checking its dependency to A, if A
> hasn't been selected elsewhere, it will result in a build crash.

as documented in Documentation/kbuild/kconfig-language.txt (and yes, it is
considered to be a shortcoming/problem by most people AFAIK):

  Note:
	select should be used with care. select will force
	a symbol to a value without visiting the dependencies.
	By abusing select you are able to select a symbol FOO even
	if FOO depends on BAR that is not set.
	In general use select only for non-visible symbols
	(no prompts anywhere) and for symbols with no dependencies.
	That will limit the usefulness but on the other hand avoid
	the illegal configurations all over.
	kconfig should one day warn about such things.

> This is what happens on the following build error:
> 
> kernel/built-in.o: In function `marker_update_probe_range':
> (.text+0x52f64): undefined reference to `tracepoint_probe_register_noupdate'
> kernel/built-in.o: In function `marker_update_probe_range':
> (.text+0x52f74): undefined reference to `tracepoint_probe_unregister_noupdate'
> kernel/built-in.o: In function `marker_update_probe_range':
> (.text+0x52fb9): undefined reference to `tracepoint_probe_unregister_noupdate'
> kernel/built-in.o: In function `marker_update_probes':
> marker.c:(.text+0x530ba): undefined reference to `tracepoint_probe_update_all'
> 
> CONFIG_KVM_TRACE will select CONFIG_MARKER, but the latter depends on CONFIG_TRACEPOINTS
> which will not be selected.
> 
> A temporary fix is to make CONFIG_MARKER select CONFIG_TRACEPOINTS, though it doesn't
> fix the source KConfig dependency handling problem.
> 
> Reported-by: Ingo Molnar <mingo@elte.hu>
> Cc: Roman Zippel <zippel@linux-m68k.org>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> ---
>  init/Kconfig |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/init/Kconfig b/init/Kconfig
> index b6400a5..a93f957 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -975,7 +975,7 @@ config TRACEPOINTS
>  
>  config MARKERS
>  	bool "Activate markers"
> -	depends on TRACEPOINTS
> +	select TRACEPOINTS
>  	help
>  	  Place an empty function call at each marker site. Can be
>  	  dynamically changed for a probe function.

but using "select" instead of "depends on" just causes the kind of problem
that you described, whereas using "depends on" does follow dependency
chains.

Frederic Weisbecker Feb. 20, 2009, 5:20 p.m. UTC | #2

On Fri, Feb 20, 2009 at 08:59:14AM -0800, Randy Dunlap wrote:
> Frederic Weisbecker wrote:
> > Sometimes it happens that KConfig dependencies are not handled
> > like in the following scenario:
> > 
> > - config A
> >    bool
> > 
> > - config B
> >    bool
> >    depends on A
> > 
> > -config C
> >    bool
> >    select B
> > 
> > If one selects C, then it will select B without checking its dependency to A, if A
> > hasn't been selected elsewhere, it will result in a build crash.
> 
> as documented in Documentation/kbuild/kconfig-language.txt (and yes, it is
> considered to be a shortcoming/problem by most people AFAIK):
> 
>   Note:
> 	select should be used with care. select will force
> 	a symbol to a value without visiting the dependencies.
> 	By abusing select you are able to select a symbol FOO even
> 	if FOO depends on BAR that is not set.
> 	In general use select only for non-visible symbols
> 	(no prompts anywhere) and for symbols with no dependencies.
> 	That will limit the usefulness but on the other hand avoid
> 	the illegal configurations all over.
> 	kconfig should one day warn about such things.


Ah thanks!


 
> > This is what happens on the following build error:
> > 
> > kernel/built-in.o: In function `marker_update_probe_range':
> > (.text+0x52f64): undefined reference to `tracepoint_probe_register_noupdate'
> > kernel/built-in.o: In function `marker_update_probe_range':
> > (.text+0x52f74): undefined reference to `tracepoint_probe_unregister_noupdate'
> > kernel/built-in.o: In function `marker_update_probe_range':
> > (.text+0x52fb9): undefined reference to `tracepoint_probe_unregister_noupdate'
> > kernel/built-in.o: In function `marker_update_probes':
> > marker.c:(.text+0x530ba): undefined reference to `tracepoint_probe_update_all'
> > 
> > CONFIG_KVM_TRACE will select CONFIG_MARKER, but the latter depends on CONFIG_TRACEPOINTS
> > which will not be selected.
> > 
> > A temporary fix is to make CONFIG_MARKER select CONFIG_TRACEPOINTS, though it doesn't
> > fix the source KConfig dependency handling problem.
> > 
> > Reported-by: Ingo Molnar <mingo@elte.hu>
> > Cc: Roman Zippel <zippel@linux-m68k.org>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > ---
> >  init/Kconfig |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/init/Kconfig b/init/Kconfig
> > index b6400a5..a93f957 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -975,7 +975,7 @@ config TRACEPOINTS
> >  
> >  config MARKERS
> >  	bool "Activate markers"
> > -	depends on TRACEPOINTS
> > +	select TRACEPOINTS
> >  	help
> >  	  Place an empty function call at each marker site. Can be
> >  	  dynamically changed for a probe function.
> 
> but using "select" instead of "depends on" just causes the kind of problem
> that you described, whereas using "depends on" does follow dependency
> chains.


Ok, but here we are in the case described in the above KConfig documentation
which tells that select is mostly useful for config that are not prompted.

This is the case for TRACEPOINTS. So I think this fix is right.

MARKERS/TRACEPOINTS could be likely described as library which can be used
by debugging facilities anywhere in the kernel.

Using only a chain of dependency would hide a lot of those debugging options.
So IMHO, I think such "library" things should be selected directly from other
options and not be enabled by hand, hence only show the debugging options anywhere
in the kernel if those two are selected.

 
> -- 
> ~Randy

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ingo Molnar Feb. 20, 2009, 5:22 p.m. UTC | #3

* Randy Dunlap <randy.dunlap@oracle.com> wrote:

> > This is what happens on the following build error:
> > 
> > kernel/built-in.o: In function `marker_update_probe_range':
> > (.text+0x52f64): undefined reference to `tracepoint_probe_register_noupdate'
> > kernel/built-in.o: In function `marker_update_probe_range':
> > (.text+0x52f74): undefined reference to `tracepoint_probe_unregister_noupdate'
> > kernel/built-in.o: In function `marker_update_probe_range':
> > (.text+0x52fb9): undefined reference to `tracepoint_probe_unregister_noupdate'
> > kernel/built-in.o: In function `marker_update_probes':
> > marker.c:(.text+0x530ba): undefined reference to `tracepoint_probe_update_all'
> > 
> > CONFIG_KVM_TRACE will select CONFIG_MARKER, but the latter 
> > depends on CONFIG_TRACEPOINTS which will not be selected.
> > 
> > A temporary fix is to make CONFIG_MARKER select 
> > CONFIG_TRACEPOINTS, though it doesn't fix the source KConfig 
> > dependency handling problem.
> > 
> > Reported-by: Ingo Molnar <mingo@elte.hu>
> > Cc: Roman Zippel <zippel@linux-m68k.org>
> > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > ---
> >  init/Kconfig |    2 +-
> >  1 files changed, 1 insertions(+), 1 deletions(-)
> > 
> > diff --git a/init/Kconfig b/init/Kconfig
> > index b6400a5..a93f957 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
> > @@ -975,7 +975,7 @@ config TRACEPOINTS
> >  
> >  config MARKERS
> >  	bool "Activate markers"
> > -	depends on TRACEPOINTS
> > +	select TRACEPOINTS
> >  	help
> >  	  Place an empty function call at each marker site. Can be
> >  	  dynamically changed for a probe function.
> 
> but using "select" instead of "depends on" just causes the 
> kind of problem that you described, whereas using "depends on" 
> does follow dependency chains.

Well, as long as the secondary selects are 'expanded' along the 
line of dependencies, it should still be fine. With an 
increasing number of dependencies it quickly becomes ugly 
though. This might be one of the cases where it works.

Eventually we should eliminate markers, their uses can either be 
converted to new-style tracepoints, or to ftrace_printk().

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Randy Dunlap Feb. 20, 2009, 5:29 p.m. UTC | #4

Ingo Molnar wrote:
> * Randy Dunlap <randy.dunlap@oracle.com> wrote:
> 
>>> This is what happens on the following build error:
>>>
>>> kernel/built-in.o: In function `marker_update_probe_range':
>>> (.text+0x52f64): undefined reference to `tracepoint_probe_register_noupdate'
>>> kernel/built-in.o: In function `marker_update_probe_range':
>>> (.text+0x52f74): undefined reference to `tracepoint_probe_unregister_noupdate'
>>> kernel/built-in.o: In function `marker_update_probe_range':
>>> (.text+0x52fb9): undefined reference to `tracepoint_probe_unregister_noupdate'
>>> kernel/built-in.o: In function `marker_update_probes':
>>> marker.c:(.text+0x530ba): undefined reference to `tracepoint_probe_update_all'
>>>
>>> CONFIG_KVM_TRACE will select CONFIG_MARKER, but the latter 
>>> depends on CONFIG_TRACEPOINTS which will not be selected.
>>>
>>> A temporary fix is to make CONFIG_MARKER select 
>>> CONFIG_TRACEPOINTS, though it doesn't fix the source KConfig 
>>> dependency handling problem.
>>>
>>> Reported-by: Ingo Molnar <mingo@elte.hu>
>>> Cc: Roman Zippel <zippel@linux-m68k.org>
>>> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
>>> ---
>>>  init/Kconfig |    2 +-
>>>  1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/init/Kconfig b/init/Kconfig
>>> index b6400a5..a93f957 100644
>>> --- a/init/Kconfig
>>> +++ b/init/Kconfig
>>> @@ -975,7 +975,7 @@ config TRACEPOINTS
>>>  
>>>  config MARKERS
>>>  	bool "Activate markers"
>>> -	depends on TRACEPOINTS
>>> +	select TRACEPOINTS
>>>  	help
>>>  	  Place an empty function call at each marker site. Can be
>>>  	  dynamically changed for a probe function.
>> but using "select" instead of "depends on" just causes the 
>> kind of problem that you described, whereas using "depends on" 
>> does follow dependency chains.
> 
> Well, as long as the secondary selects are 'expanded' along the 
> line of dependencies, it should still be fine. With an 

Agreed.

> increasing number of dependencies it quickly becomes ugly 
> though. This might be one of the cases where it works.

Agreed.

> Eventually we should eliminate markers, their uses can either be 
> converted to new-style tracepoints, or to ftrace_printk().


Thanks,

Frederic Weisbecker Feb. 20, 2009, 5:31 p.m. UTC | #5

On Fri, Feb 20, 2009 at 06:22:41PM +0100, Ingo Molnar wrote:
> 
> * Randy Dunlap <randy.dunlap@oracle.com> wrote:
> 
> > > This is what happens on the following build error:
> > > 
> > > kernel/built-in.o: In function `marker_update_probe_range':
> > > (.text+0x52f64): undefined reference to `tracepoint_probe_register_noupdate'
> > > kernel/built-in.o: In function `marker_update_probe_range':
> > > (.text+0x52f74): undefined reference to `tracepoint_probe_unregister_noupdate'
> > > kernel/built-in.o: In function `marker_update_probe_range':
> > > (.text+0x52fb9): undefined reference to `tracepoint_probe_unregister_noupdate'
> > > kernel/built-in.o: In function `marker_update_probes':
> > > marker.c:(.text+0x530ba): undefined reference to `tracepoint_probe_update_all'
> > > 
> > > CONFIG_KVM_TRACE will select CONFIG_MARKER, but the latter 
> > > depends on CONFIG_TRACEPOINTS which will not be selected.
> > > 
> > > A temporary fix is to make CONFIG_MARKER select 
> > > CONFIG_TRACEPOINTS, though it doesn't fix the source KConfig 
> > > dependency handling problem.
> > > 
> > > Reported-by: Ingo Molnar <mingo@elte.hu>
> > > Cc: Roman Zippel <zippel@linux-m68k.org>
> > > Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> > > ---
> > >  init/Kconfig |    2 +-
> > >  1 files changed, 1 insertions(+), 1 deletions(-)
> > > 
> > > diff --git a/init/Kconfig b/init/Kconfig
> > > index b6400a5..a93f957 100644
> > > --- a/init/Kconfig
> > > +++ b/init/Kconfig
> > > @@ -975,7 +975,7 @@ config TRACEPOINTS
> > >  
> > >  config MARKERS
> > >  	bool "Activate markers"
> > > -	depends on TRACEPOINTS
> > > +	select TRACEPOINTS
> > >  	help
> > >  	  Place an empty function call at each marker site. Can be
> > >  	  dynamically changed for a probe function.
> > 
> > but using "select" instead of "depends on" just causes the 
> > kind of problem that you described, whereas using "depends on" 
> > does follow dependency chains.
> 
> Well, as long as the secondary selects are 'expanded' along the 
> line of dependencies, it should still be fine. With an 
> increasing number of dependencies it quickly becomes ugly 
> though. This might be one of the cases where it works.
> 
> Eventually we should eliminate markers, their uses can either be 
> converted to new-style tracepoints, or to ftrace_printk().
> 
> 	Ingo

ftrace_printk adds more overhead if not used since it inconditionally
send the trace, unless the related TRACE_ITER_PRINTK flag on ftrace is unset.

I don't know how markers work, but the documentation describes that a single branch
check is done in case the probe is disabled.

With ftrace_printk, even if TRACE_ITER_PRINTK is unset, this is still
one call and one branch check. So for hot callsite it's unappropriate.

IMHO, tracepoints are more suited to replace markers if they have to.

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ingo Molnar Feb. 20, 2009, 5:48 p.m. UTC | #6

* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> > > >  config MARKERS
> > > >  	bool "Activate markers"
> > > > -	depends on TRACEPOINTS
> > > > +	select TRACEPOINTS
> > > >  	help
> > > >  	  Place an empty function call at each marker site. Can be
> > > >  	  dynamically changed for a probe function.
> > > 
> > > but using "select" instead of "depends on" just causes the 
> > > kind of problem that you described, whereas using "depends on" 
> > > does follow dependency chains.
> > 
> > Well, as long as the secondary selects are 'expanded' along the 
> > line of dependencies, it should still be fine. With an 
> > increasing number of dependencies it quickly becomes ugly 
> > though. This might be one of the cases where it works.
> > 
> > Eventually we should eliminate markers, their uses can either be 
> > converted to new-style tracepoints, or to ftrace_printk().
> > 
> > 	Ingo
> 
> ftrace_printk adds more overhead if not used since it 
> inconditionally send the trace, unless the related 
> TRACE_ITER_PRINTK flag on ftrace is unset.
> 
> I don't know how markers work, but the documentation describes 
> that a single branch check is done in case the probe is 
> disabled.
> 
> With ftrace_printk, even if TRACE_ITER_PRINTK is unset, this 
> is still one call and one branch check. So for hot callsite 
> it's unappropriate.
> 
> IMHO, tracepoints are more suited to replace markers if they 
> have to.

there's i think the KVM usecase where markers are used 
essentially a printk()-alike flexible tracing facility.

This is how KVMTRACE looks like at the moment:

./vmx.c:	KVMTRACE_3D(MSR_READ, vcpu, ecx, (u32)data, (u32)(data >> 32),
./vmx.c:	KVMTRACE_3D(MSR_WRITE, vcpu, ecx, (u32)data, (u32)(data >> 32),
./vmx.c:	KVMTRACE_0D(PEND_INTR, vcpu, handler);
./vmx.c:	KVMTRACE_3D(VMEXIT, vcpu, exit_reason, (u32)kvm_rip_read(vcpu),
./vmx.c:		KVMTRACE_0D(NMI, vcpu, handler);
./lapic.c:	KVMTRACE_1D(APIC_ACCESS, apic->vcpu, (u32)offset, handler);
./lapic.c:	KVMTRACE_1D(APIC_ACCESS, apic->vcpu, (u32)offset, handler);
./svm.c:	KVMTRACE_2D(DR_READ, vcpu, (u32)dr, (u32)val, handler);
./svm.c:		KVMTRACE_3D(PAGE_FAULT, &svm->vcpu, error_code,
./svm.c:		KVMTRACE_3D(TDP_FAULT, &svm->vcpu, error_code,
./svm.c:	KVMTRACE_0D(NMI, &svm->vcpu, handler);

I think this could easily be converted to a wrapper around 
ftrace_printk() plus a "kvmtrace" ftrace plugin which activates 
those wrapped ftrace_printk() sites via a single global 
__read_mostly flag.

This means KVM tracing is activated via:

 echo kvmtrace > /debug/tracing/current_tracer

that's all.

Basically the conversion would look like this:

before:

        KVMTRACE_3D(MSR_READ, &svm->vcpu, ecx, (u32)data,
                    (u32)(data >> 32), handler);

after:

	kvm_trace("MSR_READ: %p, %08lx, %016Lx\n", &svm->vcpu, ecx, data);

As a result all these traces would become a lot more readable 
(and a lot more flexible) both in the source code, and in the 
trace output stage.

And any ad-hoc tracepoint can be added, without worrying about 
the name of the macro or the number of type of arguments. Note 
that in this specific example we didnt need to split up the u64 
'data' into two 32-bit values, nor do we have to pass in the 
'handler' name, nor do we have to provide a MSR_READ 
enumeration.

The tracing-disabled case would still be as fast - a single 
branch check.

Avi, what do you think, any objections against an RFC patchset 
that shows this off?

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jason Baron Feb. 20, 2009, 6:56 p.m. UTC | #7

On Fri, Feb 20, 2009 at 06:48:11PM +0100, Ingo Molnar wrote:
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > > > >  config MARKERS
> > > > >  	bool "Activate markers"
> > > > > -	depends on TRACEPOINTS
> > > > > +	select TRACEPOINTS
> > > > >  	help
> > > > >  	  Place an empty function call at each marker site. Can be
> > > > >  	  dynamically changed for a probe function.
> > > > 
> > > > but using "select" instead of "depends on" just causes the 
> > > > kind of problem that you described, whereas using "depends on" 
> > > > does follow dependency chains.
> > > 
> > > Well, as long as the secondary selects are 'expanded' along the 
> > > line of dependencies, it should still be fine. With an 
> > > increasing number of dependencies it quickly becomes ugly 
> > > though. This might be one of the cases where it works.
> > > 
> > > Eventually we should eliminate markers, their uses can either be 
> > > converted to new-style tracepoints, or to ftrace_printk().
> > > 
> > > 	Ingo
> > 
> > ftrace_printk adds more overhead if not used since it 
> > inconditionally send the trace, unless the related 
> > TRACE_ITER_PRINTK flag on ftrace is unset.
> > 
> > I don't know how markers work, but the documentation describes 
> > that a single branch check is done in case the probe is 
> > disabled.
> > 
> > With ftrace_printk, even if TRACE_ITER_PRINTK is unset, this 
> > is still one call and one branch check. So for hot callsite 
> > it's unappropriate.
> > 
> > IMHO, tracepoints are more suited to replace markers if they 
> > have to.
> 
> there's i think the KVM usecase where markers are used 
> essentially a printk()-alike flexible tracing facility.
> 
> This is how KVMTRACE looks like at the moment:
> 
> ./vmx.c:	KVMTRACE_3D(MSR_READ, vcpu, ecx, (u32)data, (u32)(data >> 32),
> ./vmx.c:	KVMTRACE_3D(MSR_WRITE, vcpu, ecx, (u32)data, (u32)(data >> 32),
> ./vmx.c:	KVMTRACE_0D(PEND_INTR, vcpu, handler);
> ./vmx.c:	KVMTRACE_3D(VMEXIT, vcpu, exit_reason, (u32)kvm_rip_read(vcpu),
> ./vmx.c:		KVMTRACE_0D(NMI, vcpu, handler);
> ./lapic.c:	KVMTRACE_1D(APIC_ACCESS, apic->vcpu, (u32)offset, handler);
> ./lapic.c:	KVMTRACE_1D(APIC_ACCESS, apic->vcpu, (u32)offset, handler);
> ./svm.c:	KVMTRACE_2D(DR_READ, vcpu, (u32)dr, (u32)val, handler);
> ./svm.c:		KVMTRACE_3D(PAGE_FAULT, &svm->vcpu, error_code,
> ./svm.c:		KVMTRACE_3D(TDP_FAULT, &svm->vcpu, error_code,
> ./svm.c:	KVMTRACE_0D(NMI, &svm->vcpu, handler);
> 
> I think this could easily be converted to a wrapper around 
> ftrace_printk() plus a "kvmtrace" ftrace plugin which activates 
> those wrapped ftrace_printk() sites via a single global 
> __read_mostly flag.
> 
> This means KVM tracing is activated via:
> 
>  echo kvmtrace > /debug/tracing/current_tracer
> 
> that's all.
> 
> Basically the conversion would look like this:
> 
> before:
> 
>         KVMTRACE_3D(MSR_READ, &svm->vcpu, ecx, (u32)data,
>                     (u32)(data >> 32), handler);
> 
> after:
> 
> 	kvm_trace("MSR_READ: %p, %08lx, %016Lx\n", &svm->vcpu, ecx, data);
> 
> As a result all these traces would become a lot more readable 
> (and a lot more flexible) both in the source code, and in the 
> trace output stage.
> 
> And any ad-hoc tracepoint can be added, without worrying about 
> the name of the macro or the number of type of arguments. Note 
> that in this specific example we didnt need to split up the u64 
> 'data' into two 32-bit values, nor do we have to pass in the 
> 'handler' name, nor do we have to provide a MSR_READ 
> enumeration.
> 
> The tracing-disabled case would still be as fast - a single 
> branch check.
> 
> Avi, what do you think, any objections against an RFC patchset 
> that shows this off?
> 

hi,

this is the kind of case that i've designed 'dynamic debug' for. It
implicitly allows fined grained control via - source file and line
number, function name, module name, and/or fomat string. If we are going
to have a bunch of these type of interfaces sprinkled throughout the
kernel, I think this fine grained control becomes important. this is all
controlled and visible via: <debugfs>/dynamic_debug/control file. In the
above case i believe matching 'kvm' module would turn all these
statements on/off. We can always add 'tags' to tie specific debug
statements together as well.

In terms of usage, currently, if you set 'CONFIG_DYNAMIC_DEBUG' all
'pr_debug' and 'dev_dbg' statements are picked up. Thus, simply
converting the above statements to use 'pr_debug()' will make them
run-time tunable if CONFIG_DYNAMIC_DEBUG is set. That's all that needs
to be done...we don't need to write an ftracer debug statements.

We could also make subsystem specific macros as you mentioned
conditional on subsystem specific CONFIG parameters. For example, if 
'CONFIG_KVM_DEBUG' is set we tie it into this infrastructure otherwise its a
no-op.

In terms of the 'backend', currently we're using 'printk'. However, i've
posted patches in to tie into ftrace_printk(). This could easily be made
toggle switch.

That said, markers do allow for callbacks, so that different parts of
the kernel can register and get called when the marker is hit. This
behavior might be more appropriate if its not simple 'printk' style
debugging.

thanks,

-Jason


--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Frederic Weisbecker Feb. 21, 2009, 3:15 a.m. UTC | #8

On Fri, Feb 20, 2009 at 06:48:11PM +0100, Ingo Molnar wrote:
> 
> * Frederic Weisbecker <fweisbec@gmail.com> wrote:
> 
> > > > >  config MARKERS
> > > > >  	bool "Activate markers"
> > > > > -	depends on TRACEPOINTS
> > > > > +	select TRACEPOINTS
> > > > >  	help
> > > > >  	  Place an empty function call at each marker site. Can be
> > > > >  	  dynamically changed for a probe function.
> > > > 
> > > > but using "select" instead of "depends on" just causes the 
> > > > kind of problem that you described, whereas using "depends on" 
> > > > does follow dependency chains.
> > > 
> > > Well, as long as the secondary selects are 'expanded' along the 
> > > line of dependencies, it should still be fine. With an 
> > > increasing number of dependencies it quickly becomes ugly 
> > > though. This might be one of the cases where it works.
> > > 
> > > Eventually we should eliminate markers, their uses can either be 
> > > converted to new-style tracepoints, or to ftrace_printk().
> > > 
> > > 	Ingo
> > 
> > ftrace_printk adds more overhead if not used since it 
> > inconditionally send the trace, unless the related 
> > TRACE_ITER_PRINTK flag on ftrace is unset.
> > 
> > I don't know how markers work, but the documentation describes 
> > that a single branch check is done in case the probe is 
> > disabled.
> > 
> > With ftrace_printk, even if TRACE_ITER_PRINTK is unset, this 
> > is still one call and one branch check. So for hot callsite 
> > it's unappropriate.
> > 
> > IMHO, tracepoints are more suited to replace markers if they 
> > have to.
> 
> there's i think the KVM usecase where markers are used 
> essentially a printk()-alike flexible tracing facility.
> 
> This is how KVMTRACE looks like at the moment:
> 
> ./vmx.c:	KVMTRACE_3D(MSR_READ, vcpu, ecx, (u32)data, (u32)(data >> 32),
> ./vmx.c:	KVMTRACE_3D(MSR_WRITE, vcpu, ecx, (u32)data, (u32)(data >> 32),
> ./vmx.c:	KVMTRACE_0D(PEND_INTR, vcpu, handler);
> ./vmx.c:	KVMTRACE_3D(VMEXIT, vcpu, exit_reason, (u32)kvm_rip_read(vcpu),
> ./vmx.c:		KVMTRACE_0D(NMI, vcpu, handler);
> ./lapic.c:	KVMTRACE_1D(APIC_ACCESS, apic->vcpu, (u32)offset, handler);
> ./lapic.c:	KVMTRACE_1D(APIC_ACCESS, apic->vcpu, (u32)offset, handler);
> ./svm.c:	KVMTRACE_2D(DR_READ, vcpu, (u32)dr, (u32)val, handler);
> ./svm.c:		KVMTRACE_3D(PAGE_FAULT, &svm->vcpu, error_code,
> ./svm.c:		KVMTRACE_3D(TDP_FAULT, &svm->vcpu, error_code,
> ./svm.c:	KVMTRACE_0D(NMI, &svm->vcpu, handler);
> 
> I think this could easily be converted to a wrapper around 
> ftrace_printk() plus a "kvmtrace" ftrace plugin which activates 
> those wrapped ftrace_printk() sites via a single global 
> __read_mostly flag.
> 
> This means KVM tracing is activated via:
> 
>  echo kvmtrace > /debug/tracing/current_tracer
> 
> that's all.


Note that I haven't looked at kvm tracing in detail, so I could be wrong.
But when I look of the kind of events above which are traced through KVM,
I guess the frequency rate of traces can be high.

The use of ftrace_printk seems to me an overhead here for ring buffer space consuming
and tracing overhead itself.

ftrace_printk inserts the events on the ring buffer as already formatted, so
it consumes more space that traditional binary insertion. But I skip that since we have
16384 entries by default (large enough), though the notion of entry size is dynamic
with the ring buffer.

But doing the vsnprintf on tracing context seems to me inappropriate since
tracing should be as much transparent as possible from the traced site point
of view. I think it should be better on output time, unless the traces rate is not
as high as I imagine.

 
> Basically the conversion would look like this:
> 
> before:
> 
>         KVMTRACE_3D(MSR_READ, &svm->vcpu, ecx, (u32)data,
>                     (u32)(data >> 32), handler);
> 
> after:
> 
> 	kvm_trace("MSR_READ: %p, %08lx, %016Lx\n", &svm->vcpu, ecx, data);
> 
> As a result all these traces would become a lot more readable 
> (and a lot more flexible) both in the source code, and in the 
> trace output stage.


Right.
Anyway, it needs to be integrated into the common tracing part
of the kernel. virt/kvm_trace.c is a whole tracer.


> And any ad-hoc tracepoint can be added, without worrying about 
> the name of the macro or the number of type of arguments. Note 
> that in this specific example we didnt need to split up the u64 
> 'data' into two 32-bit values, nor do we have to pass in the 
> 'handler' name, nor do we have to provide a MSR_READ 
> enumeration.
> 
> The tracing-disabled case would still be as fast - a single 
> branch check.
> 
> Avi, what do you think, any objections against an RFC patchset 
> that shows this off?
> 
> 	Ingo

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Frank Ch. Eigler Feb. 21, 2009, 10:04 p.m. UTC | #9

Ingo Molnar <mingo@elte.hu> writes:

> [...]
> there's i think the KVM usecase where markers are used 
> essentially a printk()-alike flexible tracing facility.
>
> [...]
> ./vmx.c:	KVMTRACE_3D(MSR_READ, vcpu, ecx, (u32)data, (u32)(data >> 32),
> ./vmx.c:	KVMTRACE_3D(MSR_WRITE, vcpu, ecx, (u32)data, (u32)(data >> 32),
> ./vmx.c:	KVMTRACE_0D(PEND_INTR, vcpu, handler);
>
> I think this could easily be converted to a wrapper around 
> ftrace_printk() plus a "kvmtrace" ftrace plugin [...]

It would be even easier converted to the markers API directly,
without the KVMTRACE* macro intermediary:

before:
        KVMTRACE_3D(MSR_READ, &svm->vcpu, ecx, (u32)data,
                    (u32)(data >> 32), handler);
after:
	trace_mark(kvmtrace, "MSR_READ: %p, %08lx, %016Lx\n",
                             &svm->vcpu, ecx, data);

All this already "just works".

The only part that's missing, from ftrace's point of view, is an
interface from markers to the ftrace printing.  And Lai Jiangshan
kindly posted exactly such generic code back in December:

http://lkml.org/lkml/2008/12/30/297

Please let's get this merged.

> This means KVM tracing is activated via:
>  echo kvmtrace > /debug/tracing/current_tracer
> that's all.

Lai's interface is almost identical, and automatically works for
any/all markers.


> And any ad-hoc tracepoint can be added, without worrying about 
> the name of the macro or the number of type of arguments. [...]

(This KVMTRACE stuff was always unnecessary with respect to markers,
and must have been explained by some historical baggage.)


- FChE
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Motohiro KOSAKI Feb. 22, 2009, 3:23 a.m. UTC | #10

Hi

(cc to Frank and Mathieu)

> > >  config MARKERS
> > >  	bool "Activate markers"
> > > -	depends on TRACEPOINTS
> > > +	select TRACEPOINTS
> > >  	help
> > >  	  Place an empty function call at each marker site. Can be
> > >  	  dynamically changed for a probe function.
> > 
> > but using "select" instead of "depends on" just causes the 
> > kind of problem that you described, whereas using "depends on" 
> > does follow dependency chains.
> 
> Well, as long as the secondary selects are 'expanded' along the 
> line of dependencies, it should still be fine. With an 
> increasing number of dependencies it quickly becomes ugly 
> though. This might be one of the cases where it works.
> 
> Eventually we should eliminate markers, their uses can either be 
> converted to new-style tracepoints, or to ftrace_printk().

IIRC, this suggestion still don't get agreement of all tracing feature stakeholder.
We need to definitely discuss more lot and deep.
so I wonder why don't we create linux-tracing new mailing list.

tracer feature perfectly have new ML creation condition
  - very active development
  - many stakeholder and developer
  - use many common parts (eg, marker, kprobe, unified buffer)




--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Peter Zijlstra Feb. 22, 2009, 11:37 a.m. UTC | #11

On Sun, 2009-02-22 at 12:23 +0900, KOSAKI Motohiro wrote:

> IIRC, this suggestion still don't get agreement of all tracing feature stakeholder.
> We need to definitely discuss more lot and deep.
> so I wonder why don't we create linux-tracing new mailing list.

Yes, lets hide it from general view, sounds like a brilliant plan.



--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Peter Zijlstra Feb. 22, 2009, 11:43 a.m. UTC | #12

On Fri, 2009-02-20 at 18:22 +0100, Ingo Molnar wrote:
> Eventually we should eliminate markers, their uses can either be 
> converted to new-style tracepoints, or to ftrace_printk().

I would like to never merge an ftrace_printk() user... just as I'd like
to get rid of every marker.

Then again, some people appear happy to commit their printk debug spree
too :/

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Frank Ch. Eigler Feb. 22, 2009, 12:08 p.m. UTC | #13

Peter Zijlstra <peterz@infradead.org> writes:

> I would like to never merge an ftrace_printk() user... just as I'd like
> to get rid of every marker.

But why?  They solve a problem well enough that Ingo had in effect
reinvented them on Friday.

- FChE
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Peter Zijlstra Feb. 22, 2009, 12:14 p.m. UTC | #14

On Sun, 2009-02-22 at 07:08 -0500, Frank Ch. Eigler wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> 
> > I would like to never merge an ftrace_printk() user... just as I'd like
> > to get rid of every marker.
> 
> But why?  They solve a problem well enough that Ingo had in effect
> reinvented them on Friday.

Because after a printk() debug spree, I don't commit them, I toss them
out and keep the fix.

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Frank Ch. Eigler Feb. 22, 2009, 12:24 p.m. UTC | #15

Hi -

On Sun, Feb 22, 2009 at 01:14:36PM +0100, Peter Zijlstra wrote:
> > > I would like to never merge an ftrace_printk() user... just as I'd like
> > > to get rid of every marker.
> > 
> > But why?  They solve a problem well enough that Ingo had in effect
> > reinvented them on Friday.
> 
> Because after a printk() debug spree, I don't commit them, I toss them
> out and keep the fix.

Markers solve a problem closer to tracepoints than to debugging
printk's.

In this context, the main difference between tracepoints is that
markers need almost no hand-written glue code of the sort that make up
ftrace engines that just trace simple values.  Simpler & smaller code
for the same output seems like a win.

- FChE
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Mathieu Desnoyers Feb. 22, 2009, 4:04 p.m. UTC | #16

* Peter Zijlstra (peterz@infradead.org) wrote:
> On Sun, 2009-02-22 at 12:23 +0900, KOSAKI Motohiro wrote:
> 
> > IIRC, this suggestion still don't get agreement of all tracing feature stakeholder.
> > We need to definitely discuss more lot and deep.
> > so I wonder why don't we create linux-tracing new mailing list.
> 
> Yes, lets hide it from general view, sounds like a brilliant plan.
> 
> 
> 

I guess Kosaki's point is that there may be a disconnect between the
work that is done on the LTTng side and the ftrace side. LTTng uses the
markers as a core infrastructure part (and tracepoints for
instrumentation). I guess his idea is to improve communication, not to
stop it. So by your reaction, I guess his proposal might not be ideal,
but I hear his concerns. I'll try to post the core lttng patchset
shortly.

Mathieu

Ingo Molnar Feb. 22, 2009, 5:13 p.m. UTC | #17

* Frank Ch. Eigler <fche@redhat.com> wrote:

> Ingo Molnar <mingo@elte.hu> writes:
> 
> > [...]
> > there's i think the KVM usecase where markers are used 
> > essentially a printk()-alike flexible tracing facility.
> >
> > [...]
> > ./vmx.c:	KVMTRACE_3D(MSR_READ, vcpu, ecx, (u32)data, (u32)(data >> 32),
> > ./vmx.c:	KVMTRACE_3D(MSR_WRITE, vcpu, ecx, (u32)data, (u32)(data >> 32),
> > ./vmx.c:	KVMTRACE_0D(PEND_INTR, vcpu, handler);
> >
> > I think this could easily be converted to a wrapper around 
> > ftrace_printk() plus a "kvmtrace" ftrace plugin [...]
> 
> It would be even easier converted to the markers API directly,
> without the KVMTRACE* macro intermediary:
> 
> before:
>         KVMTRACE_3D(MSR_READ, &svm->vcpu, ecx, (u32)data,
>                     (u32)(data >> 32), handler);
> after:
> 	trace_mark(kvmtrace, "MSR_READ: %p, %08lx, %016Lx\n",
>                              &svm->vcpu, ecx, data);
> 
> All this already "just works".

except that we are removing markers.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Frank Ch. Eigler Feb. 22, 2009, 5:38 p.m. UTC | #18

Hi -

On Sun, Feb 22, 2009 at 06:13:44PM +0100, Ingo Molnar wrote:
> [...]
> > It would be even easier converted to the markers API directly,
> > without the KVMTRACE* macro intermediary:
> > 
> > before:
> >         KVMTRACE_3D(MSR_READ, &svm->vcpu, ecx, (u32)data,
> >                     (u32)(data >> 32), handler);
> > after:
> > 	trace_mark(kvmtrace, "MSR_READ: %p, %08lx, %016Lx\n",
> >                              &svm->vcpu, ecx, data);
> > 
> > All this already "just works".
> except that we are removing markers.

Perhaps that intention should be justified and reexamined, especially
considering the KVMTRACE* macro-replacement solution you suggested is
almost the same.


- FChE
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steven Rostedt Feb. 23, 2009, 12:23 a.m. UTC | #19

On Sun, 22 Feb 2009, Peter Zijlstra wrote:

> On Fri, 2009-02-20 at 18:22 +0100, Ingo Molnar wrote:
> > Eventually we should eliminate markers, their uses can either be 
> > converted to new-style tracepoints, or to ftrace_printk().
> 
> I would like to never merge an ftrace_printk() user... just as I'd like
> to get rid of every marker.
> 
> Then again, some people appear happy to commit their printk debug spree
> too :/

/me looks at arch/powerpc/kernel/ftrace.c and realizes that he's guilty as 
charged :-p

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Avi Kivity Feb. 23, 2009, 10:13 a.m. UTC | #20

Ingo Molnar wrote:
>         KVMTRACE_3D(MSR_READ, &svm->vcpu, ecx, (u32)data,
>                     (u32)(data >> 32), handler);
>
> after:
>
> 	kvm_trace("MSR_READ: %p, %08lx, %016Lx\n", &svm->vcpu, ecx, data);
>
> As a result all these traces would become a lot more readable 
> (and a lot more flexible) both in the source code, and in the 
> trace output stage.
>
> And any ad-hoc tracepoint can be added, without worrying about 
> the name of the macro or the number of type of arguments. Note 
> that in this specific example we didnt need to split up the u64 
> 'data' into two 32-bit values, nor do we have to pass in the 
> 'handler' name, nor do we have to provide a MSR_READ 
> enumeration.
>
> The tracing-disabled case would still be as fast - a single 
> branch check.
>
> Avi, what do you think, any objections against an RFC patchset 
> that shows this off?
>
>   

Definitely, as long as formatting is performed after the data is 
gathered (say, in userspace).  kvmtrace can generate around 1M 
events/sec/cpu, so we need truly low overhead.

Peter Zijlstra Feb. 23, 2009, 11:11 a.m. UTC | #21

On Sun, 2009-02-22 at 07:24 -0500, Frank Ch. Eigler wrote:
> Hi -
> 
> On Sun, Feb 22, 2009 at 01:14:36PM +0100, Peter Zijlstra wrote:
> > > > I would like to never merge an ftrace_printk() user... just as I'd like
> > > > to get rid of every marker.
> > > 
> > > But why?  They solve a problem well enough that Ingo had in effect
> > > reinvented them on Friday.
> > 
> > Because after a printk() debug spree, I don't commit them, I toss them
> > out and keep the fix.
> 
> Markers solve a problem closer to tracepoints than to debugging
> printk's.

Not so. In both cases the regular stuff (NMI trace, OOPS,
function/graph/sched trace, etc) is not enough and you wish to augment
its output.

> In this context, the main difference between tracepoints is that
> markers need almost no hand-written glue code of the sort that make up
> ftrace engines that just trace simple values.  Simpler & smaller code
> for the same output seems like a win.

Right, for dumb tracers that's true I suppose, however any
high-bandwidth tracer will try to avoid putting silly ASCII strings in
and will therefore need to write more glue code.

Which reduces these default thingies to printk() level debugging.

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Frank Ch. Eigler Feb. 23, 2009, 3:44 p.m. UTC | #22

Hi -

On Mon, Feb 23, 2009 at 12:11:27PM +0100, Peter Zijlstra wrote:
> [...]
> > > Because after a printk() debug spree, I don't commit them, I toss them
> > > out and keep the fix.
> > 
> > Markers solve a problem closer to tracepoints than to debugging
> > printk's.

> Not so. In both cases the regular stuff (NMI trace, OOPS,
> function/graph/sched trace, etc) is not enough and you wish to
> augment its output.

Sorry, I don't see how that relates.  If the general function tracing
widgetry is insufficient for some subsystem/purpose, some sort of
static instrumentation is needed.  Whether that instrumentation is
done by markers (with a thin glue to ftrace) or by tracepoints (with a
thick glue to ftrace) doesn't change the need for "augmentation".

> > In this context, the main difference between tracepoints is that
> > markers need almost no hand-written glue code of the sort that make up
> > ftrace engines that just trace simple values.  Simpler & smaller code
> > for the same output seems like a win.
> 
> Right, for dumb tracers that's true I suppose, however any
> high-bandwidth tracer will try to avoid putting silly ASCII strings in
> and will therefore need to write more glue code.

So let's leave it up to the wisdom of each subsystem maintainer to
decide whether any particular trace event is high enough throughput
that direct ASCII rendering is not favourable.  (Considering the
finite size of tracing buffers, high-throughput trace events would
need to be continually drained with ASCII conversion anyway, so the
overall benefit of using packed binary as an intermediate copy may not
be large after all.)

I see all this as an argument to keep the subsystem's options open.
Sometimes the extra complexity of tracepoints is worth it, sometimes
it isn't.

- FChE
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Peter Zijlstra Feb. 23, 2009, 4:22 p.m. UTC | #23

On Mon, 2009-02-23 at 10:44 -0500, Frank Ch. Eigler wrote:
> Hi -
> 
> On Mon, Feb 23, 2009 at 12:11:27PM +0100, Peter Zijlstra wrote:
> > [...]
> > > > Because after a printk() debug spree, I don't commit them, I toss them
> > > > out and keep the fix.
> > > 
> > > Markers solve a problem closer to tracepoints than to debugging
> > > printk's.
> 
> > Not so. In both cases the regular stuff (NMI trace, OOPS,
> > function/graph/sched trace, etc) is not enough and you wish to
> > augment its output.
> 
> Sorry, I don't see how that relates.  If the general function tracing
> widgetry is insufficient for some subsystem/purpose, some sort of
> static instrumentation is needed.  Whether that instrumentation is
> done by markers (with a thin glue to ftrace) or by tracepoints (with a
> thick glue to ftrace) doesn't change the need for "augmentation".

I'm not arguing against static instrumentation per-se (although
expanding the coverage of dynamic/automatic instrumentation is much more
profitable IMHO).

What I'm arguing is that trace_mark()s one distinguishing feature over
tracepoints is only suited for quick debug like work.

Furthermore, trace_mark() exposes that crap like an ABI, now suppose
some distro goes and declares that stable for some daft reason, imagine
the poor sod having to fix something littered with trace_mark().

Look at fs/ext4/ for a particularly horrid example.

[ quite apart from why I hate trace_mark() most -- which is that it
makes the code look like someone committed their tree after a debugging
session ]

> > > In this context, the main difference between tracepoints is that
> > > markers need almost no hand-written glue code of the sort that make up
> > > ftrace engines that just trace simple values.  Simpler & smaller code
> > > for the same output seems like a win.
> > 
> > Right, for dumb tracers that's true I suppose, however any
> > high-bandwidth tracer will try to avoid putting silly ASCII strings in
> > and will therefore need to write more glue code.
> 
> So let's leave it up to the wisdom of each subsystem maintainer to
> decide whether any particular trace event is high enough throughput
> that direct ASCII rendering is not favourable.  (Considering the
> finite size of tracing buffers, high-throughput trace events would
> need to be continually drained with ASCII conversion anyway, so the
> overall benefit of using packed binary as an intermediate copy may not
> be large after all.)
> 
> I see all this as an argument to keep the subsystem's options open.
> Sometimes the extra complexity of tracepoints is worth it, sometimes
> it isn't.

It always is, either the information is useful, or it isn't. If it isn't
we shouldn't have it in tree. If it is, you never know how people will
want to use it.

Big but low-rate events will take space that could have been properly
used to capture longer - giving a better view of subsystem interaction.

Tracing transcends sub-systems in that sense, its not about
per-subsystem tracers, its about full overview, and whilst subsystem
maintainers can help in determining what information is useful,
presenting that information in big bloated blobs is beyond that scope.

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Frank Ch. Eigler Feb. 23, 2009, 5:10 p.m. UTC | #24

On Mon, Feb 23, 2009 at 05:22:39PM +0100, Peter Zijlstra wrote:
> [...]
> > > Not so. In both cases the regular stuff (NMI trace, OOPS,
> > > function/graph/sched trace, etc) is not enough and you wish to
> > > augment its output.
> > 
> > Sorry, I don't see how that relates.  If the general function tracing
> > widgetry is insufficient for some subsystem/purpose, some sort of
> > static instrumentation is needed.  Whether that instrumentation is
> > done by markers (with a thin glue to ftrace) or by tracepoints (with a
> > thick glue to ftrace) doesn't change the need for "augmentation".
> 
> I'm not arguing against static instrumentation per-se (although
> expanding the coverage of dynamic/automatic instrumentation is much more
> profitable IMHO).

Much prior discussion (incl. at the kernel summit) indicates that we
need both.

> What I'm arguing is that trace_mark()s one distinguishing feature over
> tracepoints is only suited for quick debug like work.

I see where you're coming from, but one may also caricaturize the
other alternative as requiring make-work glue code to pack & unpack
all the same inforation.


> Furthermore, trace_mark() exposes that crap like an ABI, now suppose
> some distro goes and declares that stable for some daft reason,
> imagine the poor sod having to fix something littered with
> trace_mark().

The impression that this is somehow different with tracepoints is
mistaken.  Tracepoints are *exactly* as "ABI-like" as markers.


> [...]  presenting that information in big bloated blobs is beyond
> that scope.

Do you have some specific bloated blobs in mind?  It's not as if the
rendered text is necessarily much bigger than a struct containing all
the same parameters.  Consider all the fields rounded up to 4 or 8
bytes each.


- FChE
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ingo Molnar Feb. 23, 2009, 5:23 p.m. UTC | #25

* Frank Ch. Eigler <fche@redhat.com> wrote:

> > Furthermore, trace_mark() exposes that crap like an ABI, now 
> > suppose some distro goes and declares that stable for some 
> > daft reason, imagine the poor sod having to fix something 
> > littered with trace_mark().
> 
> The impression that this is somehow different with tracepoints 
> is mistaken.  Tracepoints are *exactly* as "ABI-like" as 
> markers.

they arent. To the kernel they look like function calls, with no 
ABI properties at all. They can and will change without notice.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Steven Rostedt Feb. 23, 2009, 5:31 p.m. UTC | #26

On Mon, 23 Feb 2009, Frank Ch. Eigler wrote:
> 
> > Furthermore, trace_mark() exposes that crap like an ABI, now suppose
> > some distro goes and declares that stable for some daft reason,
> > imagine the poor sod having to fix something littered with
> > trace_mark().
> 
> The impression that this is somehow different with tracepoints is
> mistaken.  Tracepoints are *exactly* as "ABI-like" as markers.
> 

I disagree with this point. markers are text strings that will eventually 
appear to userspace. trace points need translation. A trace point can be 
modified at any time and should never mess up user apps.

You may have a hook to a trace point that provides user apps a text based 
output. If the trace point changes, this hook may break. But the tracer 
maintainer of that hook will be responsible for that change, not the 
maintainer of the code the tracepoint existed in.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Theodore Ts'o Feb. 23, 2009, 6:32 p.m. UTC | #27

On Mon, Feb 23, 2009 at 12:31:31PM -0500, Steven Rostedt wrote:
> > The impression that this is somehow different with tracepoints is
> > mistaken.  Tracepoints are *exactly* as "ABI-like" as markers.
> 
> I disagree with this point. markers are text strings that will eventually 
> appear to userspace. trace points need translation. A trace point can be 
> modified at any time and should never mess up user apps.
> 
> You may have a hook to a trace point that provides user apps a text based 
> output. If the trace point changes, this hook may break. But the tracer 
> maintainer of that hook will be responsible for that change, not the 
> maintainer of the code the tracepoint existed in.

So stupid question time --- exactly who is supposed to write and
maintain trnaslation code; the "hook"?  What trace points have done is
added an extra layer of indirection, but in order for someone to
actually make *use* of the have the trace point, someone has to
maintain the "hook".

I'm sorry I've offended Peter with the ext4 trace_mark() hooks, but
it's what I could forsee needing if someone wants reports a wierd sh*t
bug in ext4 and I wanted to be able to be able to extract debugging
information without forcing them to patch and recompile the kernel,
and in the ideal world, without even needing to reboot the kernel.
(If we had usable and reliable debuginfo information, in most cases
I'd be able simply use access to function parameters as replacements
for trace_mark().)

I've had to debug problems in the field on customer machines where
installing a new kernel was a big deal (as in, you get a window to
reboot the machine every 24 hours, and the problem is so complex you
can't replicate it anywhere *but* the production environment).  It's
also been the case that more than once I've seen wierd behaviour on my
laptop, and being able to peer inside it to see what is going on
easily and conveniently is a major win.

My big complaint with tracepoints is specifically *that* I need to
write a lot of hook code each time I want to investigate something.
So to say it's more maintabile because it doesn't present an ABI is a
bit of sophistry, I think.  Yes, there's no ABI because each time you
want to do something different, you have to roll your own kernel
module code to write the hook from scratch --- taking something that
was a "gee, I wonder...." 30 second experiment to something which
takes 5-10 times longer, at minimum.

Finally, whether or not the text string is an ABI really depends on
the tools.  Given that the string is self describing, it's only an ABI
if the tools are stupid.  It doesn't take that much extra parsing
smarts to be able to under stand a format string of:

	dev %s dir %lu mode %d

and DTRT if the kernel instead hands back something which was
constructed with the format string:

	dev %s dir %lu mode %d grp %u

or even:

	dev %s dir %lu

if the mode parameter is unavailable for whatever reason in future
kernel versions.  This really isn't rocket science....

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Peter Zijlstra Feb. 23, 2009, 10:16 p.m. UTC | #28

On Mon, 2009-02-23 at 13:32 -0500, Theodore Tso wrote:
> On Mon, Feb 23, 2009 at 12:31:31PM -0500, Steven Rostedt wrote:
> > > The impression that this is somehow different with tracepoints is
> > > mistaken.  Tracepoints are *exactly* as "ABI-like" as markers.
> > 
> > I disagree with this point. markers are text strings that will eventually 
> > appear to userspace. trace points need translation. A trace point can be 
> > modified at any time and should never mess up user apps.
> > 
> > You may have a hook to a trace point that provides user apps a text based 
> > output. If the trace point changes, this hook may break. But the tracer 
> > maintainer of that hook will be responsible for that change, not the 
> > maintainer of the code the tracepoint existed in.
> 
> So stupid question time --- exactly who is supposed to write and
> maintain trnaslation code; the "hook"?  What trace points have done is
> added an extra layer of indirection, but in order for someone to
> actually make *use* of the have the trace point, someone has to
> maintain the "hook".

That just means we have to make it easier to write/generate this glue,
no? If we had function argument debug data (see below) we could generate
a generic tracepoint hook.

> I'm sorry I've offended Peter with the ext4 trace_mark() hooks, but
> it's what I could forsee needing if someone wants reports a wierd sh*t
> bug in ext4 and I wanted to be able to be able to extract debugging
> information without forcing them to patch and recompile the kernel,
> and in the ideal world, without even needing to reboot the kernel.
> (If we had usable and reliable debuginfo information, in most cases
> I'd be able simply use access to function parameters as replacements
> for trace_mark().)

We're working on adding arguments to the function/graph tracer, it would
fit all your above requirements and doesn't need any source modification
to boot.

Furthermore, most of it is upstream.

> I've had to debug problems in the field on customer machines where
> installing a new kernel was a big deal (as in, you get a window to
> reboot the machine every 24 hours, and the problem is so complex you
> can't replicate it anywhere *but* the production environment).  It's
> also been the case that more than once I've seen wierd behaviour on my
> laptop, and being able to peer inside it to see what is going on
> easily and conveniently is a major win.

Yeah, I know, the function graph tracer is brilliant that way. It even
provides information on the rest of the system and requires no
additional patches or big lumps of userspace.

> Finally, whether or not the text string is an ABI really depends on
> the tools.  Given that the string is self describing, it's only an ABI
> if the tools are stupid.  

>  This really isn't rocket science....

It isn't, yet how many scripts/programs have you seen that failed at the
above?


--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Theodore Ts'o Feb. 23, 2009, 10:41 p.m. UTC | #29

On Mon, Feb 23, 2009 at 11:16:59PM +0100, Peter Zijlstra wrote:
> 
> We're working on adding arguments to the function/graph tracer, it would
> fit all your above requirements and doesn't need any source modification
> to boot.

*Excellent*.  I would love to have that funtionality.  How do you plan
to make available complex data structures (i.e., suppose I want
inode->i_ino printed out)?  I assume this will require writing some
"easy to generate" glue code that would presumably be some kind of
kernel module?  That doesn't bother me (after all that's what
SystemTap does), as long as generation of the glue code can be largely
automated ---- so that you can take something approximately like a
DTrace or SystemTap script, and with some perl or python helper,
translate it into glue code that gets compiled into a kernel module.
Is something like that what you have in mind?

   	        	     	       	    	     	- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Peter Zijlstra Feb. 24, 2009, 8:55 a.m. UTC | #30

On Mon, 2009-02-23 at 17:41 -0500, Theodore Tso wrote:
> On Mon, Feb 23, 2009 at 11:16:59PM +0100, Peter Zijlstra wrote:
> > 
> > We're working on adding arguments to the function/graph tracer, it would
> > fit all your above requirements and doesn't need any source modification
> > to boot.
> 
> *Excellent*.  I would love to have that funtionality.  How do you plan
> to make available complex data structures (i.e., suppose I want
> inode->i_ino printed out)?  I assume this will require writing some
> "easy to generate" glue code that would presumably be some kind of
> kernel module?  That doesn't bother me (after all that's what
> SystemTap does), as long as generation of the glue code can be largely
> automated ---- so that you can take something approximately like a
> DTrace or SystemTap script, and with some perl or python helper,
> translate it into glue code that gets compiled into a kernel module.
> Is something like that what you have in mind?

Yeah, we were also looking at using sparse and term rewrite systems on
top of the regular C parse tree to generate stuff.

--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Frank Ch. Eigler Feb. 24, 2009, 1:01 p.m. UTC | #31

Hi -

> > The impression that this is somehow different with tracepoints 
> > is mistaken.  Tracepoints are *exactly* as "ABI-like" as 
> > markers.
> 
> they arent. To the kernel they look like function calls, with no 
> ABI properties at all. They can and will change without notice.

Markers also "look like" function calls because they are - the
callbacks just happen to take a format string and varargs.  What did
you think they were?


srostedt argues that separating the tracepoint from its rendering is
necessary to free the instrumentation site from backward-compatibility
stagnation.  

But consider - it has been manifold stated that tracepoints are not
welcome in the tree without a "tracer" (hand-written glue), so the
same person ends up writing both.  Since they are tightly coupled.  a
change on one will affect the other.  A moved tracepoint will produce
events at different times; a lost tracepoint parameter will lose data
at trace rendering time.  The tracing engine can only pretend to
produce the same data by guessing.

Similarly, if a marker site were to change, and if someone deemed the
old trace data important to preserve, a hand-written marker callback
function could replace a generic one, The new one could make the same
heuristic adaptation.

Try working out some examples.  You'll probably see how similar they
really turn out, with respect to the hypothetical "preserve ABI" idea.

- FChE
--
To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

tracing/markers: make markers select tracepoints

Commit Message

Comments

Patch