diff mbox series

[v2] ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function

Message ID 20220525180553.419eac77@gandalf.local.home (mailing list archive)
State Superseded
Headers show
Series [v2] ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Steven Rostedt May 25, 2022, 10:05 p.m. UTC
From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

If an unused weak function was traced, it's call to fentry will still
exist, which gets added into the __mcount_loc table. Ftrace will use
kallsyms to retrieve the name for each location in __mcount_loc to display
it in the available_filter_functions and used to enable functions via the
name matching in set_ftrace_filter/notrace. Enabling these functions do
nothing but enable an unused call to ftrace_caller. If a traced weak
function is overridden, the symbol of the function would be used for it,
which will either created duplicate names, or if the previous function was
not traced, it would be incorrectly listed in available_filter_functions
as a function that can be traced.

This became an issue with BPF[1] as there are tooling that enables the
direct callers via ftrace but then checks to see if the functions were
actually enabled. The case of one function that was marked notrace, but
was followed by an unused weak function that was traced. The unused
function's call to fentry was added to the __mcount_loc section, and
kallsyms retrieved the untraced function's symbol as the weak function was
overridden. Since the untraced function would not get traced, the BPF
check would detect this and fail.

The real fix would be to fix kallsyms to not show address of weak
functions as the function before it. But that would require adding code in
the build to add function size to kallsyms so that it can know when the
function ends instead of just using the start of the next known symbol.

In the mean time, this is a work around. Add a FTRACE_MCOUNT_MAX_OFFSET
macro that if defined, ftrace will ignore any function that has its call
to fentry/mcount that has an offset from the symbol that is greater than
FTRACE_MCOUNT_MAX_OFFSET.

If CONFIG_HAVE_FENTRY is defined for x86, define FTRACE_MCOUNT_MAX_OFFSET
to zero, which will have ftrace ignore all locations that are not at the
start of the function.

[1] https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@kernel.org/

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
Changes since v1: https://lore.kernel.org/all/20220503150410.2d9e88aa@rorschach.local.home/

 - Changed MAX_OFFSET to 4 on x86 if KERNEL_IBT is enabled
   (Reminded by Peter Zijlstra)

 arch/x86/include/asm/ftrace.h | 10 +++++++
 kernel/trace/ftrace.c         | 50 +++++++++++++++++++++++++++++++++--
 2 files changed, 58 insertions(+), 2 deletions(-)

Comments

Ingo Molnar May 26, 2022, 2:50 a.m. UTC | #1
* Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> 
> If an unused weak function was traced, it's call to fentry will still
> exist, which gets added into the __mcount_loc table. Ftrace will use
> kallsyms to retrieve the name for each location in __mcount_loc to display
> it in the available_filter_functions and used to enable functions via the
> name matching in set_ftrace_filter/notrace. Enabling these functions do
> nothing but enable an unused call to ftrace_caller. If a traced weak
> function is overridden, the symbol of the function would be used for it,
> which will either created duplicate names, or if the previous function was
> not traced, it would be incorrectly listed in available_filter_functions
> as a function that can be traced.
> 
> This became an issue with BPF[1] as there are tooling that enables the
> direct callers via ftrace but then checks to see if the functions were
> actually enabled. The case of one function that was marked notrace, but
> was followed by an unused weak function that was traced. The unused
> function's call to fentry was added to the __mcount_loc section, and
> kallsyms retrieved the untraced function's symbol as the weak function was
> overridden. Since the untraced function would not get traced, the BPF
> check would detect this and fail.
> 
> The real fix would be to fix kallsyms to not show address of weak
> functions as the function before it. But that would require adding code in
> the build to add function size to kallsyms so that it can know when the
> function ends instead of just using the start of the next known symbol.

Yeah, so I actually have a (prototype...) objtool based kallsyms 
implementation in the (way too large) fast-headers tree that is both faster 
& allows such details in principle:

  431bca135cf8 kbuild/linker: Gather all the __kallsyms sections into a single table
  6bc7af02e402 objtool/kallsyms: Copy the symbol name and offset to the new __kallsyms ELF section
  e1e85b2fab9e objtool/kallsyms: Increase section size dynamically
  3528d607641b objtool/kallsyms: Use zero entry size for section
  2555dd62348a objtool/kallsyms: Output variable length strings
  c114b71f8547 objtool/kallsyms: Add kallsyms_offsets[] table
  cfcfce6cb51f objtool/kallsyms: Change name to __kallsyms_strs
  134160bb2de1 objtool/kallsyms: Split out process_kallsyms_symbols()
  33347a4b46e0 objtool/kallsyms: Add relocations
  86626e9e6603 objtool/kallsyms: Skip discarded sections
  7dd9fef6fbb0 kallsyms/objtool: Print out the new symbol table on the kernel side
  c82d94b33a1f objtool/kallsyms: Use struct kallsyms_entry
  a66ee5034008 objtool/kallsyms, kallsyms/objtool: Share 'struct kallsyms_entry' definition
  e496159e5282 kallsyms/objtool: Process entries
  47fc63ef3fa8 objtool/kallsyms: Switch to 64-bit absolute relocations
  c54fcc03cd64 objtool/kallsyms: Fix initial offset
  eac3c107aa6e kallsyms/objtool: Emit System.map-alike symbol list
  ebfb7e14b8ca objtool/kallsyms: Skip undefined symbols
  25b69ef5666b objtool/kallsyms: Update symbol filter
  3b26a82c733f objtool/kallsyms: Add the CONFIG_KALLSYMS_FAST Kconfig option & its related Kconfig switches
  9350c25942f8 kallsyms/objtool: Make the kallsyms work even if the generic tables are not there
  4a0a120bde05 objtool/kallsyms, x86/relocs: Detect & ignore __kallsyms_offset section relocations
  87c5244f1fa8 kallsyms/objtool: Introduce linear table of symbol structures: kallsyms_syms[]
  51dafdefc61f kallsyms/objtool: Split fast vs. generic functions
  e4123a40125f kallsyms/objtool: Sort symbols by address and deduplicate them
  2d738c69965a kallsyms/objtool: Utilize the kallsyms_syms[] table in kallsyms_expand_symbol() and kallsyms_sym_address()
  997ffe217a34 kallsyms/objtool: Port kallsyms_relative_base functionality to the kallsyms_syms[] offsets

> In the mean time, this is a work around. Add a FTRACE_MCOUNT_MAX_OFFSET
> macro that if defined, ftrace will ignore any function that has its call
> to fentry/mcount that has an offset from the symbol that is greater than
> FTRACE_MCOUNT_MAX_OFFSET.
> 
> If CONFIG_HAVE_FENTRY is defined for x86, define FTRACE_MCOUNT_MAX_OFFSET
> to zero, which will have ftrace ignore all locations that are not at the
> start of the function.
> 
> [1] https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@kernel.org/
> 
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>

LGTM.

I suppose you'd like to merge this via the tracing tree? If so:

  Acked-by: Ingo Molnar <mingo@kernel.org>

Thanks,

	Ingo
Peter Zijlstra May 26, 2022, 7:39 a.m. UTC | #2
On Wed, May 25, 2022 at 06:05:53PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> 
> If an unused weak function was traced, it's call to fentry will still
> exist, which gets added into the __mcount_loc table. Ftrace will use
> kallsyms to retrieve the name for each location in __mcount_loc to display
> it in the available_filter_functions and used to enable functions via the
> name matching in set_ftrace_filter/notrace. Enabling these functions do
> nothing but enable an unused call to ftrace_caller. If a traced weak
> function is overridden, the symbol of the function would be used for it,
> which will either created duplicate names, or if the previous function was
> not traced, it would be incorrectly listed in available_filter_functions
> as a function that can be traced.
> 
> This became an issue with BPF[1] as there are tooling that enables the
> direct callers via ftrace but then checks to see if the functions were
> actually enabled. The case of one function that was marked notrace, but
> was followed by an unused weak function that was traced. The unused
> function's call to fentry was added to the __mcount_loc section, and
> kallsyms retrieved the untraced function's symbol as the weak function was
> overridden. Since the untraced function would not get traced, the BPF
> check would detect this and fail.
> 
> The real fix would be to fix kallsyms to not show address of weak
> functions as the function before it. But that would require adding code in
> the build to add function size to kallsyms so that it can know when the
> function ends instead of just using the start of the next known symbol.
> 
> In the mean time, this is a work around. Add a FTRACE_MCOUNT_MAX_OFFSET
> macro that if defined, ftrace will ignore any function that has its call
> to fentry/mcount that has an offset from the symbol that is greater than
> FTRACE_MCOUNT_MAX_OFFSET.
> 
> If CONFIG_HAVE_FENTRY is defined for x86, define FTRACE_MCOUNT_MAX_OFFSET
> to zero, which will have ftrace ignore all locations that are not at the
> start of the function.

^^^ that paragraph is obsolete by your own changes thing below :-)

> [1] https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@kernel.org/
> 
> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> ---
> Changes since v1: https://lore.kernel.org/all/20220503150410.2d9e88aa@rorschach.local.home/
> 
>  - Changed MAX_OFFSET to 4 on x86 if KERNEL_IBT is enabled
>    (Reminded by Peter Zijlstra)
> 
>  arch/x86/include/asm/ftrace.h | 10 +++++++
>  kernel/trace/ftrace.c         | 50 +++++++++++++++++++++++++++++++++--
>  2 files changed, 58 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> index 024d9797646e..53675fe2d847 100644
> --- a/arch/x86/include/asm/ftrace.h
> +++ b/arch/x86/include/asm/ftrace.h
> @@ -9,6 +9,16 @@
>  # define MCOUNT_ADDR		((unsigned long)(__fentry__))
>  #define MCOUNT_INSN_SIZE	5 /* sizeof mcount call */
>  
> +/* Ignore unused weak functions which will have non zero offsets */
> +#ifdef CONFIG_HAVE_FENTRY
> +# ifdef CONFIG_X86_KERNEL_IBT
> +/* endbr64 is 4 bytes in front of the fentry */
> +#  define FTRACE_MCOUNT_MAX_OFFSET	4
> +# else
> +#  define FTRACE_MCOUNT_MAX_OFFSET	0
> +# endif
> +#endif

#define FTRACE_MCOUNT_MAX_OFFSET ENDBR_INSN_SIZE

Should do the same I think, less lines etc..
Steven Rostedt May 26, 2022, 1:11 p.m. UTC | #3
On Thu, 26 May 2022 04:50:17 +0200
Ingo Molnar <mingo@kernel.org> wrote:


> > The real fix would be to fix kallsyms to not show address of weak
> > functions as the function before it. But that would require adding code in
> > the build to add function size to kallsyms so that it can know when the
> > function ends instead of just using the start of the next known symbol.  
> 
> Yeah, so I actually have a (prototype...) objtool based kallsyms 
> implementation in the (way too large) fast-headers tree that is both faster 
> & allows such details in principle:

Nice.

Will this work for other architectures too?

> > If CONFIG_HAVE_FENTRY is defined for x86, define FTRACE_MCOUNT_MAX_OFFSET
> > to zero, which will have ftrace ignore all locations that are not at the
> > start of the function.
> > 
> > [1] https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@kernel.org/
> > 
> > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>  
> 
> LGTM.
> 
> I suppose you'd like to merge this via the tracing tree? If so:

Yeah, I'll pull it through my tree.

> 
>   Acked-by: Ingo Molnar <mingo@kernel.org>

Thanks a lot Ingo for reviewing it. I really appreciate it!

-- Steve
Steven Rostedt May 26, 2022, 1:14 p.m. UTC | #4
On Thu, 26 May 2022 09:39:49 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

> > If CONFIG_HAVE_FENTRY is defined for x86, define FTRACE_MCOUNT_MAX_OFFSET
> > to zero, which will have ftrace ignore all locations that are not at the
> > start of the function.  
> 
> ^^^ that paragraph is obsolete by your own changes thing below :-)

Well, it is partially correct (if X86_KERNEL_IBT is disabled).

I'll update the change log to reflect the update.

> 
> > [1] https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@kernel.org/
> > 
> > Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
> > ---
> > Changes since v1: https://lore.kernel.org/all/20220503150410.2d9e88aa@rorschach.local.home/
> > 
> >  - Changed MAX_OFFSET to 4 on x86 if KERNEL_IBT is enabled
> >    (Reminded by Peter Zijlstra)
> > 
> >  arch/x86/include/asm/ftrace.h | 10 +++++++
> >  kernel/trace/ftrace.c         | 50 +++++++++++++++++++++++++++++++++--
> >  2 files changed, 58 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> > index 024d9797646e..53675fe2d847 100644
> > --- a/arch/x86/include/asm/ftrace.h
> > +++ b/arch/x86/include/asm/ftrace.h
> > @@ -9,6 +9,16 @@
> >  # define MCOUNT_ADDR		((unsigned long)(__fentry__))
> >  #define MCOUNT_INSN_SIZE	5 /* sizeof mcount call */
> >  
> > +/* Ignore unused weak functions which will have non zero offsets */
> > +#ifdef CONFIG_HAVE_FENTRY
> > +# ifdef CONFIG_X86_KERNEL_IBT
> > +/* endbr64 is 4 bytes in front of the fentry */
> > +#  define FTRACE_MCOUNT_MAX_OFFSET	4
> > +# else
> > +#  define FTRACE_MCOUNT_MAX_OFFSET	0
> > +# endif
> > +#endif  
> 
> #define FTRACE_MCOUNT_MAX_OFFSET ENDBR_INSN_SIZE
> 
> Should do the same I think, less lines etc..

Thanks, I figured there was something else I could use besides hard coding
it. But I may add that as a separate patch after this gets merged, because
my branch doesn't have that code in it yet, so it wont compile.

-- Steve
Ingo Molnar May 27, 2022, 10:04 a.m. UTC | #5
* Steven Rostedt <rostedt@goodmis.org> wrote:

> On Thu, 26 May 2022 04:50:17 +0200
> Ingo Molnar <mingo@kernel.org> wrote:
> 
> 
> > > The real fix would be to fix kallsyms to not show address of weak
> > > functions as the function before it. But that would require adding code in
> > > the build to add function size to kallsyms so that it can know when the
> > > function ends instead of just using the start of the next known symbol.  
> > 
> > Yeah, so I actually have a (prototype...) objtool based kallsyms 
> > implementation in the (way too large) fast-headers tree that is both faster 
> > & allows such details in principle:
> 
> Nice.
> 
> Will this work for other architectures too?

For those which implement objtool, it certainly should: as we parse through 
each object file during the build, generating kallsyms data structures is 
relatively straightforward.

Objtool availability is a big gating condition though. :-/

[ ... and still Acked-by on -v4 too. ]

Thanks,

	Ingo
Steven Rostedt May 27, 2022, 8:36 p.m. UTC | #6
On Fri, 27 May 2022 12:04:14 +0200
Ingo Molnar <mingo@kernel.org> wrote:

> For those which implement objtool, it certainly should: as we parse through 
> each object file during the build, generating kallsyms data structures is 
> relatively straightforward.
> 
> Objtool availability is a big gating condition though. :-/
> 
> [ ... and still Acked-by on -v4 too. ]

I just sent out a v5 and removed your Acked-by because the changes to v5
are non-trivial like the previous changes in the other versions were.

The big difference was that I needed place holders for the invalid
functions in the available_filter_functions file, as I forgot that
libtracefs uses the line number of these functions as a way to enable them
in the set_ftrace_filter and set_ftrace_notrace files. Removing them made
the indexing not in sync, and broke trace-cmd.

I also added a work queue at boot up to run through all the records and
mark any of the ones that fail the kallsyms check as DISABLED.

If you want, feel free to review and ack that change too.

  https://lore.kernel.org/all/20220527163205.421c7828@gandalf.local.home/

I need to add a selftest to test the indexing code as well. The only reason
I found it was that I was writing my presentation for Embedded Recipes and
was using it as an example. And when the filtering wasn't working, I had to
figure out why.

-- Steve
diff mbox series

Patch

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index 024d9797646e..53675fe2d847 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -9,6 +9,16 @@ 
 # define MCOUNT_ADDR		((unsigned long)(__fentry__))
 #define MCOUNT_INSN_SIZE	5 /* sizeof mcount call */
 
+/* Ignore unused weak functions which will have non zero offsets */
+#ifdef CONFIG_HAVE_FENTRY
+# ifdef CONFIG_X86_KERNEL_IBT
+/* endbr64 is 4 bytes in front of the fentry */
+#  define FTRACE_MCOUNT_MAX_OFFSET	4
+# else
+#  define FTRACE_MCOUNT_MAX_OFFSET	0
+# endif
+#endif
+
 #ifdef CONFIG_DYNAMIC_FTRACE
 #define ARCH_SUPPORTS_FTRACE_OPS 1
 #endif
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 1fa8cfc81ab2..8063a5ef82af 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -3654,6 +3654,31 @@  static void add_trampoline_func(struct seq_file *m, struct ftrace_ops *ops,
 		seq_printf(m, " ->%pS", ptr);
 }
 
+#ifdef FTRACE_MCOUNT_MAX_OFFSET
+static int print_rec(struct seq_file *m, unsigned long ip)
+{
+	unsigned long offset;
+	char str[KSYM_SYMBOL_LEN];
+	char *modname;
+	int ret;
+
+	ret = kallsyms_lookup(ip, NULL, &offset, &modname, str);
+	if (!ret || offset > FTRACE_MCOUNT_MAX_OFFSET)
+		return -1;
+
+	seq_puts(m, str);
+	if (modname)
+		seq_printf(m, " [%s]", modname);
+	return 0;
+}
+#else
+static int print_rec(struct seq_file *m, unsigned long ip)
+{
+	seq_printf(m, "%ps", (void *)ip);
+	return 0;
+}
+#endif
+
 static int t_show(struct seq_file *m, void *v)
 {
 	struct ftrace_iterator *iter = m->private;
@@ -3678,7 +3703,9 @@  static int t_show(struct seq_file *m, void *v)
 	if (!rec)
 		return 0;
 
-	seq_printf(m, "%ps", (void *)rec->ip);
+	if (print_rec(m, rec->ip))
+		return 0;
+
 	if (iter->flags & FTRACE_ITER_ENABLED) {
 		struct ftrace_ops *ops;
 
@@ -3996,6 +4023,24 @@  add_rec_by_index(struct ftrace_hash *hash, struct ftrace_glob *func_g,
 	return 0;
 }
 
+#ifdef FTRACE_MCOUNT_MAX_OFFSET
+static int lookup_ip(unsigned long ip, char **modname, char *str)
+{
+	unsigned long offset;
+
+	kallsyms_lookup(ip, NULL, &offset, modname, str);
+	if (offset > FTRACE_MCOUNT_MAX_OFFSET)
+		return -1;
+	return 0;
+}
+#else
+static int lookup_ip(unsigned long ip, char **modname, char *str)
+{
+	kallsyms_lookup(ip, NULL, NULL, modname, str);
+	return 0;
+}
+#endif
+
 static int
 ftrace_match_record(struct dyn_ftrace *rec, struct ftrace_glob *func_g,
 		struct ftrace_glob *mod_g, int exclude_mod)
@@ -4003,7 +4048,8 @@  ftrace_match_record(struct dyn_ftrace *rec, struct ftrace_glob *func_g,
 	char str[KSYM_SYMBOL_LEN];
 	char *modname;
 
-	kallsyms_lookup(rec->ip, NULL, NULL, &modname, str);
+	if (lookup_ip(rec->ip, &modname, str))
+		return 0;
 
 	if (mod_g) {
 		int mod_matches = (modname) ? ftrace_match(modname, mod_g) : 0;