Message ID | 20240802210836.2210140-3-song@kernel.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Fix kallsyms with CONFIG_LTO_CLANG | expand |
On Fri, 2 Aug 2024 14:08:34 -0700 Song Liu <song@kernel.org> wrote: > With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names > to avoid duplication. This causes confusion with users of kallsyms. > On one hand, users like livepatch are required to match the symbols > exactly. On the other hand, users like kprobe would like to match to > original function names. > > Solve this by splitting kallsyms APIs. Specifically, existing APIs now > should match the symbols exactly. Add two APIs that match only the part > without .XXXX suffix. Specifically, the following two APIs are added. > > 1. kallsyms_lookup_name_without_suffix() > 2. kallsyms_on_each_match_symbol_without_suffix() > > These APIs will be used by kprobe. > > Also cleanup some code and update kallsyms_selftests accordingly. > > Signed-off-by: Song Liu <song@kernel.org> Looks good to me, but I have a nitpick. > --- a/kernel/kallsyms.c > +++ b/kernel/kallsyms.c > @@ -164,30 +164,27 @@ static void cleanup_symbol_name(char *s) > { > char *res; > > - if (!IS_ENABLED(CONFIG_LTO_CLANG)) > - return; > - > /* > * LLVM appends various suffixes for local functions and variables that > * must be promoted to global scope as part of LTO. This can break > * hooking of static functions with kprobes. '.' is not a valid > - * character in an identifier in C. Suffixes only in LLVM LTO observed: > - * - foo.llvm.[0-9a-f]+ > + * character in an identifier in C, so we can just remove the > + * suffix. > */ > - res = strstr(s, ".llvm."); > + res = strstr(s, "."); nit: "strchr(s, '.')" ? Anyway, Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Thank you,
Hi Masami, On Mon, Aug 5, 2024 at 6:45 AM Masami Hiramatsu <mhiramat@kernel.org> wrote: > > On Fri, 2 Aug 2024 14:08:34 -0700 > Song Liu <song@kernel.org> wrote: > > > With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names > > to avoid duplication. This causes confusion with users of kallsyms. > > On one hand, users like livepatch are required to match the symbols > > exactly. On the other hand, users like kprobe would like to match to > > original function names. > > > > Solve this by splitting kallsyms APIs. Specifically, existing APIs now > > should match the symbols exactly. Add two APIs that match only the part > > without .XXXX suffix. Specifically, the following two APIs are added. > > > > 1. kallsyms_lookup_name_without_suffix() > > 2. kallsyms_on_each_match_symbol_without_suffix() > > > > These APIs will be used by kprobe. > > > > Also cleanup some code and update kallsyms_selftests accordingly. > > > > Signed-off-by: Song Liu <song@kernel.org> > > Looks good to me, but I have a nitpick. > > > > --- a/kernel/kallsyms.c > > +++ b/kernel/kallsyms.c > > @@ -164,30 +164,27 @@ static void cleanup_symbol_name(char *s) > > { > > char *res; > > > > - if (!IS_ENABLED(CONFIG_LTO_CLANG)) > > - return; > > - > > /* > > * LLVM appends various suffixes for local functions and variables that > > * must be promoted to global scope as part of LTO. This can break > > * hooking of static functions with kprobes. '.' is not a valid > > - * character in an identifier in C. Suffixes only in LLVM LTO observed: > > - * - foo.llvm.[0-9a-f]+ > > + * character in an identifier in C, so we can just remove the > > + * suffix. > > */ > > - res = strstr(s, ".llvm."); > > + res = strstr(s, "."); > > nit: "strchr(s, '.')" ? > > Anyway, > > Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Thanks for your kind review! If we have another version, I will fold this change (and the one in kallsyms_selftest.c) in. Song
On Fri 2024-08-02 14:08:34, Song Liu wrote: > With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names > to avoid duplication. This causes confusion with users of kallsyms. > On one hand, users like livepatch are required to match the symbols > exactly. On the other hand, users like kprobe would like to match to > original function names. > > Solve this by splitting kallsyms APIs. Specifically, existing APIs now > should match the symbols exactly. Add two APIs that match only the part > without .XXXX suffix. Specifically, the following two APIs are added. > > 1. kallsyms_lookup_name_without_suffix() > 2. kallsyms_on_each_match_symbol_without_suffix() > > These APIs will be used by kprobe. > > Also cleanup some code and update kallsyms_selftests accordingly. > > --- a/kernel/kallsyms.c > +++ b/kernel/kallsyms.c > @@ -164,30 +164,27 @@ static void cleanup_symbol_name(char *s) > { > char *res; > > - if (!IS_ENABLED(CONFIG_LTO_CLANG)) > - return; > - > /* > * LLVM appends various suffixes for local functions and variables that > * must be promoted to global scope as part of LTO. This can break > * hooking of static functions with kprobes. '.' is not a valid > - * character in an identifier in C. Suffixes only in LLVM LTO observed: > - * - foo.llvm.[0-9a-f]+ > + * character in an identifier in C, so we can just remove the > + * suffix. > */ > - res = strstr(s, ".llvm."); > + res = strstr(s, "."); IMHO, we should not remove the suffixes like .constprop*, .part*, .isra*. They implement a special optimized variant of the function. It is not longer the original full-featured one. > if (res) > *res = '\0'; > > return; > } > > -static int compare_symbol_name(const char *name, char *namebuf) > +static int compare_symbol_name(const char *name, char *namebuf, bool exact_match) > { > - /* The kallsyms_seqs_of_names is sorted based on names after > - * cleanup_symbol_name() (see scripts/kallsyms.c) if clang lto is enabled. > - * To ensure correct bisection in kallsyms_lookup_names(), do > - * cleanup_symbol_name(namebuf) before comparing name and namebuf. > - */ > + int ret = strcmp(name, namebuf); > + > + if (exact_match || !ret) > + return ret; > + > cleanup_symbol_name(namebuf); > return strcmp(name, namebuf); > } > @@ -204,13 +201,17 @@ static unsigned int get_symbol_seq(int index) > > static int kallsyms_lookup_names(const char *name, > unsigned int *start, > - unsigned int *end) > + unsigned int *end, > + bool exact_match) > { > int ret; > int low, mid, high; > unsigned int seq, off; > char namebuf[KSYM_NAME_LEN]; > > + if (!IS_ENABLED(CONFIG_LTO_CLANG)) > + exact_match = true; IMHO, this is very a bad design. It causes that kallsyms_on_each_match_symbol_without_suffix(,,, false); does not longer work as expected. It creates a hard to maintain code. The code does not do what it looks like. The caller should decide whether it wants to ignore the suffix or no. And this function should always respect the @exact_match parameter value. > + > low = 0; > high = kallsyms_num_syms - 1; > Best Regards, Petr
> On Aug 8, 2024, at 3:20 AM, Petr Mladek <pmladek@suse.com> wrote: > > On Fri 2024-08-02 14:08:34, Song Liu wrote: >> With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names >> to avoid duplication. This causes confusion with users of kallsyms. >> On one hand, users like livepatch are required to match the symbols >> exactly. On the other hand, users like kprobe would like to match to >> original function names. >> >> Solve this by splitting kallsyms APIs. Specifically, existing APIs now >> should match the symbols exactly. Add two APIs that match only the part >> without .XXXX suffix. Specifically, the following two APIs are added. >> >> 1. kallsyms_lookup_name_without_suffix() >> 2. kallsyms_on_each_match_symbol_without_suffix() >> >> These APIs will be used by kprobe. >> >> Also cleanup some code and update kallsyms_selftests accordingly. >> >> --- a/kernel/kallsyms.c >> +++ b/kernel/kallsyms.c >> @@ -164,30 +164,27 @@ static void cleanup_symbol_name(char *s) >> { >> char *res; >> >> - if (!IS_ENABLED(CONFIG_LTO_CLANG)) >> - return; >> - >> /* >> * LLVM appends various suffixes for local functions and variables that >> * must be promoted to global scope as part of LTO. This can break >> * hooking of static functions with kprobes. '.' is not a valid >> - * character in an identifier in C. Suffixes only in LLVM LTO observed: >> - * - foo.llvm.[0-9a-f]+ >> + * character in an identifier in C, so we can just remove the >> + * suffix. >> */ >> - res = strstr(s, ".llvm."); >> + res = strstr(s, "."); > > IMHO, we should not remove the suffixes like .constprop*, .part*, > .isra*. They implement a special optimized variant of the function. > It is not longer the original full-featured one. > >> if (res) >> *res = '\0'; >> >> return; >> } >> >> -static int compare_symbol_name(const char *name, char *namebuf) >> +static int compare_symbol_name(const char *name, char *namebuf, bool exact_match) >> { >> - /* The kallsyms_seqs_of_names is sorted based on names after >> - * cleanup_symbol_name() (see scripts/kallsyms.c) if clang lto is enabled. >> - * To ensure correct bisection in kallsyms_lookup_names(), do >> - * cleanup_symbol_name(namebuf) before comparing name and namebuf. >> - */ >> + int ret = strcmp(name, namebuf); >> + >> + if (exact_match || !ret) >> + return ret; >> + >> cleanup_symbol_name(namebuf); >> return strcmp(name, namebuf); >> } >> @@ -204,13 +201,17 @@ static unsigned int get_symbol_seq(int index) >> >> static int kallsyms_lookup_names(const char *name, >> unsigned int *start, >> - unsigned int *end) >> + unsigned int *end, >> + bool exact_match) >> { >> int ret; >> int low, mid, high; >> unsigned int seq, off; >> char namebuf[KSYM_NAME_LEN]; >> >> + if (!IS_ENABLED(CONFIG_LTO_CLANG)) >> + exact_match = true; > > IMHO, this is very a bad design. It causes that > > kallsyms_on_each_match_symbol_without_suffix(,,, false); > > does not longer work as expected. It creates a hard to maintain > code. The code does not do what it looks like. Indeed. It actually caused issue with GCC-built kernel in my tests after submitting v2. Thanks, Song
diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h index c3f075e8f60c..9ef2a944a993 100644 --- a/include/linux/kallsyms.h +++ b/include/linux/kallsyms.h @@ -74,9 +74,12 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, unsigned long), void *data); int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long), const char *name, void *data); +int kallsyms_on_each_match_symbol_without_suffix(int (*fn)(void *, unsigned long), + const char *name, void *data); /* Lookup the address for a symbol. Returns 0 if not found. */ unsigned long kallsyms_lookup_name(const char *name); +unsigned long kallsyms_lookup_name_without_suffix(const char *name); extern int kallsyms_lookup_size_offset(unsigned long addr, unsigned long *symbolsize, @@ -104,6 +107,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name) return 0; } +static inline unsigned long kallsyms_lookup_name_without_suffix(const char *name) +{ + return 0; +} + static inline int kallsyms_lookup_size_offset(unsigned long addr, unsigned long *symbolsize, unsigned long *offset) @@ -165,6 +173,12 @@ static inline int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long) { return -EOPNOTSUPP; } + +static inline int kallsyms_on_each_match_symbol_without_suffix(int (*fn)(void *, unsigned long), + const char *name, void *data) +{ + return -EOPNOTSUPP; +} #endif /*CONFIG_KALLSYMS*/ static inline void print_ip_sym(const char *loglvl, unsigned long ip) diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c index fb2c77368d18..64fdff6cde85 100644 --- a/kernel/kallsyms.c +++ b/kernel/kallsyms.c @@ -164,30 +164,27 @@ static void cleanup_symbol_name(char *s) { char *res; - if (!IS_ENABLED(CONFIG_LTO_CLANG)) - return; - /* * LLVM appends various suffixes for local functions and variables that * must be promoted to global scope as part of LTO. This can break * hooking of static functions with kprobes. '.' is not a valid - * character in an identifier in C. Suffixes only in LLVM LTO observed: - * - foo.llvm.[0-9a-f]+ + * character in an identifier in C, so we can just remove the + * suffix. */ - res = strstr(s, ".llvm."); + res = strstr(s, "."); if (res) *res = '\0'; return; } -static int compare_symbol_name(const char *name, char *namebuf) +static int compare_symbol_name(const char *name, char *namebuf, bool exact_match) { - /* The kallsyms_seqs_of_names is sorted based on names after - * cleanup_symbol_name() (see scripts/kallsyms.c) if clang lto is enabled. - * To ensure correct bisection in kallsyms_lookup_names(), do - * cleanup_symbol_name(namebuf) before comparing name and namebuf. - */ + int ret = strcmp(name, namebuf); + + if (exact_match || !ret) + return ret; + cleanup_symbol_name(namebuf); return strcmp(name, namebuf); } @@ -204,13 +201,17 @@ static unsigned int get_symbol_seq(int index) static int kallsyms_lookup_names(const char *name, unsigned int *start, - unsigned int *end) + unsigned int *end, + bool exact_match) { int ret; int low, mid, high; unsigned int seq, off; char namebuf[KSYM_NAME_LEN]; + if (!IS_ENABLED(CONFIG_LTO_CLANG)) + exact_match = true; + low = 0; high = kallsyms_num_syms - 1; @@ -219,7 +220,7 @@ static int kallsyms_lookup_names(const char *name, seq = get_symbol_seq(mid); off = get_symbol_offset(seq); kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf)); - ret = compare_symbol_name(name, namebuf); + ret = compare_symbol_name(name, namebuf, exact_match); if (ret > 0) low = mid + 1; else if (ret < 0) @@ -236,7 +237,7 @@ static int kallsyms_lookup_names(const char *name, seq = get_symbol_seq(low - 1); off = get_symbol_offset(seq); kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf)); - if (compare_symbol_name(name, namebuf)) + if (compare_symbol_name(name, namebuf, exact_match)) break; low--; } @@ -248,7 +249,7 @@ static int kallsyms_lookup_names(const char *name, seq = get_symbol_seq(high + 1); off = get_symbol_offset(seq); kallsyms_expand_symbol(off, namebuf, ARRAY_SIZE(namebuf)); - if (compare_symbol_name(name, namebuf)) + if (compare_symbol_name(name, namebuf, exact_match)) break; high++; } @@ -268,7 +269,28 @@ unsigned long kallsyms_lookup_name(const char *name) if (!*name) return 0; - ret = kallsyms_lookup_names(name, &i, NULL); + ret = kallsyms_lookup_names(name, &i, NULL, true); + if (!ret) + return kallsyms_sym_address(get_symbol_seq(i)); + + return module_kallsyms_lookup_name(name); +} + +/* + * Lookup the address for this symbol. + * Remove .XXX suffix from the symbole before comparing against + * the name to lookup. + */ +unsigned long kallsyms_lookup_name_without_suffix(const char *name) +{ + int ret; + unsigned int i; + + /* Skip the search for empty string. */ + if (!*name) + return 0; + + ret = kallsyms_lookup_names(name, &i, NULL, false); if (!ret) return kallsyms_sym_address(get_symbol_seq(i)); @@ -303,7 +325,25 @@ int kallsyms_on_each_match_symbol(int (*fn)(void *, unsigned long), int ret; unsigned int i, start, end; - ret = kallsyms_lookup_names(name, &start, &end); + ret = kallsyms_lookup_names(name, &start, &end, true); + if (ret) + return 0; + + for (i = start; !ret && i <= end; i++) { + ret = fn(data, kallsyms_sym_address(get_symbol_seq(i))); + cond_resched(); + } + + return ret; +} + +int kallsyms_on_each_match_symbol_without_suffix(int (*fn)(void *, unsigned long), + const char *name, void *data) +{ + int ret; + unsigned int i, start, end; + + ret = kallsyms_lookup_names(name, &start, &end, false); if (ret) return 0; @@ -450,8 +490,6 @@ const char *kallsyms_lookup(unsigned long addr, int lookup_symbol_name(unsigned long addr, char *symname) { - int res; - symname[0] = '\0'; symname[KSYM_NAME_LEN - 1] = '\0'; @@ -462,16 +500,10 @@ int lookup_symbol_name(unsigned long addr, char *symname) /* Grab name */ kallsyms_expand_symbol(get_symbol_offset(pos), symname, KSYM_NAME_LEN); - goto found; + return 0; } /* See if it's in a module. */ - res = lookup_module_symbol_name(addr, symname); - if (res) - return res; - -found: - cleanup_symbol_name(symname); - return 0; + return lookup_module_symbol_name(addr, symname); } /* Look up a kernel symbol and return it in a text buffer. */ diff --git a/kernel/kallsyms_selftest.c b/kernel/kallsyms_selftest.c index 2f84896a7bcb..929270f4ed55 100644 --- a/kernel/kallsyms_selftest.c +++ b/kernel/kallsyms_selftest.c @@ -187,31 +187,11 @@ static void test_perf_kallsyms_lookup_name(void) stat.min, stat.max, div_u64(stat.sum, stat.real_cnt)); } -static bool match_cleanup_name(const char *s, const char *name) -{ - char *p; - int len; - - if (!IS_ENABLED(CONFIG_LTO_CLANG)) - return false; - - p = strstr(s, ".llvm."); - if (!p) - return false; - - len = strlen(name); - if (p - s != len) - return false; - - return !strncmp(s, name, len); -} - static int find_symbol(void *data, const char *name, unsigned long addr) { struct test_stat *stat = (struct test_stat *)data; - if (strcmp(name, stat->name) == 0 || - (!stat->perf && match_cleanup_name(name, stat->name))) { + if (!strcmp(name, stat->name)) { stat->real_cnt++; stat->addr = addr; @@ -327,12 +307,28 @@ static int test_kallsyms_basic_function(void) } } + prefix = "kallsyms_on_each_match_symbol() for"; + for (i = 0; i < ARRAY_SIZE(test_items); i++) { + memset(stat, 0, sizeof(*stat)); + stat->max = INT_MAX; + stat->name = test_items[i].name; + kallsyms_on_each_match_symbol(match_symbol, test_items[i].name, stat); + if (stat->addr != test_items[i].addr || stat->real_cnt != 1) { + nr_failed++; + pr_info("%s %s failed: count=%d, addr=%lx, expect %lx\n", + prefix, test_items[i].name, + stat->real_cnt, stat->addr, test_items[i].addr); + } + } + if (nr_failed) { kfree(stat); return -ESRCH; } for (i = 0; i < kallsyms_num_syms; i++) { + char *p; + addr = kallsyms_sym_address(i); if (!is_ksym_addr(addr)) continue; @@ -415,6 +411,43 @@ static int test_kallsyms_basic_function(void) goto failed; } } + + if (IS_ENABLED(CONFIG_LTO_CLANG)) + continue; + + p = strstr(namebuf, "."); + + /* + * If the symbol contains a "." and it not started with + * ".", test the no suffix lookup. + */ + if (!p || namebuf[0] != '.') + continue; + + *p = '\0'; + + /* + * kallsyms_lookup_name_without_suffix should return a + * value, but it may not match the address of this symbol, + * as the name without suffix may not be unique. + */ + addr = kallsyms_lookup_name_without_suffix(namebuf); + if (!addr) { + pr_info("%s: kallsyms_lookup_name_without_suffix cannot find the symbol\n", + namebuf); + goto failed; + } + + memset(stat, 0, sizeof(*stat)); + stat->max = INT_MAX; + stat->name = namebuf; + kallsyms_on_each_match_symbol_without_suffix( + match_symbol, namebuf, stat); + if (!stat->real_cnt) { + pr_info("%s: kallsyms_on_each_match_symbol_without_suffix cannot find the symbol\n", + namebuf); + goto failed; + } } kfree(stat);
With CONFIG_LTO_CLANG=y, the compiler may add suffix to function names to avoid duplication. This causes confusion with users of kallsyms. On one hand, users like livepatch are required to match the symbols exactly. On the other hand, users like kprobe would like to match to original function names. Solve this by splitting kallsyms APIs. Specifically, existing APIs now should match the symbols exactly. Add two APIs that match only the part without .XXXX suffix. Specifically, the following two APIs are added. 1. kallsyms_lookup_name_without_suffix() 2. kallsyms_on_each_match_symbol_without_suffix() These APIs will be used by kprobe. Also cleanup some code and update kallsyms_selftests accordingly. Signed-off-by: Song Liu <song@kernel.org> --- include/linux/kallsyms.h | 14 ++++++ kernel/kallsyms.c | 88 ++++++++++++++++++++++++++------------ kernel/kallsyms_selftest.c | 75 +++++++++++++++++++++++--------- 3 files changed, 128 insertions(+), 49 deletions(-)