Message ID | 20250102185845.928488650@goodmis.org (mailing list archive) |
---|---|
Headers | show |
Series | scripts/sorttable: ftrace: Remove place holders for weak functions in available_filter_functions | expand |
On Thu, 02 Jan 2025 13:58:45 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > The last patch adds the option "-s <file>" to sorttable.c. Now this code > is called by: > > ${NM} -S vmlinux > .tmp_vmlinux.nm-sort > ${objtree}/scripts/sorttable -s .tmp_vmlinux.nm-sort ${1} > > Where the file created by "nm -S" is read, recording the address > and the associated sizes of each function. It then is sorted, and > before sorting the mcount_loc table, it is scanned to make sure > all symbols in the mcounc_loc are within the boundaries of the functions > defined by nm. If they are not, they are zeroed out, as they are most > likely weak functions (I don't know what else they would be). > > Then on boot up, when creating the ftrace tables from the mcount_loc > table, it will ignore any function that matches the kaslr_offset() > value. As KASLR will still shift the values even if they are zero. > But by skipping over entries in mcount_loc that match kaslr_offset() > all weak functions are removed from the ftrace table as well as the > available_filter_functions file that is derived from it. > I mentioned this in the last patch, but forgot to mention it here. Even if kallsyms is "fixed" where it doesn't return the name of a function if the address is outside its size, that doesn't fix the place holder issue with available_filter_functions. That would just make it easier to know a function doesn't have a name, but a place holder is still required. This patch set removes those place holders regardless of kallsyms being fixed or not. -- Steve
On Thu, 2 Jan 2025 at 10:59, Steven Rostedt <rostedt@goodmis.org> wrote: > > Where the file created by "nm -S" is read, recording the address > and the associated sizes of each function. It then is sorted, and > before sorting the mcount_loc table, it is scanned to make sure > all symbols in the mcounc_loc are within the boundaries of the functions > defined by nm. If they are not, they are zeroed out, as they are most > likely weak functions (I don't know what else they would be). Please just do this by sorting non-existent functions at the end, instead of just zeroing them out. That makes the mcount_loc table dense in valid entries. We could then just rewrite the size of the table (or just add a variable containing the size, if you don't want to change ELF metadata - but you're already sorting the table, so why not?) Because: > Then on boot up, when creating the ftrace tables from the mcount_loc > table, it will ignore any function that matches the kaslr_offset() > value. Why even do that? Why not just make the mcount_loc table be proper in the first place. Linus
On Thu, 2 Jan 2025 11:30:12 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote: > Please just do this by sorting non-existent functions at the end, > instead of just zeroing them out. > > That makes the mcount_loc table dense in valid entries. We could then > just rewrite the size of the table (or just add a variable containing > the size, if you don't want to change ELF metadata - but you're > already sorting the table, so why not?) > > Because: > > > Then on boot up, when creating the ftrace tables from the mcount_loc > > table, it will ignore any function that matches the kaslr_offset() > > value. > > Why even do that? Why not just make the mcount_loc table be proper in > the first place. I was a bit nervous about changing the stop_mcount_loc value. I thought of doing that first, but then I noticed that the value is found by looking at the System.map file and not from the object itself. Changing it in the object will require some more elf parsing. Just zeroing out didn't require that. I'm fine with adding that, but it will take some more elf foo magic, and my time to work on this is coming near its end. To do this, I believe the symbol table will need to be searched for the __stop_mcount_loc. This could be a clean up as well, as I don't really like that the code does a search of System.map, and reading it from the object file may be more robust. Then when we have the values from the object file, we should also be able to modify it. -- Steve
On Thu, 2 Jan 2025 14:45:53 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > > Why even do that? Why not just make the mcount_loc table be proper in > > the first place. > > I was a bit nervous about changing the stop_mcount_loc value. I thought of > doing that first, but then I noticed that the value is found by looking at > the System.map file and not from the object itself. Changing it in the > object will require some more elf parsing. Just zeroing out didn't require > that. I could still zero it out, but then change the start_mcount_loc to the first valid function after the sort. As all the zeros will be at the start. -- Steve
On Thu, 2 Jan 2025 11:30:12 -0800 Linus Torvalds <torvalds@linux-foundation.org> wrote: > Please just do this by sorting non-existent functions at the end, > instead of just zeroing them out. > > That makes the mcount_loc table dense in valid entries. We could then > just rewrite the size of the table (or just add a variable containing > the size, if you don't want to change ELF metadata - but you're > already sorting the table, so why not?) Well, I tried to move the __start_mcount_loc, but it appears that changing the symbol value *after* the linking phase does nothing :-p The references to it have already been resolved. The Elf_Rel* will do the updates from then on, and to read those, becomes architecture dependent. I guess the next thing I could do is to create a "skip" variable that can be modified, and we can skip X entries in the start_mcount_loc. As the start_mcount_loc and stop_mcount_loc (which determines the size of the table) cannot be modified in an architecture independent way. -- Steve
On Thu, 2 Jan 2025 16:44:54 -0500 Steven Rostedt <rostedt@goodmis.org> wrote: > I guess the next thing I could do is to create a "skip" variable that can > be modified, and we can skip X entries in the start_mcount_loc. As the > start_mcount_loc and stop_mcount_loc (which determines the size of the > table) cannot be modified in an architecture independent way. This is on top of the last patch, but I'll likely restructure it and remove the kaslr_offset() part totally. I'd still break it up into two patches. One to remove the reading of the System.map, and then the other to add the skip variable. This version, I add a "ftrace_mcount_skip" that is default to zero, but the sorttable.c code will update it to skip over the functions that it zeroed out. -- Steve diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c index 5963ae76b31a..3193148102e0 100644 --- a/kernel/trace/ftrace.c +++ b/kernel/trace/ftrace.c @@ -7077,20 +7077,6 @@ static int ftrace_process_locs(struct module *mod, continue; } - /* - * At build time, a check is made against: nm -S vmlinux - * to make sure all functions are found within the - * size range of symbols listed by nm. If not, it's likely - * a weak function that was overridden. We do not want those. - * The script will zero them out, but kaslr will still - * update them. If the address is the same as the kaslr_offset() - * then skip the record. - */ - if (addr == kaslr_offset()) { - skipped++; - continue; - } - end_offset = (pg->index+1) * sizeof(pg->records[0]); if (end_offset > PAGE_SIZE << pg->order) { /* We should have allocated enough */ @@ -7761,10 +7747,13 @@ int __init __weak ftrace_dyn_arch_init(void) return 0; } +int ftrace_mcount_skip __initdata; + void __init ftrace_init(void) { extern unsigned long __start_mcount_loc[]; extern unsigned long __stop_mcount_loc[]; + unsigned long *start_addr = __start_mcount_loc; unsigned long count, flags; int ret; @@ -7775,6 +7764,14 @@ void __init ftrace_init(void) goto failed; count = __stop_mcount_loc - __start_mcount_loc; + + /* + * scripts/sorttable.c will set ftrace_mcount_skip to state + * how many functions to skip in the __mcount_loc section. + * These would have been weak functions. + */ + count -= ftrace_mcount_skip; + start_addr += ftrace_mcount_skip; if (!count) { pr_info("ftrace: No functions to be traced?\n"); goto failed; @@ -7783,9 +7780,7 @@ void __init ftrace_init(void) pr_info("ftrace: allocating %ld entries in %ld pages\n", count, DIV_ROUND_UP(count, ENTRIES_PER_PAGE)); - ret = ftrace_process_locs(NULL, - __start_mcount_loc, - __stop_mcount_loc); + ret = ftrace_process_locs(NULL, start_addr, __stop_mcount_loc); if (ret) { pr_warn("ftrace: failed to allocate entries for functions\n"); goto failed; diff --git a/scripts/sorttable.c b/scripts/sorttable.c index d1d52bd12adb..506172898fd8 100644 --- a/scripts/sorttable.c +++ b/scripts/sorttable.c @@ -543,6 +543,7 @@ static pthread_t mcount_sort_thread; struct elf_mcount_loc { Elf_Ehdr *ehdr; Elf_Shdr *init_data_sec; + Elf_Sym *ftrace_skip_sym; uint64_t start_mcount_loc; uint64_t stop_mcount_loc; }; @@ -554,12 +555,19 @@ static void *sort_mcount_loc(void *arg) uint64_t offset = emloc->start_mcount_loc - shdr_addr(emloc->init_data_sec) + shdr_offset(emloc->init_data_sec); uint64_t count = emloc->stop_mcount_loc - emloc->start_mcount_loc; + uint32_t *skip_addr = NULL; unsigned char *start_loc = (void *)emloc->ehdr + offset; + void *end_loc = start_loc + count; + + if (emloc->ftrace_skip_sym) { + offset = sym_value(emloc->ftrace_skip_sym) - + shdr_addr(emloc->init_data_sec) + + shdr_offset(emloc->init_data_sec); + skip_addr = (void *)emloc->ehdr + offset; + } /* zero out any locations not found by function list */ if (function_list_size) { - void *end_loc = start_loc + count; - for (void *ptr = start_loc; ptr < end_loc; ptr += long_size) { uint64_t key; @@ -574,44 +582,65 @@ static void *sort_mcount_loc(void *arg) } qsort(start_loc, count/long_size, long_size, compare_extable); + + /* Now set how many functions need to be skipped */ + for (void *ptr = start_loc; skip_addr && ptr < end_loc; ptr += long_size) { + uint64_t val; + + if (long_size == 4) + val = *(uint32_t *)ptr; + else + val = *(uint64_t *)ptr; + if (val) { + uint32_t skip; + + /* Update the ftrace_mcount_skip to skip these functions */ + val = ptr - (void *)start_loc; + skip = val / long_size; + w(r(&skip), skip_addr); + break; + } + } return NULL; } /* Get the address of __start_mcount_loc and __stop_mcount_loc in System.map */ -static void get_mcount_loc(uint64_t *_start, uint64_t *_stop) +static void get_mcount_loc(struct elf_mcount_loc *emloc, Elf_Shdr *symtab_sec, + const char *strtab) { - FILE *file_start, *file_stop; - char start_buff[20]; - char stop_buff[20]; - int len = 0; + Elf_Sym *sym, *end_sym; + int symentsize = shdr_entsize(symtab_sec); + int found = 0; + + sym = (void *)emloc->ehdr + shdr_offset(symtab_sec); + end_sym = (void *)sym + shdr_size(symtab_sec); + + while (sym < end_sym) { + if (!strcmp(strtab + sym_name(sym), "__start_mcount_loc")) { + emloc->start_mcount_loc = sym_value(sym); + if (++found == 3) + break; + } else if (!strcmp(strtab + sym_name(sym), "__stop_mcount_loc")) { + emloc->stop_mcount_loc = sym_value(sym); + if (++found == 3) + break; + } else if (!strcmp(strtab + sym_name(sym), "ftrace_mcount_skip")) { + emloc->ftrace_skip_sym = sym; + if (++found == 3) + break; + } + sym = (void *)sym + symentsize; + } - file_start = popen(" grep start_mcount System.map | awk '{print $1}' ", "r"); - if (!file_start) { + if (!emloc->start_mcount_loc) { fprintf(stderr, "get start_mcount_loc error!"); return; } - file_stop = popen(" grep stop_mcount System.map | awk '{print $1}' ", "r"); - if (!file_stop) { + if (!emloc->stop_mcount_loc) { fprintf(stderr, "get stop_mcount_loc error!"); - pclose(file_start); return; } - - while (fgets(start_buff, sizeof(start_buff), file_start) != NULL) { - len = strlen(start_buff); - start_buff[len - 1] = '\0'; - } - *_start = strtoul(start_buff, NULL, 16); - - while (fgets(stop_buff, sizeof(stop_buff), file_stop) != NULL) { - len = strlen(stop_buff); - stop_buff[len - 1] = '\0'; - } - *_stop = strtoul(stop_buff, NULL, 16); - - pclose(file_start); - pclose(file_stop); } #else /* MCOUNT_SORT_ENABLED */ static inline int parse_symbols(const char *fname) { return 0; } @@ -647,8 +676,6 @@ static int do_sort(Elf_Ehdr *ehdr, unsigned int shstrndx; #ifdef MCOUNT_SORT_ENABLED struct elf_mcount_loc mstruct = {0}; - uint64_t _start_mcount_loc = 0; - uint64_t _stop_mcount_loc = 0; #endif #if defined(SORTTABLE_64) && defined(UNWINDER_ORC_ENABLED) unsigned int orc_ip_size = 0; @@ -686,15 +713,9 @@ static int do_sort(Elf_Ehdr *ehdr, #ifdef MCOUNT_SORT_ENABLED /* locate the .init.data section in vmlinux */ - if (!strcmp(secstrings + idx, ".init.data")) { - get_mcount_loc(&_start_mcount_loc, &_stop_mcount_loc); - mstruct.ehdr = ehdr; + if (!strcmp(secstrings + idx, ".init.data")) mstruct.init_data_sec = shdr; - mstruct.start_mcount_loc = _start_mcount_loc; - mstruct.stop_mcount_loc = _stop_mcount_loc; - } #endif - #if defined(SORTTABLE_64) && defined(UNWINDER_ORC_ENABLED) /* locate the ORC unwind tables */ if (!strcmp(secstrings + idx, ".orc_unwind_ip")) { @@ -736,23 +757,6 @@ static int do_sort(Elf_Ehdr *ehdr, goto out; } #endif - -#ifdef MCOUNT_SORT_ENABLED - if (!mstruct.init_data_sec || !_start_mcount_loc || !_stop_mcount_loc) { - fprintf(stderr, - "incomplete mcount's sort in file: %s\n", - fname); - goto out; - } - - /* create thread to sort mcount_loc concurrently */ - if (pthread_create(&mcount_sort_thread, NULL, &sort_mcount_loc, &mstruct)) { - fprintf(stderr, - "pthread_create mcount_sort_thread failed '%s': %s\n", - strerror(errno), fname); - goto out; - } -#endif if (!extab_sec) { fprintf(stderr, "no __ex_table in file: %s\n", fname); goto out; @@ -772,6 +776,26 @@ static int do_sort(Elf_Ehdr *ehdr, strtab = (const char *)ehdr + shdr_offset(strtab_sec); symtab = (const Elf_Sym *)((const char *)ehdr + shdr_offset(symtab_sec)); +#ifdef MCOUNT_SORT_ENABLED + mstruct.ehdr = ehdr; + get_mcount_loc(&mstruct, symtab_sec, strtab); + + if (!mstruct.init_data_sec || !mstruct.start_mcount_loc || !mstruct.stop_mcount_loc) { + fprintf(stderr, + "incomplete mcount's sort in file: %s\n", + fname); + goto out; + } + + /* create thread to sort mcount_loc concurrently */ + if (pthread_create(&mcount_sort_thread, NULL, &sort_mcount_loc, &mstruct)) { + fprintf(stderr, + "pthread_create mcount_sort_thread failed '%s': %s\n", + strerror(errno), fname); + goto out; + } +#endif + if (custom_sort) { custom_sort(extab_image, shdr_size(extab_sec)); } else {