diff mbox series

[POC,RFC] build: Make weak functions visible in kallsyms

Message ID 20241226164957.5cab9f2d@gandalf.local.home (mailing list archive)
State New
Headers show
Series [POC,RFC] build: Make weak functions visible in kallsyms | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Steven Rostedt Dec. 26, 2024, 9:49 p.m. UTC
From: Steven Rostedt <rostedt@goodmis.org>

This is a proof of concept, so the implementation may need some work. The
point here is to see if this is something we want to have. I only made it
work for x86, but there's nothing arch specific except knowing if it's for
32bit or 64bit. I also have to add accommodation for big vs little endian
cross compiling.

Weak functions have been playing havoc with ftrace users for some time.
That's because if a weak function can be traced, it gets an "fentry" or
"mcount" inserted inside of it. This gets added to the the list in the
__mcount_loc section where ftrace will find them all.

But then, when the linker removes these functions because they were
overridden, the code does not disappear, leaving the pointers in the
__mcount_loc locations.

This also played havoc with kallsyms, as it generates the size of functions
by looking at the next function. If the next function was a weak function
that was overridden, the content of that function is joined with the
previous function.

Combining this with ftrace, it use to show the previous function twice (if
it was traced), or worse, if the previous function was notrace, it would
still show up in available_filter_functions list. This made the BPF test
suite fail:

  https://lore.kernel.org/all/20220412094923.0abe90955e5db486b7bca279@kernel.org/

The workaround to that was to check the addresses of functions in this
file, and if it was too far from the start of the function, it was
considered invalid (as fentry and mcount are at the start of functions):

  commit b39181f7c6907 ("ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid
  adding weak function")

But this still an issue with live kernel patching, as it checks the length
of the function, and if the does not match to what it sees before the weak
function disappears it fails to get added:

  https://patchwork.kernel.org/project/linux-trace-kernel/cover/20240723063258.2240610-1-zhengyejian@huaweicloud.com/

  which took an approach to fix this.

This is similar to that approach but also a bit simpler. I simply added a
weakfuncs.c program that gets run on all object files right after they are
built, and appends to the symbol table copies of the weak functions, but
the are not weak. The names are:

  __weak__##function_name##__weak__

This now shows up in kallsyms as:

ffffffffa8e03f20 t rootfs_init_fs_context
ffffffffa8e03f40 T __pfx_wait_for_initramfs
ffffffffa8e03f50 T wait_for_initramfs
ffffffffa8e03fa0 t __pfx_show_mem
ffffffffa8e03fb0 t show_mem
ffffffffa8e03fc0 T __pfx___weak__calibrate_delay_is_known__weak__
ffffffffa8e03fd0 T __weak__calibrate_delay_is_known__weak__
ffffffffa8e03ff0 T __pfx___weak__calibration_delay_done__weak__
ffffffffa8e03ff0 W __pfx_calibration_delay_done
ffffffffa8e04000 T __weak__calibration_delay_done__weak__
ffffffffa8e04000 W calibration_delay_done
ffffffffa8e04010 T __pfx_calibrate_delay
ffffffffa8e04020 T calibrate_delay
ffffffffa8e04630 t __pfx_calibrate_delay_

As well as in the available_filter_functions:

__probestub_initcall_finish
do_one_initcall
__weak__free_initmem__weak__
run_init_process
trace_initcall_start_cb
trace_initcall_finish_cb
rootfs_init_fs_context
wait_for_initramfs
__weak__calibrate_delay_is_known__weak__
__weak__calibration_delay_done__weak__
calibrate_delay

This makes kallsyms work properly now, as it does see the weak functions
and will not consider their code as part of the previous function.

Thoughts?

Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 scripts/Makefile       |   1 +
 scripts/Makefile.build |  13 ++
 scripts/Makefile.lib   |   1 +
 scripts/weakfuncs.c    | 489 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 504 insertions(+)
 create mode 100644 scripts/weakfuncs.c

Comments

Linus Torvalds Dec. 26, 2024, 11:01 p.m. UTC | #1
On Thu, 26 Dec 2024 at 13:49, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> But then, when the linker removes these functions because they were
> overridden, the code does not disappear, leaving the pointers in the
> __mcount_loc locations.

This seems entirely unrelated to weak functions, and will be true for
any other "linker removed it" (which can happen for other reasons
too).

So your "fix" seems to be hacking around a symptom.

And honestly, the kallsyms argument seems bogus too. The problem with
kallsyms is that it looks up the size the wrong way. Making up new
function names doesn't fix the problem, it - once again - just hacks
around the symptom of doing something wrong.

Christ, kallsyms looking at nm output and going by "next symbol" was
always bogus, but I think that's how the old a.out format worked
originally.

But "nm" literally takes a "-S" argument. We just don't use it.

So I think the fix is literally to just make kallsysms have the size
data. Of course, very annoyingly out /proc/kallsyms file format also
tracks the (legacy) nm output that doesn't have size information.

But I do think that if you hit real problems, you need to fix the
*source* of the issue, not add another ugly hack around things.

             Linus
Steven Rostedt Dec. 27, 2024, 2:19 a.m. UTC | #2
On Thu, 26 Dec 2024 15:01:07 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, 26 Dec 2024 at 13:49, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > But then, when the linker removes these functions because they were
> > overridden, the code does not disappear, leaving the pointers in the
> > __mcount_loc locations.  
> 
> This seems entirely unrelated to weak functions, and will be true for
> any other "linker removed it" (which can happen for other reasons
> too).
> 
> So your "fix" seems to be hacking around a symptom.

Yeah, that's why this was just a POC.

> 
> And honestly, the kallsyms argument seems bogus too. The problem with
> kallsyms is that it looks up the size the wrong way. Making up new
> function names doesn't fix the problem, it - once again - just hacks
> around the symptom of doing something wrong.
> 
> Christ, kallsyms looking at nm output and going by "next symbol" was
> always bogus, but I think that's how the old a.out format worked
> originally.
> 
> But "nm" literally takes a "-S" argument. We just don't use it.
> 
> So I think the fix is literally to just make kallsysms have the size
> data. Of course, very annoyingly out /proc/kallsyms file format also
> tracks the (legacy) nm output that doesn't have size information.
> 
> But I do think that if you hit real problems, you need to fix the
> *source* of the issue, not add another ugly hack around things.

So basically the real solution is to fix kallsyms to know about the end
of functions. Peter Zijlstra mentioned that before, but it would take a
bit more work and understanding of kallsyms to fix it properly.

I'm fine not doing the hack and hopefully one day someone will have the
time to fix kallsyms. This was just something I could do quickly,
knowing it is mostly keeping with the status quo and not actually
fixing the root of the issue. I also needed to refresh my ELF parsing
abilities ;-)

I may take a look at kallsyms internals. I have some spare time before
the new year to try and work on things I don't normally get time to
work on.

-- Steve
Masahiro Yamada Dec. 27, 2024, 2:52 a.m. UTC | #3
On Fri, Dec 27, 2024 at 11:19 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Thu, 26 Dec 2024 15:01:07 -0800
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
> > On Thu, 26 Dec 2024 at 13:49, Steven Rostedt <rostedt@goodmis.org> wrote:
> > >
> > > But then, when the linker removes these functions because they were
> > > overridden, the code does not disappear, leaving the pointers in the
> > > __mcount_loc locations.
> >
> > This seems entirely unrelated to weak functions, and will be true for
> > any other "linker removed it" (which can happen for other reasons
> > too).
> >
> > So your "fix" seems to be hacking around a symptom.
>
> Yeah, that's why this was just a POC.
>
> >
> > And honestly, the kallsyms argument seems bogus too. The problem with
> > kallsyms is that it looks up the size the wrong way. Making up new
> > function names doesn't fix the problem, it - once again - just hacks
> > around the symptom of doing something wrong.
> >
> > Christ, kallsyms looking at nm output and going by "next symbol" was
> > always bogus, but I think that's how the old a.out format worked
> > originally.
> >
> > But "nm" literally takes a "-S" argument. We just don't use it.
> >
> > So I think the fix is literally to just make kallsysms have the size
> > data. Of course, very annoyingly out /proc/kallsyms file format also
> > tracks the (legacy) nm output that doesn't have size information.
> >
> > But I do think that if you hit real problems, you need to fix the
> > *source* of the issue, not add another ugly hack around things.
>
> So basically the real solution is to fix kallsyms to know about the end
> of functions. Peter Zijlstra mentioned that before, but it would take a
> bit more work and understanding of kallsyms to fix it properly.
>
> I'm fine not doing the hack and hopefully one day someone will have the
> time to fix kallsyms. This was just something I could do quickly,
> knowing it is mostly keeping with the status quo and not actually
> fixing the root of the issue. I also needed to refresh my ELF parsing
> abilities ;-)
>
> I may take a look at kallsyms internals. I have some spare time before
> the new year to try and work on things I don't normally get time to
> work on.
>
> -- Steve


So, does the kallsyms patch set from Zheng Yejian fix the issues?

It was a long time ago that I reviewed his v1.
I need to take a look if we decide to go with that approach.
Linus Torvalds Dec. 27, 2024, 2:58 a.m. UTC | #4
On Thu, 26 Dec 2024 at 18:19, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> So basically the real solution is to fix kallsyms to know about the end
> of functions. Peter Zijlstra mentioned that before, but it would take a
> bit more work and understanding of kallsyms to fix it properly.

Yeah. The kallsyms code *used* to be pretty simple - really just a
list of symbols and addresses.

But it took up a lot of memory, so back in the day (two decades ago by
now) it started growing some name compression code, and then some
serious speedups for lookup.

See

    https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=e10392112d315c45f054c22c862e3a7ae27d17d4

for when it went from basically a very simple array of names to be a
lot less obvious with that table lookup name compression (but it had
_some_ name compression even before that).

That said, I think it's really mainly just the name compression that
is a bit obscure and looks odd, and it's only used for the builtin
kernel symbols (module symbols are still just a per-module array of
"const Elf_Sym *").

And you can actually largely ignore the odd name compression - because
the *rest* is fairly straightforward.

For example, the actual offset of the symbol is simply a plain array
still: kallsyms_offsets[]. It's slightly oddly encoded (see
kallsyms_sym_address() if you care), but that's because it's an array
of 32-bit values used to encode kernel symbol offsets that can
obviously be 64-bit.

Encoding the size of the symbols should be trivial: just add another
array: "kallsyms_sizes[]", and it wouldn't even need that odd
encoding.

So I actually think it *should* be fairly straightforward to do for
anybnody who knows the kallsyms code at all.

The main pain-point would be *if* we want to actually expose the sizes
in /proc/kallsyms. That would be a file format change. Which we can't
do, so we'd have to do it as some kind of strange ioctl setting (or
just add a new file under a new name).

But maybe we don't even need that. If all the uses are in-kernel, just
adding the kallsyms_sizes[] (and accessor helper functions) should be
fairly straightforward.

Of course, I say that without having done it. I might be overlooking something.

           Linus
Linus Torvalds Dec. 27, 2024, 3:03 a.m. UTC | #5
On Thu, 26 Dec 2024 at 18:52, Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> So, does the kallsyms patch set from Zheng Yejian fix the issues?

[ Me goes searching ]

Oh. Instead of adding size information, it seems to add fake "hole"
entries to the kallsyms.

And I guess that works too. It does feel a bit hacky, but at least
it's integrated with the kallsyms logic.

                Linus
Linus Torvalds Dec. 27, 2024, 3:35 a.m. UTC | #6
On Thu, 26 Dec 2024 at 13:49, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> But then, when the linker removes these functions because they were
> overridden, the code does not disappear, leaving the pointers in the
> __mcount_loc locations.

Btw, does this actually happen when the compiler does the mcount thing for us?

It *feels* like this might be a bug in the FTRACE_MCOUNT_USE_OBJTOOL
logic interacting with --gc-setions.

Is the problem that --gc-sections removed the function section that
contained the weak function that was then never used, but the objtool
thing with create_mcount_loc_sections() would generate that
mcount_loc_list and nothing realized that it's no longer there?

Or does it happen even with the compiler-generated case (ie with the
-mrecord-mcount and FTRACE_MCOUNT_USE_CC)?

We can disable LD_DEAD_CODE_DATA_ELIMINATION, if that's what triggers it.

It's marked as experimental, and it does smell like either
--gc-sections is buggy, or we're doing something wrong to trigger it
(and I could easily see objtool rewriting object files being that
trigger...)

                  Linus
Steven Rostedt Dec. 27, 2024, 3:45 a.m. UTC | #7
On Thu, 26 Dec 2024 19:35:18 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, 26 Dec 2024 at 13:49, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > But then, when the linker removes these functions because they were
> > overridden, the code does not disappear, leaving the pointers in the
> > __mcount_loc locations.  
> 
> Btw, does this actually happen when the compiler does the mcount thing for us?

Yes.

I believe the issue is that the mcount_loc is created during the
compile phase, and it just points to the call to fentry/mcount. The
linker phase doesn't remove the code, just the symbols that are
overridden. That means the pointer to the fentry/mcount calls still
point to the same locations, as the code is still there.

I even sent an email about this to the gcc folks, and Peter responded
basically explaining the above.

  https://lore.kernel.org/all/20241014210841.5a4764de@gandalf.local.home/

-- Steve
Linus Torvalds Dec. 27, 2024, 6:09 p.m. UTC | #8
On Thu, 26 Dec 2024 at 19:45, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> >
> > Btw, does this actually happen when the compiler does the mcount thing for us?
>
> Yes.

Ok, that's actually good.

I'm not really worried about the "unused symbols aren't in kallsyms"
issue, even if it confuses the mcount logic. THAT confusion is easy to
deal with by either adding symbol size information (which I think
would be a good thing in general, although perhaps not worth it).

Even without the symbol size, the mcount issue can be dealt with by
just knowing that the mcount location has to be at the very beginning
of the function and just taking the offset - that we already do have -
into account.

I was more worried that there might also be some much deeper confusion
with the linker actually garbage collecting the unused weak function
code away, and now an unused symbol that kallsyms doesn't know about
wouldn't just have an unexpected mcount pointer to it, but the mcount
pointer would actually be stale and point to some unrelated code.

So as long as *that* isn't what is happening, this all seems fairly benign.

               Linus
Steven Rostedt Dec. 27, 2024, 7:19 p.m. UTC | #9
On Fri, 27 Dec 2024 10:09:40 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, 26 Dec 2024 at 19:45, Steven Rostedt <rostedt@goodmis.org> wrote:
> >  
> > >
> > > Btw, does this actually happen when the compiler does the mcount thing for us?  
> >
> > Yes.  
> 
> Ok, that's actually good.
> 
> I'm not really worried about the "unused symbols aren't in kallsyms"
> issue, even if it confuses the mcount logic. THAT confusion is easy to
> deal with by either adding symbol size information (which I think
> would be a good thing in general, although perhaps not worth it).
> 
> Even without the symbol size, the mcount issue can be dealt with by
> just knowing that the mcount location has to be at the very beginning
> of the function and just taking the offset - that we already do have -
> into account.

This is actually what we do today. A check is made against the location of
the fentry/mcount, and if it's too far away from the function entry, it is
considered a weak function. The distance is architecture dependent, as well
as options (adding ENDBR and such needs to be accounted for).

Now we still need to list them in the available_filter_functions, as the
order of what's in mcount_loc is used by the set_filter_function, and if we
"hide" any, it can screw up the accounting, because you can enable
functions via the index into that file. Using an index is an O(1)
operation, where as by name it needs to search all addresses and call
kallsyms for a name look up and then compare to what was passed in. Tooling
uses the index, as it's much faster to enable several functions (which by
name turns into an O(n^2) operation). Enabling a thousand functions by name
can take over a minute to complete, whereas by index takes less than a
second.

For now, we have in available_filter_functions:

  trace_initcall_finish_cb
  initcall_blacklisted
  do_one_initcall
  __ftrace_invalid_address___708
  rootfs_init_fs_context
  wait_for_initramfs
  __ftrace_invalid_address___68
  calibration_delay_done
  calibrate_delay

Where those "__ftrace_invalid_address_*" is detected as a weak function.
This is ugly and a hack, but we still need a place holder for them :-(
I would love to find a better way to handle this.

Now, the .mcount_loc is sorted at build time. If there's a way to find
which addresses are no longer visible, then that code could possibly remove
them from the mcount_loc. Perhaps, it could do a 'nm -S' and check to make
sure that all addresses are within a listed size, and if not, remove it.
That way, we could get rid of those place holders.

> 
> I was more worried that there might also be some much deeper confusion
> with the linker actually garbage collecting the unused weak function
> code away, and now an unused symbol that kallsyms doesn't know about
> wouldn't just have an unexpected mcount pointer to it, but the mcount
> pointer would actually be stale and point to some unrelated code.
> 
> So as long as *that* isn't what is happening, this all seems fairly benign.

For now, the mcount_loc is just an annoyance as we have those ugly place
holders. The real bug is with live kernel patching, where there's a
mismatch in the function size. IIUC, the building of the module for the
live patching finds the size of the function it's patching with 'nm'. And
then before patching, asks kallsyms for the size of the function. Because
kallsyms uses the next function to determine the size, it returns the size
of the function plus the size of the weak function. There's a mismatch and
live kernel patching refuses to patch the function.

By adding the actual size of the function to kallsyms, that would fix the
problem. The size doesn't need to be exported to /proc/kallsyms, as you
mentioned before, all users happen to be in-kernel.

The one downsize of storing the numbers, is that we will need to store a
size for every function, and that can add up.

 $ wc -l /proc/kallsyms 
 208784 /proc/kallsyms

That's 208 thousand numbers to store!

Perhaps this is the one advantage of having a weak placeholder in kallsyms
as well. It may save memory.

-- Steve
diff mbox series

Patch

diff --git a/scripts/Makefile b/scripts/Makefile
index 6bcda4b9d054..c1336c0baf6e 100644
--- a/scripts/Makefile
+++ b/scripts/Makefile
@@ -5,6 +5,7 @@ 
 
 hostprogs-always-$(CONFIG_KALLSYMS)			+= kallsyms
 hostprogs-always-$(BUILD_C_RECORDMCOUNT)		+= recordmcount
+hostprogs-always-y					+= weakfuncs
 hostprogs-always-$(CONFIG_BUILDTIME_TABLE_SORT)		+= sorttable
 hostprogs-always-$(CONFIG_ASN1)				+= asn1_compiler
 hostprogs-always-$(CONFIG_MODULE_SIG_FORMAT)		+= sign-file
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index c16e4cf54d77..0b6c3b9de28b 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -148,6 +148,19 @@  cmd_gen_symversions_c =	$(call gen_symversions,c)
 
 endif
 
+# Due to recursion, we must skip empty.o.
+# The empty.o file is created in the make process in order to determine
+# the target endianness and word size. It is made before all other C
+# files, including recordmcount.
+cmd_weakfuncs =					\
+	if [ $(@) != "scripts/mod/empty.o" ]; then	\
+		$(objtree)/scripts/weakfuncs "$(@)";	\
+	fi;
+weakfuncs_source := $(srctree)/scripts/weakfuncs.c
+$(obj)/%.o: $(obj)/%.c $(weakfuncs_source) FORCE
+	$(call if_changed_rule,cc_o_c)
+	$(call cmd,force_checksrc)
+
 ifdef CONFIG_FTRACE_MCOUNT_USE_RECORDMCOUNT
 # compiler will not generate __mcount_loc use recordmcount or recordmcount.pl
 ifdef BUILD_C_RECORDMCOUNT
diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 7395200538da..c20659251914 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -321,6 +321,7 @@  define rule_cc_o_c
 	$(call cmd,gen_objtooldep)
 	$(call cmd,gen_symversions_c)
 	$(call cmd,record_mcount)
+	$(call cmd,weakfuncs)
 	$(call cmd,warn_shared_object)
 endef
 
diff --git a/scripts/weakfuncs.c b/scripts/weakfuncs.c
new file mode 100644
index 000000000000..4051720dde49
--- /dev/null
+++ b/scripts/weakfuncs.c
@@ -0,0 +1,489 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <stdbool.h>
+#include <string.h>
+#include <stdlib.h>
+#include <stdio.h>
+#include <unistd.h>
+#include <fcntl.h>
+#include <elf.h>
+
+#include <sys/types.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+
+#define APPEND_STR "__weak__"
+
+/* Will append APPEND_STR to the start and end */
+#define APPEND_STR_SIZE 16
+
+#define ALIGN(x, y) (((x) + (y) - 1) / (y) * (y))
+
+typedef union {
+	Elf32_Ehdr	e32;
+	Elf64_Ehdr	e64;
+} Elf_Ehdr;
+
+typedef union {
+	Elf32_Shdr	s32;
+	Elf64_Shdr	s64;
+} Elf_Shdr;
+
+typedef union {
+	Elf32_Sym	s32;
+	Elf64_Sym	s64;
+} Elf_Sym;
+
+struct elf {
+	void 		*map;
+	Elf_Ehdr	*ehdr;
+	Elf_Shdr	*shsym;
+	Elf_Shdr	*shstr;
+	const char	*shstrings;
+	const char	*strings;
+	char		*newstrings;
+	off_t		file_size;
+	off_t		newsym;
+	off_t		symend;
+	off_t		newstrstart;
+	size_t		origstrsize;
+	size_t		newstrend;
+	size_t		newstrsize;
+	int		fd;
+	bool		is_32;
+};
+
+static inline int e_shnum(struct elf *elf)
+{
+	if (elf->is_32)
+		return elf->ehdr->e32.e_shnum;
+	else
+		return elf->ehdr->e64.e_shnum;
+}
+
+static inline int e_shstrndx(struct elf *elf)
+{
+	if (elf->is_32)
+		return elf->ehdr->e32.e_shstrndx;
+	else
+		return elf->ehdr->e64.e_shstrndx;
+}
+
+static inline int sh_type(struct elf *elf, Elf_Shdr *shdr)
+{
+	if (elf->is_32)
+		return shdr->s32.sh_type;
+	else
+		return shdr->s64.sh_type;
+}
+
+static inline size_t sh_offset(struct elf *elf, Elf_Shdr *shdr)
+{
+	if (elf->is_32)
+		return shdr->s32.sh_offset;
+	else
+		return shdr->s64.sh_offset;
+}
+
+static inline size_t sh_size(struct elf *elf, Elf_Shdr *shdr)
+{
+	if (elf->is_32)
+		return shdr->s32.sh_size;
+	else
+		return shdr->s64.sh_size;
+}
+
+static inline size_t sh_entsize(struct elf *elf, Elf_Shdr *shdr)
+{
+	if (elf->is_32)
+		return shdr->s32.sh_entsize;
+	else
+		return shdr->s64.sh_entsize;
+}
+
+static inline const char *sh_name(struct elf *elf, Elf_Shdr *shdr)
+{
+	if (elf->is_32)
+		return elf->shstrings + shdr->s32.sh_name;
+	else
+		return elf->shstrings + shdr->s64.sh_name;
+}
+
+static inline const char *sym_name(struct elf *elf, Elf_Sym *sym)
+{
+	if (elf->is_32)
+		return elf->strings + sym->s32.st_name;
+	else
+		return elf->strings + sym->s64.st_name;
+}
+
+static inline int sym_bind(struct elf *elf, Elf_Sym *sym)
+{
+	if (elf->is_32)
+		return ELF32_ST_BIND(sym->s32.st_info);
+	else
+		return ELF64_ST_BIND(sym->s64.st_info);
+}
+
+static inline size_t sym_value(struct elf *elf, Elf_Sym *sym)
+{
+	if (elf->is_32)
+		return sym->s32.st_value;
+	else
+		return sym->s64.st_value;
+}
+
+static inline int sym_size(struct elf *elf, Elf_Sym *sym)
+{
+	if (elf->is_32)
+		return sym->s32.st_size;
+	else
+		return sym->s64.st_size;
+}
+
+static inline int sym_shndx(struct elf *elf, Elf_Sym *sym)
+{
+	if (elf->is_32)
+		return sym->s32.st_shndx;
+	else
+		return sym->s64.st_shndx;
+}
+
+static inline Elf_Shdr *get_shdr(struct elf *elf, int index)
+{
+	if (elf->is_32) {
+		return elf->map + elf->ehdr->e32.e_shoff +
+			(index * elf->ehdr->e32.e_shentsize);
+	} else {
+		return elf->map + elf->ehdr->e64.e_shoff +
+			(index * elf->ehdr->e64.e_shentsize);
+	}
+}
+
+static void update_sym_bind(struct elf *elf, Elf_Sym *sym)
+{
+	int t;
+
+	if (elf->is_32) {
+		t = ELF32_ST_TYPE(sym->s32.st_info);
+		sym->s32.st_info = ELF32_ST_INFO(STB_GLOBAL, t);
+	} else {
+		t = ELF64_ST_TYPE(sym->s64.st_info);
+		sym->s64.st_info = ELF64_ST_INFO(STB_GLOBAL, t);
+	}	
+}
+
+static void update_sym_name(struct elf *elf, Elf_Sym *sym, size_t idx)
+{
+	if (elf->is_32)
+		sym->s32.st_name = idx;
+	else
+		sym->s64.st_name = idx;
+}
+
+static Elf_Shdr *find_sym_table(struct elf *elf)
+{
+	Elf_Shdr *shdr;
+
+	for (int i = 0; i < e_shnum(elf); i++) {
+		shdr = get_shdr(elf, i);
+		if (sh_type(elf, shdr) == SHT_SYMTAB)
+			return shdr;
+	}
+	return NULL;
+}
+
+static const char *find_strings(struct elf *elf)
+{
+	Elf_Shdr *shdr;
+
+	shdr = get_shdr(elf, e_shstrndx(elf));
+	elf->shstrings = elf->map + sh_offset(elf, shdr);
+
+	for (int i = 0; i < e_shnum(elf); i++) {
+		shdr = get_shdr(elf, i);
+		if (sh_type(elf, shdr) != SHT_STRTAB)
+			continue;
+
+		if (!strcmp(sh_name(elf, shdr), ".strtab")) {
+			elf->shstr = shdr;
+			elf->origstrsize = sh_size(elf, shdr);
+			return elf->map + sh_offset(elf, shdr);
+		}
+	}
+	return NULL;
+}
+
+static int add_string(struct elf *elf, const char *str)
+{
+	size_t size;
+	int len = strlen(str);
+	char *newstr;
+
+	size = elf->newstrsize + len + APPEND_STR_SIZE + 1;
+	newstr = realloc(elf->newstrings, size);
+	if (!newstr)
+		return -1;
+
+	elf->newstrings = newstr;
+
+	newstr += elf->newstrsize;
+
+	memcpy(newstr, APPEND_STR, APPEND_STR_SIZE / 2);
+	newstr += APPEND_STR_SIZE / 2;
+
+	memcpy(newstr, str, len);
+	newstr += len;
+
+	memcpy(newstr, APPEND_STR, APPEND_STR_SIZE / 2 + 1);
+
+	elf->newstrsize += len + APPEND_STR_SIZE + 1;
+
+	return 0;
+}
+
+static int process_weak_func(struct elf *elf, Elf_Sym *sym)
+{
+	static off_t symsize;
+	static void *buf;
+	const char *str;
+	Elf_Shdr *shdr;
+	off_t end;
+	ssize_t r;
+
+	shdr = get_shdr(elf, sym_shndx(elf, sym));
+	/* Make sure the section is executable */
+	if (sh_type(elf, shdr) != SHT_PROGBITS)
+		return 0;
+
+	if (!symsize) {
+		symsize = sh_entsize(elf, elf->shsym);
+		buf = calloc(1, symsize);
+		if (!buf) {
+			perror("Allocated temp buffer");
+			return -1;
+		}
+	}
+
+	if (!elf->newsym) {
+		void *symtab;
+		size_t size;
+
+		/* copy the symbol table */
+		end = elf->file_size;
+		/* Make aligned by symbol entry size */
+		end = ALIGN(end, symsize);
+
+		if (lseek(elf->fd, end, SEEK_SET) != end) {
+			perror("lseek");
+			return -1;
+		}
+
+		elf->newsym = end;
+		symtab = elf->map + sh_offset(elf, elf->shsym);
+		size = sh_size(elf, elf->shsym);
+		r = write(elf->fd, symtab, size);
+		if (r != size) {
+			perror("Copying symtable");
+			return -1;
+		}
+
+		/* Make sure it's still aligned */
+		end = lseek(elf->fd, 0, SEEK_END);
+		if (end < 0) {
+			perror("Getting new end of file");
+			return -1;
+		}
+
+		elf->symend = end;
+		end = ALIGN(end, symsize);
+		if (end != elf->symend) {
+			/* Add padding */
+			r = write(elf->fd, buf, end - elf->symend);
+			free(buf);
+			if (r < 0) {
+				perror("Adding padding");
+				return -1;
+			}
+			elf->symend = end;
+		}
+		elf->newstrend = elf->origstrsize;
+	}
+
+	memcpy(buf, sym, symsize);
+	sym = buf;
+	update_sym_bind(elf, sym);
+	str = sym_name(elf, sym);
+	update_sym_name(elf, sym, elf->newstrend);
+	elf->newstrend += strlen(str) + APPEND_STR_SIZE + 1;
+
+	if (add_string(elf, str) < 0) {
+		perror("Failed adding new string");
+		return -1;
+	}
+
+	r = write(elf->fd, sym, symsize);
+	if (r < 0) {
+		perror("Creating new symbol");
+		return -1;
+	}
+	elf->symend += symsize;
+
+	return 0;
+}
+
+static int add_new_strings(struct elf *elf)
+{
+	off_t end;
+	size_t r;
+
+	end = lseek(elf->fd, 0, SEEK_END);
+	if (end < 0) {
+		perror("lseek");
+		return -1;
+	}
+	elf->newstrstart = end;
+
+	r = write(elf->fd, elf->strings, elf->origstrsize);
+	if (r != elf->origstrsize) {
+		perror("Appending copy of string section");
+		return -1;
+	}
+
+	r = write(elf->fd, elf->newstrings, elf->newstrsize);
+	if (r != elf->newstrsize) {
+		perror("Appending new strings to section");
+		return -1;
+	}
+	
+	end = lseek(elf->fd, 0, SEEK_END);
+	if (end < 0) {
+		perror("lseek");
+		return -1;
+	}
+	elf->newstrend = end;
+
+	return 0;
+}
+
+static void update_shdr(struct elf *elf, Elf_Shdr *shdr,
+			size_t offset, size_t size)
+{
+	if (elf->is_32) {
+		shdr->s32.sh_offset = offset;
+		shdr->s32.sh_size = size;
+	} else {
+		shdr->s64.sh_offset = offset;
+		shdr->s64.sh_size = size;
+	}
+}
+
+static void update_elf(struct elf *elf)
+{
+	update_shdr(elf, elf->shsym, elf->newsym, elf->symend - elf->newsym);
+	update_shdr(elf, elf->shstr, elf->newstrstart,
+		    elf->newstrend - elf->newstrstart);
+}
+
+static int add_weak_funcs(struct elf *elf)
+{
+	Elf_Shdr *sh_sym = find_sym_table(elf);
+	Elf_Sym *sym;
+	void *end_sym;
+
+	if (!sh_sym) {
+		fprintf(stderr, "Could not find symbol table\n");
+		return -1;
+	}
+
+	elf->shsym = sh_sym;
+	elf->strings = find_strings(elf);
+	if (!elf->strings) {
+		fprintf(stderr, "Could not find string table\n");
+		return -1;
+	}
+
+	sym = elf->map + sh_offset(elf, sh_sym);
+	end_sym = elf->map + sh_offset(elf, sh_sym) + sh_size(elf, sh_sym);
+
+	while ((void *)sym < end_sym) {
+		if (sym_bind(elf, sym) == STB_WEAK) {
+			if (process_weak_func(elf, sym) < 0)
+				return -1;
+		}
+		sym = (void *)sym + sh_entsize(elf, sh_sym);
+	}
+
+	if (elf->newstrings) {
+		if (add_new_strings(elf) < 0)
+			return -1;
+		update_elf(elf);
+	}
+
+	return 0;
+}
+
+static int handle_weak_funcs(const char *file)
+{
+	struct elf elf = {};
+	struct stat st;
+	void *map;
+	int ret = -1;
+	int fd;
+
+	if (stat(file, &st) < 0) {
+		perror(file);
+		return -1;
+	}
+
+	fd = open(file, O_RDWR);
+	if (fd < 0) {
+		perror(file);
+		return -1;
+	}
+
+	map = mmap(NULL, st.st_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
+	if (map == MAP_FAILED) {
+		perror("mmap");
+		return -1;
+	}
+
+	elf.map = map;
+	elf.ehdr = map;
+	elf.fd = fd;
+	elf.file_size = st.st_size;
+
+	switch (elf.ehdr->e32.e_machine) {
+	case EM_386:
+		elf.is_32 = true;
+		break;
+	case EM_X86_64:
+		elf.is_32 = false;
+		break;
+	default:
+		fprintf(stderr, "ELF type %d is unsupported\n", elf.ehdr->e32.e_machine);
+		goto out;
+	}
+
+	ret = add_weak_funcs(&elf);
+	ret = 0;
+ out:
+	munmap(map, st.st_size);
+	close(fd);
+	return ret;
+}
+
+int main(int argc, char **argv)
+{
+	if (argc < 2) {
+		fprintf(stderr, "usage: weakfunc file.o ...\n");
+		exit(-1);
+	}
+
+	for (int i = 1; i < argc; i++) {
+		int ret = handle_weak_funcs(argv[i]);
+
+		if (ret < 0)
+			exit(ret);
+	}
+}