Message ID | 20230615145607.3469985-1-revest@chromium.org (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | BPF |
Headers | show |
Series | [bpf] bpf/btf: Accept function names that contain dots | expand |
On Thu, Jun 15, 2023 at 4:56 PM Florent Revest <revest@chromium.org> wrote: > > When building a kernel with LLVM=1, LLVM_IAS=0 and CONFIG_KASAN=y, LLVM > leaves DWARF tags for the "asan.module_ctor" & co symbols. To be fair I can't tell if this is an LLVM bug. It's sort of curious that with LLVM_IAS=1, these debugging symbols are not kept and they are with LLVM_IAS=0 but I don't know what the expected behavior should be and how BTF should deal with it. I'll let people with more context comment on this! :) An easy reproducer is: $ touch pwet.c $ clang -g -fsanitize=kernel-address -c -o pwet.o pwet.c $ llvm-dwarfdump pwet.o | grep module_ctor $ clang -fno-integrated-as -g -fsanitize=kernel-address -c -o pwet.o pwet.c $ llvm-dwarfdump pwet.o | grep module_ctor DW_AT_name ("asan.module_ctor") > In a dramatic turn of event, this BTF verification failure can cause > the netfilter_bpf initialization to fail, causing netfilter_core to > free the netfilter_helper hashmap and netfilter_ftp to trigger a > use-after-free. The risk of u-a-f in netfilter will be addressed > separately To be precise, I meant "netfilter conntrack". I sent the following patch as a more targeted mitigation for the uaf https://lore.kernel.org/netfilter-devel/20230615152918.3484699-1-revest@chromium.org/T/#u
On Thu, 2023-06-15 at 17:44 +0200, Florent Revest wrote: > On Thu, Jun 15, 2023 at 4:56 PM Florent Revest <revest@chromium.org> wrote: > > > > When building a kernel with LLVM=1, LLVM_IAS=0 and CONFIG_KASAN=y, LLVM > > leaves DWARF tags for the "asan.module_ctor" & co symbols. > > To be fair I can't tell if this is an LLVM bug. It's sort of curious > that with LLVM_IAS=1, these debugging symbols are not kept and they > are with LLVM_IAS=0 but I don't know what the expected behavior should > be and how BTF should deal with it. I'll let people with more context > comment on this! :) > > An easy reproducer is: > > $ touch pwet.c > > $ clang -g -fsanitize=kernel-address -c -o pwet.o pwet.c > $ llvm-dwarfdump pwet.o | grep module_ctor > > $ clang -fno-integrated-as -g -fsanitize=kernel-address -c -o pwet.o pwet.c > $ llvm-dwarfdump pwet.o | grep module_ctor > DW_AT_name ("asan.module_ctor") Interestingly, I am unable to reproduce it using either clang version 14.0.0-1ubuntu1 or clang main (bd66f4b1da30). > > > In a dramatic turn of event, this BTF verification failure can cause > > the netfilter_bpf initialization to fail, causing netfilter_core to > > free the netfilter_helper hashmap and netfilter_ftp to trigger a > > use-after-free. The risk of u-a-f in netfilter will be addressed > > separately > > To be precise, I meant "netfilter conntrack". > > I sent the following patch as a more targeted mitigation for the uaf > https://lore.kernel.org/netfilter-devel/20230615152918.3484699-1-revest@chromium.org/T/#u >
On Thu, Jun 15, 2023 at 7:56 AM Florent Revest <revest@chromium.org> wrote: > > When building a kernel with LLVM=1, LLVM_IAS=0 and CONFIG_KASAN=y, LLVM > leaves DWARF tags for the "asan.module_ctor" & co symbols. In turn, > pahole creates BTF_KIND_FUNC entries for these and this makes the BTF > metadata validation fail because they contain a dot. > > In a dramatic turn of event, this BTF verification failure can cause > the netfilter_bpf initialization to fail, causing netfilter_core to > free the netfilter_helper hashmap and netfilter_ftp to trigger a > use-after-free. The risk of u-a-f in netfilter will be addressed > separately but the existence of "asan.module_ctor" debug info under some > build conditions sounds like a good enough reason to accept functions > that contain dots in BTF. I don't see much harm in allowing dots. There are also all those .isra and other modifications to functions that we currently don't have in BTF, but with the discussions about recording function addrs we might eventually have those as well. So: Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Cc: stable@vger.kernel.org > Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") > Signed-off-by: Florent Revest <revest@chromium.org> > --- > kernel/bpf/btf.c | 20 ++++++++------------ > 1 file changed, 8 insertions(+), 12 deletions(-) > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > index 6b682b8e4b50..72b32b7cd9cd 100644 > --- a/kernel/bpf/btf.c > +++ b/kernel/bpf/btf.c > @@ -744,13 +744,12 @@ static bool btf_name_offset_valid(const struct btf *btf, u32 offset) > return offset < btf->hdr.str_len; > } > > -static bool __btf_name_char_ok(char c, bool first, bool dot_ok) > +static bool __btf_name_char_ok(char c, bool first) > { > if ((first ? !isalpha(c) : > !isalnum(c)) && > c != '_' && > - ((c == '.' && !dot_ok) || > - c != '.')) > + c != '.') > return false; > return true; > } > @@ -767,20 +766,20 @@ static const char *btf_str_by_offset(const struct btf *btf, u32 offset) > return NULL; > } > > -static bool __btf_name_valid(const struct btf *btf, u32 offset, bool dot_ok) > +static bool __btf_name_valid(const struct btf *btf, u32 offset) > { > /* offset must be valid */ > const char *src = btf_str_by_offset(btf, offset); > const char *src_limit; > > - if (!__btf_name_char_ok(*src, true, dot_ok)) > + if (!__btf_name_char_ok(*src, true)) > return false; > > /* set a limit on identifier length */ > src_limit = src + KSYM_NAME_LEN; > src++; > while (*src && src < src_limit) { > - if (!__btf_name_char_ok(*src, false, dot_ok)) > + if (!__btf_name_char_ok(*src, false)) > return false; > src++; > } > @@ -788,17 +787,14 @@ static bool __btf_name_valid(const struct btf *btf, u32 offset, bool dot_ok) > return !*src; > } > > -/* Only C-style identifier is permitted. This can be relaxed if > - * necessary. > - */ > static bool btf_name_valid_identifier(const struct btf *btf, u32 offset) > { > - return __btf_name_valid(btf, offset, false); > + return __btf_name_valid(btf, offset); > } > > static bool btf_name_valid_section(const struct btf *btf, u32 offset) > { > - return __btf_name_valid(btf, offset, true); > + return __btf_name_valid(btf, offset); > } > > static const char *__btf_name_by_offset(const struct btf *btf, u32 offset) > @@ -4422,7 +4418,7 @@ static s32 btf_var_check_meta(struct btf_verifier_env *env, > } > > if (!t->name_off || > - !__btf_name_valid(env->btf, t->name_off, true)) { > + !__btf_name_valid(env->btf, t->name_off)) { > btf_verifier_log_type(env, t, "Invalid name"); > return -EINVAL; > } > -- > 2.41.0.162.gfafddb0af9-goog >
On Thu, Jun 15, 2023 at 7:05 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > On Thu, 2023-06-15 at 17:44 +0200, Florent Revest wrote: > > An easy reproducer is: > > > > $ touch pwet.c > > > > $ clang -g -fsanitize=kernel-address -c -o pwet.o pwet.c > > $ llvm-dwarfdump pwet.o | grep module_ctor > > > > $ clang -fno-integrated-as -g -fsanitize=kernel-address -c -o pwet.o pwet.c > > $ llvm-dwarfdump pwet.o | grep module_ctor > > DW_AT_name ("asan.module_ctor") > > Interestingly, I am unable to reproduce it using either > clang version 14.0.0-1ubuntu1 or clang main (bd66f4b1da30). Somehow, I didn't think of trying other clang versions! Thanks, that's a good point Eduard. :) I also can't reproduce it on a 14x build. However, I seem to be able to reproduce it on main: git clone https://github.com/llvm/llvm-project.git mkdir llvm-project/build cd llvm-project/build git checkout bd66f4b1da30 cmake -DLLVM_ENABLE_PROJECTS=clang -DCMAKE_BUILD_TYPE=Release -G "Unix Makefiles" ../llvm make -j $(nproc) bin/clang -fno-integrated-as -g -fsanitize=kernel-address -c -o ~/pwet.o ~/pwet.c bin/llvm-dwarfdump ~/pwet.o | grep module_ctor # Shows module_ctor I started a bisection, hopefully that will point to something interesting
On Mon, Jun 19, 2023 at 1:20 PM Florent Revest <revest@chromium.org> wrote: > > On Thu, Jun 15, 2023 at 7:05 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > On Thu, 2023-06-15 at 17:44 +0200, Florent Revest wrote: > > > An easy reproducer is: > > > > > > $ touch pwet.c > > > > > > $ clang -g -fsanitize=kernel-address -c -o pwet.o pwet.c > > > $ llvm-dwarfdump pwet.o | grep module_ctor > > > > > > $ clang -fno-integrated-as -g -fsanitize=kernel-address -c -o pwet.o pwet.c > > > $ llvm-dwarfdump pwet.o | grep module_ctor > > > DW_AT_name ("asan.module_ctor") > > > > Interestingly, I am unable to reproduce it using either > > clang version 14.0.0-1ubuntu1 or clang main (bd66f4b1da30). > > Somehow, I didn't think of trying other clang versions! Thanks, that's > a good point Eduard. :) > > I also can't reproduce it on a 14x build. > > However, I seem to be able to reproduce it on main: > > git clone https://github.com/llvm/llvm-project.git > mkdir llvm-project/build > cd llvm-project/build > git checkout bd66f4b1da30 > cmake -DLLVM_ENABLE_PROJECTS=clang -DCMAKE_BUILD_TYPE=Release -G > "Unix Makefiles" ../llvm > make -j $(nproc) > > bin/clang -fno-integrated-as -g -fsanitize=kernel-address -c -o > ~/pwet.o ~/pwet.c > bin/llvm-dwarfdump ~/pwet.o | grep module_ctor > # Shows module_ctor > > I started a bisection, hopefully that will point to something interesting The bisection pointed to a LLVM patch from Nick in October 2022: e3bb359aacdd ("[clang][Toolchains][Gnu] pass -g through to assembler") Based on the context I have, that commit sounds fair enough. I don't think LLVM does anything wrong here, it seems like BPF should be the one dealing with dots in function debug info.
On Fri, Jun 16, 2023 at 6:57 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Thu, Jun 15, 2023 at 7:56 AM Florent Revest <revest@chromium.org> wrote: > > > > When building a kernel with LLVM=1, LLVM_IAS=0 and CONFIG_KASAN=y, LLVM > > leaves DWARF tags for the "asan.module_ctor" & co symbols. In turn, > > pahole creates BTF_KIND_FUNC entries for these and this makes the BTF > > metadata validation fail because they contain a dot. > > > > In a dramatic turn of event, this BTF verification failure can cause > > the netfilter_bpf initialization to fail, causing netfilter_core to > > free the netfilter_helper hashmap and netfilter_ftp to trigger a > > use-after-free. The risk of u-a-f in netfilter will be addressed > > separately but the existence of "asan.module_ctor" debug info under some > > build conditions sounds like a good enough reason to accept functions > > that contain dots in BTF. > > I don't see much harm in allowing dots. There are also all those .isra > and other modifications to functions that we currently don't have in > BTF, but with the discussions about recording function addrs we might > eventually have those as well. So: > > Acked-by: Andrii Nakryiko <andrii@kernel.org> Thanks Andrii! :) > > Cc: stable@vger.kernel.org > > Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") So do you think these trailers should be kept ? I suppose we can either see this as a "new feature" to accommodate .isra that should go through bpf-next or as a bug fix that goes through bpf and gets backported to stable (without this, BTF wouldn't work on old kernels built under a new clang and with LLVM_IAS=0 and CONFIG_KASAN=y so this sounds like a legitimate bug fix to me, I just wanted to double check)
On Mon, 2023-06-19 at 15:55 +0200, Florent Revest wrote: > On Mon, Jun 19, 2023 at 1:20 PM Florent Revest <revest@chromium.org> wrote: > > > > On Thu, Jun 15, 2023 at 7:05 PM Eduard Zingerman <eddyz87@gmail.com> wrote: > > > > > > On Thu, 2023-06-15 at 17:44 +0200, Florent Revest wrote: > > > > An easy reproducer is: > > > > > > > > $ touch pwet.c > > > > > > > > $ clang -g -fsanitize=kernel-address -c -o pwet.o pwet.c > > > > $ llvm-dwarfdump pwet.o | grep module_ctor > > > > > > > > $ clang -fno-integrated-as -g -fsanitize=kernel-address -c -o pwet.o pwet.c > > > > $ llvm-dwarfdump pwet.o | grep module_ctor > > > > DW_AT_name ("asan.module_ctor") > > > > > > Interestingly, I am unable to reproduce it using either > > > clang version 14.0.0-1ubuntu1 or clang main (bd66f4b1da30). > > > > Somehow, I didn't think of trying other clang versions! Thanks, that's > > a good point Eduard. :) > > > > I also can't reproduce it on a 14x build. > > > > However, I seem to be able to reproduce it on main: > > > > git clone https://github.com/llvm/llvm-project.git > > mkdir llvm-project/build > > cd llvm-project/build > > git checkout bd66f4b1da30 > > cmake -DLLVM_ENABLE_PROJECTS=clang -DCMAKE_BUILD_TYPE=Release -G > > "Unix Makefiles" ../llvm > > make -j $(nproc) > > > > bin/clang -fno-integrated-as -g -fsanitize=kernel-address -c -o > > ~/pwet.o ~/pwet.c > > bin/llvm-dwarfdump ~/pwet.o | grep module_ctor > > # Shows module_ctor > > > > I started a bisection, hopefully that will point to something interesting > > The bisection pointed to a LLVM patch from Nick in October 2022: > e3bb359aacdd ("[clang][Toolchains][Gnu] pass -g through to assembler") > > Based on the context I have, that commit sounds fair enough. I don't > think LLVM does anything wrong here, it seems like BPF should be the > one dealing with dots in function debug info. That explains why I could not reproduce the issue: I tried with gas 2.38. Using gas 2.40 I see the same behavior as you. If one tries to generate assembly file with '-fsanitize': $ clang -fno-integrated-as -g -fsanitize=kernel-address -S -o pwet.s pwet.c $ cat pwet.s .text .file "pwet.c" .p2align 4, 0x90 # -- Begin function asan.module_ctor .type asan.module_ctor,@function asan.module_ctor: # @asan.module_ctor .Lfunc_begin0: ... And then compile it using Gnu assembler: $ as --64 -o pwet.o pwet.s -g -gdwarf-5 The behavior differs between 2.38 and 2.40, the older version does not produce debug entry for 'asan.module_ctor', while newer does.
On 6/19/23 7:03 AM, Florent Revest wrote: > On Fri, Jun 16, 2023 at 6:57 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: >> >> On Thu, Jun 15, 2023 at 7:56 AM Florent Revest <revest@chromium.org> wrote: >>> >>> When building a kernel with LLVM=1, LLVM_IAS=0 and CONFIG_KASAN=y, LLVM >>> leaves DWARF tags for the "asan.module_ctor" & co symbols. In turn, >>> pahole creates BTF_KIND_FUNC entries for these and this makes the BTF >>> metadata validation fail because they contain a dot. >>> >>> In a dramatic turn of event, this BTF verification failure can cause >>> the netfilter_bpf initialization to fail, causing netfilter_core to >>> free the netfilter_helper hashmap and netfilter_ftp to trigger a >>> use-after-free. The risk of u-a-f in netfilter will be addressed >>> separately but the existence of "asan.module_ctor" debug info under some >>> build conditions sounds like a good enough reason to accept functions >>> that contain dots in BTF. >> >> I don't see much harm in allowing dots. There are also all those .isra >> and other modifications to functions that we currently don't have in >> BTF, but with the discussions about recording function addrs we might >> eventually have those as well. So: >> >> Acked-by: Andrii Nakryiko <andrii@kernel.org> > > Thanks Andrii! :) > >>> Cc: stable@vger.kernel.org >>> Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") > > So do you think these trailers should be kept ? I suppose we can > either see this as a "new feature" to accommodate .isra that should go > through bpf-next or as a bug fix that goes through bpf and gets > backported to stable (without this, BTF wouldn't work on old kernels > built under a new clang and with LLVM_IAS=0 and CONFIG_KASAN=y so this > sounds like a legitimate bug fix to me, I just wanted to double check) How many people really build the kernel with LLVM=1 LLVM_IAS=0 which uses clang compiler ans gcc 'as'. I think distro most likely won't do this if they intend to build the kernel with clang. Note that LLVM=1 implies to use both clang compiler and clang assembler. Using clang17 and 'LLVM=1 LLVM_IAS=0', with latest bpf-next, I actually hit some build errors like: /tmp/video-bios-59fa52.s: Assembler messages: /tmp/video-bios-59fa52.s:4: Error: junk at end of line, first unrecognized character is `"' /tmp/video-bios-59fa52.s:4: Error: file number less than one /tmp/video-bios-59fa52.s:5: Error: junk at end of line, first unrecognized character is `"' /tmp/video-bios-59fa52.s:6: Error: junk at end of line, first unrecognized character is `"' /tmp/video-bios-59fa52.s:7: Error: junk at end of line, first unrecognized character is `"' /tmp/video-bios-59fa52.s:8: Error: junk at end of line, first unrecognized character is `"' /tmp/video-bios-59fa52.s:9: Error: junk at end of line, first unrecognized character is `"' /tmp/video-bios-59fa52.s:10: Error: junk at end of line, first unrecognized character is `"' /tmp/video-bios-59fa52.s:68: Error: junk at end of line, first unrecognized character is `"' clang: error: assembler command failed with exit code 1 (use -v to see invocation) make[4]: *** [/home/yhs/work/bpf-next/scripts/Makefile.build:252: arch/x86/realmode/rm/video-bios.o] Error 1 make[4]: *** Waiting for unfinished jobs.... /tmp/wakemain-88777c.s: Assembler messages: /tmp/wakemain-88777c.s:4: Error: junk at end of line, first unrecognized character is `"' /tmp/wakemain-88777c.s:4: Error: file number less than one /tmp/wakemain-88777c.s:5: Error: junk at end of line, first unrecognized character is `"' /tmp/wakemain-88777c.s:6: Error: junk at end of line, first unrecognized character is `"' /tmp/wakemain-88777c.s:7: Error: junk at end of line, first unrecognized character is `"' /tmp/wakemain-88777c.s:8: Error: junk at end of line, first unrecognized character is `"' /tmp/wakemain-88777c.s:81: Error: junk at end of line, first unrecognized character is `"' /tmp/wakemain-88777c.s:312: Error: junk at end of line, first unrecognized character is `"' clang: error: assembler command failed with exit code 1 (use -v to see invocation) Potentially because of my local gnu assembler 2.30-120.el8 won't work with some syntax generated by clang. Mixing clang compiler and arbitrary gnu assembler are not a good idea (see the above example). It might work with close-to-latest gnu assembler. To support function name like '<fname>.isra', some llvm work will be needed, and it may take some time. So in my opinion, this patch is NOT a bug fix. It won't affect distro. Whether we should backport to the old kernel, I am not sure whether it is absolutely necessary as casual build can always remove LLVM_IAS=0 or hack the kernel source itself.
On Mon, Jun 19, 2023 at 8:17 PM Yonghong Song <yhs@meta.com> wrote: > > > > On 6/19/23 7:03 AM, Florent Revest wrote: > > On Fri, Jun 16, 2023 at 6:57 PM Andrii Nakryiko > > <andrii.nakryiko@gmail.com> wrote: > >> > >> On Thu, Jun 15, 2023 at 7:56 AM Florent Revest <revest@chromium.org> wrote: > >>> > >>> When building a kernel with LLVM=1, LLVM_IAS=0 and CONFIG_KASAN=y, LLVM > >>> leaves DWARF tags for the "asan.module_ctor" & co symbols. In turn, > >>> pahole creates BTF_KIND_FUNC entries for these and this makes the BTF > >>> metadata validation fail because they contain a dot. > >>> > >>> In a dramatic turn of event, this BTF verification failure can cause > >>> the netfilter_bpf initialization to fail, causing netfilter_core to > >>> free the netfilter_helper hashmap and netfilter_ftp to trigger a > >>> use-after-free. The risk of u-a-f in netfilter will be addressed > >>> separately but the existence of "asan.module_ctor" debug info under some > >>> build conditions sounds like a good enough reason to accept functions > >>> that contain dots in BTF. > >> > >> I don't see much harm in allowing dots. There are also all those .isra > >> and other modifications to functions that we currently don't have in > >> BTF, but with the discussions about recording function addrs we might > >> eventually have those as well. So: > >> > >> Acked-by: Andrii Nakryiko <andrii@kernel.org> > > > > Thanks Andrii! :) > > > >>> Cc: stable@vger.kernel.org > >>> Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") > > > > So do you think these trailers should be kept ? I suppose we can > > either see this as a "new feature" to accommodate .isra that should go > > through bpf-next or as a bug fix that goes through bpf and gets > > backported to stable (without this, BTF wouldn't work on old kernels > > built under a new clang and with LLVM_IAS=0 and CONFIG_KASAN=y so this > > sounds like a legitimate bug fix to me, I just wanted to double check) > > How many people really build the kernel with > LLVM=1 LLVM_IAS=0 > which uses clang compiler ans gcc 'as'. > I think distro most likely won't do this if they intend to > build the kernel with clang. I tend to agree with you > Note that > LLVM=1 > implies to use both clang compiler and clang assembler. However, this is only true since: f12b034afeb3 ("scripts/Makefile.clang: default to LLVM_IAS=1") 5.10 stable for example does not have that commit and LLVM_IAS=0 is still the default there. (actually that's how I stumbled upon this: by building a 5.10 LTS and then finding a way to reproduce it upstream by disabling LLVM_IAS) > Using clang17 and 'LLVM=1 LLVM_IAS=0', with latest bpf-next, > I actually hit some build errors like: > > /tmp/video-bios-59fa52.s: Assembler messages: > /tmp/video-bios-59fa52.s:4: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:4: Error: file number less than one > /tmp/video-bios-59fa52.s:5: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:6: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:7: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:8: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:9: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:10: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:68: Error: junk at end of line, first > unrecognized character is `"' > clang: error: assembler command failed with exit code 1 (use -v to see > invocation) > make[4]: *** [/home/yhs/work/bpf-next/scripts/Makefile.build:252: > arch/x86/realmode/rm/video-bios.o] Error 1 > make[4]: *** Waiting for unfinished jobs.... > /tmp/wakemain-88777c.s: Assembler messages: > /tmp/wakemain-88777c.s:4: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:4: Error: file number less than one > /tmp/wakemain-88777c.s:5: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:6: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:7: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:8: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:81: Error: junk at end of line, first > unrecognized character is `"' > /tmp/wakemain-88777c.s:312: Error: junk at end of line, first > unrecognized character is `"' > clang: error: assembler command failed with exit code 1 (use -v to see > invocation) > > Potentially because of my local gnu assembler 2.30-120.el8 won't work > with some syntax generated by clang. Mixing clang compiler and arbitrary > gnu assembler are not a good idea (see the above example). It might > work with close-to-latest gnu assembler. I did not hit this bug with clang 17 and bpf-next so it's probably an incompatibility with that gnu assembler indeed. > To support function name like '<fname>.isra', some llvm work will be > needed, and it may take some time. > > So in my opinion, this patch is NOT a bug fix. It won't affect distro. > Whether we should backport to the old kernel, I am not sure whether it > is absolutely necessary as casual build can always remove LLVM_IAS=0 or > hack the kernel source itself. If you think it's not worth backporting to 5.10 (where LLVM_IAS=0 is the default) then I could drop these trailers and send it to bpf-next with a different justification. Either way is fine by me.
On Mon, Jun 19, 2023 at 2:17 PM Yonghong Song <yhs@meta.com> wrote: > > How many people really build the kernel with > LLVM=1 LLVM_IAS=0 > which uses clang compiler ans gcc 'as'. > I think distro most likely won't do this if they intend to > build the kernel with clang. > > Note that > LLVM=1 > implies to use both clang compiler and clang assembler. Yes, we prefer folks to build with LLVM=1. The problem exists for users of stable kernels that predate LLVM_IAS=1 support working well (4.19 is when we had most of the assembler related issues sorted out, actually later but we backported most fixes to 4.19). > > Using clang17 and 'LLVM=1 LLVM_IAS=0', with latest bpf-next, > I actually hit some build errors like: > > /tmp/video-bios-59fa52.s: Assembler messages: > /tmp/video-bios-59fa52.s:4: Error: junk at end of line, first > unrecognized character is `"' Probably because: 1. CONFIG_DEBUG_INFO_DWARF5=y was set or CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y and you're using a version of clang which implicitly defaults to DWARFv5. 2. you're using a version of GAS that does not understand DWARFv5. 3. you did not run defconfig/menuconfig to have kconfig check for DWARFv5 support. The kconfigs should prevent you from selecting DWARFv5 if your toolchain combination doesn't support it; if you run kconfig. > /tmp/video-bios-59fa52.s:4: Error: file number less than one > /tmp/video-bios-59fa52.s:5: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:6: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:7: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:8: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:9: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:10: Error: junk at end of line, first > unrecognized character is `"' > /tmp/video-bios-59fa52.s:68: Error: junk at end of line, first > unrecognized character is `"' > clang: error: assembler command failed with exit code 1 (use -v to see > invocation) > make[4]: *** [/home/yhs/work/bpf-next/scripts/Makefile.build:252: > arch/x86/realmode/rm/video-bios.o] Error 1 > make[4]: *** Waiting for unfinished jobs.... > /tmp/wakemain-88777c.s: Assembler messages: > /tmp/wakemain-88777c.s:4: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:4: Error: file number less than one > /tmp/wakemain-88777c.s:5: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:6: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:7: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:8: Error: junk at end of line, first unrecognized > character is `"' > /tmp/wakemain-88777c.s:81: Error: junk at end of line, first > unrecognized character is `"' > /tmp/wakemain-88777c.s:312: Error: junk at end of line, first > unrecognized character is `"' > clang: error: assembler command failed with exit code 1 (use -v to see > invocation) > > Potentially because of my local gnu assembler 2.30-120.el8 won't work It's recorded in lib/Kconfig.debug that 2.35.2 is required for DWARFv5 support if you're using GAS. My machine has 2.40. > with some syntax generated by clang. Mixing clang compiler and arbitrary > gnu assembler are not a good idea (see the above example). It might I agree, but for older branches of stable which are still supported, we didn't quite have clang assembler support usable. We still need to support those branches of stable. > work with close-to-latest gnu assembler. > > To support function name like '<fname>.isra', some llvm work will be > needed, and it may take some time. > > So in my opinion, this patch is NOT a bug fix. It won't affect distro. > Whether we should backport to the old kernel, I am not sure whether it > is absolutely necessary as casual build can always remove LLVM_IAS=0 or > hack the kernel source itself.
On 6/20/23 7:38 AM, Nick Desaulniers wrote: > On Mon, Jun 19, 2023 at 2:17 PM Yonghong Song <yhs@meta.com> wrote: >> >> How many people really build the kernel with >> LLVM=1 LLVM_IAS=0 >> which uses clang compiler ans gcc 'as'. >> I think distro most likely won't do this if they intend to >> build the kernel with clang. >> >> Note that >> LLVM=1 >> implies to use both clang compiler and clang assembler. > > Yes, we prefer folks to build with LLVM=1. The problem exists for > users of stable kernels that predate LLVM_IAS=1 support working well > (4.19 is when we had most of the assembler related issues sorted out, > actually later but we backported most fixes to 4.19). > >> >> Using clang17 and 'LLVM=1 LLVM_IAS=0', with latest bpf-next, >> I actually hit some build errors like: >> >> /tmp/video-bios-59fa52.s: Assembler messages: >> /tmp/video-bios-59fa52.s:4: Error: junk at end of line, first >> unrecognized character is `"' > > Probably because: > 1. CONFIG_DEBUG_INFO_DWARF5=y was set or > CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y and you're using a version > of clang which implicitly defaults to DWARFv5. > 2. you're using a version of GAS that does not understand DWARFv5. > 3. you did not run defconfig/menuconfig to have kconfig check for > DWARFv5 support. Yes, I am using llvm17 which I think the default dwarf is v5, and indeed I have CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y My gas is 2.30, probably too old to support dwarf 5. Yes, I didn't run defconfig/menuconfig. > > The kconfigs should prevent you from selecting DWARFv5 if your > toolchain combination doesn't support it; if you run kconfig. > >> /tmp/video-bios-59fa52.s:4: Error: file number less than one >> /tmp/video-bios-59fa52.s:5: Error: junk at end of line, first >> unrecognized character is `"' >> /tmp/video-bios-59fa52.s:6: Error: junk at end of line, first >> unrecognized character is `"' >> /tmp/video-bios-59fa52.s:7: Error: junk at end of line, first >> unrecognized character is `"' >> /tmp/video-bios-59fa52.s:8: Error: junk at end of line, first >> unrecognized character is `"' >> /tmp/video-bios-59fa52.s:9: Error: junk at end of line, first >> unrecognized character is `"' >> /tmp/video-bios-59fa52.s:10: Error: junk at end of line, first >> unrecognized character is `"' >> /tmp/video-bios-59fa52.s:68: Error: junk at end of line, first >> unrecognized character is `"' >> clang: error: assembler command failed with exit code 1 (use -v to see >> invocation) >> make[4]: *** [/home/yhs/work/bpf-next/scripts/Makefile.build:252: >> arch/x86/realmode/rm/video-bios.o] Error 1 >> make[4]: *** Waiting for unfinished jobs.... >> /tmp/wakemain-88777c.s: Assembler messages: >> /tmp/wakemain-88777c.s:4: Error: junk at end of line, first unrecognized >> character is `"' >> /tmp/wakemain-88777c.s:4: Error: file number less than one >> /tmp/wakemain-88777c.s:5: Error: junk at end of line, first unrecognized >> character is `"' >> /tmp/wakemain-88777c.s:6: Error: junk at end of line, first unrecognized >> character is `"' >> /tmp/wakemain-88777c.s:7: Error: junk at end of line, first unrecognized >> character is `"' >> /tmp/wakemain-88777c.s:8: Error: junk at end of line, first unrecognized >> character is `"' >> /tmp/wakemain-88777c.s:81: Error: junk at end of line, first >> unrecognized character is `"' >> /tmp/wakemain-88777c.s:312: Error: junk at end of line, first >> unrecognized character is `"' >> clang: error: assembler command failed with exit code 1 (use -v to see >> invocation) >> >> Potentially because of my local gnu assembler 2.30-120.el8 won't work > > It's recorded in lib/Kconfig.debug that 2.35.2 is required for DWARFv5 > support if you're using GAS. My machine has 2.40. > >> with some syntax generated by clang. Mixing clang compiler and arbitrary >> gnu assembler are not a good idea (see the above example). It might > > I agree, but for older branches of stable which are still supported, > we didn't quite have clang assembler support usable. We still need to > support those branches of stable. Thanks Florent pointing out 5.10 stable kernels which have this issue. I am okay with backporting to old stable kernels if that is needed. But the patch going to bpf-next should not have a bug-fix tag and the patch commit message can be tweaked for backport to 5.10 though. > >> work with close-to-latest gnu assembler. >> >> To support function name like '<fname>.isra', some llvm work will be >> needed, and it may take some time. >> >> So in my opinion, this patch is NOT a bug fix. It won't affect distro. >> Whether we should backport to the old kernel, I am not sure whether it >> is absolutely necessary as casual build can always remove LLVM_IAS=0 or >> hack the kernel source itself. > > >
On Tue, Jun 20, 2023 at 10:53 AM Yonghong Song <yhs@meta.com> wrote: > > > > On 6/20/23 7:38 AM, Nick Desaulniers wrote: > > 3. you did not run defconfig/menuconfig to have kconfig check for > > DWARFv5 support. > > Yes, I didn't run defconfig/menuconfig. That doesn't mean the odd combo of clang+gas doesn't work. Just like how using scripts/config is a hazard since it also doesn't run kconfig and allows you to set incompatible configurations. Garbage in; garbage out. > > > > > The kconfigs should prevent you from selecting DWARFv5 if your > > toolchain combination doesn't support it; if you run kconfig. > > > >> /tmp/video-bios-59fa52.s:4: Error: file number less than one > >> /tmp/video-bios-59fa52.s:5: Error: junk at end of line, first > >> unrecognized character is `"' > >> /tmp/video-bios-59fa52.s:6: Error: junk at end of line, first > >> unrecognized character is `"' > >> /tmp/video-bios-59fa52.s:7: Error: junk at end of line, first > >> unrecognized character is `"' > >> /tmp/video-bios-59fa52.s:8: Error: junk at end of line, first > >> unrecognized character is `"' > >> /tmp/video-bios-59fa52.s:9: Error: junk at end of line, first > >> unrecognized character is `"' > >> /tmp/video-bios-59fa52.s:10: Error: junk at end of line, first > >> unrecognized character is `"' > >> /tmp/video-bios-59fa52.s:68: Error: junk at end of line, first > >> unrecognized character is `"' > >> clang: error: assembler command failed with exit code 1 (use -v to see > >> invocation) > >> make[4]: *** [/home/yhs/work/bpf-next/scripts/Makefile.build:252: > >> arch/x86/realmode/rm/video-bios.o] Error 1 > >> make[4]: *** Waiting for unfinished jobs.... > >> /tmp/wakemain-88777c.s: Assembler messages: > >> /tmp/wakemain-88777c.s:4: Error: junk at end of line, first unrecognized > >> character is `"' > >> /tmp/wakemain-88777c.s:4: Error: file number less than one > >> /tmp/wakemain-88777c.s:5: Error: junk at end of line, first unrecognized > >> character is `"' > >> /tmp/wakemain-88777c.s:6: Error: junk at end of line, first unrecognized > >> character is `"' > >> /tmp/wakemain-88777c.s:7: Error: junk at end of line, first unrecognized > >> character is `"' > >> /tmp/wakemain-88777c.s:8: Error: junk at end of line, first unrecognized > >> character is `"' > >> /tmp/wakemain-88777c.s:81: Error: junk at end of line, first > >> unrecognized character is `"' > >> /tmp/wakemain-88777c.s:312: Error: junk at end of line, first > >> unrecognized character is `"' > >> clang: error: assembler command failed with exit code 1 (use -v to see > >> invocation) > >> > >> Potentially because of my local gnu assembler 2.30-120.el8 won't work > > > > It's recorded in lib/Kconfig.debug that 2.35.2 is required for DWARFv5 > > support if you're using GAS. My machine has 2.40. > > > >> with some syntax generated by clang. Mixing clang compiler and arbitrary > >> gnu assembler are not a good idea (see the above example). It might > > > > I agree, but for older branches of stable which are still supported, > > we didn't quite have clang assembler support usable. We still need to > > support those branches of stable. > > Thanks Florent pointing out 5.10 stable kernels which have this issue. No, all kernels have this issue, when using `LLVM=1 LLVM_IAS=0`. It's more likely that someone is using that combination for branches of stable that predate 4.19 (such as 4.14) but we do still try to support that combination somewhat, even if we recommend just using `LLVM=1`. Interop between toolchains is still important, even if "why would you do that?" > I am okay with backporting to old stable kernels if that is needed. > But the patch going to bpf-next should not have a bug-fix tag and > the patch commit message can be tweaked for backport to 5.10 though. > > > > >> work with close-to-latest gnu assembler. > >> > >> To support function name like '<fname>.isra', some llvm work will be > >> needed, and it may take some time. > >> > >> So in my opinion, this patch is NOT a bug fix. It won't affect distro. > >> Whether we should backport to the old kernel, I am not sure whether it > >> is absolutely necessary as casual build can always remove LLVM_IAS=0 or > >> hack the kernel source itself. > > > > > >
On 6/20/23 8:07 AM, Nick Desaulniers wrote: > On Tue, Jun 20, 2023 at 10:53 AM Yonghong Song <yhs@meta.com> wrote: >> >> >> >> On 6/20/23 7:38 AM, Nick Desaulniers wrote: >>> 3. you did not run defconfig/menuconfig to have kconfig check for >>> DWARFv5 support. >> >> Yes, I didn't run defconfig/menuconfig. > > That doesn't mean the odd combo of clang+gas doesn't work. > > Just like how using scripts/config is a hazard since it also doesn't > run kconfig and allows you to set incompatible configurations. Garbage > in; garbage out. > >> >>> >>> The kconfigs should prevent you from selecting DWARFv5 if your >>> toolchain combination doesn't support it; if you run kconfig. >>> >>>> /tmp/video-bios-59fa52.s:4: Error: file number less than one >>>> /tmp/video-bios-59fa52.s:5: Error: junk at end of line, first >>>> unrecognized character is `"' >>>> /tmp/video-bios-59fa52.s:6: Error: junk at end of line, first >>>> unrecognized character is `"' >>>> /tmp/video-bios-59fa52.s:7: Error: junk at end of line, first >>>> unrecognized character is `"' >>>> /tmp/video-bios-59fa52.s:8: Error: junk at end of line, first >>>> unrecognized character is `"' >>>> /tmp/video-bios-59fa52.s:9: Error: junk at end of line, first >>>> unrecognized character is `"' >>>> /tmp/video-bios-59fa52.s:10: Error: junk at end of line, first >>>> unrecognized character is `"' >>>> /tmp/video-bios-59fa52.s:68: Error: junk at end of line, first >>>> unrecognized character is `"' >>>> clang: error: assembler command failed with exit code 1 (use -v to see >>>> invocation) >>>> make[4]: *** [/home/yhs/work/bpf-next/scripts/Makefile.build:252: >>>> arch/x86/realmode/rm/video-bios.o] Error 1 >>>> make[4]: *** Waiting for unfinished jobs.... >>>> /tmp/wakemain-88777c.s: Assembler messages: >>>> /tmp/wakemain-88777c.s:4: Error: junk at end of line, first unrecognized >>>> character is `"' >>>> /tmp/wakemain-88777c.s:4: Error: file number less than one >>>> /tmp/wakemain-88777c.s:5: Error: junk at end of line, first unrecognized >>>> character is `"' >>>> /tmp/wakemain-88777c.s:6: Error: junk at end of line, first unrecognized >>>> character is `"' >>>> /tmp/wakemain-88777c.s:7: Error: junk at end of line, first unrecognized >>>> character is `"' >>>> /tmp/wakemain-88777c.s:8: Error: junk at end of line, first unrecognized >>>> character is `"' >>>> /tmp/wakemain-88777c.s:81: Error: junk at end of line, first >>>> unrecognized character is `"' >>>> /tmp/wakemain-88777c.s:312: Error: junk at end of line, first >>>> unrecognized character is `"' >>>> clang: error: assembler command failed with exit code 1 (use -v to see >>>> invocation) >>>> >>>> Potentially because of my local gnu assembler 2.30-120.el8 won't work >>> >>> It's recorded in lib/Kconfig.debug that 2.35.2 is required for DWARFv5 >>> support if you're using GAS. My machine has 2.40. >>> >>>> with some syntax generated by clang. Mixing clang compiler and arbitrary >>>> gnu assembler are not a good idea (see the above example). It might >>> >>> I agree, but for older branches of stable which are still supported, >>> we didn't quite have clang assembler support usable. We still need to >>> support those branches of stable. >> >> Thanks Florent pointing out 5.10 stable kernels which have this issue. > > No, all kernels have this issue, when using `LLVM=1 LLVM_IAS=0`. It's > more likely that someone is using that combination for branches of > stable that predate 4.19 (such as 4.14) but we do still try to support > that combination somewhat, even if we recommend just using `LLVM=1`. > Interop between toolchains is still important, even if "why would you > do that?" Okay, yes, although 'LLVM=1' is recommended way to compiler clang based kernel, users can certainly do 'LLVM=1 LLVM_IAS=0' as well although not recommended. Then it is okay to put a bug fix in the commit message. Just need to clarify that - > 5.10 kernel, LLVM=1 (LLVM_IAS=0 is not the default) is recommended but user can still have LLVM=1 LLVM_IAS=0 to trigger the issue - <= 5.10 kernel, LLVM=1 (LLVM_IAS=0 is the default) is recommended in which case gnu as will be used. > >> I am okay with backporting to old stable kernels if that is needed. >> But the patch going to bpf-next should not have a bug-fix tag and >> the patch commit message can be tweaked for backport to 5.10 though. >> >>> >>>> work with close-to-latest gnu assembler. >>>> >>>> To support function name like '<fname>.isra', some llvm work will be >>>> needed, and it may take some time. >>>> >>>> So in my opinion, this patch is NOT a bug fix. It won't affect distro. >>>> Whether we should backport to the old kernel, I am not sure whether it >>>> is absolutely necessary as casual build can always remove LLVM_IAS=0 or >>>> hack the kernel source itself. >>> >>> >>> > > >
On 6/21/23 5:28 AM, Yonghong Song wrote: > On 6/20/23 8:07 AM, Nick Desaulniers wrote: [...] >> No, all kernels have this issue, when using `LLVM=1 LLVM_IAS=0`. It's >> more likely that someone is using that combination for branches of >> stable that predate 4.19 (such as 4.14) but we do still try to support >> that combination somewhat, even if we recommend just using `LLVM=1`. >> Interop between toolchains is still important, even if "why would you >> do that?" > > Okay, yes, although 'LLVM=1' is recommended way to compiler clang > based kernel, users can certainly do 'LLVM=1 LLVM_IAS=0' as well > although not recommended. Then it is okay to put a bug fix in > the commit message. Just need to clarify that > - > 5.10 kernel, LLVM=1 (LLVM_IAS=0 is not the default) > is recommended but user can still have LLVM=1 LLVM_IAS=0 > to trigger the issue > - <= 5.10 kernel, LLVM=1 (LLVM_IAS=0 is the default) is > recommended in which case gnu as will be used. Given this was already applied to bpf few days ago, I've just updated the commit message to reflect the above. Agree that this is valuable info to retain for the log. Thanks everyone, Daniel
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 6b682b8e4b50..72b32b7cd9cd 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -744,13 +744,12 @@ static bool btf_name_offset_valid(const struct btf *btf, u32 offset) return offset < btf->hdr.str_len; } -static bool __btf_name_char_ok(char c, bool first, bool dot_ok) +static bool __btf_name_char_ok(char c, bool first) { if ((first ? !isalpha(c) : !isalnum(c)) && c != '_' && - ((c == '.' && !dot_ok) || - c != '.')) + c != '.') return false; return true; } @@ -767,20 +766,20 @@ static const char *btf_str_by_offset(const struct btf *btf, u32 offset) return NULL; } -static bool __btf_name_valid(const struct btf *btf, u32 offset, bool dot_ok) +static bool __btf_name_valid(const struct btf *btf, u32 offset) { /* offset must be valid */ const char *src = btf_str_by_offset(btf, offset); const char *src_limit; - if (!__btf_name_char_ok(*src, true, dot_ok)) + if (!__btf_name_char_ok(*src, true)) return false; /* set a limit on identifier length */ src_limit = src + KSYM_NAME_LEN; src++; while (*src && src < src_limit) { - if (!__btf_name_char_ok(*src, false, dot_ok)) + if (!__btf_name_char_ok(*src, false)) return false; src++; } @@ -788,17 +787,14 @@ static bool __btf_name_valid(const struct btf *btf, u32 offset, bool dot_ok) return !*src; } -/* Only C-style identifier is permitted. This can be relaxed if - * necessary. - */ static bool btf_name_valid_identifier(const struct btf *btf, u32 offset) { - return __btf_name_valid(btf, offset, false); + return __btf_name_valid(btf, offset); } static bool btf_name_valid_section(const struct btf *btf, u32 offset) { - return __btf_name_valid(btf, offset, true); + return __btf_name_valid(btf, offset); } static const char *__btf_name_by_offset(const struct btf *btf, u32 offset) @@ -4422,7 +4418,7 @@ static s32 btf_var_check_meta(struct btf_verifier_env *env, } if (!t->name_off || - !__btf_name_valid(env->btf, t->name_off, true)) { + !__btf_name_valid(env->btf, t->name_off)) { btf_verifier_log_type(env, t, "Invalid name"); return -EINVAL; }
When building a kernel with LLVM=1, LLVM_IAS=0 and CONFIG_KASAN=y, LLVM leaves DWARF tags for the "asan.module_ctor" & co symbols. In turn, pahole creates BTF_KIND_FUNC entries for these and this makes the BTF metadata validation fail because they contain a dot. In a dramatic turn of event, this BTF verification failure can cause the netfilter_bpf initialization to fail, causing netfilter_core to free the netfilter_helper hashmap and netfilter_ftp to trigger a use-after-free. The risk of u-a-f in netfilter will be addressed separately but the existence of "asan.module_ctor" debug info under some build conditions sounds like a good enough reason to accept functions that contain dots in BTF. Cc: stable@vger.kernel.org Fixes: 1dc92851849c ("bpf: kernel side support for BTF Var and DataSec") Signed-off-by: Florent Revest <revest@chromium.org> --- kernel/bpf/btf.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-)