Message ID | 20220225123953.3251327-1-alexandre.ghiti@canonical.com (mailing list archive) |
---|---|
Headers | show |
Series | Fixes KASAN and other along the way | expand |
On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti <alexandre.ghiti@canonical.com> wrote: > > As reported by Aleksandr, syzbot riscv is broken since commit > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually > breaks KASAN_INLINE which is not fixed in this series, that will come later > when found. > > Nevertheless, this series fixes small things that made the syzbot > configuration + KASAN_OUTLINE fail to boot. > > Note that even though the config at [1] boots fine with this series, I > was not able to boot the small config at [2] which fails because > kasan_poison receives a really weird address 0x4075706301000000 (maybe a > kasan person could provide some hint about what happens below in > do_ctors -> __asan_register_globals): asan_register_globals is responsible for poisoning redzones around globals. As hinted by 'do_ctors', it calls constructors, and in this case a compiler-generated constructor that calls __asan_register_globals with metadata generated by the compiler. That metadata contains information about global variables. Note, these constructors are called on initial boot, but also every time a kernel module (that has globals) is loaded. It may also be a toolchain issue, but it's hard to say. If you're using GCC to test, try Clang (11 or later), and vice-versa.
On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com> wrote: > > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti > <alexandre.ghiti@canonical.com> wrote: > > > > As reported by Aleksandr, syzbot riscv is broken since commit > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually > > breaks KASAN_INLINE which is not fixed in this series, that will come later > > when found. > > > > Nevertheless, this series fixes small things that made the syzbot > > configuration + KASAN_OUTLINE fail to boot. > > > > Note that even though the config at [1] boots fine with this series, I > > was not able to boot the small config at [2] which fails because > > kasan_poison receives a really weird address 0x4075706301000000 (maybe a > > kasan person could provide some hint about what happens below in > > do_ctors -> __asan_register_globals): > > asan_register_globals is responsible for poisoning redzones around > globals. As hinted by 'do_ctors', it calls constructors, and in this > case a compiler-generated constructor that calls > __asan_register_globals with metadata generated by the compiler. That > metadata contains information about global variables. Note, these > constructors are called on initial boot, but also every time a kernel > module (that has globals) is loaded. > > It may also be a toolchain issue, but it's hard to say. If you're > using GCC to test, try Clang (11 or later), and vice-versa. I tried 3 different gcc toolchains already, but that did not fix the issue. The only thing that worked was setting asan-globals=0 in scripts/Makefile.kasan, but ok, that's not a fix. I tried to bisect this issue but our kasan implementation has been broken quite a few times, so it failed. I keep digging! Thanks for the tips, Alex
On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com> wrote: > > > > On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <alexandre.ghiti@canonical.com> wrote: >> >> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com> wrote: >> > >> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti >> > <alexandre.ghiti@canonical.com> wrote: >> > > >> > > As reported by Aleksandr, syzbot riscv is broken since commit >> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually >> > > breaks KASAN_INLINE which is not fixed in this series, that will come later >> > > when found. >> > > >> > > Nevertheless, this series fixes small things that made the syzbot >> > > configuration + KASAN_OUTLINE fail to boot. >> > > >> > > Note that even though the config at [1] boots fine with this series, I >> > > was not able to boot the small config at [2] which fails because >> > > kasan_poison receives a really weird address 0x4075706301000000 (maybe a >> > > kasan person could provide some hint about what happens below in >> > > do_ctors -> __asan_register_globals): >> > >> > asan_register_globals is responsible for poisoning redzones around >> > globals. As hinted by 'do_ctors', it calls constructors, and in this >> > case a compiler-generated constructor that calls >> > __asan_register_globals with metadata generated by the compiler. That >> > metadata contains information about global variables. Note, these >> > constructors are called on initial boot, but also every time a kernel >> > module (that has globals) is loaded. >> > >> > It may also be a toolchain issue, but it's hard to say. If you're >> > using GCC to test, try Clang (11 or later), and vice-versa. >> >> I tried 3 different gcc toolchains already, but that did not fix the >> issue. The only thing that worked was setting asan-globals=0 in >> scripts/Makefile.kasan, but ok, that's not a fix. >> I tried to bisect this issue but our kasan implementation has been >> broken quite a few times, so it failed. >> >> I keep digging! >> > > The problem does not reproduce for me with GCC 11.2.0: kernels built with both [1] and [2] are bootable. Do you mean you reach userspace? Because my image boots too, and fails at some point: [ 0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps every 4398046511100ns [ 0.015847] Console: colour dummy device 80x25 [ 0.016899] printk: console [tty0] enabled [ 0.020326] printk: bootconsole [ns16550a0] disabled It traps here. > FWIW here is how I run them: > > qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \ > -device virtio-rng-pci -machine virt -device \ > virtio-net-pci,netdev=net0 -netdev \ > user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \ > virtio-blk-device,drive=hd0 -drive \ > file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \ > -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append "root=/dev/vda > console=ttyS0 earlyprintk=serial" > > >> >> Thanks for the tips, >> >> Alex > > > > -- > Alexander Potapenko > Software Engineer > > Google Germany GmbH > Erika-Mann-Straße, 33 > 80636 München > > Geschäftsführer: Paul Manicle, Liana Sebastian > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg > > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde. > > > > This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.
On Fri, Feb 25, 2022 at 3:31 PM Alexander Potapenko <glider@google.com> wrote: > > > > On Fri, Feb 25, 2022 at 3:15 PM Alexandre Ghiti <alexandre.ghiti@canonical.com> wrote: >> >> On Fri, Feb 25, 2022 at 3:10 PM Alexander Potapenko <glider@google.com> wrote: >> > >> > >> > >> > On Fri, Feb 25, 2022 at 3:04 PM Alexandre Ghiti <alexandre.ghiti@canonical.com> wrote: >> >> >> >> On Fri, Feb 25, 2022 at 2:06 PM Marco Elver <elver@google.com> wrote: >> >> > >> >> > On Fri, 25 Feb 2022 at 13:40, Alexandre Ghiti >> >> > <alexandre.ghiti@canonical.com> wrote: >> >> > > >> >> > > As reported by Aleksandr, syzbot riscv is broken since commit >> >> > > 54c5639d8f50 ("riscv: Fix asan-stack clang build"). This commit actually >> >> > > breaks KASAN_INLINE which is not fixed in this series, that will come later >> >> > > when found. >> >> > > >> >> > > Nevertheless, this series fixes small things that made the syzbot >> >> > > configuration + KASAN_OUTLINE fail to boot. >> >> > > >> >> > > Note that even though the config at [1] boots fine with this series, I >> >> > > was not able to boot the small config at [2] which fails because >> >> > > kasan_poison receives a really weird address 0x4075706301000000 (maybe a >> >> > > kasan person could provide some hint about what happens below in >> >> > > do_ctors -> __asan_register_globals): >> >> > >> >> > asan_register_globals is responsible for poisoning redzones around >> >> > globals. As hinted by 'do_ctors', it calls constructors, and in this >> >> > case a compiler-generated constructor that calls >> >> > __asan_register_globals with metadata generated by the compiler. That >> >> > metadata contains information about global variables. Note, these >> >> > constructors are called on initial boot, but also every time a kernel >> >> > module (that has globals) is loaded. >> >> > >> >> > It may also be a toolchain issue, but it's hard to say. If you're >> >> > using GCC to test, try Clang (11 or later), and vice-versa. >> >> >> >> I tried 3 different gcc toolchains already, but that did not fix the >> >> issue. The only thing that worked was setting asan-globals=0 in >> >> scripts/Makefile.kasan, but ok, that's not a fix. >> >> I tried to bisect this issue but our kasan implementation has been >> >> broken quite a few times, so it failed. >> >> >> >> I keep digging! >> >> >> > >> > The problem does not reproduce for me with GCC 11.2.0: kernels built with both [1] and [2] are bootable. >> >> Do you mean you reach userspace? Because my image boots too, and fails >> at some point: >> >> [ 0.000150] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps >> every 4398046511100ns >> [ 0.015847] Console: colour dummy device 80x25 >> [ 0.016899] printk: console [tty0] enabled >> [ 0.020326] printk: bootconsole [ns16550a0] disabled >> > > In my case, QEMU successfully boots to the login prompt. > I am running QEMU 6.2.0 (Debian 1:6.2+dfsg-2) and an image Aleksandr shared with me (guess it was built according to this instruction: https://github.com/google/syzkaller/blob/master/docs/linux/setup_linux-host_qemu-vm_riscv64-kernel.md) > Nice thanks guys! I always use the latest opensbi and not the one that is embedded in qemu, which is the only difference between your command line (which works) and mine (which does not work). So the issue is probably there, I really need to investigate that now. That means I only need to fix KASAN_INLINE and we're good. I imagine Palmer can add your Tested-by on the series then? Thanks again! Alex >> >> It traps here. >> >> > FWIW here is how I run them: >> > >> > qemu-system-riscv64 -m 2048 -smp 1 -nographic -no-reboot \ >> > -device virtio-rng-pci -machine virt -device \ >> > virtio-net-pci,netdev=net0 -netdev \ >> > user,id=net0,restrict=on,hostfwd=tcp:127.0.0.1:12529-:22 -device \ >> > virtio-blk-device,drive=hd0 -drive \ >> > file=${IMAGE},if=none,format=raw,id=hd0 -snapshot \ >> > -kernel ${KERNEL_SRC_DIR}/arch/riscv/boot/Image -append "root=/dev/vda >> > console=ttyS0 earlyprintk=serial" >> > >> > >> >> >> >> Thanks for the tips, >> >> >> >> Alex >> > >> > >> > >> > -- >> > Alexander Potapenko >> > Software Engineer >> > >> > Google Germany GmbH >> > Erika-Mann-Straße, 33 >> > 80636 München >> > >> > Geschäftsführer: Paul Manicle, Liana Sebastian >> > Registergericht und -nummer: Hamburg, HRB 86891 >> > Sitz der Gesellschaft: Hamburg >> > >> > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde. >> > >> > >> > >> > This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person. >> >> -- >> You received this message because you are subscribed to the Google Groups "kasan-dev" group. >> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. >> To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/CA%2BzEjCsQPVYSV7CdhKnvjujXkMXuRQd%3DVPok1awb20xifYmidw%40mail.gmail.com. > > > > -- > Alexander Potapenko > Software Engineer > > Google Germany GmbH > Erika-Mann-Straße, 33 > 80636 München > > Geschäftsführer: Paul Manicle, Liana Sebastian > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg > > Diese E-Mail ist vertraulich. Falls Sie diese fälschlicherweise erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes weiter, löschen Sie alle Kopien und Anhänge davon und lassen Sie mich bitte wissen, dass die E-Mail an die falsche Person gesendet wurde. > > > > This e-mail is confidential. If you received this communication by mistake, please don't forward it to anyone else, please erase all copies and attachments, and please let me know that it has gone to the wrong person.