Message ID | 20191224135404.389039-1-Jason@zx2c4.com (mailing list archive) |
---|---|
State | Rejected |
Delegated to: | Paul Burton |
Headers | show |
Series | mips: vdso: conditionalize 32-bit time functions on COMPAT_32BIT_TIME | expand |
On Tue, Dec 24, 2019 at 2:54 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > When the VDSO falls back to 32-bit time functions on kernels with > COMPAT_32BIT_TIME=n, userspace becomes corrupted and appears to crash > shortly after, with something like: > > [ 0.359617] do_page_fault(): sending SIGSEGV to init for invalid read access from 000000007ff790d0 > [ 0.359843] epc = 0000000077e45df4 in libc.so[77da6000+de000] > [ 0.360319] ra = 0000000010000c50 in init[10000000+2000] > [ 0.364456] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b > > This can be reproduced with simply calling `clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts)`, > since `CLOCK_PROCESS_CPUTIME_ID` is not exported to the VDSO, invoking > the syscall callback branch. This crash was observed with musl 1.20's > clock_gettime implementation: Thanks for the bug report! I'm not completely sure why this fails in this particular way though. I assume you are using musl-1.1.20, not a musl-1.2.0 snapshot (the version 1.20 you list does not exist), so the combination you are testing is supposed to just return -ENOSYS here to match the behavior of hte system call. > --- a/arch/mips/include/asm/vdso/gettimeofday.h > +++ b/arch/mips/include/asm/vdso/gettimeofday.h > @@ -107,7 +107,7 @@ static __always_inline int clock_getres_fallback( > return error ? -ret : ret; > } > > -#if _MIPS_SIM != _MIPS_SIM_ABI64 > +#if _MIPS_SIM != _MIPS_SIM_ABI64 && defined(CONFIG_COMPAT_32BIT_TIME) > > #define VDSO_HAS_32BIT_FALLBACK 1 > I don't think this is the correct fix, it may actually make it worse by changing the vdso implementation for clock_gettime32() to fall back to clock_gettime64(), which would appear to work correctly before y2038 but fail afterwards. How about this one: diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S index da4627430aba..0bdc6a026be8 100644 --- a/arch/mips/vdso/vdso.lds.S +++ b/arch/mips/vdso/vdso.lds.S @@ -93,9 +93,11 @@ VERSION LINUX_2.6 { #ifndef DISABLE_MIPS_VDSO global: +#if (_MIPS_SIM == _MIPS_SIM_ABI64) || defined(CONFIG_COMPAT_32BIT_TIME) __vdso_clock_gettime; __vdso_gettimeofday; __vdso_clock_getres; +#endif #if _MIPS_SIM != _MIPS_SIM_ABI64 __vdso_clock_gettime64; #endif That should ensure that no user space can call the old vdso functions on a kernel that intentionally breaks the actual syscalls. Arnd
On Mon, Dec 30, 2019 at 12:58 PM Arnd Bergmann <arnd@arndb.de> wrote: > Thanks for the bug report! I'm not completely sure why this fails in > this particular > way though. I assume you are using musl-1.1.20, not a musl-1.2.0 snapshot Yes, that's the one, sorry. > diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S > index da4627430aba..0bdc6a026be8 100644 > --- a/arch/mips/vdso/vdso.lds.S > +++ b/arch/mips/vdso/vdso.lds.S > @@ -93,9 +93,11 @@ VERSION > LINUX_2.6 { > #ifndef DISABLE_MIPS_VDSO > global: > +#if (_MIPS_SIM == _MIPS_SIM_ABI64) || defined(CONFIG_COMPAT_32BIT_TIME) > __vdso_clock_gettime; > __vdso_gettimeofday; > __vdso_clock_getres; > +#endif > #if _MIPS_SIM != _MIPS_SIM_ABI64 > __vdso_clock_gettime64; > #endif > > That should ensure that no user space can call the old vdso > functions on a kernel that intentionally breaks the actual > syscalls. I can confirm that patch fixes things. Thanks. Jason
On Mon, Dec 30, 2019 at 1:27 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > On Mon, Dec 30, 2019 at 12:58 PM Arnd Bergmann <arnd@arndb.de> wrote: > > Thanks for the bug report! I'm not completely sure why this fails in > > this particular > > way though. I assume you are using musl-1.1.20, not a musl-1.2.0 snapshot > > Yes, that's the one, sorry. > > > diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S > > index da4627430aba..0bdc6a026be8 100644 > > --- a/arch/mips/vdso/vdso.lds.S > > +++ b/arch/mips/vdso/vdso.lds.S > > @@ -93,9 +93,11 @@ VERSION > > LINUX_2.6 { > > #ifndef DISABLE_MIPS_VDSO > > global: > > +#if (_MIPS_SIM == _MIPS_SIM_ABI64) || defined(CONFIG_COMPAT_32BIT_TIME) > > __vdso_clock_gettime; > > __vdso_gettimeofday; > > __vdso_clock_getres; > > +#endif > > #if _MIPS_SIM != _MIPS_SIM_ABI64 > > __vdso_clock_gettime64; > > #endif > > > > That should ensure that no user space can call the old vdso > > functions on a kernel that intentionally breaks the actual > > syscalls. > > I can confirm that patch fixes things. Thanks. Ok, that's good news, but I think we still need to answer two questions: - Why does it crash in the first place rather than returning -ENOSYS? - How does it actually work if you run an application built against an old musl version on a kernel that tries to make this not work? Do you just get a random time (uninitialized user space stack) and work with that without checking the error code? Arnd
On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote: > > - Why does it crash in the first place rather than returning -ENOSYS? There's a bit of speculation about this in the original thread that prompted this patch (you're CC'd). > > - How does it actually work if you run an application built against > an old musl version on a kernel that tries to make this not work? > Do you just get a random time (uninitialized user space stack) and > work with that without checking the error code? Actually, your patch fails here. The ts struct remains as it was before, filled with garbage. No good. My original patch in this thread, though, does result in the correct value being written to ts. Jason
On Mon, Dec 30, 2019 at 3:37 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > - Why does it crash in the first place rather than returning -ENOSYS? > > There's a bit of speculation about this in the original thread that > prompted this patch (you're CC'd). The following will provoke the crash: __attribute__((noinline)) void somefunc(void) { } int __clock_gettime(clockid_t clk, struct timespec *ts) { ((int (*)(clockid_t, struct timespec *))vdso_func)(clk, ts); somefunc(); return 88; } It seems like the VDSO is doing something to the stack.
On Mon, Dec 30, 2019 at 3:37 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > - Why does it crash in the first place rather than returning -ENOSYS? > > There's a bit of speculation about this in the original thread that > prompted this patch (you're CC'd). > > > > > - How does it actually work if you run an application built against > > an old musl version on a kernel that tries to make this not work? > > Do you just get a random time (uninitialized user space stack) and > > work with that without checking the error code? > > Actually, your patch fails here. The ts struct remains as it was > before, filled with garbage. No good. My original patch in this > thread, though, does result in the correct value being written to ts. Ok, that is the intended behavior then, clock_gettime() needs to fail with -EINVAL or -ENOSYS here (depending on the libc implementation), and of course the data is not updated. Returning success from clock_gettime() on a kernel with only time64 support and a libc with only time32 support (or vice versa) would be a bug. Arnd
On Mon, Dec 30, 2019 at 4:37 PM Arnd Bergmann <arnd@arndb.de> wrote: > > On Mon, Dec 30, 2019 at 3:37 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > > > - Why does it crash in the first place rather than returning -ENOSYS? > > > > There's a bit of speculation about this in the original thread that > > prompted this patch (you're CC'd). > > > > > > > > - How does it actually work if you run an application built against > > > an old musl version on a kernel that tries to make this not work? > > > Do you just get a random time (uninitialized user space stack) and > > > work with that without checking the error code? > > > > Actually, your patch fails here. The ts struct remains as it was > > before, filled with garbage. No good. My original patch in this > > thread, though, does result in the correct value being written to ts. > > Ok, that is the intended behavior then, clock_gettime() needs > to fail with -EINVAL or -ENOSYS here (depending on the libc > implementation), and of course the data is not updated. > > Returning success from clock_gettime() on a kernel with only > time64 support and a libc with only time32 support (or vice > versa) would be a bug. Ah, right, hence why the 32-bit compat code is behind a still-on-by-default-but-not-for-long menu option.
On Mon, Dec 30, 2019 at 4:39 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > On Mon, Dec 30, 2019 at 4:37 PM Arnd Bergmann <arnd@arndb.de> wrote: > > > > On Mon, Dec 30, 2019 at 3:37 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > > > On Mon, Dec 30, 2019 at 1:34 PM Arnd Bergmann <arnd@arndb.de> wrote: > > Returning success from clock_gettime() on a kernel with only > > time64 support and a libc with only time32 support (or vice > > versa) would be a bug. > > Ah, right, hence why the 32-bit compat code is behind a > still-on-by-default-but-not-for-long menu option. I expect this to remain on-by-default for a rather long time, as we still to run old binaries well into 2030s. Making it configurable is done for two reasons: - on embedded systems that have all user space built with time64, turning this off can make the kernel slightly smaller - turning it off lets you check that no code relies on calling the old interfaces internally, bypassing the C library, in a way that works at the moment but may break after 2038. Arnd
Makes sense w.r.t. time32 situation. I still think that in spite of that there's still something weird happening with the mips VDSO. Here's a register dump before the call: $ 0 : 0000000000000000 0000000000000001 0000000010000000 fffffffffffffffc $ 4 : 0000000000000002 000000007fff2e40 0000000000000000 0000000000000001 $ 8 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 $12 : 0000000000000000 000000000000000a ffffffff80000000 000000007fffffda $16 : 0000000010001ba8 0000005800000015 0000000010000000 0000000010000000 $20 : 0000000010000000 0000000010000000 0000000000000000 0000000077ff2ae8 $24 : 0000000000000005 0000000077fa1d18 $28 : 0000000010019cf0 000000007fff2e40 0000000000000000 0000000010000c30 Hi : 0000000000000000 Lo : 0000000000000000 epc : 0000000077fa1d18 0x77fa1d18 ra : 0000000010000c30 0x10000c30 And here it is immediately after: $ 0 : 0000000000000000 0000000000000001 ffffffffffffffa7 000000007fff5000 $ 4 : 0000000000000002 000000007fff2e40 0000000077ff2000 0000000000000001 $ 8 : 0000000000000006 0000000000000020 0000000000000002 0000000000000000 $12 : 0000000000000000 0000000000001852 ffffffff80156160 000000007fffffda $16 : 0000000010001ba8 0000005800000015 0000000010000000 0000000010000000 $20 : 0000000010000000 0000000010000000 0000000000000000 0000000077ff2b00 $24 : 0000000000000005 0000000000000000 $28 : 000000007fff5000 000000007fff2e30 0000000000000000 0000000077fa1e00 Hi : 0000000000000000 Lo : 0000000000000000 epc : 0000000077fa1e00 0x77fa1e00 ra : 0000000077fa1e00 0x77fa1e00 I wonder if a toolchain option or compiler bug or something is causing the vdso to not restore certain registers (gp? ra?).
On Mon, Dec 30, 2019 at 4:58 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > Makes sense w.r.t. time32 situation. > > I still think that in spite of that there's still something weird > happening with the mips VDSO. Agreed. > Here's a register dump before the call: > > $ 0 : 0000000000000000 0000000000000001 0000000010000000 fffffffffffffffc > $ 4 : 0000000000000002 000000007fff2e40 0000000000000000 0000000000000001 > $ 8 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 > $12 : 0000000000000000 000000000000000a ffffffff80000000 000000007fffffda > $16 : 0000000010001ba8 0000005800000015 0000000010000000 0000000010000000 > $20 : 0000000010000000 0000000010000000 0000000000000000 0000000077ff2ae8 > $24 : 0000000000000005 0000000077fa1d18 > $28 : 0000000010019cf0 000000007fff2e40 0000000000000000 0000000010000c30 > Hi : 0000000000000000 > Lo : 0000000000000000 > epc : 0000000077fa1d18 0x77fa1d18 > ra : 0000000010000c30 0x10000c30 > > And here it is immediately after: > > $ 0 : 0000000000000000 0000000000000001 ffffffffffffffa7 000000007fff5000 > $ 4 : 0000000000000002 000000007fff2e40 0000000077ff2000 0000000000000001 > $ 8 : 0000000000000006 0000000000000020 0000000000000002 0000000000000000 > $12 : 0000000000000000 0000000000001852 ffffffff80156160 000000007fffffda > $16 : 0000000010001ba8 0000005800000015 0000000010000000 0000000010000000 > $20 : 0000000010000000 0000000010000000 0000000000000000 0000000077ff2b00 > $24 : 0000000000000005 0000000000000000 > $28 : 000000007fff5000 000000007fff2e30 0000000000000000 0000000077fa1e00 > Hi : 0000000000000000 > Lo : 0000000000000000 > epc : 0000000077fa1e00 0x77fa1e00 > ra : 0000000077fa1e00 0x77fa1e00 Is this immediately before/after the syscall instruction or the indirect function call? > I wonder if a toolchain option or compiler bug or something is causing > the vdso to not restore certain registers (gp? ra?). Here is the assembler output I see for the o32 vdso, hopefully I got all the relevant bits: # /git/arm-soc/lib/vdso/gettimeofday.c:130: if (unlikely(ret)) .set noreorder .set nomacro beqz $2,$L86 #,, lw $28,16($sp) #, .set macro .set reorder $L46: # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:118: register struct old_timespec32 *ts asm("a1") = _ts; move $5,$16 # ts, ts # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:119: register clockid_t clkid asm("a0") = _clkid; move $4,$17 # clkid, clock # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:121: register long nr asm("v0") = __NR_clock_gettime; li $2,4263 # 0x10a7 # nr, # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:124: asm volatile( #APP # 124 "/git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h" 1 syscall # 0 "" 2 # /git/arm-soc/arch/mips/vdso/vgettimeofday.c:18: } #NO_APP lw $31,60($sp) #, lw $19,56($sp) #, # /git/arm-soc/arch/mips/include/asm/vdso/gettimeofday.h:131: return error ? -ret : ret; subu $3,$0,$2 # <retval>, ret selnez $3,$3,$7 # tmp406, <retval>, error seleqz $2,$2,$7 # tmp407, ret, error or $3,$3,$2 # <retval>, tmp406, tmp407 # /git/arm-soc/arch/mips/vdso/vgettimeofday.c:18: } lw $18,52($sp) #, lw $17,48($sp) #, lw $16,44($sp) #, move $2,$3 #, <retval> .set noreorder .set nomacro jr $31 # addiu $sp,$sp,64 #,, .set macro .set reorder gp ($28) and ra ($31) sound like good guesses to me, SP ($r29) changed from 000000007fff2e40 to 000000007fff2e30, if that is not the intention, it would clearly explain why anything afterwards crashes, but that seems unlikely. r3 contains the error code -ENOSYS on mips, so that's good. $23 is supposed to be preserved across function calls and is consequently not part of the clobber list but is modified. $25 is part of the clobber list and is also modified, but there is no code to save/restore it in the assembler output. Arnd
On Mon, Dec 30, 2019 at 6:33 PM Arnd Bergmann <arnd@arndb.de> wrote > Is this immediately before/after the syscall instruction or the > indirect function call? It's immediately after/before the call to the VDSO function itself. Next I'll try to instrument the VDSO to get closer to that syscall. I produced those reg dumps by hooking the page fault handler in the kernel to print them and then disabling aslr and sticking a `*(volatile int *)0 = 0;` in the code. Pretty gnarly.
On Mon, Dec 30, 2019 at 10:09 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > On Mon, Dec 30, 2019 at 6:33 PM Arnd Bergmann <arnd@arndb.de> wrote > > Is this immediately before/after the syscall instruction or the > > indirect function call? > > It's immediately after/before the call to the VDSO function itself. > Next I'll try to instrument the VDSO to get closer to that syscall. > > I produced those reg dumps by hooking the page fault handler in the > kernel to print them and then disabling aslr and sticking a > `*(volatile int *)0 = 0;` in the code. Pretty gnarly. Here's immediately before and immediately after the syscall asm that the vdso has in mips/include/asm/vdso/gettimeofday.h. sp and ra are wrong? Before: [ 0.546364] $ 0 : 0000000000000000 0000000000000001 0000000000000002 0000000000000000 [ 0.546545] $ 4 : 000000007fff4000 0000000000000000 0000000077ff0000 0000000000000406 [ 0.546762] $ 8 : 000000007fff5000 0000000000000020 0000000000000002 0000000000000000 [ 0.546912] $12 : 0000000000000000 000000000000000a ffffffff80000000 000000000000006d [ 0.547046] $16 : 000000007fff2e40 000000007fff2e40 0000000010000000 0000000010000000 [ 0.547178] $20 : 0000000010000000 0000000010000000 0000000000000000 0000000077ff0000 [ 0.547548] $24 : 0000000000000005 0000000000000000 [ 0.547743] $28 : 000000007fff5000 000000007fff2df0 0000000000000000 000000007fff550c [ 0.547898] Hi : 0000000000000000 [ 0.548010] Lo : 0000000000000000 [ 0.548175] epc : 000000007fff5580 0x7fff5580 [ 0.548358] ra : 000000007fff550c 0x7fff550c [ 0.549305] Stack : 0000000000000002 000000007fff2e40 0000000000000002 0000000077f9e80c [ 0.549500] 0000000000000000 0000000000000000 ffffffffffffffff 0000000010000000 [ 0.549687] 0000000010019dd0 0000000010000c20 0000000077ff0000 0000000077fa4868 [ 0.549951] 0000000377ff19b8 0000000000000000 000000007fff2f04 0000000000000001 [ 0.550127] 0000000010000870 0000000077ff0000 0000000077fa4868 0000000077ff19b8 [ 0.550277] 0000000077ff7180 0000000077f297ac 7fff2f0c77ff7180 0000000077f29800 [ 0.550458] 0000000000000000 000000007fff2f00 0000000077ff19b8 0000000077ff1e30 [ 0.550613] 0000000010019dd0 0000000010000dec 0000000010019dd0 0000000010000db0 [ 0.550811] 0000000000000000 0000000000000000 000000017fff2fda 000000007fff2fe0 [ 0.550957] 7fff2fe700000000 000000217fff5000 0000001000000020 0000000600001000 After: [ 0.577975] $ 0 : 0000000000000000 0000000000000001 0000000000000059 000000007fff5000 [ 0.578191] $ 4 : 0000000000000002 000000007fff2e40 0000000077ff0000 0000000000000001 [ 0.578392] $ 8 : 0000000000000006 0000000000000020 0000000000000002 0000000000000000 [ 0.578611] $12 : 0000000000000000 0000000000001852 ffffffff801560e0 000000000000006d [ 0.578817] $16 : 0000000000000002 000000007fff2e40 0000000010000000 0000000010000000 [ 0.579004] $20 : 0000000010000000 0000000010000000 0000000000000000 0000000077ff0000 [ 0.579149] $24 : 0000000000000005 0000000000000000 [ 0.579375] $28 : 000000007fff5000 000000007fff2de0 0000000000000000 000000007fff551c [ 0.579640] Hi : 0000000000000000 [ 0.579799] Lo : 0000000000000000 [ 0.579974] epc : 000000007fff55a0 0x7fff55a0 [ 0.580134] ra : 000000007fff551c 0x7fff551c [ 0.581293] Stack : 0000000000000000 0000000077f9e760 0000000000000002 000000007fff2e40 [ 0.581456] 0000000077ff0000 0000000077f9e80c 0000000000000000 0000000000000000 [ 0.581619] ffffffffffffffff 0000000010000000 0000000010019dd0 0000000010000c20 [ 0.581834] 0000000077ff0000 0000000077fa4868 0000000377ff19b8 0000000000000000 [ 0.581985] 000000007fff2f04 0000000000000001 0000000010000870 0000000077ff0000 [ 0.582136] 0000000077fa4868 0000000077ff19b8 0000000077ff7180 0000000077f297ac [ 0.582288] 7fff2f0c77ff7180 0000000077f29800 0000000000000000 000000007fff2f00 [ 0.582438] 0000000077ff19b8 0000000077ff1e30 0000000010019dd0 0000000010000dec [ 0.582585] 0000000010019dd0 0000000010000db0 0000000000000000 0000000000000000 [ 0.582732] 000000017fff2fda 000000007fff2fe0 7fff2fe700000000 000000217fff5000
Here's a "one click" reproducer: https://data.zx2c4.com/mips-musl-libc-weird-crash-time32-compat.tar.xz Untar that and hit `make -j$(nproc)`, and you'll get a freshly built and crashing kernel+userland.
Hi Jason, On Tue, Dec 31, 2019 at 05:14:41PM +0100, Jason A. Donenfeld wrote: > Here's a "one click" reproducer: > https://data.zx2c4.com/mips-musl-libc-weird-crash-time32-compat.tar.xz > > Untar that and hit `make -j$(nproc)`, and you'll get a freshly built > and crashing kernel+userland. Thanks for the test case. It seems like the VDSO code isn't saving & restoring $gp/$28, even though it's meant to be callee-saved in both the n32 & n64 ABIs. With some digging I found that the below seems to resolve the issue. Could you check whether it works for you? I'm still not quite sure *why* this happens; perhaps GCC just decides it doesn't need to save & restore $gp/$28 when it spots that it's being "used" for __current_thread_info (even though that's never actually referenced in the VDSO)? Just moving the declaration of __current_thread_info inside the current_thread_info() function seems to do the trick too, and is probably a bit neater. Thanks, Paul --- diff --git a/arch/mips/include/asm/thread_info.h b/arch/mips/include/asm/thread_info.h index 4993db40482c..ac33959bbb1f 100644 --- a/arch/mips/include/asm/thread_info.h +++ b/arch/mips/include/asm/thread_info.h @@ -50,7 +50,11 @@ struct thread_info { } /* How to get the thread information struct from C. */ +#ifdef __VDSO__ +register struct thread_info *__current_thread_info __asm__("$0"); +#else register struct thread_info *__current_thread_info __asm__("$28"); +#endif static inline struct thread_info *current_thread_info(void) {
Hi Jason, On Tue, Dec 31, 2019 at 08:10:56PM -0800, Paul Burton wrote: > I'm still not quite sure *why* this happens; perhaps GCC just decides it > doesn't need to save & restore $gp/$28 when it spots that it's being > "used" for __current_thread_info (even though that's never actually > referenced in the VDSO)? Ah: > After defining a global register variable, for the current compilation > unit: > > - If the register is a call-saved register, call ABI is affected: the > register will not be restored in function epilogue sequences after > the variable has been assigned. Therefore, functions cannot safely > return to callers that assume standard ABI. https://gcc.gnu.org/onlinedocs/gcc/Global-Register-Variables.html That makes sense then. What doesn't make sense is how this ever worked. A question for another day... Thanks, Paul
On Wed, Jan 1, 2020 at 5:08 AM Paul Burton <paulburton@kernel.org> wrote: > > Hi Jason, > > On Tue, Dec 31, 2019 at 05:14:41PM +0100, Jason A. Donenfeld wrote: > > Here's a "one click" reproducer: > > https://data.zx2c4.com/mips-musl-libc-weird-crash-time32-compat.tar.xz > > > > Untar that and hit `make -j$(nproc)`, and you'll get a freshly built > > and crashing kernel+userland. > > Thanks for the test case. It seems like the VDSO code isn't saving & > restoring $gp/$28, even though it's meant to be callee-saved in both the > n32 & n64 ABIs. With some digging I found that the below seems to > resolve the issue. Could you check whether it works for you? > > I'm still not quite sure *why* this happens; perhaps GCC just decides it > doesn't need to save & restore $gp/$28 when it spots that it's being > "used" for __current_thread_info (even though that's never actually > referenced in the VDSO)? > > Just moving the declaration of __current_thread_info inside the > current_thread_info() function seems to do the trick too, and is > probably a bit neater. > > Thanks, > Paul > > --- > diff --git a/arch/mips/include/asm/thread_info.h b/arch/mips/include/asm/thread_info.h > index 4993db40482c..ac33959bbb1f 100644 > --- a/arch/mips/include/asm/thread_info.h > +++ b/arch/mips/include/asm/thread_info.h > @@ -50,7 +50,11 @@ struct thread_info { > } > > /* How to get the thread information struct from C. */ > +#ifdef __VDSO__ > +register struct thread_info *__current_thread_info __asm__("$0"); > +#else > register struct thread_info *__current_thread_info __asm__("$28"); > +#endif > > static inline struct thread_info *current_thread_info(void) > { Holy guacamole, nice catch. That's interesting behavior indeed... I'll leave it to you to submit for 5.5?
On Wed, Jan 1, 2020 at 5:22 AM Paul Burton <paulburton@kernel.org> wrote: > That makes sense then. What doesn't make sense is how this ever > worked. A question for another day... In most other cases, calls to the vdso would be followed in the parent function immediately by a return, restoring gp then.
diff --git a/arch/mips/include/asm/vdso/gettimeofday.h b/arch/mips/include/asm/vdso/gettimeofday.h index b08825531e9f..7f1aa610e68e 100644 --- a/arch/mips/include/asm/vdso/gettimeofday.h +++ b/arch/mips/include/asm/vdso/gettimeofday.h @@ -107,7 +107,7 @@ static __always_inline int clock_getres_fallback( return error ? -ret : ret; } -#if _MIPS_SIM != _MIPS_SIM_ABI64 +#if _MIPS_SIM != _MIPS_SIM_ABI64 && defined(CONFIG_COMPAT_32BIT_TIME) #define VDSO_HAS_32BIT_FALLBACK 1
When the VDSO falls back to 32-bit time functions on kernels with COMPAT_32BIT_TIME=n, userspace becomes corrupted and appears to crash shortly after, with something like: [ 0.359617] do_page_fault(): sending SIGSEGV to init for invalid read access from 000000007ff790d0 [ 0.359843] epc = 0000000077e45df4 in libc.so[77da6000+de000] [ 0.360319] ra = 0000000010000c50 in init[10000000+2000] [ 0.364456] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b This can be reproduced with simply calling `clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts)`, since `CLOCK_PROCESS_CPUTIME_ID` is not exported to the VDSO, invoking the syscall callback branch. This crash was observed with musl 1.20's clock_gettime implementation: int __clock_gettime(clockid_t clk, struct timespec *ts) { int r; int (*f)(clockid_t, struct timespec *) = (int (*)(clockid_t, struct timespec *))vdso_func; if (f) { r = f(clk, ts); if (!r) { return r; } if (r == -EINVAL) return __syscall_ret(r); } r = __syscall(SYS_clock_gettime, clk, ts); // <-- CRASH if (r == -ENOSYS) { if (clk == CLOCK_REALTIME) { __syscall(SYS_gettimeofday, ts, 0); ts->tv_nsec = (int)ts->tv_nsec * 1000; return 0; } r = -EINVAL; } return __syscall_ret(r); } The particular kernel and libc are built as part of the MIPS64 CI on build.wireguard.com which generally uses as minimal configurations as possible. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Paul Burton <paulburton@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: Christian Brauner <christian.brauner@canonical.com> Fixes: 942437c97fd9 ("y2038: allow disabling time32 system calls") --- arch/mips/include/asm/vdso/gettimeofday.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)