Message ID | 20220214163452.1568807-5-arnd@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | clean up asm/uaccess.h, kill set_fs for good | expand |
On Mon, Feb 14, 2022 at 05:34:42PM +0100, Arnd Bergmann wrote: > +#define __range_not_ok(addr, size, limit) (!__access_ok(addr, size)) > +#define __chk_range_not_ok(addr, size, limit) (!__access_ok((void __user *)addr, size)) Can we just kill these off insted of letting themm obsfucate the code?
On Mon, Feb 14, 2022 at 6:02 PM Christoph Hellwig <hch@infradead.org> wrote: > > On Mon, Feb 14, 2022 at 05:34:42PM +0100, Arnd Bergmann wrote: > > +#define __range_not_ok(addr, size, limit) (!__access_ok(addr, size)) > > +#define __chk_range_not_ok(addr, size, limit) (!__access_ok((void __user *)addr, size)) > > Can we just kill these off insted of letting themm obsfucate the code? As Al pointed out, they turned out to be necessary on sparc64, but the only definitions are on sparc64 and x86, so it's possible that they serve a similar purpose here, in which case changing the limit from TASK_SIZE to TASK_SIZE_MAX is probably wrong as well. So either I need to revert the original definition as I did on sparc64, or they can be removed completely. Hopefully Al or the x86 maintainers can clarify. Arnd
On Mon, Feb 14, 2022 at 08:45:52PM +0100, Arnd Bergmann wrote: > As Al pointed out, they turned out to be necessary on sparc64, but the only > definitions are on sparc64 and x86, so it's possible that they serve a similar > purpose here, in which case changing the limit from TASK_SIZE to > TASK_SIZE_MAX is probably wrong as well. > > So either I need to revert the original definition as I did on sparc64, or > they can be removed completely. Hopefully Al or the x86 maintainers > can clarify. Looking at the x86 users I think: - valid_user_frame should go away and the caller should use get_user instead of __get_user - the one in copy_code can just go away, as there is another check in copy_from_user_nmi - copy_stack_frame should just use access_ok - as does copy_from_user_nmi but yes, having someone who actually knows this code look over it would be very helpful.
On Mon, Feb 14, 2022 at 11:46 AM Arnd Bergmann <arnd@kernel.org> wrote: > > As Al pointed out, they turned out to be necessary on sparc64, but the only > definitions are on sparc64 and x86, so it's possible that they serve a similar > purpose here, in which case changing the limit from TASK_SIZE to > TASK_SIZE_MAX is probably wrong as well. x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the get_user() assembler implementation does the same. I think any __range_not_ok() users that use TASK_SIZE are entirely historical, and should be just fixed. Linus (*) And by "always" I mean "as far back as I bothered to go". In the 2.6.12 git import, we had #define USER_DS MAKE_MM_SEG(PAGE_OFFSET) so the user access limit was actually not really TASK_SIZE_MAX at all, but the beginning of the kernel mapping, which on x86-64 is much much higher.
On Mon, Feb 14, 2022 at 12:01:05PM -0800, Linus Torvalds wrote: > On Mon, Feb 14, 2022 at 11:46 AM Arnd Bergmann <arnd@kernel.org> wrote: > > > > As Al pointed out, they turned out to be necessary on sparc64, but the only > > definitions are on sparc64 and x86, so it's possible that they serve a similar > > purpose here, in which case changing the limit from TASK_SIZE to > > TASK_SIZE_MAX is probably wrong as well. > > x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the > get_user() assembler implementation does the same. > > I think any __range_not_ok() users that use TASK_SIZE are entirely > historical, and should be just fixed. IIRC, that was mostly userland stack trace collection in perf. I'll try to dig in archives and see what shows up - it's been a while ago...
On Mon, Feb 14, 2022 at 12:01 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the > get_user() assembler implementation does the same. Side note: we could just check the sign bit instead, and avoid big constants that way. Right now we actually have this complexity in the x86-64 user access code: #ifdef CONFIG_X86_5LEVEL #define LOAD_TASK_SIZE_MINUS_N(n) \ ALTERNATIVE __stringify(mov $((1 << 47) - 4096 - (n)),%rdx), \ __stringify(mov $((1 << 56) - 4096 - (n)),%rdx), X86_FEATURE_LA57 #else #define LOAD_TASK_SIZE_MINUS_N(n) \ mov $(TASK_SIZE_MAX - (n)),%_ASM_DX #endif just because the code tries to get that TASK_SIZE_MAX boundary just right. And getting that boundary just right is important on 32-bit x86, but it's *much* less important on x86-64. There's still a (weak) reason to do it even for 64-bit code: page faults outside the valid user space range don't actually cause a #PF fault - they cause #GP - and then we have the #GP handler warn about "this address hasn't been checked". Which is nice and useful for doing syzbot kind of randomization loads (ie user accesses that didn't go through access_ok() will stand out nicely), but maybe it's not worth this. syzbot would be fine with only the "sign bit set" case warning for the same thing. So on x86-64, we could just check the sign of the address instead, and simplify and shrink those get/put_user() code sequences (but array_index_mask_nospec() currently uses the carry flag computation too, so we'd have to change that part as well, maybe not worth it). Linus
From: Linus Torvalds > Sent: 14 February 2022 20:24 > > > > x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the > > get_user() assembler implementation does the same. > > Side note: we could just check the sign bit instead, and avoid big > constants that way. The cheap test for most 64bit is (addr | size) >> 62 != 0. I did some tests last week and the compilers correctly optimise out constant size. Doesn't sparc64 still need a wrap test? Or is that assumed because there is always an unmapped page and transfer are 'adequately' done on increasing addresses? David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Mon, Feb 14, 2022 at 08:17:07PM +0000, Al Viro wrote: > On Mon, Feb 14, 2022 at 12:01:05PM -0800, Linus Torvalds wrote: > > On Mon, Feb 14, 2022 at 11:46 AM Arnd Bergmann <arnd@kernel.org> wrote: > > > > > > As Al pointed out, they turned out to be necessary on sparc64, but the only > > > definitions are on sparc64 and x86, so it's possible that they serve a similar > > > purpose here, in which case changing the limit from TASK_SIZE to > > > TASK_SIZE_MAX is probably wrong as well. > > > > x86-64 has always(*) used TASK_SIZE_MAX for access_ok(), and the > > get_user() assembler implementation does the same. > > > > I think any __range_not_ok() users that use TASK_SIZE are entirely > > historical, and should be just fixed. > > IIRC, that was mostly userland stack trace collection in perf. > I'll try to dig in archives and see what shows up - it's been > a while ago... After some digging: access_ok() needs only to make sure that MMU won't go anywhere near the kernel page tables; address limit for 32bit threads is none of its concern, so TASK_SIZE_MAX is right for it. valid_user_frame() in arch/x86/events/core.c: used while walking the userland call chain. The reason it's not access_ok() is only that perf_callchain_user() might've been called from interrupt that came while we'd been under KERNEL_DS. That had been back in 2015 and it had been obsoleted since 2017, commit 88b0193d9418 (perf/callchain: Force USER_DS when invoking perf_callchain_user()). We had been guaranteed USER_DS ever since. IOW, it could've reverted to use of access_ok() at any point after that. TASK_SIZE vs TASK_SIZE_MAX is pretty much an accident there - might've been TASK_SIZE_MAX from the very beginning. copy_stack_frame() in arch/x86/kernel/stacktrace.c: similar story, except the commit that made sure callers will have USER_DS - cac9b9a4b083 (stacktrace: Force USER_DS for stack_trace_save_user()) in this case. Also could've been using access_ok() just fine. Amusingly, access_ok() used to be there, until it had been replaced with explicit check on Jul 22 2019 - 4 days after that had been made useless by fix in the caller... copy_from_user_nmi(). That one is a bit more interesting. We have a call chain from perf_output_sample_ustack() (covered by force_uaccess_begin() these days, not that it mattered for x86 now), there's something odd in dumpstack.c:copy_code() (with explicit check for TASK_SIZE_MAX in the caller) and there's a couple of callers in Intel PMU code. AFAICS, there's no reason whatsoever to use TASK_SIZE in that one - the point is to prevent copyin from the kernel memory, and in that respect TASK_SIZE_MAX isn't any worse. The check in copy_code() probably should go. So all of those guys should be simply switched to access_ok(). Might be worth making that a preliminary patch - it's independent from everything else and there's no point folding it into any of the patches in the series.
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index ac96f9b2d64b..6956a63291b6 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -16,30 +16,13 @@ * Test whether a block of memory is a valid user space address. * Returns 0 if the range is valid, nonzero otherwise. */ -static inline bool __chk_range_not_ok(unsigned long addr, unsigned long size, unsigned long limit) +static inline bool __access_ok(void __user *ptr, unsigned long size) { - /* - * If we have used "sizeof()" for the size, - * we know it won't overflow the limit (but - * it might overflow the 'addr', so it's - * important to subtract the size from the - * limit, not add it to the address). - */ - if (__builtin_constant_p(size)) - return unlikely(addr > limit - size); - - /* Arbitrary sizes? Be careful about overflow */ - addr += size; - if (unlikely(addr < size)) - return true; - return unlikely(addr > limit); -} + unsigned long limit = TASK_SIZE_MAX; + unsigned long addr = ptr; -#define __range_not_ok(addr, size, limit) \ -({ \ - __chk_user_ptr(addr); \ - __chk_range_not_ok((unsigned long __force)(addr), size, limit); \ -}) + return (size <= limit) && (addr <= (limit - size)); +} #ifdef CONFIG_DEBUG_ATOMIC_SLEEP static inline bool pagefault_disabled(void); @@ -66,12 +49,15 @@ static inline bool pagefault_disabled(void); * Return: true (nonzero) if the memory block may be valid, false (zero) * if it is definitely invalid. */ -#define access_ok(addr, size) \ -({ \ - WARN_ON_IN_IRQ(); \ - likely(!__range_not_ok(addr, size, TASK_SIZE_MAX)); \ +#define access_ok(addr, size) \ +({ \ + WARN_ON_IN_IRQ(); \ + likely(__access_ok(addr, size));\ }) +#define __range_not_ok(addr, size, limit) (!__access_ok(addr, size)) +#define __chk_range_not_ok(addr, size, limit) (!__access_ok((void __user *)addr, size)) + extern int __get_user_1(void); extern int __get_user_2(void); extern int __get_user_4(void);