Message ID | 151571802258.27429.932636277047687877.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jan 11, 2018 at 04:47:02PM -0800, Dan Williams wrote: > For 'get_user' paths, do not allow the kernel to speculate on the value > of a user controlled pointer. In addition to the 'stac' instruction for > Supervisor Mode Access Protection, an 'ifence' causes the 'access_ok' > result to resolve in the pipeline before the cpu might take any > speculative action on the pointer value. So I understand the need to "patch first and ask questions later". I also understand that usercopy is an obvious attack point for speculative bugs. However, I'm still hopelessly confused about what exactly this patch (and the next one) are supposed to accomplish. I can't figure out if: a) I'm missing something completely obvious; b) this is poorly described; or c) it doesn't actually fix/protect/harden anything. The commit log doesn't help me at all. In fact, it confuses me more. For example, this paragraph: > Since this is a major kernel interface that deals with user controlled > data, the '__uaccess_begin_nospec' mechanism will prevent speculative > execution past an 'access_ok' permission check. While speculative > execution past 'access_ok' is not enough to lead to a kernel memory > leak, it is a necessary precondition. That just sounds wrong. What if the speculation starts *after* the access_ok() check? Then the barrier has no purpose. Most access_ok/get_user/copy_from_user calls are like this: if (copy_from_user(...uptr..)) /* or access_ok() or get_user() */ return -EFAULT; So in other words, the usercopy function is called *before* the branch. But to halt speculation, the lfence needs to come *after* the branch. So putting lfences *before* the branch doesn't solve anything. So what am I missing?
On Fri, Jan 12, 2018 at 9:51 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > On Thu, Jan 11, 2018 at 04:47:02PM -0800, Dan Williams wrote: >> For 'get_user' paths, do not allow the kernel to speculate on the value >> of a user controlled pointer. In addition to the 'stac' instruction for >> Supervisor Mode Access Protection, an 'ifence' causes the 'access_ok' >> result to resolve in the pipeline before the cpu might take any >> speculative action on the pointer value. > > So I understand the need to "patch first and ask questions later". I > also understand that usercopy is an obvious attack point for speculative > bugs. However, I'm still hopelessly confused about what exactly this > patch (and the next one) are supposed to accomplish. > > I can't figure out if: > > a) I'm missing something completely obvious; > b) this is poorly described; or > c) it doesn't actually fix/protect/harden anything. > > The commit log doesn't help me at all. In fact, it confuses me more. > For example, this paragraph: > >> Since this is a major kernel interface that deals with user controlled >> data, the '__uaccess_begin_nospec' mechanism will prevent speculative >> execution past an 'access_ok' permission check. While speculative >> execution past 'access_ok' is not enough to lead to a kernel memory >> leak, it is a necessary precondition. > > That just sounds wrong. What if the speculation starts *after* the > access_ok() check? Then the barrier has no purpose. > > Most access_ok/get_user/copy_from_user calls are like this: > > if (copy_from_user(...uptr..)) /* or access_ok() or get_user() */ > return -EFAULT; > > So in other words, the usercopy function is called *before* the branch. > > But to halt speculation, the lfence needs to come *after* the branch. > So putting lfences *before* the branch doesn't solve anything. > > So what am I missing? We're trying to prevent a pointer under user control from being de-referenced inside the kernel, before we know it has been limited to something safe. In the following sequence the branch we are worried about speculating is the privilege check: if (access_ok(uptr)) /* <--- Privelege Check */ if (copy_from_user_(uptr)) The cpu can speculatively skip that access_ok() check and cause a read of kernel memory.
On Fri, Jan 12, 2018 at 10:21:43AM -0800, Dan Williams wrote: > > That just sounds wrong. What if the speculation starts *after* the > > access_ok() check? Then the barrier has no purpose. > > > > Most access_ok/get_user/copy_from_user calls are like this: > > > > if (copy_from_user(...uptr..)) /* or access_ok() or get_user() */ > > return -EFAULT; > > > > So in other words, the usercopy function is called *before* the branch. > > > > But to halt speculation, the lfence needs to come *after* the branch. > > So putting lfences *before* the branch doesn't solve anything. > > > > So what am I missing? > > We're trying to prevent a pointer under user control from being > de-referenced inside the kernel, before we know it has been limited to > something safe. In the following sequence the branch we are worried > about speculating is the privilege check: > > if (access_ok(uptr)) /* <--- Privelege Check */ > if (copy_from_user_(uptr)) > > The cpu can speculatively skip that access_ok() check and cause a read > of kernel memory. Converting your example code to assembly: call access_ok # no speculation which started before this point is allowed to continue past this point test %rax, %rax jne error_path dereference_uptr: (do nefarious things with the user pointer) error_path: mov -EINVAL, %rax ret So the CPU is still free to speculately execute the dereference_uptr block because the lfence was before the 'jne error_path' branch.
On Fri, Jan 12, 2018 at 10:58 AM, Josh Poimboeuf <jpoimboe@redhat.com> wrote: > On Fri, Jan 12, 2018 at 10:21:43AM -0800, Dan Williams wrote: >> > That just sounds wrong. What if the speculation starts *after* the >> > access_ok() check? Then the barrier has no purpose. >> > >> > Most access_ok/get_user/copy_from_user calls are like this: >> > >> > if (copy_from_user(...uptr..)) /* or access_ok() or get_user() */ >> > return -EFAULT; >> > >> > So in other words, the usercopy function is called *before* the branch. >> > >> > But to halt speculation, the lfence needs to come *after* the branch. >> > So putting lfences *before* the branch doesn't solve anything. >> > >> > So what am I missing? >> >> We're trying to prevent a pointer under user control from being >> de-referenced inside the kernel, before we know it has been limited to >> something safe. In the following sequence the branch we are worried >> about speculating is the privilege check: >> >> if (access_ok(uptr)) /* <--- Privelege Check */ >> if (copy_from_user_(uptr)) >> >> The cpu can speculatively skip that access_ok() check and cause a read >> of kernel memory. > > Converting your example code to assembly: > > call access_ok # no speculation which started before this point is allowed to continue past this point > test %rax, %rax > jne error_path > dereference_uptr: > (do nefarious things with the user pointer) > > error_path: > mov -EINVAL, %rax > ret > > So the CPU is still free to speculately execute the dereference_uptr > block because the lfence was before the 'jne error_path' branch. By the time we get to de-reference uptr we know it is not pointing at kernel memory, because access_ok would have failed and the cpu would have waited for that failure result before doing anything else. Now the vulnerability that still exists after this fence is if we do something with the value returned from de-referencing that pointer. I.e. if we do: get_user(val, uptr); if (val < array_max) array[val]; ...then we're back to having a user controlled pointer in the kernel. This is the point where we need array_ptr() to sanitize things.
On Fri, Jan 12, 2018 at 11:26 AM, Dan Williams <dan.j.williams@intel.com> wrote: > > By the time we get to de-reference uptr we know it is not pointing at > kernel memory, because access_ok would have failed and the cpu would > have waited for that failure result before doing anything else. I'm not actually convinced that's right in the original patches, exactly because of the issue that Josh pointed out: even if there is a comparison inside access_ok() that will be properly serialized, then that comparison can (and sometimes does) just cause a truth value to be generated, and then there might be *another* comparison of that return value after the lfence. And while the return value is table, the conditional branch on that comparison isn't. The new model of just doing it together with the STAC should be fine, though. I do think that it would be a good idea to very expressly document the fact that it's not that the user access itself is unsafe. I do agree that things like "get_user()" want to be protected, but not because of any direct bugs or problems with get_user() and friends, but simply because get_user() is an excellent source of a pointer that is obviously controlled from a potentially attacking user space. So it's a prime candidate for then finding _subsequent_ accesses that can then be used to perturb the cache. Linus
On Fri, Jan 12, 2018 at 12:01:04PM -0800, Linus Torvalds wrote: > On Fri, Jan 12, 2018 at 11:26 AM, Dan Williams <dan.j.williams@intel.com> wrote: > > > > By the time we get to de-reference uptr we know it is not pointing at > > kernel memory, because access_ok would have failed and the cpu would > > have waited for that failure result before doing anything else. > > I'm not actually convinced that's right in the original patches, > exactly because of the issue that Josh pointed out: even if there is a > comparison inside access_ok() that will be properly serialized, then > that comparison can (and sometimes does) just cause a truth value to > be generated, and then there might be *another* comparison of that > return value after the lfence. And while the return value is table, > the conditional branch on that comparison isn't. > > The new model of just doing it together with the STAC should be fine, though. Aha, that clears it up for me, thanks. I was still thinking about the previous version of the patch which had the barrier in access_ok(). I didn't realize the new version moved the barrier to after the access_ok() checks.
diff --git a/arch/x86/include/asm/smap.h b/arch/x86/include/asm/smap.h index db333300bd4b..0b59707e0b46 100644 --- a/arch/x86/include/asm/smap.h +++ b/arch/x86/include/asm/smap.h @@ -40,6 +40,10 @@ #endif /* CONFIG_X86_SMAP */ +#define ASM_IFENCE \ + ALTERNATIVE_2 "", "mfence", X86_FEATURE_MFENCE_RDTSC, \ + "lfence", X86_FEATURE_LFENCE_RDTSC + #else /* __ASSEMBLY__ */ #include <asm/alternative.h> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 574dff4d2913..a31fd4fc6483 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -124,6 +124,11 @@ extern int __get_user_bad(void); #define __uaccess_begin() stac() #define __uaccess_end() clac() +#define __uaccess_begin_nospec() \ +({ \ + stac(); \ + ifence(); \ +}) /* * This is a type: either unsigned long, if the argument fits into @@ -487,6 +492,11 @@ struct __large_struct { unsigned long buf[100]; }; __uaccess_begin(); \ barrier(); +#define uaccess_try_nospec do { \ + current->thread.uaccess_err = 0; \ + __uaccess_begin_nospec(); \ + barrier(); + #define uaccess_catch(err) \ __uaccess_end(); \ (err) |= (current->thread.uaccess_err ? -EFAULT : 0); \
For 'get_user' paths, do not allow the kernel to speculate on the value of a user controlled pointer. In addition to the 'stac' instruction for Supervisor Mode Access Protection, an 'ifence' causes the 'access_ok' result to resolve in the pipeline before the cpu might take any speculative action on the pointer value. Since this is a major kernel interface that deals with user controlled data, the '__uaccess_begin_nospec' mechanism will prevent speculative execution past an 'access_ok' permission check. While speculative execution past 'access_ok' is not enough to lead to a kernel memory leak, it is a necessary precondition. To be clear, '__uaccess_begin_nospec' and ASM_IFENCE are not addressing any known issues with 'get_user' they are addressing a class of potential problems that could be near 'get_user' usages. In other words, these helpers are for hygiene not clinical fixes. There are no functional changes in this patch. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Andi Kleen <ak@linux.intel.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: x86@kernel.org Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- arch/x86/include/asm/smap.h | 4 ++++ arch/x86/include/asm/uaccess.h | 10 ++++++++++ 2 files changed, 14 insertions(+)