Message ID | 151703973427.26578.15693075353773519333.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
* Dan Williams <dan.j.williams@intel.com> wrote: > For '__get_user' paths, do not allow the kernel to speculate on the > value of a user controlled pointer. In addition to the 'stac' > instruction for Supervisor Mode Access Protection, an 'ifence' causes > the 'access_ok' result to resolve in the pipeline before the cpu might > take any speculative action on the pointer value. > > Since __get_user is a major kernel interface that deals with user > controlled pointers, the '__uaccess_begin_nospec' mechanism will prevent > speculative execution past an 'access_ok' permission check. While > speculative execution past 'access_ok' is not enough to lead to a kernel > memory leak, it is a necessary precondition. > > To be clear, '__uaccess_begin_nospec' is addressing a class of potential > problems near '__get_user' usages. > > Note, that while ifence is used to protect '__get_user', pointer masking > will be used for 'get_user' since it incorporates a bounds check near > the usage. > > There are no functional changes in this patch. The style problems/inconsistencies of the #2 patch are repeated here too. Also, please split this patch into two patches: - #1 introducing ifence() and using it where it was open coded before - #2 introducing the _nospec() uaccess variants > Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> > Suggested-by: Andi Kleen <ak@linux.intel.com> > Cc: Tom Lendacky <thomas.lendacky@amd.com> > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Kees Cook <keescook@chromium.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: "H. Peter Anvin" <hpa@zytor.com> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: x86@kernel.org > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > arch/x86/include/asm/barrier.h | 4 ++++ > arch/x86/include/asm/msr.h | 3 +-- > arch/x86/include/asm/uaccess.h | 9 +++++++++ > 3 files changed, 14 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h > index 30419b674ebd..5f11d4c5c862 100644 > --- a/arch/x86/include/asm/barrier.h > +++ b/arch/x86/include/asm/barrier.h > @@ -46,6 +46,10 @@ static inline unsigned long array_idx_mask(unsigned long idx, unsigned long sz) > return mask; > } > > +/* prevent speculative execution past this barrier */ Please use consistent capitalization in comments. Thanks, Ingo
* Dan Williams <dan.j.williams@intel.com> wrote: > --- a/arch/x86/include/asm/uaccess.h > +++ b/arch/x86/include/asm/uaccess.h > @@ -124,6 +124,11 @@ extern int __get_user_bad(void); > > #define __uaccess_begin() stac() > #define __uaccess_end() clac() > +#define __uaccess_begin_nospec() \ > +({ \ > + stac(); \ > + ifence(); \ > +}) BTW., wouldn't it be better to switch the barrier order here, i.e. to do: ifence(); \ stac(); \ ? The reason is that stac()/clac() is usually paired, so there's a chance with short sequences that it would resolve with 'no externally visible changes to flags'. Also, there's many cases where flags are modified _inside_ the STAC/CLAC section, so grouping them together inside a speculation atom could be beneficial. The flip side is that if the MFENCE stalls the STAC that is ahead of it could be processed for 'free' - while it's always post barrier with my suggestion. But in any case it would be nice to see a discussion of this aspect in the changelog, even if the patch does not change. Thanks, Ingo
On Sun, Jan 28, 2018 at 1:14 AM, Ingo Molnar <mingo@kernel.org> wrote: > > * Dan Williams <dan.j.williams@intel.com> wrote: > >> --- a/arch/x86/include/asm/uaccess.h >> +++ b/arch/x86/include/asm/uaccess.h >> @@ -124,6 +124,11 @@ extern int __get_user_bad(void); >> >> #define __uaccess_begin() stac() >> #define __uaccess_end() clac() >> +#define __uaccess_begin_nospec() \ >> +({ \ >> + stac(); \ >> + ifence(); \ >> +}) > > BTW., wouldn't it be better to switch the barrier order here, i.e. to do: > > ifence(); \ > stac(); \ > > ? > > The reason is that stac()/clac() is usually paired, so there's a chance with short > sequences that it would resolve with 'no externally visible changes to flags'. > > Also, there's many cases where flags are modified _inside_ the STAC/CLAC section, > so grouping them together inside a speculation atom could be beneficial. I'm struggling a bit to grok this. Are you referring to the state of the flags from the address limit comparison? That's the result that needs fencing before speculative execution leaks to to the user pointer de-reference. > The flip side is that if the MFENCE stalls the STAC that is ahead of it could be > processed for 'free' - while it's always post barrier with my suggestion. This 'for free' aspect is what I aiming for. > > But in any case it would be nice to see a discussion of this aspect in the > changelog, even if the patch does not change. I'll add a note to the changelog that having the fence after the 'stac' hopefully allows some overlap of the cost of 'stac' and the flushing of the instruction pipeline.
* Dan Williams <dan.j.williams@intel.com> wrote: > > The flip side is that if the MFENCE stalls the STAC that is ahead of it could be > > processed for 'free' - while it's always post barrier with my suggestion. > > This 'for free' aspect is what I aiming for. Ok. > > > > But in any case it would be nice to see a discussion of this aspect in the > > changelog, even if the patch does not change. > > I'll add a note to the changelog that having the fence after the > 'stac' hopefully allows some overlap of the cost of 'stac' and the > flushing of the instruction pipeline. Perfect! Thanks, Ingo
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h index 30419b674ebd..5f11d4c5c862 100644 --- a/arch/x86/include/asm/barrier.h +++ b/arch/x86/include/asm/barrier.h @@ -46,6 +46,10 @@ static inline unsigned long array_idx_mask(unsigned long idx, unsigned long sz) return mask; } +/* prevent speculative execution past this barrier */ +#define ifence() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \ + "lfence", X86_FEATURE_LFENCE_RDTSC) + #ifdef CONFIG_X86_PPRO_FENCE #define dma_rmb() rmb() #else diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index 07962f5f6fba..e426d2a33ff3 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -214,8 +214,7 @@ static __always_inline unsigned long long rdtsc_ordered(void) * that some other imaginary CPU is updating continuously with a * time stamp. */ - alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, - "lfence", X86_FEATURE_LFENCE_RDTSC); + ifence(); return rdtsc(); } diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 574dff4d2913..626caf58183a 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -124,6 +124,11 @@ extern int __get_user_bad(void); #define __uaccess_begin() stac() #define __uaccess_end() clac() +#define __uaccess_begin_nospec() \ +({ \ + stac(); \ + ifence(); \ +}) /* * This is a type: either unsigned long, if the argument fits into @@ -487,6 +492,10 @@ struct __large_struct { unsigned long buf[100]; }; __uaccess_begin(); \ barrier(); +#define uaccess_try_nospec do { \ + current->thread.uaccess_err = 0; \ + __uaccess_begin_nospec(); \ + #define uaccess_catch(err) \ __uaccess_end(); \ (err) |= (current->thread.uaccess_err ? -EFAULT : 0); \
For '__get_user' paths, do not allow the kernel to speculate on the value of a user controlled pointer. In addition to the 'stac' instruction for Supervisor Mode Access Protection, an 'ifence' causes the 'access_ok' result to resolve in the pipeline before the cpu might take any speculative action on the pointer value. Since __get_user is a major kernel interface that deals with user controlled pointers, the '__uaccess_begin_nospec' mechanism will prevent speculative execution past an 'access_ok' permission check. While speculative execution past 'access_ok' is not enough to lead to a kernel memory leak, it is a necessary precondition. To be clear, '__uaccess_begin_nospec' is addressing a class of potential problems near '__get_user' usages. Note, that while ifence is used to protect '__get_user', pointer masking will be used for 'get_user' since it incorporates a bounds check near the usage. There are no functional changes in this patch. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Suggested-by: Andi Kleen <ak@linux.intel.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Kees Cook <keescook@chromium.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: x86@kernel.org Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- arch/x86/include/asm/barrier.h | 4 ++++ arch/x86/include/asm/msr.h | 3 +-- arch/x86/include/asm/uaccess.h | 9 +++++++++ 3 files changed, 14 insertions(+), 2 deletions(-)