Message ID | ed52cfb852d2772bf20f48614d75f1d1b1451995.1582841072.git.jpoimboe@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/i915: Minimize uaccess exposure in i915_gem_execbuffer2_ioctl() | expand |
Quoting Josh Poimboeuf (2020-02-27 22:08:26) > With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports: > > drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled > > This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr() > -- and indirectly, sign_extend64() -- from the user_access_begin/end > critical region (i.e, with SMAP disabled). > > While it's probably harmless in this case, in general we like to avoid > extra function calls in SMAP-disabled regions because it can open up > inadvertent security holes. > > Fix it by moving the gen8_canonical_addr() conversion to a separate loop > before user_access_begin() is called. > > Note that gen8_canonical_addr() is now called *before* masking off the > PIN_OFFSET_MASK bits. That should be ok because it just does a sign > extension and ignores the masked lower bits anyway. > > Reported-by: Randy Dunlap <rdunlap@infradead.org> > Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> > --- > drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > index d5a0f5ae4a8b..183cab13e028 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > @@ -2947,6 +2947,13 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, > u64_to_user_ptr(args->buffers_ptr); > unsigned int i; > > + /* > + * Do the call to gen8_canonical_addr() outside the > + * uaccess-enabled region to minimize uaccess exposure. > + */ > + for (i = 0; i < args->buffer_count; i++) > + exec2_list[i].offset = gen8_canonical_addr(exec2_list[i].offset); Another loop over all the objects, where we intentionally try and skip unmodified entries? To save 2 instructions from inside the second loop? Colour me skeptical. -Chris
On Thu, Feb 27, 2020 at 04:08:26PM -0600, Josh Poimboeuf wrote: > With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports: > > drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled > > This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr() > -- and indirectly, sign_extend64() -- from the user_access_begin/end > critical region (i.e, with SMAP disabled). > > While it's probably harmless in this case, in general we like to avoid > extra function calls in SMAP-disabled regions because it can open up > inadvertent security holes. > > Fix it by moving the gen8_canonical_addr() conversion to a separate loop > before user_access_begin() is called. > > Note that gen8_canonical_addr() is now called *before* masking off the > PIN_OFFSET_MASK bits. That should be ok because it just does a sign > extension and ignores the masked lower bits anyway. How painful would it be to inline the damn thing? <looks> static inline u64 gen8_canonical_addr(u64 address) { return sign_extend64(address, GEN8_HIGH_ADDRESS_BIT); } static inline __s64 sign_extend64(__u64 value, int index) { __u8 shift = 63 - index; return (__s64)(value << shift) >> shift; } What the hell? Josh, what kind of .config do you have that these are _not_ inlined? And why not mark gen8_canonical_addr() __always_inline?
On Thu, Feb 27, 2020 at 10:35:42PM +0000, Al Viro wrote: > On Thu, Feb 27, 2020 at 04:08:26PM -0600, Josh Poimboeuf wrote: > > With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports: > > > > drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled > > > > This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr() > > -- and indirectly, sign_extend64() -- from the user_access_begin/end > > critical region (i.e, with SMAP disabled). > > > > While it's probably harmless in this case, in general we like to avoid > > extra function calls in SMAP-disabled regions because it can open up > > inadvertent security holes. > > > > Fix it by moving the gen8_canonical_addr() conversion to a separate loop > > before user_access_begin() is called. > > > > Note that gen8_canonical_addr() is now called *before* masking off the > > PIN_OFFSET_MASK bits. That should be ok because it just does a sign > > extension and ignores the masked lower bits anyway. > > How painful would it be to inline the damn thing? > <looks> > static inline u64 gen8_canonical_addr(u64 address) > { > return sign_extend64(address, GEN8_HIGH_ADDRESS_BIT); > } > static inline __s64 sign_extend64(__u64 value, int index) > { > __u8 shift = 63 - index; > return (__s64)(value << shift) >> shift; > } > > What the hell? Josh, what kind of .config do you have that these are > _not_ inlined? I think this was seen with CONFIG_CC_OPTIMIZE_FOR_SIZE, which tends to ignore inline. > And why not mark gen8_canonical_addr() __always_inline? Right, marking those two functions as __always_inline is the other option. The problem is, if you keep doing it, eventually you end up with __always_inline-itis spreading all over the place. And it affects all the other callers, at least in the CONFIG_CC_OPTIMIZE_FOR_SIZE case. At least this fix is localized. But I agree my patch isn't ideal either.
On 2/27/20 5:03 PM, Josh Poimboeuf wrote: > On Thu, Feb 27, 2020 at 10:35:42PM +0000, Al Viro wrote: >> On Thu, Feb 27, 2020 at 04:08:26PM -0600, Josh Poimboeuf wrote: >>> With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports: >>> >>> drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled >>> >>> This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr() >>> -- and indirectly, sign_extend64() -- from the user_access_begin/end >>> critical region (i.e, with SMAP disabled). >>> >>> While it's probably harmless in this case, in general we like to avoid >>> extra function calls in SMAP-disabled regions because it can open up >>> inadvertent security holes. >>> >>> Fix it by moving the gen8_canonical_addr() conversion to a separate loop >>> before user_access_begin() is called. >>> >>> Note that gen8_canonical_addr() is now called *before* masking off the >>> PIN_OFFSET_MASK bits. That should be ok because it just does a sign >>> extension and ignores the masked lower bits anyway. >> >> How painful would it be to inline the damn thing? >> <looks> >> static inline u64 gen8_canonical_addr(u64 address) >> { >> return sign_extend64(address, GEN8_HIGH_ADDRESS_BIT); >> } >> static inline __s64 sign_extend64(__u64 value, int index) >> { >> __u8 shift = 63 - index; >> return (__s64)(value << shift) >> shift; >> } >> >> What the hell? Josh, what kind of .config do you have that these are >> _not_ inlined? > > I think this was seen with CONFIG_CC_OPTIMIZE_FOR_SIZE, which tends to so the commit message correctly says. > ignore inline. > >> And why not mark gen8_canonical_addr() __always_inline? > > Right, marking those two functions as __always_inline is the other > option. The problem is, if you keep doing it, eventually you end up > with __always_inline-itis spreading all over the place. And it affects > all the other callers, at least in the CONFIG_CC_OPTIMIZE_FOR_SIZE case. > At least this fix is localized. > > But I agree my patch isn't ideal either. fwiw, Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested thanks.
On Thu, Feb 27, 2020 at 10:26:00PM +0000, Chris Wilson wrote: > > --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c > > @@ -2947,6 +2947,13 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, > > u64_to_user_ptr(args->buffers_ptr); > > unsigned int i; > > > > + /* > > + * Do the call to gen8_canonical_addr() outside the > > + * uaccess-enabled region to minimize uaccess exposure. > > + */ > > + for (i = 0; i < args->buffer_count; i++) > > + exec2_list[i].offset = gen8_canonical_addr(exec2_list[i].offset); > > > Another loop over all the objects, where we intentionally try and skip > unmodified entries? To save 2 instructions from inside the second loop? > > Colour me skeptical. So are you're saying these arrays can be large and that you have performance concerns?
On Thu, Feb 27, 2020 at 07:03:42PM -0600, Josh Poimboeuf wrote: > > And why not mark gen8_canonical_addr() __always_inline? > > Right, marking those two functions as __always_inline is the other > option. The problem is, if you keep doing it, eventually you end up > with __always_inline-itis spreading all over the place. And it affects > all the other callers, at least in the CONFIG_CC_OPTIMIZE_FOR_SIZE case. > At least this fix is localized. I'm all for __always_inline in this case, the compiler not inlining sign extention is just retarded,
On Fri, Feb 28, 2020 at 07:04:41PM +0100, Peter Zijlstra wrote: > On Thu, Feb 27, 2020 at 07:03:42PM -0600, Josh Poimboeuf wrote: > > > And why not mark gen8_canonical_addr() __always_inline? > > > > Right, marking those two functions as __always_inline is the other > > option. The problem is, if you keep doing it, eventually you end up > > with __always_inline-itis spreading all over the place. And it affects > > all the other callers, at least in the CONFIG_CC_OPTIMIZE_FOR_SIZE case. > > At least this fix is localized. > > I'm all for __always_inline in this case, the compiler not inlining sign > extention is just retarded, FWIW, in this case it's salq $8, %rax sarq $8, %rax i.e. 8 bytes. Sure, that's 3 bytes longer than call, but really, WTF?
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c index d5a0f5ae4a8b..183cab13e028 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c @@ -2947,6 +2947,13 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, u64_to_user_ptr(args->buffers_ptr); unsigned int i; + /* + * Do the call to gen8_canonical_addr() outside the + * uaccess-enabled region to minimize uaccess exposure. + */ + for (i = 0; i < args->buffer_count; i++) + exec2_list[i].offset = gen8_canonical_addr(exec2_list[i].offset); + /* Copy the new buffer offsets back to the user's exec list. */ /* * Note: count * sizeof(*user_exec_list) does not overflow, @@ -2962,9 +2969,7 @@ i915_gem_execbuffer2_ioctl(struct drm_device *dev, void *data, if (!(exec2_list[i].offset & UPDATE)) continue; - exec2_list[i].offset = - gen8_canonical_addr(exec2_list[i].offset & PIN_OFFSET_MASK); - unsafe_put_user(exec2_list[i].offset, + unsafe_put_user(exec2_list[i].offset & PIN_OFFSET_MASK, &user_exec_list[i].offset, end_user); }
With CONFIG_CC_OPTIMIZE_FOR_SIZE, objtool reports: drivers/gpu/drm/i915/gem/i915_gem_execbuffer.o: warning: objtool: i915_gem_execbuffer2_ioctl()+0x5b7: call to gen8_canonical_addr() with UACCESS enabled This means i915_gem_execbuffer2_ioctl() is calling gen8_canonical_addr() -- and indirectly, sign_extend64() -- from the user_access_begin/end critical region (i.e, with SMAP disabled). While it's probably harmless in this case, in general we like to avoid extra function calls in SMAP-disabled regions because it can open up inadvertent security holes. Fix it by moving the gen8_canonical_addr() conversion to a separate loop before user_access_begin() is called. Note that gen8_canonical_addr() is now called *before* masking off the PIN_OFFSET_MASK bits. That should be ok because it just does a sign extension and ignores the masked lower bits anyway. Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> --- drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)