mbox series

[00/11,v3] x86: load FPU registers on return to userland

Message ID 20181004140547.13014-1-bigeasy@linutronix.de (mailing list archive)
Headers show
Series x86: load FPU registers on return to userland | expand

Message

Sebastian Andrzej Siewior Oct. 4, 2018, 2:05 p.m. UTC
This is a refurbished series originally started by by Rik van Riel. The
goal is load the FPU registers on return to userland and not on every
context switch. By this optimisation we can:
- avoid loading the registers if the task stays in kernel and does
  not return to userland
- make kernel_fpu_begin() cheaper: it only saves the registers on the
  first invocation. The second invocation does not need save them again.

To access the FPU registers in kernel we need:
- disable preemption to avoid that the scheduler switches tasks. By
  doing so it would set TIF_LOAD_FPU and the FPU registers would be not
  valid.
- disable BH because the softirq might use kernel_fpu_begin() and then
  set TIF_LOAD_FPU instead loading the FPU registers on completion.

v1…v3:
v2 was never posted. I followed the idea to completely decouple PKRU
from xstate. This didn't quite work and made a few things complicated. 
One obvious required fixup is copy_fpstate_to_sigframe() where the PKRU
state needs to be fiddled into xstate. This required another
xfeatures_mask so that the sanity checks were performed and
xstate_offsets would be computed. Additionally ptrace also reads/sets
xstate in order to get/set the register and PKRU is one of them. So this
would need some fiddle, too.
In v3 I dropped that decouple idea. I also learned that the wrpkru
instruction is not privileged and so caching it in kernel does not work.
Instead I keep PKRU in xstate area and load it at context switch time
while the remaining registers are deferred (until return to userland).
The offset of PKRU within xstate is enumerated at boot time so why not
use it.

This seems to work with my in-kernel test case and a userland test case
which use xmm registers. The pkey feature was tested in non kvm
accelerated qemu and it seems to work, too.

Sebastian

Comments

Rik van Riel Oct. 4, 2018, 4:45 p.m. UTC | #1
On Thu, 2018-10-04 at 16:05 +0200, Sebastian Andrzej Siewior wrote:


> In v3 I dropped that decouple idea. I also learned that the wrpkru
> instruction is not privileged and so caching it in kernel does not
> work.

Wait, so any thread can bypass its memory protection
keys, even if there is a seccomp filter preventing
it from calling the PKRU syscalls?

Is that intended?

Is that simply a hardware limitation, or something
where we can set a flag somewhere to force tasks to
go through the kernel?
Andy Lutomirski Oct. 4, 2018, 4:50 p.m. UTC | #2
> On Oct 4, 2018, at 9:45 AM, Rik van Riel <riel@surriel.com> wrote:
> 
> On Thu, 2018-10-04 at 16:05 +0200, Sebastian Andrzej Siewior wrote:
> 
> 
>> In v3 I dropped that decouple idea. I also learned that the wrpkru
>> instruction is not privileged and so caching it in kernel does not
>> work.
> 
> Wait, so any thread can bypass its memory protection
> keys, even if there is a seccomp filter preventing
> it from calling the PKRU syscalls?
> 
> Is that intended?
> 
> Is that simply a hardware limitation, or something
> where we can set a flag somewhere to force tasks to
> go through the kernel?
> 
> 

Hardware limitation.
Sebastian Andrzej Siewior Oct. 5, 2018, 11:55 a.m. UTC | #3
On 2018-10-04 12:45:08 [-0400], Rik van Riel wrote:
> Wait, so any thread can bypass its memory protection
> keys, even if there is a seccomp filter preventing
> it from calling the PKRU syscalls?

We have SYS_pkey_alloc +free and SYS_pkey_mprotect. For read/ write of
the register value, libc is using and opcodes.

> Is that intended?

Either that or it ended like that because someone failed to attend a
meeting where this was discussed. Here is something from pkeys(7):

| Protection  keys  have  the  potential  to  add  a  layer  of security and
| reliability to applications.  But they have not been primarily designed as a
| security feature.  For instance, WRPKRU is a completely unprivileged
| instruction, so pkeys are useless in any case that an attacker controls the
| PKRU register or can execute arbitrary instructions.

Sebastian