diff mbox series

[RFC] x86/fpu: Don't unconditionally add XFEATURE_MASK_FPSSE on sigentry

Message ID 20190425173545.sogxyptbaqvoofm6@linutronix.de (mailing list archive)
State New, archived
Headers show
Series [RFC] x86/fpu: Don't unconditionally add XFEATURE_MASK_FPSSE on sigentry | expand

Commit Message

Sebastian Andrzej Siewior April 25, 2019, 5:35 p.m. UTC
This commit reverts commit 04944b793e18e ("x86: xsave: set FP, SSE bits
in the xsave header in the user sigcontext"). The commit claims that it
is required for legacy applications but fails to explain why this is
needed and it is not obvious to me why the application would require the
FP/SSE state in the signal handler.

In the compacted form, the XSAVES may save only XMM+SSE but skip FP.
This is denoted by header->xfeatures = 6. The fastpath
(copy_fpregs_to_sigframe()) does that but _also_ initialises the FP
state (cwd to 0x37f, mxcsr as we do, remaining fields to 0).
The slowpath (copy_xstate_to_user()) leaves most of the FP untouched.
Only mxcsr and mxcsr_flags are set due to xfeatures_mxcsr_quirk().
Now that XFEATURE_MASK_FP is set unconditionally we load on return from
signal random garbage as the FP state.
This results in SIGFPE on one particular machine (the same testcase
(starting a java application) on older generations seems to work).

One way to fix this is not to unconditionally set XFEATURE_MASK_FPSSE
since it was never saved. This avoids loading garbage on the return
path.
We could also initialise the FP state in the xfeatures_mxcsr_quirk()
case like xsave does.

Reported-by: Kurt Kanzenbach <kurt.kanzenbach@linutronix.de>
Fixes: 69277c98f5eef ("x86/fpu: Always store the registers in copy_fpstate_to_sigframe()")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 arch/x86/kernel/fpu/signal.c | 23 -----------------------
 1 file changed, 23 deletions(-)

Comments

Dave Hansen April 25, 2019, 9:13 p.m. UTC | #1
On 4/25/19 10:35 AM, Sebastian Andrzej Siewior wrote:
> This commit reverts commit 04944b793e18e ("x86: xsave: set FP, SSE bits
> in the xsave header in the user sigcontext"). The commit claims that it
> is required for legacy applications but fails to explain why this is
> needed and it is not obvious to me why the application would require the
> FP/SSE state in the signal handler.

Any software that understands XSAVE is OK.  I think the legacy software
would be that which groks 'fxregs_state, and FXSAVE/FXRSTOR but does not
comprehend XSAVE/XRSTOR.  *That* software might change fxregs_state in
the signal frame, but the lack of XFEATURE_MASK_FPSSE in xfeatures would
prevent XRSTOR from restoring it.

That's just a guess, though.

If we care, I think we should just use XSAVE instead of XSAVEOPT and
trying to reconstruct the init state in software.
Sebastian Andrzej Siewior April 26, 2019, 7:26 a.m. UTC | #2
On 2019-04-25 14:13:05 [-0700], Dave Hansen wrote:
> On 4/25/19 10:35 AM, Sebastian Andrzej Siewior wrote:
> > This commit reverts commit 04944b793e18e ("x86: xsave: set FP, SSE bits
> > in the xsave header in the user sigcontext"). The commit claims that it
> > is required for legacy applications but fails to explain why this is
> > needed and it is not obvious to me why the application would require the
> > FP/SSE state in the signal handler.
> 
> Any software that understands XSAVE is OK.  I think the legacy software
> would be that which groks 'fxregs_state, and FXSAVE/FXRSTOR but does not
> comprehend XSAVE/XRSTOR.  *That* software might change fxregs_state in
> the signal frame, but the lack of XFEATURE_MASK_FPSSE in xfeatures would
> prevent XRSTOR from restoring it.

but it would edit its FP state before it has been used.

> That's just a guess, though.
> 
> If we care, I think we should just use XSAVE instead of XSAVEOPT and
> trying to reconstruct the init state in software.

We can't use XSAVE directly in the slowpath. We need to reconstruct the
init state. We have the mxcsr quirk. We would need just to extend it and
set the FP area to init state if the FP state is missing like we do in
fpstate_sanitize_xstate().

Sebastian
Dave Hansen April 26, 2019, 4:33 p.m. UTC | #3
On 4/26/19 12:26 AM, Sebastian Andrzej Siewior wrote:
>> That's just a guess, though.
>>
>> If we care, I think we should just use XSAVE instead of XSAVEOPT and
>> trying to reconstruct the init state in software.
> We can't use XSAVE directly in the slowpath. We need to reconstruct the
> init state. We have the mxcsr quirk. We would need just to extend it and
> set the FP area to init state if the FP state is missing like we do in
> fpstate_sanitize_xstate().

Can you remind me why we can't use XSAVE directly in the slow path?
Sebastian Andrzej Siewior April 26, 2019, 4:50 p.m. UTC | #4
On 2019-04-26 09:33:28 [-0700], Dave Hansen wrote:
> On 4/26/19 12:26 AM, Sebastian Andrzej Siewior wrote:
> >> That's just a guess, though.
> >>
> >> If we care, I think we should just use XSAVE instead of XSAVEOPT and
> >> trying to reconstruct the init state in software.
> > We can't use XSAVE directly in the slowpath. We need to reconstruct the
> > init state. We have the mxcsr quirk. We would need just to extend it and
> > set the FP area to init state if the FP state is missing like we do in
> > fpstate_sanitize_xstate().
> 
> Can you remind me why we can't use XSAVE directly in the slow path?

Where to?
In the fastpath we XSAVE directly to task's stack. We are in the
slowpath because this failed. Task's FPU-state is using compacted form.
So we use this as source and copy_to_user() to task's stack.
I don't think we can XSAVE to task's FPU-state because the compacted
form may need less memory than the non-compacted form.

Currently I'm leaning towards cleaning the FP area so we behave like
XSAVE does. Independently of that, I would like to revert that commit.
Based on the comment and patch description it does not say that it fixes
a real problem. It *may* fix something.

Sebastian
Dave Hansen April 26, 2019, 7:04 p.m. UTC | #5
On 4/26/19 9:50 AM, Sebastian Andrzej Siewior wrote:
> On 2019-04-26 09:33:28 [-0700], Dave Hansen wrote:
>> On 4/26/19 12:26 AM, Sebastian Andrzej Siewior wrote:
>>>> That's just a guess, though.
>>>>
>>>> If we care, I think we should just use XSAVE instead of XSAVEOPT and
>>>> trying to reconstruct the init state in software.
>>> We can't use XSAVE directly in the slowpath. We need to reconstruct the
>>> init state. We have the mxcsr quirk. We would need just to extend it and
>>> set the FP area to init state if the FP state is missing like we do in
>>> fpstate_sanitize_xstate().
>>
>> Can you remind me why we can't use XSAVE directly in the slow path?
> 
> Where to? In the fastpath we XSAVE directly to task's stack. We are
> in the slowpath because this failed.
I'm looking at the code and having a bit of a hard time connecting it to
what you're saying here.

We are in copy_fpstate_to_sigframe(), right?  Let's assume we are
using_compacted_format().  If copy_fpregs_to_sigframe() fails to copy,
we return immediately and never get to the save_xstate_epilog() code in
question.

So, to even get to save_xstate_epilog(), we had to do a *successful*
copy_fpstate_to_sigframe() which, on XSAVE systems will use
copy_xregs_to_user() which already uses plain XSAVE (not XSAVEOPT).

save_xstate_epilog() goes and tries to set XFEATURE_MASK_FPSSE on this
XSAVE (literally the XSAVE instruction) generated header.

But, if we're dealing with header.xfeatures generated by an XSAVE with
the RFBM=-1, I don't understand how header.xfeatures could ever *not*
have XFEATURE_MASK_FPSSE set.  The only way would be if we had gotten
here after using FXSAVE, but I believe that's impossible since those
systems have use_xsave()==0.

IOW, I think the patch is right, but I'm not sure I totally agree with
the reasoning.
Sebastian Andrzej Siewior April 26, 2019, 8:05 p.m. UTC | #6
On 2019-04-26 12:04:55 [-0700], Dave Hansen wrote:
> I'm looking at the code and having a bit of a hard time connecting it to
> what you're saying here.
> 
> We are in copy_fpstate_to_sigframe(), right?  Let's assume we are
> using_compacted_format().  If copy_fpregs_to_sigframe() fails to copy,
> we return immediately and never get to the save_xstate_epilog() code in
> question.
> 
> So, to even get to save_xstate_epilog(), we had to do a *successful*
> copy_fpstate_to_sigframe() which, on XSAVE systems will use
> copy_xregs_to_user() which already uses plain XSAVE (not XSAVEOPT).

We are in:

copy_fpstate_to_sigframe()
 |
 copy_fpregs_to_sigframe() fails.
 |
 using_compacted_format()
  |
  copy_xstate_to_user()
   |
   __copy_xstate_to_user() the xstate_header
   |
   for (i = 0; i < XFEATURE_MAX; i++)
    copy only XFEATURE_SSE + XFEATURE_YMM, other aren't set.
   |
   xfeatures_mxcsr_quirk() true, we copy
   |
   __copy_xstate_to_user() the xstate_fx_sw_bytes

Now. The FP part of the xsave has never been touched which means the
bytes are not initialized. The are is filled with random conten on user
stack. The saved xfeatures say only (XFEATURE_SSE | XFEATURE_YMM) so a
XRESTOR of that gives us exactly what we stored.

> save_xstate_epilog() goes and tries to set XFEATURE_MASK_FPSSE on this
> XSAVE (literally the XSAVE instruction) generated header.

Correct. But we never saved the "data" for XFEATURE_MASK_FP and, as noted
above, the content is random garbage on the stack.
Be setting XFEATURE_MASK_FPSSE we claim that suddenly that the FP part
of the XSAVE area is valid which is not the case.

> But, if we're dealing with header.xfeatures generated by an XSAVE with
> the RFBM=-1, I don't understand how header.xfeatures could ever *not*
> have XFEATURE_MASK_FPSSE set.  The only way would be if we had gotten
> here after using FXSAVE, but I believe that's impossible since those
> systems have use_xsave()==0.

Look at the code path above. XSAVES stored the state in task's
&fpu->state.xsave using XSAVES and we copied the *saved* bits so it
looks like XSAVE. XFEATURE_MASK_FP was not set, so we skipped that area.

> IOW, I think the patch is right, but I'm not sure I totally agree with
> the reasoning.

As I described in the path description: XSAVE (the fastpath, the code
before I started touching it) would also not set XFEATURE_MASK_FP in
xfeatures *but* would set FP area to its ini-state. Therefore the
unconditional OR of XFEATURE_MASK_FPSSE does not hurt because the FP
area was initialized to its initstate. The following restore would load
a valid FP state.

Sebastian
Dave Hansen April 26, 2019, 8:44 p.m. UTC | #7
On 4/26/19 1:05 PM, Sebastian Andrzej Siewior wrote:
> 
> copy_fpstate_to_sigframe()
>  |
>  copy_fpregs_to_sigframe() fails.
>  |
>  using_compacted_format()

Aw, crud.  I was looking at Linus's tree.  Sorry about that.  In Linus's
tree, if copy_fpregs_to_sigframe() fails, we just return from
copy_fpstate_to_sigframe() immediately.  In other words, either XSAVE
works, and we don't have this issue, or XSAVE fails and we don't do the
fixup code in question.

I'll say it again, though: I think we should be using the XSAVE
instruction to save this state.  If we need to map some pages in order
to make XSAVE work, then I think we should bring some pages in and we
can do *that* in the slow path.  We can even do that without taking page
faults by just calling get_user_pages() for instance.

Or, the slow path could be to fpregs_unlock(), zero the user area
(taking and handling page faults), then fpregs_lock() and retry the
copy_fpregs_to_sigframe().
diff mbox series

Patch

diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
index 7026f1c4e5e30..6c66203b19f3c 100644
--- a/arch/x86/kernel/fpu/signal.c
+++ b/arch/x86/kernel/fpu/signal.c
@@ -81,7 +81,6 @@  static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
 {
 	struct xregs_state __user *x = buf;
 	struct _fpx_sw_bytes *sw_bytes;
-	u32 xfeatures;
 	int err;
 
 	/* Setup the bytes not touched by the [f]xsave and reserved for SW. */
@@ -93,28 +92,6 @@  static inline int save_xstate_epilog(void __user *buf, int ia32_frame)
 
 	err |= __put_user(FP_XSTATE_MAGIC2,
 			  (__u32 __user *)(buf + fpu_user_xstate_size));
-
-	/*
-	 * Read the xfeatures which we copied (directly from the cpu or
-	 * from the state in task struct) to the user buffers.
-	 */
-	err |= __get_user(xfeatures, (__u32 __user *)&x->header.xfeatures);
-
-	/*
-	 * For legacy compatible, we always set FP/SSE bits in the bit
-	 * vector while saving the state to the user context. This will
-	 * enable us capturing any changes(during sigreturn) to
-	 * the FP/SSE bits by the legacy applications which don't touch
-	 * xfeatures in the xsave header.
-	 *
-	 * xsave aware apps can change the xfeatures in the xsave
-	 * header as well as change any contents in the memory layout.
-	 * xrestore as part of sigreturn will capture all the changes.
-	 */
-	xfeatures |= XFEATURE_MASK_FPSSE;
-
-	err |= __put_user(xfeatures, (__u32 __user *)&x->header.xfeatures);
-
 	return err;
 }