diff mbox series

arm64/signal: Don't assume that TIF_SVE means we saved SVE state

Message ID 20240119-arm64-sve-signal-regs-v1-1-b9fd61b0289a@kernel.org (mailing list archive)
State New, archived
Headers show
Series arm64/signal: Don't assume that TIF_SVE means we saved SVE state | expand

Commit Message

Mark Brown Jan. 19, 2024, 12:29 p.m. UTC
When we are in a syscall we will only save the FPSIMD subset even though
the task still has access to the full register set, and on context switch
we will only remove TIF_SVE when loading the register state. This means
that the signal handling code should not assume that TIF_SVE means that
the register state is stored in SVE format, it should instead check the
format that was recorded during save.

Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc:  <stable@vger.kernel.org>
---
 arch/arm64/kernel/fpsimd.c | 2 +-
 arch/arm64/kernel/signal.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)


---
base-commit: 0dd3ee31125508cd67f7e7172247f05b7fd1753a
change-id: 20240118-arm64-sve-signal-regs-5711e0d10425

Best regards,

Comments

Dave Martin Jan. 19, 2024, 4:31 p.m. UTC | #1
On Fri, Jan 19, 2024 at 12:29:13PM +0000, Mark Brown wrote:
> When we are in a syscall we will only save the FPSIMD subset even though
> the task still has access to the full register set, and on context switch

(Pedantic nit: "A even if B" (= "A applies even in that subset of cases
where B"), instead of "A even though B" (= "A applies notwithstanding
that it is always the case that B") (?)  If the SVE trapping were
ripped out altogether, it would be a different and rather simpler
story...)

> we will only remove TIF_SVE when loading the register state. This means
> that the signal handling code should not assume that TIF_SVE means that
> the register state is stored in SVE format, it should instead check the
> format that was recorded during save.
> 
> Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
> Signed-off-by: Mark Brown <broonie@kernel.org>
> Cc:  <stable@vger.kernel.org>
> ---
>  arch/arm64/kernel/fpsimd.c | 2 +-
>  arch/arm64/kernel/signal.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 1559c706d32d..80133c190136 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -1626,7 +1626,7 @@ void fpsimd_preserve_current_state(void)
>  void fpsimd_signal_preserve_current_state(void)
>  {
>  	fpsimd_preserve_current_state();
> -	if (test_thread_flag(TIF_SVE))
> +	if (current->thread.fp_type == FP_STATE_SVE)
>  		sve_to_fpsimd(current);
>  }
>  
> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index 0e8beb3349ea..425b1bc17a3f 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -242,7 +242,7 @@ static int preserve_sve_context(struct sve_context __user *ctx)
>  		vl = task_get_sme_vl(current);
>  		vq = sve_vq_from_vl(vl);
>  		flags |= SVE_SIG_FLAG_SM;
> -	} else if (test_thread_flag(TIF_SVE)) {
> +	} else if (current->thread.fp_type == FP_STATE_SVE) {
>  		vq = sve_vq_from_vl(vl);
>  	}
>  
> @@ -878,7 +878,7 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
>  	if (system_supports_sve() || system_supports_sme()) {
>  		unsigned int vq = 0;
>  
> -		if (add_all || test_thread_flag(TIF_SVE) ||
> +		if (add_all || current->thread.fp_type == FP_STATE_SVE ||
>  		    thread_sm_enabled(&current->thread)) {
>  			int vl = max(sve_max_vl(), sme_max_vl());
>  
> 
> ---
> base-commit: 0dd3ee31125508cd67f7e7172247f05b7fd1753a
> change-id: 20240118-arm64-sve-signal-regs-5711e0d10425
> 
> Best regards,
> -- 
> Mark Brown <broonie@kernel.org>
> 

[...]

If the historical meanings of TIF_SVE have been split up (which seems a
good idea), does that resolve all of the "bare"
test_thread_flag(TIF_SVE) that were still there?

If there are any others remaining, they may need looking at if there is
any question over what condition they are trying to test for...

Cheers
---Dave
Mark Brown Jan. 19, 2024, 5:47 p.m. UTC | #2
On Fri, Jan 19, 2024 at 04:31:13PM +0000, Dave Martin wrote:
> On Fri, Jan 19, 2024 at 12:29:13PM +0000, Mark Brown wrote:

> > When we are in a syscall we will only save the FPSIMD subset even though
> > the task still has access to the full register set, and on context switch

> (Pedantic nit: "A even if B" (= "A applies even in that subset of cases
> where B"), instead of "A even though B" (= "A applies notwithstanding
> that it is always the case that B") (?)  If the SVE trapping were
> ripped out altogether, it would be a different and rather simpler
> story...)

I really can't follow what you're trying to say here.  I'm not sure I
where the bit about "always" comes from here?

> If the historical meanings of TIF_SVE have been split up (which seems a
> good idea), does that resolve all of the "bare"
> test_thread_flag(TIF_SVE) that were still there?

There's a couple more, but this is all of them in the signal handling
code - I should have one or two more patches.  Most of the usage is
actually checking the trapping and therefore fine.
Dave Martin Jan. 23, 2024, 3:15 p.m. UTC | #3
On Fri, Jan 19, 2024 at 05:47:55PM +0000, Mark Brown wrote:
> On Fri, Jan 19, 2024 at 04:31:13PM +0000, Dave Martin wrote:
> > On Fri, Jan 19, 2024 at 12:29:13PM +0000, Mark Brown wrote:
> 
> > > When we are in a syscall we will only save the FPSIMD subset even though
> > > the task still has access to the full register set, and on context switch
> 
> > (Pedantic nit: "A even if B" (= "A applies even in that subset of cases
> > where B"), instead of "A even though B" (= "A applies notwithstanding
> > that it is always the case that B") (?)  If the SVE trapping were
> > ripped out altogether, it would be a different and rather simpler
> > story...)
> 
> I really can't follow what you're trying to say here.  I'm not sure I
> where the bit about "always" comes from here?

The sentence seemed to me to be lacking some context, but I still
haven't fully familiarised myself with the changes to the code...

If it's what you intended to write and nobody else is confused, it's
probably good.


> > If the historical meanings of TIF_SVE have been split up (which seems a
> > good idea), does that resolve all of the "bare"
> > test_thread_flag(TIF_SVE) that were still there?
> 
> There's a couple more, but this is all of them in the signal handling
> code - I should have one or two more patches.  Most of the usage is
> actually checking the trapping and therefore fine.

I see, I guess this area needs keeping an eye on generally, but if there
are no more cases considered urgent then I guess that's fine for now
(modulo other patches in flight).

Cheers
---Dave
Mark Brown Jan. 23, 2024, 6:55 p.m. UTC | #4
On Tue, Jan 23, 2024 at 03:15:27PM +0000, Dave Martin wrote:
> On Fri, Jan 19, 2024 at 05:47:55PM +0000, Mark Brown wrote:

> > There's a couple more, but this is all of them in the signal handling
> > code - I should have one or two more patches.  Most of the usage is
> > actually checking the trapping and therefore fine.

> I see, I guess this area needs keeping an eye on generally, but if there
> are no more cases considered urgent then I guess that's fine for now
> (modulo other patches in flight).

Yeah, it needs keeping an eye on going forwards and there should be at
least one more patch to come probably this week but orthogonal to this
one in content (similar changes in different parts of the code).
Will Deacon Jan. 30, 2024, 11:51 a.m. UTC | #5
On Fri, Jan 19, 2024 at 12:29:13PM +0000, Mark Brown wrote:
> When we are in a syscall we will only save the FPSIMD subset even though
> the task still has access to the full register set, and on context switch
> we will only remove TIF_SVE when loading the register state. This means
> that the signal handling code should not assume that TIF_SVE means that
> the register state is stored in SVE format, it should instead check the
> format that was recorded during save.
> 
> Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch")
> Signed-off-by: Mark Brown <broonie@kernel.org>
> Cc:  <stable@vger.kernel.org>
> ---
>  arch/arm64/kernel/fpsimd.c | 2 +-
>  arch/arm64/kernel/signal.c | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index 1559c706d32d..80133c190136 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -1626,7 +1626,7 @@ void fpsimd_preserve_current_state(void)
>  void fpsimd_signal_preserve_current_state(void)
>  {
>  	fpsimd_preserve_current_state();
> -	if (test_thread_flag(TIF_SVE))
> +	if (current->thread.fp_type == FP_STATE_SVE)
>  		sve_to_fpsimd(current);
>  }

I don't think this hunk applies on -rc2 ^^.

> diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
> index 0e8beb3349ea..425b1bc17a3f 100644
> --- a/arch/arm64/kernel/signal.c
> +++ b/arch/arm64/kernel/signal.c
> @@ -242,7 +242,7 @@ static int preserve_sve_context(struct sve_context __user *ctx)
>  		vl = task_get_sme_vl(current);
>  		vq = sve_vq_from_vl(vl);
>  		flags |= SVE_SIG_FLAG_SM;
> -	} else if (test_thread_flag(TIF_SVE)) {
> +	} else if (current->thread.fp_type == FP_STATE_SVE) {
>  		vq = sve_vq_from_vl(vl);
>  	}
>  
> @@ -878,7 +878,7 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
>  	if (system_supports_sve() || system_supports_sme()) {
>  		unsigned int vq = 0;
>  
> -		if (add_all || test_thread_flag(TIF_SVE) ||
> +		if (add_all || current->thread.fp_type == FP_STATE_SVE ||
>  		    thread_sm_enabled(&current->thread)) {
>  			int vl = max(sve_max_vl(), sme_max_vl());

I think this code is preemptible, so I'm struggling to understand what
happens if the fp_type changes under our feet as a result of a context
switch.

Will
Mark Brown Jan. 30, 2024, 2:09 p.m. UTC | #6
On Tue, Jan 30, 2024 at 11:51:07AM +0000, Will Deacon wrote:
> On Fri, Jan 19, 2024 at 12:29:13PM +0000, Mark Brown wrote:

> > -	if (test_thread_flag(TIF_SVE))
> > +	if (current->thread.fp_type == FP_STATE_SVE)
> >  		sve_to_fpsimd(current);
> >  }

> I don't think this hunk applies on -rc2 ^^.

Hrm, git seemed to figure out a rebase with no intervention - I've
thrown it at my CI and will resend assuming no changes from the rest of
the discussion.

> > -		if (add_all || test_thread_flag(TIF_SVE) ||
> > +		if (add_all || current->thread.fp_type == FP_STATE_SVE ||
> >  		    thread_sm_enabled(&current->thread)) {
> >  			int vl = max(sve_max_vl(), sme_max_vl());

> I think this code is preemptible, so I'm struggling to understand what
> happens if the fp_type changes under our feet as a result of a context
> switch.

We are relying here on having forced a flush of the floating point
register state prior to this code running, simple preemption won't
change the state from what was already saved.  The same consideration
also applies to the check for streaming mode here.

That said if this is preempted ptrace *could* come in and rewrite the
data or at worst change the vector length (which could leave us with
sve_state deallocated or a different size, possibly while we're in the
middle of accessing it).  This could also happen with the existing check
for TIF_SVE so I don't think there's anything new here - AFAICT this has
always been an issue with the vector code, unless I'm missing some
bigger thing which excludes ptrace.  I think any change that's needed
there won't overlap with this one, I'm looking.
Dave Martin Jan. 30, 2024, 2:44 p.m. UTC | #7
On Tue, Jan 30, 2024 at 02:09:34PM +0000, Mark Brown wrote:
> On Tue, Jan 30, 2024 at 11:51:07AM +0000, Will Deacon wrote:
> > On Fri, Jan 19, 2024 at 12:29:13PM +0000, Mark Brown wrote:
> 
> > > -	if (test_thread_flag(TIF_SVE))
> > > +	if (current->thread.fp_type == FP_STATE_SVE)
> > >  		sve_to_fpsimd(current);
> > >  }
> 
> > I don't think this hunk applies on -rc2 ^^.
> 
> Hrm, git seemed to figure out a rebase with no intervention - I've
> thrown it at my CI and will resend assuming no changes from the rest of
> the discussion.
> 
> > > -		if (add_all || test_thread_flag(TIF_SVE) ||
> > > +		if (add_all || current->thread.fp_type == FP_STATE_SVE ||
> > >  		    thread_sm_enabled(&current->thread)) {
> > >  			int vl = max(sve_max_vl(), sme_max_vl());
> 
> > I think this code is preemptible, so I'm struggling to understand what
> > happens if the fp_type changes under our feet as a result of a context
> > switch.
> 
> We are relying here on having forced a flush of the floating point
> register state prior to this code running, simple preemption won't
> change the state from what was already saved.  The same consideration
> also applies to the check for streaming mode here.
> 
> That said if this is preempted ptrace *could* come in and rewrite the
> data or at worst change the vector length (which could leave us with
> sve_state deallocated or a different size, possibly while we're in the
> middle of accessing it).  This could also happen with the existing check
> for TIF_SVE so I don't think there's anything new here - AFAICT this has
> always been an issue with the vector code, unless I'm missing some
> bigger thing which excludes ptrace.  I think any change that's needed
> there won't overlap with this one, I'm looking.

I'm pretty sure that terrible things will happen treewide if ptrace can
ever access or manipulate the internal state of a _running_ task.

I think the logic is that any ptrace call that can access or manipulate
the state of a task is gated on the task being ptrace-stopped.  Once we
have committed to deliveing a signal, we have obviously run past the
opportunity to stop (and hence be ptraced) on that signal.

Cases where a multiple signals are delivered before acutally reaching
userspace might want some thought.

I haven't tracked down the smokeproof gun in the code yet, though.


From memory, I think that the above forced flush was there to protect
against the context switch code rather than ptrace, and guarantees that
any change that ctxsw _might_ spontaneously make to the task state has
already been done and dusted before we do the actual signal delivery.
This may be a red herring so far as ptrace hazards are concerned.

Cheers
---Dave
Mark Brown Jan. 30, 2024, 2:53 p.m. UTC | #8
On Tue, Jan 30, 2024 at 02:44:51PM +0000, Dave Martin wrote:
> On Tue, Jan 30, 2024 at 02:09:34PM +0000, Mark Brown wrote:

> > That said if this is preempted ptrace *could* come in and rewrite the
> > data or at worst change the vector length (which could leave us with
> > sve_state deallocated or a different size, possibly while we're in the
> > middle of accessing it).  This could also happen with the existing check
> > for TIF_SVE so I don't think there's anything new here - AFAICT this has
> > always been an issue with the vector code, unless I'm missing some
> > bigger thing which excludes ptrace.  I think any change that's needed
> > there won't overlap with this one, I'm looking.

> I'm pretty sure that terrible things will happen treewide if ptrace can
> ever access or manipulate the internal state of a _running_ task.

> I think the logic is that any ptrace call that can access or manipulate
> the state of a task is gated on the task being ptrace-stopped.  Once we

...

> I haven't tracked down the smokeproof gun in the code yet, though.

Yes, exactly - this feels like something that surely must be handled
already with exclusion along the lines that you're describing but I
didn't yet spot exactly what the mechanism is.

> From memory, I think that the above forced flush was there to protect
> against the context switch code rather than ptrace, and guarantees that
> any change that ctxsw _might_ spontaneously make to the task state has
> already been done and dusted before we do the actual signal delivery.
> This may be a red herring so far as ptrace hazards are concerned.

Indeed, it's all about the current task and won't help at for ptrace.
Mark Brown Jan. 30, 2024, 3:42 p.m. UTC | #9
On Tue, Jan 30, 2024 at 02:44:51PM +0000, Dave Martin wrote:

> I think the logic is that any ptrace call that can access or manipulate
> the state of a task is gated on the task being ptrace-stopped.  Once we
> have committed to deliveing a signal, we have obviously run past the
> opportunity to stop (and hence be ptraced) on that signal.

This seems to be all there, the core ptrace and the signal handling code
talk to each other and ensure that we won't try to rewrite the state in
the middle of signal handling so we should be safe here.
Dave Martin Jan. 30, 2024, 3:47 p.m. UTC | #10
On Tue, Jan 30, 2024 at 02:53:45PM +0000, Mark Brown wrote:
> On Tue, Jan 30, 2024 at 02:44:51PM +0000, Dave Martin wrote:
> > On Tue, Jan 30, 2024 at 02:09:34PM +0000, Mark Brown wrote:
> 
> > > That said if this is preempted ptrace *could* come in and rewrite the
> > > data or at worst change the vector length (which could leave us with
> > > sve_state deallocated or a different size, possibly while we're in the
> > > middle of accessing it).  This could also happen with the existing check
> > > for TIF_SVE so I don't think there's anything new here - AFAICT this has
> > > always been an issue with the vector code, unless I'm missing some
> > > bigger thing which excludes ptrace.  I think any change that's needed
> > > there won't overlap with this one, I'm looking.
> 
> > I'm pretty sure that terrible things will happen treewide if ptrace can
> > ever access or manipulate the internal state of a _running_ task.
> 
> > I think the logic is that any ptrace call that can access or manipulate
> > the state of a task is gated on the task being ptrace-stopped.  Once we
> 
> ...
> 
> > I haven't tracked down the smokeproof gun in the code yet, though.
> 
> Yes, exactly - this feels like something that surely must be handled
> already with exclusion along the lines that you're describing but I
> didn't yet spot exactly what the mechanism is.

I think the critical thing is the task_is_traced() check in
kernel/ptrace.c.  This seems to be what gates every ptrace call that
requires a traced task (i.e., stopped under ptrace).

So long as nothing during the delivery of a single signal can trigger a
ptrace-stop itself, I think ptrace can't get in the middle of it.  This
would imply calling the signal delivery code recursively (not just
raising one signal while delivering another).  I'd be pretty confident
that this is meant to be impossible from a design standpoint.


> > From memory, I think that the above forced flush was there to protect
> > against the context switch code rather than ptrace, and guarantees that
> > any change that ctxsw _might_ spontaneously make to the task state has
> > already been done and dusted before we do the actual signal delivery.
> > This may be a red herring so far as ptrace hazards are concerned.
> 
> Indeed, it's all about the current task and won't help at for ptrace.

Agreed

Cheers
---Dave
diff mbox series

Patch

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 1559c706d32d..80133c190136 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1626,7 +1626,7 @@  void fpsimd_preserve_current_state(void)
 void fpsimd_signal_preserve_current_state(void)
 {
 	fpsimd_preserve_current_state();
-	if (test_thread_flag(TIF_SVE))
+	if (current->thread.fp_type == FP_STATE_SVE)
 		sve_to_fpsimd(current);
 }
 
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 0e8beb3349ea..425b1bc17a3f 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -242,7 +242,7 @@  static int preserve_sve_context(struct sve_context __user *ctx)
 		vl = task_get_sme_vl(current);
 		vq = sve_vq_from_vl(vl);
 		flags |= SVE_SIG_FLAG_SM;
-	} else if (test_thread_flag(TIF_SVE)) {
+	} else if (current->thread.fp_type == FP_STATE_SVE) {
 		vq = sve_vq_from_vl(vl);
 	}
 
@@ -878,7 +878,7 @@  static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
 	if (system_supports_sve() || system_supports_sme()) {
 		unsigned int vq = 0;
 
-		if (add_all || test_thread_flag(TIF_SVE) ||
+		if (add_all || current->thread.fp_type == FP_STATE_SVE ||
 		    thread_sm_enabled(&current->thread)) {
 			int vl = max(sve_max_vl(), sme_max_vl());