diff mbox series

[v2,11/12] arm64: BTI: Reset BTYPE when skipping emulated instructions

Message ID 1570733080-21015-12-git-send-email-Dave.Martin@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: ARMv8.5-A: Branch Target Identification support | expand

Commit Message

Dave Martin Oct. 10, 2019, 6:44 p.m. UTC
Since normal execution of any non-branch instruction resets the
PSTATE BTYPE field to 0, so do the same thing when emulating a
trapped instruction.

Branches don't trap directly, so we should never need to assign a
non-zero value to BTYPE here.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
---
 arch/arm64/kernel/traps.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Mark Rutland Oct. 11, 2019, 2:21 p.m. UTC | #1
On Thu, Oct 10, 2019 at 07:44:39PM +0100, Dave Martin wrote:
> Since normal execution of any non-branch instruction resets the
> PSTATE BTYPE field to 0, so do the same thing when emulating a
> trapped instruction.
> 
> Branches don't trap directly, so we should never need to assign a
> non-zero value to BTYPE here.
> 
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> ---
>  arch/arm64/kernel/traps.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 3af2768..4d8ce50 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -331,6 +331,8 @@ void arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size)
>  
>  	if (regs->pstate & PSR_MODE32_BIT)
>  		advance_itstate(regs);
> +	else
> +		regs->pstate &= ~(u64)PSR_BTYPE_MASK;

This looks good to me, with one nit below.

We don't (currently) need the u64 cast here, and it's inconsistent with
what we do elsewhere. If the upper 32-bit of pstate get allocated, we'll
need to fix up all the other masking we do:

[mark@lakrids:~/src/linux]% git grep 'pstate &= ~'
arch/arm64/kernel/armv8_deprecated.c:           regs->pstate &= ~PSR_AA32_E_BIT;
arch/arm64/kernel/cpufeature.c:         regs->pstate &= ~PSR_SSBS_BIT;
arch/arm64/kernel/debug-monitors.c:     regs->pstate &= ~DBG_SPSR_SS;
arch/arm64/kernel/insn.c:       pstate &= ~(pstate >> 1);       /* PSR_C_BIT &= ~PSR_Z_BIT */
arch/arm64/kernel/insn.c:       pstate &= ~(pstate >> 1);       /* PSR_C_BIT &= ~PSR_Z_BIT */
arch/arm64/kernel/probes/kprobes.c:     regs->pstate &= ~PSR_D_BIT;
arch/arm64/kernel/probes/kprobes.c:     regs->pstate &= ~DAIF_MASK;
arch/arm64/kernel/ptrace.c:     regs->pstate &= ~SPSR_EL1_AARCH32_RES0_BITS;
arch/arm64/kernel/ptrace.c:                     regs->pstate &= ~PSR_AA32_E_BIT;
arch/arm64/kernel/ptrace.c:     regs->pstate &= ~SPSR_EL1_AARCH64_RES0_BITS;
arch/arm64/kernel/ptrace.c:             regs->pstate &= ~DBG_SPSR_SS;
arch/arm64/kernel/ssbd.c:       task_pt_regs(task)->pstate &= ~val;
arch/arm64/kernel/traps.c:      regs->pstate &= ~PSR_AA32_IT_MASK;

... and at that point I'd suggest we should just ensure the bit
definitions are all defined as unsigned long in the first place since
adding casts to each use is error-prone.

Thanks,
Mark.
Dave Martin Oct. 11, 2019, 2:47 p.m. UTC | #2
On Fri, Oct 11, 2019 at 03:21:58PM +0100, Mark Rutland wrote:
> On Thu, Oct 10, 2019 at 07:44:39PM +0100, Dave Martin wrote:
> > Since normal execution of any non-branch instruction resets the
> > PSTATE BTYPE field to 0, so do the same thing when emulating a
> > trapped instruction.
> > 
> > Branches don't trap directly, so we should never need to assign a
> > non-zero value to BTYPE here.
> > 
> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > ---
> >  arch/arm64/kernel/traps.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > index 3af2768..4d8ce50 100644
> > --- a/arch/arm64/kernel/traps.c
> > +++ b/arch/arm64/kernel/traps.c
> > @@ -331,6 +331,8 @@ void arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size)
> >  
> >  	if (regs->pstate & PSR_MODE32_BIT)
> >  		advance_itstate(regs);
> > +	else
> > +		regs->pstate &= ~(u64)PSR_BTYPE_MASK;
> 
> This looks good to me, with one nit below.
> 
> We don't (currently) need the u64 cast here, and it's inconsistent with
> what we do elsewhere. If the upper 32-bit of pstate get allocated, we'll
> need to fix up all the other masking we do:

Huh, looks like I missed that.  Dang.  Will fix.

> [mark@lakrids:~/src/linux]% git grep 'pstate &= ~'
> arch/arm64/kernel/armv8_deprecated.c:           regs->pstate &= ~PSR_AA32_E_BIT;
> arch/arm64/kernel/cpufeature.c:         regs->pstate &= ~PSR_SSBS_BIT;
> arch/arm64/kernel/debug-monitors.c:     regs->pstate &= ~DBG_SPSR_SS;
> arch/arm64/kernel/insn.c:       pstate &= ~(pstate >> 1);       /* PSR_C_BIT &= ~PSR_Z_BIT */
> arch/arm64/kernel/insn.c:       pstate &= ~(pstate >> 1);       /* PSR_C_BIT &= ~PSR_Z_BIT */
> arch/arm64/kernel/probes/kprobes.c:     regs->pstate &= ~PSR_D_BIT;
> arch/arm64/kernel/probes/kprobes.c:     regs->pstate &= ~DAIF_MASK;
> arch/arm64/kernel/ptrace.c:     regs->pstate &= ~SPSR_EL1_AARCH32_RES0_BITS;
> arch/arm64/kernel/ptrace.c:                     regs->pstate &= ~PSR_AA32_E_BIT;
> arch/arm64/kernel/ptrace.c:     regs->pstate &= ~SPSR_EL1_AARCH64_RES0_BITS;
> arch/arm64/kernel/ptrace.c:             regs->pstate &= ~DBG_SPSR_SS;
> arch/arm64/kernel/ssbd.c:       task_pt_regs(task)->pstate &= ~val;
> arch/arm64/kernel/traps.c:      regs->pstate &= ~PSR_AA32_IT_MASK;
> 
> ... and at that point I'd suggest we should just ensure the bit
> definitions are all defined as unsigned long in the first place since
> adding casts to each use is error-prone.

Are we concerned about changing the types of UAPI #defines?  That can
cause subtle and unexpected breakage, especially when the signedness
of a #define changes.

Ideally, we'd just change all these to 1UL << n.

Cheers
---Dave
Mark Rutland Oct. 18, 2019, 11:04 a.m. UTC | #3
On Fri, Oct 11, 2019 at 03:47:43PM +0100, Dave Martin wrote:
> On Fri, Oct 11, 2019 at 03:21:58PM +0100, Mark Rutland wrote:
> > On Thu, Oct 10, 2019 at 07:44:39PM +0100, Dave Martin wrote:
> > > Since normal execution of any non-branch instruction resets the
> > > PSTATE BTYPE field to 0, so do the same thing when emulating a
> > > trapped instruction.
> > > 
> > > Branches don't trap directly, so we should never need to assign a
> > > non-zero value to BTYPE here.
> > > 
> > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > ---
> > >  arch/arm64/kernel/traps.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > index 3af2768..4d8ce50 100644
> > > --- a/arch/arm64/kernel/traps.c
> > > +++ b/arch/arm64/kernel/traps.c
> > > @@ -331,6 +331,8 @@ void arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size)
> > >  
> > >  	if (regs->pstate & PSR_MODE32_BIT)
> > >  		advance_itstate(regs);
> > > +	else
> > > +		regs->pstate &= ~(u64)PSR_BTYPE_MASK;
> > 
> > This looks good to me, with one nit below.
> > 
> > We don't (currently) need the u64 cast here, and it's inconsistent with
> > what we do elsewhere. If the upper 32-bit of pstate get allocated, we'll
> > need to fix up all the other masking we do:
> 
> Huh, looks like I missed that.  Dang.  Will fix.
> 
> > [mark@lakrids:~/src/linux]% git grep 'pstate &= ~'
> > arch/arm64/kernel/armv8_deprecated.c:           regs->pstate &= ~PSR_AA32_E_BIT;
> > arch/arm64/kernel/cpufeature.c:         regs->pstate &= ~PSR_SSBS_BIT;
> > arch/arm64/kernel/debug-monitors.c:     regs->pstate &= ~DBG_SPSR_SS;
> > arch/arm64/kernel/insn.c:       pstate &= ~(pstate >> 1);       /* PSR_C_BIT &= ~PSR_Z_BIT */
> > arch/arm64/kernel/insn.c:       pstate &= ~(pstate >> 1);       /* PSR_C_BIT &= ~PSR_Z_BIT */
> > arch/arm64/kernel/probes/kprobes.c:     regs->pstate &= ~PSR_D_BIT;
> > arch/arm64/kernel/probes/kprobes.c:     regs->pstate &= ~DAIF_MASK;
> > arch/arm64/kernel/ptrace.c:     regs->pstate &= ~SPSR_EL1_AARCH32_RES0_BITS;
> > arch/arm64/kernel/ptrace.c:                     regs->pstate &= ~PSR_AA32_E_BIT;
> > arch/arm64/kernel/ptrace.c:     regs->pstate &= ~SPSR_EL1_AARCH64_RES0_BITS;
> > arch/arm64/kernel/ptrace.c:             regs->pstate &= ~DBG_SPSR_SS;
> > arch/arm64/kernel/ssbd.c:       task_pt_regs(task)->pstate &= ~val;
> > arch/arm64/kernel/traps.c:      regs->pstate &= ~PSR_AA32_IT_MASK;
> > 
> > ... and at that point I'd suggest we should just ensure the bit
> > definitions are all defined as unsigned long in the first place since
> > adding casts to each use is error-prone.
> 
> Are we concerned about changing the types of UAPI #defines?  That can
> cause subtle and unexpected breakage, especially when the signedness
> of a #define changes.
> 
> Ideally, we'd just change all these to 1UL << n.

I agree that's the ideal -- I don't know how concerned we are w.r.t. the
UAPI headers, I'm afraid.

Thanks,
Mark.
Dave Martin Oct. 18, 2019, 2:49 p.m. UTC | #4
On Fri, Oct 18, 2019 at 12:04:29PM +0100, Mark Rutland wrote:
> On Fri, Oct 11, 2019 at 03:47:43PM +0100, Dave Martin wrote:
> > On Fri, Oct 11, 2019 at 03:21:58PM +0100, Mark Rutland wrote:
> > > On Thu, Oct 10, 2019 at 07:44:39PM +0100, Dave Martin wrote:
> > > > Since normal execution of any non-branch instruction resets the
> > > > PSTATE BTYPE field to 0, so do the same thing when emulating a
> > > > trapped instruction.
> > > > 
> > > > Branches don't trap directly, so we should never need to assign a
> > > > non-zero value to BTYPE here.
> > > > 
> > > > Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> > > > ---
> > > >  arch/arm64/kernel/traps.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > > > index 3af2768..4d8ce50 100644
> > > > --- a/arch/arm64/kernel/traps.c
> > > > +++ b/arch/arm64/kernel/traps.c
> > > > @@ -331,6 +331,8 @@ void arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size)
> > > >  
> > > >  	if (regs->pstate & PSR_MODE32_BIT)
> > > >  		advance_itstate(regs);
> > > > +	else
> > > > +		regs->pstate &= ~(u64)PSR_BTYPE_MASK;
> > > 
> > > This looks good to me, with one nit below.
> > > 
> > > We don't (currently) need the u64 cast here, and it's inconsistent with
> > > what we do elsewhere. If the upper 32-bit of pstate get allocated, we'll
> > > need to fix up all the other masking we do:
> > 
> > Huh, looks like I missed that.  Dang.  Will fix.
> > 
> > > [mark@lakrids:~/src/linux]% git grep 'pstate &= ~'
> > > arch/arm64/kernel/armv8_deprecated.c:           regs->pstate &= ~PSR_AA32_E_BIT;
> > > arch/arm64/kernel/cpufeature.c:         regs->pstate &= ~PSR_SSBS_BIT;
> > > arch/arm64/kernel/debug-monitors.c:     regs->pstate &= ~DBG_SPSR_SS;
> > > arch/arm64/kernel/insn.c:       pstate &= ~(pstate >> 1);       /* PSR_C_BIT &= ~PSR_Z_BIT */
> > > arch/arm64/kernel/insn.c:       pstate &= ~(pstate >> 1);       /* PSR_C_BIT &= ~PSR_Z_BIT */
> > > arch/arm64/kernel/probes/kprobes.c:     regs->pstate &= ~PSR_D_BIT;
> > > arch/arm64/kernel/probes/kprobes.c:     regs->pstate &= ~DAIF_MASK;
> > > arch/arm64/kernel/ptrace.c:     regs->pstate &= ~SPSR_EL1_AARCH32_RES0_BITS;
> > > arch/arm64/kernel/ptrace.c:                     regs->pstate &= ~PSR_AA32_E_BIT;
> > > arch/arm64/kernel/ptrace.c:     regs->pstate &= ~SPSR_EL1_AARCH64_RES0_BITS;
> > > arch/arm64/kernel/ptrace.c:             regs->pstate &= ~DBG_SPSR_SS;
> > > arch/arm64/kernel/ssbd.c:       task_pt_regs(task)->pstate &= ~val;
> > > arch/arm64/kernel/traps.c:      regs->pstate &= ~PSR_AA32_IT_MASK;
> > > 
> > > ... and at that point I'd suggest we should just ensure the bit
> > > definitions are all defined as unsigned long in the first place since
> > > adding casts to each use is error-prone.
> > 
> > Are we concerned about changing the types of UAPI #defines?  That can
> > cause subtle and unexpected breakage, especially when the signedness
> > of a #define changes.
> > 
> > Ideally, we'd just change all these to 1UL << n.
> 
> I agree that's the ideal -- I don't know how concerned we are w.r.t. the
> UAPI headers, I'm afraid.

OK, I'll following the existing convention for now, keep the #define as
(implicitly) signed, and drop the u64 casts.

At some point in the future we may want to refactor the headers so that
the kernel uses shadow register bit definitions that are always u64.
The new HWCAP definitions provide a reasonable template for doing that
kind of thing.

It's probably best not to do anything to alter the types of the UAPI
definitions.

I will shamelessly duck this for now :|

Cheers
---Dave
diff mbox series

Patch

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 3af2768..4d8ce50 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -331,6 +331,8 @@  void arm64_skip_faulting_instruction(struct pt_regs *regs, unsigned long size)
 
 	if (regs->pstate & PSR_MODE32_BIT)
 		advance_itstate(regs);
+	else
+		regs->pstate &= ~(u64)PSR_BTYPE_MASK;
 }
 
 static LIST_HEAD(undef_hook);