diff mbox series

[1/3] KVM: arm64: Don't set PSTATE.SS when Software Step state is Active-pending

Message ID 20220909044636.1997755-2-reijiw@google.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Fix a bug of single-step execution enabled by userspace | expand

Commit Message

Reiji Watanabe Sept. 9, 2022, 4:46 a.m. UTC
Currently, PSTATE.SS is set on every guest entry if single-step is
enabled for the vCPU by userspace.  However, it could cause extra
single-step execution without returning to userspace, which shouldn't
be performed, if the Software Step state at the last guest exit was
Active-pending (i.e. the last exit was not triggered by Software Step
exception, but by an asynchronous exception after the single-step
execution is performed).

Fix this by not setting PSTATE.SS on guest entry if the Software
Step state at the last exit was Active-pending.

Fixes: 337b99bf7edf ("KVM: arm64: guest debug, add support for single-step")
Signed-off-by: Reiji Watanabe <reijiw@google.com>
---
 arch/arm64/include/asm/kvm_host.h |  3 +++
 arch/arm64/kvm/debug.c            | 19 ++++++++++++++++++-
 arch/arm64/kvm/guest.c            |  1 +
 arch/arm64/kvm/handle_exit.c      |  2 ++
 4 files changed, 24 insertions(+), 1 deletion(-)

Comments

Marc Zyngier Sept. 9, 2022, 12:41 p.m. UTC | #1
Hi Reiji,

On Fri, 09 Sep 2022 05:46:34 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> Currently, PSTATE.SS is set on every guest entry if single-step is
> enabled for the vCPU by userspace.  However, it could cause extra
> single-step execution without returning to userspace, which shouldn't
> be performed, if the Software Step state at the last guest exit was
> Active-pending (i.e. the last exit was not triggered by Software Step
> exception, but by an asynchronous exception after the single-step
> execution is performed).

For my own enlightenment, could you describe a sequence of events that
leads to this issue?

> 
> Fix this by not setting PSTATE.SS on guest entry if the Software
> Step state at the last exit was Active-pending.
> 
> Fixes: 337b99bf7edf ("KVM: arm64: guest debug, add support for single-step")
> Signed-off-by: Reiji Watanabe <reijiw@google.com>
> ---
>  arch/arm64/include/asm/kvm_host.h |  3 +++
>  arch/arm64/kvm/debug.c            | 19 ++++++++++++++++++-
>  arch/arm64/kvm/guest.c            |  1 +
>  arch/arm64/kvm/handle_exit.c      |  2 ++
>  4 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index e9c9388ccc02..4cf6eef02565 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -535,6 +535,9 @@ struct kvm_vcpu_arch {
>  #define IN_WFIT			__vcpu_single_flag(sflags, BIT(3))
>  /* vcpu system registers loaded on physical CPU */
>  #define SYSREGS_ON_CPU		__vcpu_single_flag(sflags, BIT(4))
> +/* Software step state is Active-pending */
> +#define DBG_SS_ACTIVE_PENDING	__vcpu_single_flag(sflags, BIT(5))
> +
>  
>  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
>  #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +	\
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 0b28d7db7c76..125cfb94b4ad 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -188,7 +188,16 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
>  		 * debugging the system.
>  		 */
>  		if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
> -			*vcpu_cpsr(vcpu) |=  DBG_SPSR_SS;
> +			/*
> +			 * If the software step state at the last guest exit
> +			 * was Active-pending, we don't set DBG_SPSR_SS so
> +			 * that the state is maintained (to not run another
> +			 * single-step until the pending Software Step
> +			 * exception is taken).
> +			 */
> +			if (!vcpu_get_flag(vcpu, DBG_SS_ACTIVE_PENDING))
> +				*vcpu_cpsr(vcpu) |= DBG_SPSR_SS;

I guess my confusion stems from my (probably wrong) interpretation if
the SS state is A+P, there is no harm in making it pending again
(setting the SS bit in PSTATE).

> +
>  			mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
>  			mdscr |= DBG_MDSCR_SS;

But it looks like the *pending* state is actually stored in MDSCR
instead? The spec only mentions this for the A+P state, so this is
quite likely a bug indeed.

Now, where does the asynchronous exception comes into play? I found
this intriguing remark in the ARM ARM:

<quote>
The Software Step exception has higher priority than all other types
of synchronous exception. However, the prioritization of this
exception with respect to any unmasked pending asynchronous exception
is not defined by the architecture.
</quote>

Is this what you were referring to in the commit message? I think you
need to spell it out for us, as I don't fully understand what you are
fixing nor do I understand the gory details of single-stepping...

Thanks,

	M.
Reiji Watanabe Sept. 10, 2022, 4:12 a.m. UTC | #2
Hi Marc,

On Fri, Sep 9, 2022 at 5:41 AM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Reiji,
>
> On Fri, 09 Sep 2022 05:46:34 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> >
> > Currently, PSTATE.SS is set on every guest entry if single-step is
> > enabled for the vCPU by userspace.  However, it could cause extra
> > single-step execution without returning to userspace, which shouldn't
> > be performed, if the Software Step state at the last guest exit was
> > Active-pending (i.e. the last exit was not triggered by Software Step
> > exception, but by an asynchronous exception after the single-step
> > execution is performed).
>
> For my own enlightenment, could you describe a sequence of events that
> leads to this issue?

Here is an example of the sequences.

 [Usersace]
  | - ioctl(SET_GUEST_DEBUG)
  | - ioctl(KVM_RUN) (vCPU PC==X)
  v
 [KVM]
  | - *vcpu_cpsr(vcpu) |= SPSR_SS;
  | - mdscr |= DBG_MDSCR_SS;
  | - VM Entry
  v
 [Guest] vCPU PC==X
  | - Execute an instruction at PC==X.
  |   PC is updated with X+4, and PSTATE.SS is cleared.
  |
  | !! Asynchronous exception !!
  v
 [KVM] vCPU PC==X+4
  | - The kernel processes the async exception.
  | - handle_exit() returns 1 (don't return to userspace)
  | - *vcpu_cpsr(vcpu) |= SPSR_SS;
  | - mdscr |= DBG_MDSCR_SS;
  | - VM Entry
  v
 [Guest] vCPU PC==X+4
  | - Execute an instruction at PC==X+4.
  |   PC is updated with X+8, PSTATE.SS is cleared.
  |
  | !! Software Step Exception !!
  v
 [KVM] vCPU PC==X+8
  | - run->exit_reason = KVM_EXIT_DEBUG;
  | - Return to userspace
  v
 [Userspace]
    - Userspace sees PC==X+8 (Userspace didn't see PC==X+4).


> > Fix this by not setting PSTATE.SS on guest entry if the Software
> > Step state at the last exit was Active-pending.
> >
> > Fixes: 337b99bf7edf ("KVM: arm64: guest debug, add support for single-step")
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  3 +++
> >  arch/arm64/kvm/debug.c            | 19 ++++++++++++++++++-
> >  arch/arm64/kvm/guest.c            |  1 +
> >  arch/arm64/kvm/handle_exit.c      |  2 ++
> >  4 files changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index e9c9388ccc02..4cf6eef02565 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -535,6 +535,9 @@ struct kvm_vcpu_arch {
> >  #define IN_WFIT                      __vcpu_single_flag(sflags, BIT(3))
> >  /* vcpu system registers loaded on physical CPU */
> >  #define SYSREGS_ON_CPU               __vcpu_single_flag(sflags, BIT(4))
> > +/* Software step state is Active-pending */
> > +#define DBG_SS_ACTIVE_PENDING        __vcpu_single_flag(sflags, BIT(5))
> > +
> >
> >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> >  #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +   \
> > diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> > index 0b28d7db7c76..125cfb94b4ad 100644
> > --- a/arch/arm64/kvm/debug.c
> > +++ b/arch/arm64/kvm/debug.c
> > @@ -188,7 +188,16 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
> >                * debugging the system.
> >                */
> >               if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
> > -                     *vcpu_cpsr(vcpu) |=  DBG_SPSR_SS;
> > +                     /*
> > +                      * If the software step state at the last guest exit
> > +                      * was Active-pending, we don't set DBG_SPSR_SS so
> > +                      * that the state is maintained (to not run another
> > +                      * single-step until the pending Software Step
> > +                      * exception is taken).
> > +                      */
> > +                     if (!vcpu_get_flag(vcpu, DBG_SS_ACTIVE_PENDING))
> > +                             *vcpu_cpsr(vcpu) |= DBG_SPSR_SS;
>
> I guess my confusion stems from my (probably wrong) interpretation if
> the SS state is A+P, there is no harm in making it pending again
> (setting the SS bit in PSTATE).

Setting the SS bit in PSTATE (with MDSCR_EL1.SS=1) makes the state
Active-not-pending (not Active-pending) if my interpretation of
the spec is correct.

> > +
> >                       mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
> >                       mdscr |= DBG_MDSCR_SS;
>
> But it looks like the *pending* state is actually stored in MDSCR
> instead? The spec only mentions this for the A+P state, so this is
> quite likely a bug indeed.

MDSCR_EL1.SS is set even in Active-not-pending state (not just
Active-pending state).  (Or do you mean the state is stored in
some other field in MDSCR ??)


> Now, where does the asynchronous exception comes into play? I found
> this intriguing remark in the ARM ARM:
>
> <quote>
> The Software Step exception has higher priority than all other types
> of synchronous exception. However, the prioritization of this
> exception with respect to any unmasked pending asynchronous exception
> is not defined by the architecture.
> </quote>
>
> Is this what you were referring to in the commit message? I think you
> need to spell it out for us, as I don't fully understand what you are
> fixing nor do I understand the gory details of single-stepping...

Yes, that is what I was referring to.
In "Figure D2-3 Software step state machine" in Arm ARM (DDI 0487I.a),
since KVM currently sets PSTATE.SS to 1 on every Guest Entry (when
single-step is enabled by userspace), KVM always has the CPU in
"Inactive" (the second inactive state from the top) transition to
"Active-not-pending" on the Guest Entry.  With this patch, KVM
have the CPU transitions to "Active-pending" if the state before
"Inactive" was "Active-pending", which indicates the step completed
but Software Step exception is not taken yet, so that Software
Step exception is taken before further single-step is executed.

I'm sorry for the unclear explanation, and
I hope my comments clarify the problem I'm trying to fix.

Thank you,
Reiji
Marc Zyngier Sept. 10, 2022, 9:58 a.m. UTC | #3
Hi Reiji,

On Sat, 10 Sep 2022 05:12:57 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> > > Currently, PSTATE.SS is set on every guest entry if single-step is
> > > enabled for the vCPU by userspace.  However, it could cause extra
> > > single-step execution without returning to userspace, which shouldn't
> > > be performed, if the Software Step state at the last guest exit was
> > > Active-pending (i.e. the last exit was not triggered by Software Step
> > > exception, but by an asynchronous exception after the single-step
> > > execution is performed).
> >
> > For my own enlightenment, could you describe a sequence of events that
> > leads to this issue?
> 
> Here is an example of the sequences.
> 
>  [Usersace]
>   | - ioctl(SET_GUEST_DEBUG)
>   | - ioctl(KVM_RUN) (vCPU PC==X)
>   v
>  [KVM]
>   | - *vcpu_cpsr(vcpu) |= SPSR_SS;
>   | - mdscr |= DBG_MDSCR_SS;
>   | - VM Entry
>   v
>  [Guest] vCPU PC==X
>   | - Execute an instruction at PC==X.
>   |   PC is updated with X+4, and PSTATE.SS is cleared.
>   |
>   | !! Asynchronous exception !!
>   v
>  [KVM] vCPU PC==X+4
>   | - The kernel processes the async exception.
>   | - handle_exit() returns 1 (don't return to userspace)
>   | - *vcpu_cpsr(vcpu) |= SPSR_SS;
>   | - mdscr |= DBG_MDSCR_SS;
>   | - VM Entry
>   v
>  [Guest] vCPU PC==X+4
>   | - Execute an instruction at PC==X+4.
>   |   PC is updated with X+8, PSTATE.SS is cleared.
>   |
>   | !! Software Step Exception !!
>   v
>  [KVM] vCPU PC==X+8
>   | - run->exit_reason = KVM_EXIT_DEBUG;
>   | - Return to userspace
>   v
>  [Userspace]
>     - Userspace sees PC==X+8 (Userspace didn't see PC==X+4).
>

OK, I think I get it now, and I got confused because of the naming
which is similar to what we use for interrupts, but the semantics are
very different (let's pretend that this is the cause of most of my
stupid questions earlier...).

The states are described as such:

Active+non-Pending: MDSCR.SS=1, PSTATE.SS=1
Active+Pending:     MDSCR.SS=1, PSTATE.SS=0

and it is the inversion of PSTATE.SS that got me.

The pending state describe the state of the *exception*. Before
executing the instruction, no exception is pending. Once executed, an
exception is pending.

Of course, if we get an interrupt right after a single step (and that
the implementation prioritises them over synchronous exceptions), we
exit because of the interrupt, and the bug you uncovered sends us back
to Active+non-Pending, losing the SS exception. Boo.

Your fix is to not set PSTATE.SS=1 until we have actually handled a
debug exception. In the above example, this would result in:

 [Guest] vCPU PC==X
  | - Execute an instruction at PC==X.
  |   PC is updated with X+4, and PSTATE.SS is cleared.
  |
  | !! Asynchronous exception !!
  v
 [KVM] vCPU PC==X+4
  | - The kernel processes the async exception.
  | - handle_exit() returns 1 (don't return to userspace)
  | - vcpu_set_flag(vcpu, DBG_SS_ACTIVE_PENDING);
  | - mdscr |= DBG_MDSCR_SS;
  | - VM Entry
  v
 [Guest] vCPU PC==X+4
  | - Pending SS exception
  |
  | !! Software Step Exception !!
  v
 [KVM] vCPU PC==X+4
  | - run->exit_reason = KVM_EXIT_DEBUG;
  | - vcpu_clear_flag(vcpu, DBG_SS_ACTIVE_PENDING);
  | - Return to userspace
  v
 [Userspace]
    - Userspace sees PC==X+4

It is amusing that we never saw that one before, but I guess the
IMPDEF nature of the exception prioritisation caught us here. Also,
VM debugging is a relatively rare use case.

> > Now, where does the asynchronous exception comes into play? I found
> > this intriguing remark in the ARM ARM:
> >
> > <quote>
> > The Software Step exception has higher priority than all other types
> > of synchronous exception. However, the prioritization of this
> > exception with respect to any unmasked pending asynchronous exception
> > is not defined by the architecture.
> > </quote>
> >
> > Is this what you were referring to in the commit message? I think you
> > need to spell it out for us, as I don't fully understand what you are
> > fixing nor do I understand the gory details of single-stepping...
> 
> Yes, that is what I was referring to.
> In "Figure D2-3 Software step state machine" in Arm ARM (DDI 0487I.a),
> since KVM currently sets PSTATE.SS to 1 on every Guest Entry (when
> single-step is enabled by userspace), KVM always has the CPU in
> "Inactive" (the second inactive state from the top) transition to
> "Active-not-pending" on the Guest Entry.  With this patch, KVM
> have the CPU transitions to "Active-pending" if the state before
> "Inactive" was "Active-pending", which indicates the step completed
> but Software Step exception is not taken yet, so that Software
> Step exception is taken before further single-step is executed.
> 
> I'm sorry for the unclear explanation, and
> I hope my comments clarify the problem I'm trying to fix.

Please do not apologise! This is excellent work, and I'm really glad
you got to the bottom of this. It'd be good to capture some of this
discussion in the commit message though, as I'm pretty we will all
blissfully forget all about it shortly!

	M.
Marc Zyngier Sept. 10, 2022, 10:36 a.m. UTC | #4
On Fri, 09 Sep 2022 05:46:34 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> Currently, PSTATE.SS is set on every guest entry if single-step is
> enabled for the vCPU by userspace.  However, it could cause extra
> single-step execution without returning to userspace, which shouldn't
> be performed, if the Software Step state at the last guest exit was
> Active-pending (i.e. the last exit was not triggered by Software Step
> exception, but by an asynchronous exception after the single-step
> execution is performed).
> 
> Fix this by not setting PSTATE.SS on guest entry if the Software
> Step state at the last exit was Active-pending.
> 
> Fixes: 337b99bf7edf ("KVM: arm64: guest debug, add support for single-step")
> Signed-off-by: Reiji Watanabe <reijiw@google.com>

Now that I'm a bit more clued about what the architecture actually
mandates, I can try and review this patch.

> ---
>  arch/arm64/include/asm/kvm_host.h |  3 +++
>  arch/arm64/kvm/debug.c            | 19 ++++++++++++++++++-
>  arch/arm64/kvm/guest.c            |  1 +
>  arch/arm64/kvm/handle_exit.c      |  2 ++
>  4 files changed, 24 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index e9c9388ccc02..4cf6eef02565 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -535,6 +535,9 @@ struct kvm_vcpu_arch {
>  #define IN_WFIT			__vcpu_single_flag(sflags, BIT(3))
>  /* vcpu system registers loaded on physical CPU */
>  #define SYSREGS_ON_CPU		__vcpu_single_flag(sflags, BIT(4))
> +/* Software step state is Active-pending */
> +#define DBG_SS_ACTIVE_PENDING	__vcpu_single_flag(sflags, BIT(5))
> +
>  
>  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
>  #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +	\
> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 0b28d7db7c76..125cfb94b4ad 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -188,7 +188,16 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
>  		 * debugging the system.
>  		 */
>  		if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
> -			*vcpu_cpsr(vcpu) |=  DBG_SPSR_SS;
> +			/*
> +			 * If the software step state at the last guest exit
> +			 * was Active-pending, we don't set DBG_SPSR_SS so
> +			 * that the state is maintained (to not run another
> +			 * single-step until the pending Software Step
> +			 * exception is taken).
> +			 */
> +			if (!vcpu_get_flag(vcpu, DBG_SS_ACTIVE_PENDING))
> +				*vcpu_cpsr(vcpu) |= DBG_SPSR_SS;
> +
>  			mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
>  			mdscr |= DBG_MDSCR_SS;
>  			vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
> @@ -279,6 +288,14 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
>  						&vcpu->arch.debug_ptr->dbg_wcr[0],
>  						&vcpu->arch.debug_ptr->dbg_wvr[0]);
>  		}
> +
> +		if ((vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) &&
> +		    !(*vcpu_cpsr(vcpu) & DBG_SPSR_SS))
> +			/*
> +			 * Mark the vcpu as ACTIVE_PENDING
> +			 * until Software Step exception is confirmed.

s/confirmed/taken/? This would match the comment in the previous hunk.

> +			 */
> +			vcpu_set_flag(vcpu, DBG_SS_ACTIVE_PENDING);
>  	}
>  }
>  
> diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> index f802a3b3f8db..2ff13a3f8479 100644
> --- a/arch/arm64/kvm/guest.c
> +++ b/arch/arm64/kvm/guest.c
> @@ -937,6 +937,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
>  	} else {
>  		/* If not enabled clear all flags */
>  		vcpu->guest_debug = 0;
> +		vcpu_clear_flag(vcpu, DBG_SS_ACTIVE_PENDING);
>  	}
>  
>  out:
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index bbe5b393d689..8e43b2668d67 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -154,6 +154,8 @@ static int kvm_handle_guest_debug(struct kvm_vcpu *vcpu)
>  
>  	if (ESR_ELx_EC(esr) == ESR_ELx_EC_WATCHPT_LOW)
>  		run->debug.arch.far = vcpu->arch.fault.far_el2;
> +	else if (ESR_ELx_EC(esr) == ESR_ELx_EC_SOFTSTP_LOW)
> +		vcpu_clear_flag(vcpu, DBG_SS_ACTIVE_PENDING);

Can we write this as a switch/case statement?

>  
>  	return 0;
>  }

I think we also need to do something if userspace decides to write to
PSTATE as a result of a non-debug exit (such as a signal) when this
DBG_SS_ACTIVE_PENDING is set. I came up with the following
complicated, but not impossible scenario:

- guest single step, PSTATE.SS=0
- exit due to interrupt
- DBG_SS_ACTIVE_PENDING set
- reenter guest
- exit again due to another interrupt
- exit to userspace due to signal pending
- userspace writes PSTATE.SS=1 for no good reason
- we now have an inconsistent state between PSTATE.SS and the vcpu flags

My gut feeling is that we need something like the vcpu flag being set
to !PSTATE.SS if written while debug is enabled.

Thoughts?

	M.
Reiji Watanabe Sept. 14, 2022, 6:13 a.m. UTC | #5
Hi Marc,

Thank you for the review!

On Sat, Sep 10, 2022 at 3:36 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Fri, 09 Sep 2022 05:46:34 +0100,
> Reiji Watanabe <reijiw@google.com> wrote:
> >
> > Currently, PSTATE.SS is set on every guest entry if single-step is
> > enabled for the vCPU by userspace.  However, it could cause extra
> > single-step execution without returning to userspace, which shouldn't
> > be performed, if the Software Step state at the last guest exit was
> > Active-pending (i.e. the last exit was not triggered by Software Step
> > exception, but by an asynchronous exception after the single-step
> > execution is performed).
> >
> > Fix this by not setting PSTATE.SS on guest entry if the Software
> > Step state at the last exit was Active-pending.
> >
> > Fixes: 337b99bf7edf ("KVM: arm64: guest debug, add support for single-step")
> > Signed-off-by: Reiji Watanabe <reijiw@google.com>
>
> Now that I'm a bit more clued about what the architecture actually
> mandates, I can try and review this patch.
>
> > ---
> >  arch/arm64/include/asm/kvm_host.h |  3 +++
> >  arch/arm64/kvm/debug.c            | 19 ++++++++++++++++++-
> >  arch/arm64/kvm/guest.c            |  1 +
> >  arch/arm64/kvm/handle_exit.c      |  2 ++
> >  4 files changed, 24 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index e9c9388ccc02..4cf6eef02565 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -535,6 +535,9 @@ struct kvm_vcpu_arch {
> >  #define IN_WFIT                      __vcpu_single_flag(sflags, BIT(3))
> >  /* vcpu system registers loaded on physical CPU */
> >  #define SYSREGS_ON_CPU               __vcpu_single_flag(sflags, BIT(4))
> > +/* Software step state is Active-pending */
> > +#define DBG_SS_ACTIVE_PENDING        __vcpu_single_flag(sflags, BIT(5))
> > +
> >
> >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> >  #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +   \
> > diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> > index 0b28d7db7c76..125cfb94b4ad 100644
> > --- a/arch/arm64/kvm/debug.c
> > +++ b/arch/arm64/kvm/debug.c
> > @@ -188,7 +188,16 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
> >                * debugging the system.
> >                */
> >               if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
> > -                     *vcpu_cpsr(vcpu) |=  DBG_SPSR_SS;
> > +                     /*
> > +                      * If the software step state at the last guest exit
> > +                      * was Active-pending, we don't set DBG_SPSR_SS so
> > +                      * that the state is maintained (to not run another
> > +                      * single-step until the pending Software Step
> > +                      * exception is taken).
> > +                      */
> > +                     if (!vcpu_get_flag(vcpu, DBG_SS_ACTIVE_PENDING))
> > +                             *vcpu_cpsr(vcpu) |= DBG_SPSR_SS;
> > +
> >                       mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
> >                       mdscr |= DBG_MDSCR_SS;
> >                       vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
> > @@ -279,6 +288,14 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
> >                                               &vcpu->arch.debug_ptr->dbg_wcr[0],
> >                                               &vcpu->arch.debug_ptr->dbg_wvr[0]);
> >               }
> > +
> > +             if ((vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) &&
> > +                 !(*vcpu_cpsr(vcpu) & DBG_SPSR_SS))
> > +                     /*
> > +                      * Mark the vcpu as ACTIVE_PENDING
> > +                      * until Software Step exception is confirmed.
>
> s/confirmed/taken/? This would match the comment in the previous hunk.

Yes, I will fix that.

>
> > +                      */
> > +                     vcpu_set_flag(vcpu, DBG_SS_ACTIVE_PENDING);
> >       }
> >  }
> >
> > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > index f802a3b3f8db..2ff13a3f8479 100644
> > --- a/arch/arm64/kvm/guest.c
> > +++ b/arch/arm64/kvm/guest.c
> > @@ -937,6 +937,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
> >       } else {
> >               /* If not enabled clear all flags */
> >               vcpu->guest_debug = 0;
> > +             vcpu_clear_flag(vcpu, DBG_SS_ACTIVE_PENDING);
> >       }
> >
> >  out:
> > diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> > index bbe5b393d689..8e43b2668d67 100644
> > --- a/arch/arm64/kvm/handle_exit.c
> > +++ b/arch/arm64/kvm/handle_exit.c
> > @@ -154,6 +154,8 @@ static int kvm_handle_guest_debug(struct kvm_vcpu *vcpu)
> >
> >       if (ESR_ELx_EC(esr) == ESR_ELx_EC_WATCHPT_LOW)
> >               run->debug.arch.far = vcpu->arch.fault.far_el2;
> > +     else if (ESR_ELx_EC(esr) == ESR_ELx_EC_SOFTSTP_LOW)
> > +             vcpu_clear_flag(vcpu, DBG_SS_ACTIVE_PENDING);
>
> Can we write this as a switch/case statement?

Sure, I will change this to switch/case statement.


>
> >
> >       return 0;
> >  }
>
> I think we also need to do something if userspace decides to write to
> PSTATE as a result of a non-debug exit (such as a signal) when this
> DBG_SS_ACTIVE_PENDING is set. I came up with the following
> complicated, but not impossible scenario:
>
> - guest single step, PSTATE.SS=0
> - exit due to interrupt
> - DBG_SS_ACTIVE_PENDING set
> - reenter guest
> - exit again due to another interrupt
> - exit to userspace due to signal pending
> - userspace writes PSTATE.SS=1 for no good reason
> - we now have an inconsistent state between PSTATE.SS and the vcpu flags
>
> My gut feeling is that we need something like the vcpu flag being set
> to !PSTATE.SS if written while debug is enabled.
>
> Thoughts?

Ah, that's a good point.
Values that KVM is going to set in debug registers (e.g. MDSCR_EL1,
dbg_bcr, etc) at guest-entry cannot be changed by userspace via
SET_ONE_REG when debug is enabled.  I'm inclined to apply the same
for PSTATE.SS (clear PSTATE.SS if the vcpu flag is set on guest entry,
and set PSTATE.SS to 1 otherwise). Since  MDSCR_EL1 value that KVM is
going to set is not visible from userspace, changing Software-step
state when userspace updates PSTATE.SS might be a bit odd IMHO
(something odd anyway though).

Related to the above scenario, I found another bug (I think).
After guest exits with Active-not-pending (PSTATE.SS==1) due to an
interrupt, and then KVM exits to userspace due to signal pending,
if userspace disables single-step, PSTATE.SS will remain 1 on
subsequent guest entries (or it might have been originally 1, and
KVM might clear it.  Most of the time it doesn't matter, and when the
guest is also using single-step, things will go wrong anyway though).

Considering those, I am thinking of changing the patch as follows,
 - Change kvm_arm_setup_debug() to clear PSTATE.SS if the vcpu flag
   (DBG_SS_ACTIVE_PENDING) is set, and set PSTATE.SS to 1 otherwise.
 - Change save_guest_debug_regs()/restore_guest_debug_regs() to
   save/restore the guest value of PSTATE.SS
   (Add a new field in kvm_vcpu_arch.guest_debug_preserved to save
    the guest value of PSTATE.SS)
keeping the other changes in the patch below.
 - Clear DBG_SS_ACTIVE_PENDING in kvm_handle_guest_debug()
 - Clear DBG_SS_ACTIVE_PENDING when userspace disables single-step

With this, PSTATE.SS value that KVM is going to set on guest-entry
won't be exposed to userspace, and PSTATE.SS value that is set by
userspace will not be used for the guest until single-step is
disabled (similar to MDSCR_EL1).

What do you think ?

Thank you,
Reiji
Marc Zyngier Sept. 14, 2022, 8:12 a.m. UTC | #6
On Wed, 14 Sep 2022 07:13:04 +0100,
Reiji Watanabe <reijiw@google.com> wrote:
> 
> Hi Marc,
> 
> Thank you for the review!
> 
> On Sat, Sep 10, 2022 at 3:36 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Fri, 09 Sep 2022 05:46:34 +0100,
> > Reiji Watanabe <reijiw@google.com> wrote:
> > >
> > > Currently, PSTATE.SS is set on every guest entry if single-step is
> > > enabled for the vCPU by userspace.  However, it could cause extra
> > > single-step execution without returning to userspace, which shouldn't
> > > be performed, if the Software Step state at the last guest exit was
> > > Active-pending (i.e. the last exit was not triggered by Software Step
> > > exception, but by an asynchronous exception after the single-step
> > > execution is performed).
> > >
> > > Fix this by not setting PSTATE.SS on guest entry if the Software
> > > Step state at the last exit was Active-pending.
> > >
> > > Fixes: 337b99bf7edf ("KVM: arm64: guest debug, add support for single-step")
> > > Signed-off-by: Reiji Watanabe <reijiw@google.com>
> >
> > Now that I'm a bit more clued about what the architecture actually
> > mandates, I can try and review this patch.
> >
> > > ---
> > >  arch/arm64/include/asm/kvm_host.h |  3 +++
> > >  arch/arm64/kvm/debug.c            | 19 ++++++++++++++++++-
> > >  arch/arm64/kvm/guest.c            |  1 +
> > >  arch/arm64/kvm/handle_exit.c      |  2 ++
> > >  4 files changed, 24 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > > index e9c9388ccc02..4cf6eef02565 100644
> > > --- a/arch/arm64/include/asm/kvm_host.h
> > > +++ b/arch/arm64/include/asm/kvm_host.h
> > > @@ -535,6 +535,9 @@ struct kvm_vcpu_arch {
> > >  #define IN_WFIT                      __vcpu_single_flag(sflags, BIT(3))
> > >  /* vcpu system registers loaded on physical CPU */
> > >  #define SYSREGS_ON_CPU               __vcpu_single_flag(sflags, BIT(4))
> > > +/* Software step state is Active-pending */
> > > +#define DBG_SS_ACTIVE_PENDING        __vcpu_single_flag(sflags, BIT(5))
> > > +
> > >
> > >  /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
> > >  #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +   \
> > > diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> > > index 0b28d7db7c76..125cfb94b4ad 100644
> > > --- a/arch/arm64/kvm/debug.c
> > > +++ b/arch/arm64/kvm/debug.c
> > > @@ -188,7 +188,16 @@ void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
> > >                * debugging the system.
> > >                */
> > >               if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
> > > -                     *vcpu_cpsr(vcpu) |=  DBG_SPSR_SS;
> > > +                     /*
> > > +                      * If the software step state at the last guest exit
> > > +                      * was Active-pending, we don't set DBG_SPSR_SS so
> > > +                      * that the state is maintained (to not run another
> > > +                      * single-step until the pending Software Step
> > > +                      * exception is taken).
> > > +                      */
> > > +                     if (!vcpu_get_flag(vcpu, DBG_SS_ACTIVE_PENDING))
> > > +                             *vcpu_cpsr(vcpu) |= DBG_SPSR_SS;
> > > +
> > >                       mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
> > >                       mdscr |= DBG_MDSCR_SS;
> > >                       vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
> > > @@ -279,6 +288,14 @@ void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
> > >                                               &vcpu->arch.debug_ptr->dbg_wcr[0],
> > >                                               &vcpu->arch.debug_ptr->dbg_wvr[0]);
> > >               }
> > > +
> > > +             if ((vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) &&
> > > +                 !(*vcpu_cpsr(vcpu) & DBG_SPSR_SS))
> > > +                     /*
> > > +                      * Mark the vcpu as ACTIVE_PENDING
> > > +                      * until Software Step exception is confirmed.
> >
> > s/confirmed/taken/? This would match the comment in the previous hunk.
> 
> Yes, I will fix that.
> 
> >
> > > +                      */
> > > +                     vcpu_set_flag(vcpu, DBG_SS_ACTIVE_PENDING);
> > >       }
> > >  }
> > >
> > > diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
> > > index f802a3b3f8db..2ff13a3f8479 100644
> > > --- a/arch/arm64/kvm/guest.c
> > > +++ b/arch/arm64/kvm/guest.c
> > > @@ -937,6 +937,7 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
> > >       } else {
> > >               /* If not enabled clear all flags */
> > >               vcpu->guest_debug = 0;
> > > +             vcpu_clear_flag(vcpu, DBG_SS_ACTIVE_PENDING);
> > >       }
> > >
> > >  out:
> > > diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> > > index bbe5b393d689..8e43b2668d67 100644
> > > --- a/arch/arm64/kvm/handle_exit.c
> > > +++ b/arch/arm64/kvm/handle_exit.c
> > > @@ -154,6 +154,8 @@ static int kvm_handle_guest_debug(struct kvm_vcpu *vcpu)
> > >
> > >       if (ESR_ELx_EC(esr) == ESR_ELx_EC_WATCHPT_LOW)
> > >               run->debug.arch.far = vcpu->arch.fault.far_el2;
> > > +     else if (ESR_ELx_EC(esr) == ESR_ELx_EC_SOFTSTP_LOW)
> > > +             vcpu_clear_flag(vcpu, DBG_SS_ACTIVE_PENDING);
> >
> > Can we write this as a switch/case statement?
> 
> Sure, I will change this to switch/case statement.
> 
> 
> >
> > >
> > >       return 0;
> > >  }
> >
> > I think we also need to do something if userspace decides to write to
> > PSTATE as a result of a non-debug exit (such as a signal) when this
> > DBG_SS_ACTIVE_PENDING is set. I came up with the following
> > complicated, but not impossible scenario:
> >
> > - guest single step, PSTATE.SS=0
> > - exit due to interrupt
> > - DBG_SS_ACTIVE_PENDING set
> > - reenter guest
> > - exit again due to another interrupt
> > - exit to userspace due to signal pending
> > - userspace writes PSTATE.SS=1 for no good reason
> > - we now have an inconsistent state between PSTATE.SS and the vcpu flags
> >
> > My gut feeling is that we need something like the vcpu flag being set
> > to !PSTATE.SS if written while debug is enabled.
> >
> > Thoughts?
> 
> Ah, that's a good point.
> Values that KVM is going to set in debug registers (e.g. MDSCR_EL1,
> dbg_bcr, etc) at guest-entry cannot be changed by userspace via
> SET_ONE_REG when debug is enabled.  I'm inclined to apply the same
> for PSTATE.SS (clear PSTATE.SS if the vcpu flag is set on guest entry,
> and set PSTATE.SS to 1 otherwise). Since  MDSCR_EL1 value that KVM is
> going to set is not visible from userspace, changing Software-step
> state when userspace updates PSTATE.SS might be a bit odd IMHO
> (something odd anyway though).

Indeed, that's probably for the best, although it is a userspace
visible change (hopefully not one that someone relies on). It'd be
worth documenting that when debugging is enabled, PSTATE.SS is not
controllable by userspace. But what you say below may change things.

> Related to the above scenario, I found another bug (I think).
> After guest exits with Active-not-pending (PSTATE.SS==1) due to an
> interrupt, and then KVM exits to userspace due to signal pending,
> if userspace disables single-step, PSTATE.SS will remain 1 on
> subsequent guest entries (or it might have been originally 1, and
> KVM might clear it.  Most of the time it doesn't matter, and when the
> guest is also using single-step, things will go wrong anyway though).
> 
> Considering those, I am thinking of changing the patch as follows,
>  - Change kvm_arm_setup_debug() to clear PSTATE.SS if the vcpu flag
>    (DBG_SS_ACTIVE_PENDING) is set, and set PSTATE.SS to 1 otherwise.
>  - Change save_guest_debug_regs()/restore_guest_debug_regs() to
>    save/restore the guest value of PSTATE.SS
>    (Add a new field in kvm_vcpu_arch.guest_debug_preserved to save
>     the guest value of PSTATE.SS)
> keeping the other changes in the patch below.
>  - Clear DBG_SS_ACTIVE_PENDING in kvm_handle_guest_debug()
>  - Clear DBG_SS_ACTIVE_PENDING when userspace disables single-step
> 
> With this, PSTATE.SS value that KVM is going to set on guest-entry
> won't be exposed to userspace, and PSTATE.SS value that is set by
> userspace will not be used for the guest until single-step is
> disabled (similar to MDSCR_EL1).
> 
> What do you think ?

Is it expected that userspace could then manipulate the guest's
PSTATE.SS state when guest debug is active? I'm tempted to say yes.

Otherwise, this seems to match my understanding of the architecture
(but I though I understood it 8 years ago, so I'm probably talking
rubbish).

Anyway, please post a new version with these changes, as I'd really
like to have this in 6.1 (too late such a change in 6.0 I'm afraid).

Thanks,

	M.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e9c9388ccc02..4cf6eef02565 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -535,6 +535,9 @@  struct kvm_vcpu_arch {
 #define IN_WFIT			__vcpu_single_flag(sflags, BIT(3))
 /* vcpu system registers loaded on physical CPU */
 #define SYSREGS_ON_CPU		__vcpu_single_flag(sflags, BIT(4))
+/* Software step state is Active-pending */
+#define DBG_SS_ACTIVE_PENDING	__vcpu_single_flag(sflags, BIT(5))
+
 
 /* Pointer to the vcpu's SVE FFR for sve_{save,load}_state() */
 #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +	\
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 0b28d7db7c76..125cfb94b4ad 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -188,7 +188,16 @@  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 		 * debugging the system.
 		 */
 		if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
-			*vcpu_cpsr(vcpu) |=  DBG_SPSR_SS;
+			/*
+			 * If the software step state at the last guest exit
+			 * was Active-pending, we don't set DBG_SPSR_SS so
+			 * that the state is maintained (to not run another
+			 * single-step until the pending Software Step
+			 * exception is taken).
+			 */
+			if (!vcpu_get_flag(vcpu, DBG_SS_ACTIVE_PENDING))
+				*vcpu_cpsr(vcpu) |= DBG_SPSR_SS;
+
 			mdscr = vcpu_read_sys_reg(vcpu, MDSCR_EL1);
 			mdscr |= DBG_MDSCR_SS;
 			vcpu_write_sys_reg(vcpu, mdscr, MDSCR_EL1);
@@ -279,6 +288,14 @@  void kvm_arm_clear_debug(struct kvm_vcpu *vcpu)
 						&vcpu->arch.debug_ptr->dbg_wcr[0],
 						&vcpu->arch.debug_ptr->dbg_wvr[0]);
 		}
+
+		if ((vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) &&
+		    !(*vcpu_cpsr(vcpu) & DBG_SPSR_SS))
+			/*
+			 * Mark the vcpu as ACTIVE_PENDING
+			 * until Software Step exception is confirmed.
+			 */
+			vcpu_set_flag(vcpu, DBG_SS_ACTIVE_PENDING);
 	}
 }
 
diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index f802a3b3f8db..2ff13a3f8479 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -937,6 +937,7 @@  int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
 	} else {
 		/* If not enabled clear all flags */
 		vcpu->guest_debug = 0;
+		vcpu_clear_flag(vcpu, DBG_SS_ACTIVE_PENDING);
 	}
 
 out:
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index bbe5b393d689..8e43b2668d67 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -154,6 +154,8 @@  static int kvm_handle_guest_debug(struct kvm_vcpu *vcpu)
 
 	if (ESR_ELx_EC(esr) == ESR_ELx_EC_WATCHPT_LOW)
 		run->debug.arch.far = vcpu->arch.fault.far_el2;
+	else if (ESR_ELx_EC(esr) == ESR_ELx_EC_SOFTSTP_LOW)
+		vcpu_clear_flag(vcpu, DBG_SS_ACTIVE_PENDING);
 
 	return 0;
 }