diff mbox series

[RFC,v3,01/16] KVM: arm64: Initialize VCPU mdcr_el2 before loading it

Message ID 20201027172705.15181-2-alexandru.elisei@arm.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Add Statistical Profiling Extension (SPE) support | expand

Commit Message

Alexandru Elisei Oct. 27, 2020, 5:26 p.m. UTC
When a VCPU is created, the kvm_vcpu struct is initialized to zero in
kvm_vm_ioctl_create_vcpu(). On VHE systems, the first time
vcpu.arch.mdcr_el2 is loaded on hardware is in vcpu_load(), before it is
set to a sensible value in kvm_arm_setup_debug() later in the run loop. The
result is that KVM executes for a short time with MDCR_EL2 set to zero.

This is mostly harmless as we don't need to trap debug and SPE register
accesses from EL1 (we're still running in the host at EL2), but we do set
MDCR_EL2.HPMN to 0 which is constrained unpredictable according to ARM DDI
0487F.b, page D13-3620; the required behavior from the hardware in this
case is to reserve an unkown number of registers for EL2 and EL3 exclusive
use.

Initialize mdcr_el2 in kvm_vcpu_vcpu_first_run_init(), so we can avoid the
constrained unpredictable behavior and to ensure that the MDCR_EL2 register
has the same value after each vcpu_load(), including the first time the
VCPU is run.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arch/arm64/include/asm/kvm_host.h |  1 +
 arch/arm64/kvm/arm.c              |  3 +-
 arch/arm64/kvm/debug.c            | 81 +++++++++++++++++++++----------
 3 files changed, 58 insertions(+), 27 deletions(-)

Comments

James Morse Nov. 19, 2020, 4:58 p.m. UTC | #1
Hi Alex,

On 27/10/2020 17:26, Alexandru Elisei wrote:
> When a VCPU is created, the kvm_vcpu struct is initialized to zero in
> kvm_vm_ioctl_create_vcpu(). On VHE systems, the first time
> vcpu.arch.mdcr_el2 is loaded on hardware is in vcpu_load(), before it is
> set to a sensible value in kvm_arm_setup_debug() later in the run loop. The
> result is that KVM executes for a short time with MDCR_EL2 set to zero.
> 
> This is mostly harmless as we don't need to trap debug and SPE register
> accesses from EL1 (we're still running in the host at EL2), but we do set
> MDCR_EL2.HPMN to 0 which is constrained unpredictable according to ARM DDI
> 0487F.b, page D13-3620; the required behavior from the hardware in this
> case is to reserve an unkown number of registers for EL2 and EL3 exclusive
> use.
> 
> Initialize mdcr_el2 in kvm_vcpu_vcpu_first_run_init(), so we can avoid the
> constrained unpredictable behavior and to ensure that the MDCR_EL2 register
> has the same value after each vcpu_load(), including the first time the
> VCPU is run.


> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
> index 7a7e425616b5..22ee448aee2b 100644
> --- a/arch/arm64/kvm/debug.c
> +++ b/arch/arm64/kvm/debug.c
> @@ -68,6 +68,59 @@ void kvm_arm_init_debug(void)

> +static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu, u32 host_mdcr)
> +{
> +	bool trap_debug = !(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY);
> +
> +	/*
> +	 * This also clears MDCR_EL2_E2PB_MASK to disable guest access
> +	 * to the profiling buffer.
> +	 */
> +	vcpu->arch.mdcr_el2 = host_mdcr & MDCR_EL2_HPMN_MASK;
> +	vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
> +				MDCR_EL2_TPMS |
> +				MDCR_EL2_TPMCR |
> +				MDCR_EL2_TDRA |
> +				MDCR_EL2_TDOSA);

> +	if (vcpu->guest_debug) {
> +		/* Route all software debug exceptions to EL2 */
> +		vcpu->arch.mdcr_el2 |= MDCR_EL2_TDE;
> +		if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW)
> +			trap_debug = true;
> +	}

This had me confused for a while... could you hint that this is when the guest is being
'external' debugged by the VMM? (its clear-er before this change)


Thanks,

James


> +	/* Trap debug register access */
> +	if (trap_debug)
> +		vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA;
> +
> +	trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2);
> +}
Alexandru Elisei Dec. 2, 2020, 2:25 p.m. UTC | #2
Hi James,

Many thanks for having a look at the series!

On 11/19/20 4:58 PM, James Morse wrote:
> Hi Alex,
>
> On 27/10/2020 17:26, Alexandru Elisei wrote:
>> When a VCPU is created, the kvm_vcpu struct is initialized to zero in
>> kvm_vm_ioctl_create_vcpu(). On VHE systems, the first time
>> vcpu.arch.mdcr_el2 is loaded on hardware is in vcpu_load(), before it is
>> set to a sensible value in kvm_arm_setup_debug() later in the run loop. The
>> result is that KVM executes for a short time with MDCR_EL2 set to zero.
>>
>> This is mostly harmless as we don't need to trap debug and SPE register
>> accesses from EL1 (we're still running in the host at EL2), but we do set
>> MDCR_EL2.HPMN to 0 which is constrained unpredictable according to ARM DDI
>> 0487F.b, page D13-3620; the required behavior from the hardware in this
>> case is to reserve an unkown number of registers for EL2 and EL3 exclusive
>> use.
>>
>> Initialize mdcr_el2 in kvm_vcpu_vcpu_first_run_init(), so we can avoid the
>> constrained unpredictable behavior and to ensure that the MDCR_EL2 register
>> has the same value after each vcpu_load(), including the first time the
>> VCPU is run.
>
>> diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
>> index 7a7e425616b5..22ee448aee2b 100644
>> --- a/arch/arm64/kvm/debug.c
>> +++ b/arch/arm64/kvm/debug.c
>> @@ -68,6 +68,59 @@ void kvm_arm_init_debug(void)
>> +static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu, u32 host_mdcr)
>> +{
>> +	bool trap_debug = !(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY);
>> +
>> +	/*
>> +	 * This also clears MDCR_EL2_E2PB_MASK to disable guest access
>> +	 * to the profiling buffer.
>> +	 */
>> +	vcpu->arch.mdcr_el2 = host_mdcr & MDCR_EL2_HPMN_MASK;
>> +	vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
>> +				MDCR_EL2_TPMS |
>> +				MDCR_EL2_TPMCR |
>> +				MDCR_EL2_TDRA |
>> +				MDCR_EL2_TDOSA);
>> +	if (vcpu->guest_debug) {
>> +		/* Route all software debug exceptions to EL2 */
>> +		vcpu->arch.mdcr_el2 |= MDCR_EL2_TDE;
>> +		if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW)
>> +			trap_debug = true;
>> +	}
> This had me confused for a while... could you hint that this is when the guest is being
> 'external' debugged by the VMM? (its clear-er before this change)

I can put a comment above the if statement similar to the one from
kvm_arm_setup_debug(), where this code is lifted from:

        /* Is the VCPU being debugged by userspace? */

What do you think?

Thanks,
Alex
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 0aecbab6a7fb..25d326aecded 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -597,6 +597,7 @@  static inline void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu) {}
 static inline void kvm_arch_vcpu_block_finish(struct kvm_vcpu *vcpu) {}
 
 void kvm_arm_init_debug(void);
+void kvm_arm_vcpu_init_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_clear_debug(struct kvm_vcpu *vcpu);
 void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index f56122eedffc..e51d8f328c7e 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -544,6 +544,8 @@  static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 		static_branch_inc(&userspace_irqchip_in_use);
 	}
 
+	kvm_arm_vcpu_init_debug(vcpu);
+
 	ret = kvm_timer_enable(vcpu);
 	if (ret)
 		return ret;
@@ -739,7 +741,6 @@  int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 		}
 
 		kvm_arm_setup_debug(vcpu);
-
 		/**************************************************************
 		 * Enter the guest
 		 */
diff --git a/arch/arm64/kvm/debug.c b/arch/arm64/kvm/debug.c
index 7a7e425616b5..22ee448aee2b 100644
--- a/arch/arm64/kvm/debug.c
+++ b/arch/arm64/kvm/debug.c
@@ -68,6 +68,59 @@  void kvm_arm_init_debug(void)
 	__this_cpu_write(mdcr_el2, kvm_call_hyp_ret(__kvm_get_mdcr_el2));
 }
 
+/**
+ * kvm_arm_setup_mdcr_el2 - configure vcpu mdcr_el2 value
+ *
+ * @vcpu:	the vcpu pointer
+ * @host_mdcr:  host mdcr_el2 value
+ *
+ * This ensures we will trap access to:
+ *  - Performance monitors (MDCR_EL2_TPM/MDCR_EL2_TPMCR)
+ *  - Debug ROM Address (MDCR_EL2_TDRA)
+ *  - OS related registers (MDCR_EL2_TDOSA)
+ *  - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
+ */
+static void kvm_arm_setup_mdcr_el2(struct kvm_vcpu *vcpu, u32 host_mdcr)
+{
+	bool trap_debug = !(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY);
+
+	/*
+	 * This also clears MDCR_EL2_E2PB_MASK to disable guest access
+	 * to the profiling buffer.
+	 */
+	vcpu->arch.mdcr_el2 = host_mdcr & MDCR_EL2_HPMN_MASK;
+	vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
+				MDCR_EL2_TPMS |
+				MDCR_EL2_TPMCR |
+				MDCR_EL2_TDRA |
+				MDCR_EL2_TDOSA);
+
+	if (vcpu->guest_debug) {
+		/* Route all software debug exceptions to EL2 */
+		vcpu->arch.mdcr_el2 |= MDCR_EL2_TDE;
+		if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW)
+			trap_debug = true;
+	}
+
+	/* Trap debug register access */
+	if (trap_debug)
+		vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA;
+
+	trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2);
+}
+
+/**
+ * kvm_arm_vcpu_init_debug - setup vcpu debug traps
+ *
+ * @vcpu:	the vcpu pointer
+ *
+ * Set vcpu initial mdcr_el2 value.
+ */
+void kvm_arm_vcpu_init_debug(struct kvm_vcpu *vcpu)
+{
+	kvm_arm_setup_mdcr_el2(vcpu, this_cpu_read(mdcr_el2));
+}
+
 /**
  * kvm_arm_reset_debug_ptr - reset the debug ptr to point to the vcpu state
  */
@@ -83,12 +136,7 @@  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
  * @vcpu:	the vcpu pointer
  *
  * This is called before each entry into the hypervisor to setup any
- * debug related registers. Currently this just ensures we will trap
- * access to:
- *  - Performance monitors (MDCR_EL2_TPM/MDCR_EL2_TPMCR)
- *  - Debug ROM Address (MDCR_EL2_TDRA)
- *  - OS related registers (MDCR_EL2_TDOSA)
- *  - Statistical profiler (MDCR_EL2_TPMS/MDCR_EL2_E2PB)
+ * debug related registers.
  *
  * Additionally, KVM only traps guest accesses to the debug registers if
  * the guest is not actively using them (see the KVM_ARM64_DEBUG_DIRTY
@@ -100,27 +148,14 @@  void kvm_arm_reset_debug_ptr(struct kvm_vcpu *vcpu)
 
 void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 {
-	bool trap_debug = !(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY);
 	unsigned long mdscr, orig_mdcr_el2 = vcpu->arch.mdcr_el2;
 
 	trace_kvm_arm_setup_debug(vcpu, vcpu->guest_debug);
 
-	/*
-	 * This also clears MDCR_EL2_E2PB_MASK to disable guest access
-	 * to the profiling buffer.
-	 */
-	vcpu->arch.mdcr_el2 = __this_cpu_read(mdcr_el2) & MDCR_EL2_HPMN_MASK;
-	vcpu->arch.mdcr_el2 |= (MDCR_EL2_TPM |
-				MDCR_EL2_TPMS |
-				MDCR_EL2_TPMCR |
-				MDCR_EL2_TDRA |
-				MDCR_EL2_TDOSA);
+	kvm_arm_setup_mdcr_el2(vcpu, __this_cpu_read(mdcr_el2));
 
 	/* Is Guest debugging in effect? */
 	if (vcpu->guest_debug) {
-		/* Route all software debug exceptions to EL2 */
-		vcpu->arch.mdcr_el2 |= MDCR_EL2_TDE;
-
 		/* Save guest debug state */
 		save_guest_debug_regs(vcpu);
 
@@ -174,7 +209,6 @@  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 
 			vcpu->arch.debug_ptr = &vcpu->arch.external_debug_state;
 			vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
-			trap_debug = true;
 
 			trace_kvm_arm_set_regset("BKPTS", get_num_brps(),
 						&vcpu->arch.debug_ptr->dbg_bcr[0],
@@ -189,10 +223,6 @@  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 	BUG_ON(!vcpu->guest_debug &&
 		vcpu->arch.debug_ptr != &vcpu->arch.vcpu_debug_state);
 
-	/* Trap debug register access */
-	if (trap_debug)
-		vcpu->arch.mdcr_el2 |= MDCR_EL2_TDA;
-
 	/* If KDE or MDE are set, perform a full save/restore cycle. */
 	if (vcpu_read_sys_reg(vcpu, MDSCR_EL1) & (DBG_MDSCR_KDE | DBG_MDSCR_MDE))
 		vcpu->arch.flags |= KVM_ARM64_DEBUG_DIRTY;
@@ -201,7 +231,6 @@  void kvm_arm_setup_debug(struct kvm_vcpu *vcpu)
 	if (has_vhe() && orig_mdcr_el2 != vcpu->arch.mdcr_el2)
 		write_sysreg(vcpu->arch.mdcr_el2, mdcr_el2);
 
-	trace_kvm_arm_set_dreg32("MDCR_EL2", vcpu->arch.mdcr_el2);
 	trace_kvm_arm_set_dreg32("MDSCR_EL1", vcpu_read_sys_reg(vcpu, MDSCR_EL1));
 }