diff mbox series

[1/2] KVM: arm64: Don't write junk to sysregs on reset

Message ID 20190805121555.130897-2-maz@kernel.org (mailing list archive)
State Mainlined
Commit 03fdfb2690099c19160a3f2c5b77db60b3afeded
Headers show
Series KVM: arm/arm64: Revamp sysreg reset checks | expand

Commit Message

Marc Zyngier Aug. 5, 2019, 12:15 p.m. UTC
At the moment, the way we reset system registers is mildly insane:
We write junk to them, call the reset functions, and then check that
we have something else in them.

The "fun" thing is that this can happen while the guest is running
(PSCI, for example). If anything in KVM has to evaluate the state
of a system register while junk is in there, bad thing may happen.

Let's stop doing that. Instead, we track that we have called a
reset function for that register, and assume that the reset
function has done something. This requires fixing a couple of
sysreg refinition in the trap table.

In the end, the very need of this reset check is pretty dubious,
as it doesn't check everything (a lot of the sysregs leave outside of
the sys_regs[] array). It may well be axed in the near future.

Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 arch/arm64/kvm/sys_regs.c | 32 ++++++++++++++++++--------------
 1 file changed, 18 insertions(+), 14 deletions(-)

Comments

Zenghui Yu Aug. 6, 2019, 6:29 a.m. UTC | #1
Hi Marc,

On 2019/8/5 20:15, Marc Zyngier wrote:
> At the moment, the way we reset system registers is mildly insane:
> We write junk to them, call the reset functions, and then check that
> we have something else in them.
> 
> The "fun" thing is that this can happen while the guest is running
> (PSCI, for example). If anything in KVM has to evaluate the state
> of a system register while junk is in there, bad thing may happen.
> 
> Let's stop doing that. Instead, we track that we have called a
> reset function for that register, and assume that the reset
> function has done something. This requires fixing a couple of
> sysreg refinition in the trap table.
> 
> In the end, the very need of this reset check is pretty dubious,
> as it doesn't check everything (a lot of the sysregs leave outside of
> the sys_regs[] array). It may well be axed in the near future.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>

(Regardless of whether this check is needed or not,) I tested this patch
with kvm-unit-tests:

for i in {1..100}; do QEMU=/path/to/qemu-system-aarch64 accel=kvm 
arch=arm64 ./run_tests.sh; done

And all the tests passed!


Thanks,
zenghui
Marc Zyngier Aug. 6, 2019, 8:35 a.m. UTC | #2
On 06/08/2019 07:29, Zenghui Yu wrote:
> Hi Marc,
> 
> On 2019/8/5 20:15, Marc Zyngier wrote:
>> At the moment, the way we reset system registers is mildly insane:
>> We write junk to them, call the reset functions, and then check that
>> we have something else in them.
>>
>> The "fun" thing is that this can happen while the guest is running
>> (PSCI, for example). If anything in KVM has to evaluate the state
>> of a system register while junk is in there, bad thing may happen.
>>
>> Let's stop doing that. Instead, we track that we have called a
>> reset function for that register, and assume that the reset
>> function has done something. This requires fixing a couple of
>> sysreg refinition in the trap table.
>>
>> In the end, the very need of this reset check is pretty dubious,
>> as it doesn't check everything (a lot of the sysregs leave outside of
>> the sys_regs[] array). It may well be axed in the near future.
>>
>> Signed-off-by: Marc Zyngier <maz@kernel.org>
> 
> (Regardless of whether this check is needed or not,) I tested this patch
> with kvm-unit-tests:
> 
> for i in {1..100}; do QEMU=/path/to/qemu-system-aarch64 accel=kvm 
> arch=arm64 ./run_tests.sh; done
> 
> And all the tests passed!

Great! Can I take this as a 'Tested-by:'?

Thanks,

	M.
Zenghui Yu Aug. 6, 2019, 8:52 a.m. UTC | #3
On 2019/8/6 16:35, Marc Zyngier wrote:
> On 06/08/2019 07:29, Zenghui Yu wrote:
>> Hi Marc,
>>
>> On 2019/8/5 20:15, Marc Zyngier wrote:
>>> At the moment, the way we reset system registers is mildly insane:
>>> We write junk to them, call the reset functions, and then check that
>>> we have something else in them.
>>>
>>> The "fun" thing is that this can happen while the guest is running
>>> (PSCI, for example). If anything in KVM has to evaluate the state
>>> of a system register while junk is in there, bad thing may happen.
>>>
>>> Let's stop doing that. Instead, we track that we have called a
>>> reset function for that register, and assume that the reset
>>> function has done something. This requires fixing a couple of
>>> sysreg refinition in the trap table.
>>>
>>> In the end, the very need of this reset check is pretty dubious,
>>> as it doesn't check everything (a lot of the sysregs leave outside of
>>> the sys_regs[] array). It may well be axed in the near future.
>>>
>>> Signed-off-by: Marc Zyngier <maz@kernel.org>
>>
>> (Regardless of whether this check is needed or not,) I tested this patch
>> with kvm-unit-tests:
>>
>> for i in {1..100}; do QEMU=/path/to/qemu-system-aarch64 accel=kvm
>> arch=arm64 ./run_tests.sh; done
>>
>> And all the tests passed!
> 
> Great! Can I take this as a 'Tested-by:'?

Yes, please add:

Tested-by: Zenghui Yu <yuzenghui@huawei.com>


Zenghui
diff mbox series

Patch

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index f26e181d881c..2071260a275b 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -632,7 +632,7 @@  static void reset_pmcr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 	 */
 	val = ((pmcr & ~ARMV8_PMU_PMCR_MASK)
 	       | (ARMV8_PMU_PMCR_MASK & 0xdecafbad)) & (~ARMV8_PMU_PMCR_E);
-	__vcpu_sys_reg(vcpu, PMCR_EL0) = val;
+	__vcpu_sys_reg(vcpu, r->reg) = val;
 }
 
 static bool check_pmu_access_disabled(struct kvm_vcpu *vcpu, u64 flags)
@@ -981,13 +981,13 @@  static bool access_pmuserenr(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
 /* Silly macro to expand the DBG{BCR,BVR,WVR,WCR}n_EL1 registers in one go */
 #define DBG_BCR_BVR_WCR_WVR_EL1(n)					\
 	{ SYS_DESC(SYS_DBGBVRn_EL1(n)),					\
-	  trap_bvr, reset_bvr, n, 0, get_bvr, set_bvr },		\
+	  trap_bvr, reset_bvr, 0, 0, get_bvr, set_bvr },		\
 	{ SYS_DESC(SYS_DBGBCRn_EL1(n)),					\
-	  trap_bcr, reset_bcr, n, 0, get_bcr, set_bcr },		\
+	  trap_bcr, reset_bcr, 0, 0, get_bcr, set_bcr },		\
 	{ SYS_DESC(SYS_DBGWVRn_EL1(n)),					\
-	  trap_wvr, reset_wvr, n, 0,  get_wvr, set_wvr },		\
+	  trap_wvr, reset_wvr, 0, 0,  get_wvr, set_wvr },		\
 	{ SYS_DESC(SYS_DBGWCRn_EL1(n)),					\
-	  trap_wcr, reset_wcr, n, 0,  get_wcr, set_wcr }
+	  trap_wcr, reset_wcr, 0, 0,  get_wcr, set_wcr }
 
 /* Macro to expand the PMEVCNTRn_EL0 register */
 #define PMU_PMEVCNTR_EL0(n)						\
@@ -1540,7 +1540,7 @@  static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 },
 	{ SYS_DESC(SYS_CTR_EL0), access_ctr },
 
-	{ SYS_DESC(SYS_PMCR_EL0), access_pmcr, reset_pmcr, },
+	{ SYS_DESC(SYS_PMCR_EL0), access_pmcr, reset_pmcr, PMCR_EL0 },
 	{ SYS_DESC(SYS_PMCNTENSET_EL0), access_pmcnten, reset_unknown, PMCNTENSET_EL0 },
 	{ SYS_DESC(SYS_PMCNTENCLR_EL0), access_pmcnten, NULL, PMCNTENSET_EL0 },
 	{ SYS_DESC(SYS_PMOVSCLR_EL0), access_pmovs, NULL, PMOVSSET_EL0 },
@@ -2254,13 +2254,19 @@  static int emulate_sys_reg(struct kvm_vcpu *vcpu,
 }
 
 static void reset_sys_reg_descs(struct kvm_vcpu *vcpu,
-			      const struct sys_reg_desc *table, size_t num)
+				const struct sys_reg_desc *table, size_t num,
+				unsigned long *bmap)
 {
 	unsigned long i;
 
 	for (i = 0; i < num; i++)
-		if (table[i].reset)
+		if (table[i].reset) {
+			int reg = table[i].reg;
+
 			table[i].reset(vcpu, &table[i]);
+			if (reg > 0 && reg < NR_SYS_REGS)
+				set_bit(reg, bmap);
+		}
 }
 
 /**
@@ -2774,18 +2780,16 @@  void kvm_reset_sys_regs(struct kvm_vcpu *vcpu)
 {
 	size_t num;
 	const struct sys_reg_desc *table;
-
-	/* Catch someone adding a register without putting in reset entry. */
-	memset(&vcpu->arch.ctxt.sys_regs, 0x42, sizeof(vcpu->arch.ctxt.sys_regs));
+	DECLARE_BITMAP(bmap, NR_SYS_REGS) = { 0, };
 
 	/* Generic chip reset first (so target could override). */
-	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs));
+	reset_sys_reg_descs(vcpu, sys_reg_descs, ARRAY_SIZE(sys_reg_descs), bmap);
 
 	table = get_target_table(vcpu->arch.target, true, &num);
-	reset_sys_reg_descs(vcpu, table, num);
+	reset_sys_reg_descs(vcpu, table, num, bmap);
 
 	for (num = 1; num < NR_SYS_REGS; num++) {
-		if (WARN(__vcpu_sys_reg(vcpu, num) == 0x4242424242424242,
+		if (WARN(!test_bit(num, bmap),
 			 "Didn't reset __vcpu_sys_reg(%zi)\n", num))
 			break;
 	}