diff mbox series

[RFC,v2,3/3] KVM: arm64: Enable errata based on migration target CPUs

Message ID 20241024094012.29452-4-shameerali.kolothum.thodi@huawei.com (mailing list archive)
State New, archived
Headers show
Series KVM: arm64: Errata management for VM Live migration | expand

Commit Message

Shameer Kolothum Oct. 24, 2024, 9:40 a.m. UTC
If the Guest has migration target CPUs set, enable all errata
that are based on target MIDR/REVIDR.

Also make sure we call the paravirt helper to retrieve migration
targets if any.

Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
---
 arch/arm64/kernel/cpu_errata.c | 46 ++++++++++++++++++++++++++--------
 arch/arm64/kernel/cpufeature.c |  3 +++
 2 files changed, 39 insertions(+), 10 deletions(-)

Comments

Oliver Upton Oct. 25, 2024, 1:36 a.m. UTC | #1
nitpick: shortlog shouldn't use a KVM prefix if the patch isn't touching
KVM.

On Thu, Oct 24, 2024 at 10:40:12AM +0100, Shameer Kolothum wrote:
> If the Guest has migration target CPUs set, enable all errata
> that are based on target MIDR/REVIDR.
> 
> Also make sure we call the paravirt helper to retrieve migration
> targets if any.
> 
> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>

I don't know if you saw my suggestion on v1 [*], but it'd be great if we
can hide the array of implementations from users of is_midr_in_range()
and friends.

There's other junk keyed off MIDR (e.g. Spectre) that also needs to be
aware of all the implementations where the VM might run. The easiest way
to do that is to stop using a caller-provided MIDR and have
is_midr_in_range() either walk the array of implementations or read
MIDR_EL1.

[*]: https://lore.kernel.org/kvmarm/ZwlbTCwoKQyh3vmF@linux.dev/
Shameer Kolothum Oct. 28, 2024, 5:29 p.m. UTC | #2
> -----Original Message-----
> From: Oliver Upton <oliver.upton@linux.dev>
> Sent: Friday, October 25, 2024 2:37 AM
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
> Cc: kvmarm@lists.linux.dev; maz@kernel.org; catalin.marinas@arm.com;
> will@kernel.org; mark.rutland@arm.com; cohuck@redhat.com;
> eric.auger@redhat.com; yuzenghui <yuzenghui@huawei.com>; Wangzhou
> (B) <wangzhou1@hisilicon.com>; jiangkunkun <jiangkunkun@huawei.com>;
> Jonathan Cameron <jonathan.cameron@huawei.com>; Anthony Jebson
> <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org;
> Linuxarm <linuxarm@huawei.com>
> Subject: Re: [RFC PATCH v2 3/3] KVM: arm64: Enable errata based on
> migration target CPUs
> 
> nitpick: shortlog shouldn't use a KVM prefix if the patch isn't touching
> KVM.
> 
> On Thu, Oct 24, 2024 at 10:40:12AM +0100, Shameer Kolothum wrote:
> > If the Guest has migration target CPUs set, enable all errata
> > that are based on target MIDR/REVIDR.
> >
> > Also make sure we call the paravirt helper to retrieve migration
> > targets if any.
> >
> > Signed-off-by: Shameer Kolothum
> <shameerali.kolothum.thodi@huawei.com>
> 
> I don't know if you saw my suggestion on v1 [*], but it'd be great if we
> can hide the array of implementations from users of is_midr_in_range()
> and friends.

I did see your suggestion but my bad,  misunderstood it and thought that you are
referring to _midr_range() functions in cpu_errata.c only.
 
> There's other junk keyed off MIDR (e.g. Spectre) that also needs to be
> aware of all the implementations where the VM might run. The easiest way
> to do that is to stop using a caller-provided MIDR and have
> is_midr_in_range() either walk the array of implementations or read
> MIDR_EL1.

So the suggestion is to use something like this?

bool is_midr_in_range(struct midr_range const *range)
{
        int i;

        for (i = 0; i < errata_migrn_target_num; i++) {
                if (midr_is_cpu_model_range(errata_migrn_target_cpus[i].midr,
                                            range->model,
                                            range->rv_min, range->rv_max))
                        return true;
        }

        return midr_is_cpu_model_range(read_cpuid_id(), range->model,
                                       range->rv_min, range->rv_max);
}

Or do we need some kind of hint to these functions to specify which 
MIDR to use? I think there are at least couple of places where it looks like
it make sense to do the checking against MIDR_EL1 only.
(eg: has_neoverse_n1_erratum_1542419().

Thanks,
Shameer
Oliver Upton Oct. 30, 2024, 4:33 a.m. UTC | #3
Hey Shameer,

On Mon, Oct 28, 2024 at 05:29:14PM +0000, Shameerali Kolothum Thodi wrote:
> > I don't know if you saw my suggestion on v1 [*], but it'd be great if we
> > can hide the array of implementations from users of is_midr_in_range()
> > and friends.
> 
> I did see your suggestion but my bad,  misunderstood it and thought that you are
> referring to _midr_range() functions in cpu_errata.c only.

Ah, no worries. And yeah, for this to be a general solution we're going
to need to cover all our bases where MIDR checks are used.

> > There's other junk keyed off MIDR (e.g. Spectre) that also needs to be
> > aware of all the implementations where the VM might run. The easiest way
> > to do that is to stop using a caller-provided MIDR and have
> > is_midr_in_range() either walk the array of implementations or read
> > MIDR_EL1.
> 
> So the suggestion is to use something like this?
> 
> bool is_midr_in_range(struct midr_range const *range)
> {
>         int i;
> 
>         for (i = 0; i < errata_migrn_target_num; i++) {
>                 if (midr_is_cpu_model_range(errata_migrn_target_cpus[i].midr,
>                                             range->model,
>                                             range->rv_min, range->rv_max))
>                         return true;
>         }
> 
>         return midr_is_cpu_model_range(read_cpuid_id(), range->model,
>                                        range->rv_min, range->rv_max);
> }

Yep, pretty much like that. Except that the guest should either use the
values from the hypercall *or* the value of MIDR_EL1, not both.

> Or do we need some kind of hint to these functions to specify which 
> MIDR to use? I think there are at least couple of places where it looks like
> it make sense to do the checking against MIDR_EL1 only.
> (eg: has_neoverse_n1_erratum_1542419().

I don't think we need a hint or anything. The fact that the hypervisor
is exposing this hypercall interface indicates the MIDR of the
implementation the vCPU currently runs on is subject to change.

So the concept of a per-CPU erratum is completely out of the window, and
every CPU is subject to the errata of every implementation advertised by
the hypervisor.
diff mbox series

Patch

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index feaaf2b11f46..df6d32dfd5c0 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -18,17 +18,14 @@  u32 __ro_after_init errata_migrn_target_num;
 struct migrn_target __ro_after_init errata_migrn_target_cpus[MAX_MIGRN_TARGET_CPUS];
 
 static bool __maybe_unused
-is_affected_midr_range(const struct arm64_cpu_capabilities *entry, int scope)
+__is_affected_midr_range(const struct arm64_cpu_capabilities *entry, u32 midr, u32 revidr)
 {
 	const struct arm64_midr_revidr *fix;
-	u32 midr = read_cpuid_id(), revidr;
 
-	WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
 	if (!is_midr_in_range(midr, &entry->midr_range))
 		return false;
 
 	midr &= MIDR_REVISION_MASK | MIDR_VARIANT_MASK;
-	revidr = read_cpuid(REVIDR_EL1);
 	for (fix = entry->fixed_revs; fix && fix->revidr_mask; fix++)
 		if (midr == fix->midr_rv && (revidr & fix->revidr_mask))
 			return false;
@@ -37,27 +34,56 @@  is_affected_midr_range(const struct arm64_cpu_capabilities *entry, int scope)
 }
 
 static bool __maybe_unused
-is_affected_midr_range_list(const struct arm64_cpu_capabilities *entry,
-			    int scope)
+is_affected_midr_range(const struct arm64_cpu_capabilities *entry, int scope)
 {
+	int i;
+
+	for (i = 0; i < errata_migrn_target_num; i++) {
+		if (__is_affected_midr_range(entry, errata_migrn_target_cpus[i].midr,
+					     errata_migrn_target_cpus[i].revidr))
+			return true;
+	}
 	WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
-	return is_midr_in_range_list(read_cpuid_id(), entry->midr_range_list);
+	return __is_affected_midr_range(entry, read_cpuid_id(), read_cpuid(REVIDR_EL1));
 }
 
 static bool __maybe_unused
-is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope)
+is_affected_midr_range_list(const struct arm64_cpu_capabilities *entry,
+			    int scope)
 {
-	u32 model;
+	int i;
 
+	for (i = 0; i < errata_migrn_target_num; i++) {
+		if (is_midr_in_range_list(errata_migrn_target_cpus[i].midr,
+					  entry->midr_range_list))
+			return true;
+	}
 	WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
+	return is_midr_in_range_list(read_cpuid_id(), entry->midr_range_list);
+}
 
-	model = read_cpuid_id();
+static bool __maybe_unused
+__is_kryo_midr(const struct arm64_cpu_capabilities *entry, u32 model)
+{
 	model &= MIDR_IMPLEMENTOR_MASK | (0xf00 << MIDR_PARTNUM_SHIFT) |
 		 MIDR_ARCHITECTURE_MASK;
 
 	return model == entry->midr_range.model;
 }
 
+static bool __maybe_unused
+is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope)
+{
+	int i;
+
+	for (i = 0; i < errata_migrn_target_num; i++) {
+		if (__is_kryo_midr(entry, errata_migrn_target_cpus[i].midr))
+			return true;
+	}
+	WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
+	return __is_kryo_midr(entry, read_cpuid_id());
+}
+
 static bool
 has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
 			  int scope)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index f7fd7e3259e4..390f4ffa773c 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -86,6 +86,7 @@ 
 #include <asm/mmu_context.h>
 #include <asm/mpam.h>
 #include <asm/mte.h>
+#include <asm/paravirt.h>
 #include <asm/processor.h>
 #include <asm/smp.h>
 #include <asm/sysreg.h>
@@ -3597,6 +3598,8 @@  unsigned long cpu_get_elf_hwcap2(void)
 
 static void __init setup_boot_cpu_capabilities(void)
 {
+	pv_errata_migrn_target_init();
+
 	/*
 	 * The boot CPU's feature register values have been recorded. Detect
 	 * boot cpucaps and local cpucaps for the boot CPU, then enable and