diff mbox series

arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented

Message ID 20250103142635.1759674-1-maz@kernel.org (mailing list archive)
State New
Headers show
Series arm64: Filter out SVE hwcaps when FEAT_SVE isn't implemented | expand

Commit Message

Marc Zyngier Jan. 3, 2025, 2:26 p.m. UTC
The hwcaps code that exposes SVE features to userspace only
considers ID_AA64ZFR0_EL1, while this is only valid when
ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.

The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
ID_AA64ZFR0_EL1 register is also 0. So far, so good.

Things become a bit more interesting if the HW implements SME.
In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
features. And these fields overlap with their SVE interpretations.
But the architecture says that the SME and SVE feature sets must
match, so we're still hunky-dory.

This goes wrong if the HW implements SME, but not SVE. In this
case, we end-up advertising some SVE features to userspace, even
if the HW has none. That's because we never consider whether SVE
is actually implemented. Oh well.

Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
being non-zero.

Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
---
 arch/arm64/kernel/cpufeature.c | 58 ++++++++++++++++++++++++----------
 1 file changed, 41 insertions(+), 17 deletions(-)

Comments

Catalin Marinas Jan. 3, 2025, 5:20 p.m. UTC | #1
On Fri, Jan 03, 2025 at 02:26:35PM +0000, Marc Zyngier wrote:
> The hwcaps code that exposes SVE features to userspace only
> considers ID_AA64ZFR0_EL1, while this is only valid when
> ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.
> 
> The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
> ID_AA64ZFR0_EL1 register is also 0. So far, so good.
> 
> Things become a bit more interesting if the HW implements SME.
> In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
> features. And these fields overlap with their SVE interpretations.
> But the architecture says that the SME and SVE feature sets must
> match, so we're still hunky-dory.
> 
> This goes wrong if the HW implements SME, but not SVE. In this
> case, we end-up advertising some SVE features to userspace, even
> if the HW has none. That's because we never consider whether SVE
> is actually implemented. Oh well.
> 
> Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
> being non-zero.
> 
> Reported-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: stable@vger.kernel.org

I'd add:

Fixes: 06a916feca2b ("arm64: Expose SVE2 features for userspace")

While at the time the code was correct, the architecture messed up our
assumptions with the introduction of SME.

> @@ -3022,6 +3027,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.matches = match,						\
>  	}
>  
> +#define HWCAP_CAP_MATCH_ID(match, reg, field, min_value, cap_type, cap)		\
> +	{									\
> +		__HWCAP_CAP(#cap, cap_type, cap)				\
> +		HWCAP_CPUID_MATCH(reg, field, min_value) 			\
> +		.matches = match,						\
> +	}

Do we actually need this macro?

> +
>  #ifdef CONFIG_ARM64_PTR_AUTH
>  static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = {
>  	{
> @@ -3050,6 +3062,18 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = {
>  };
>  #endif
>  
> +#ifdef CONFIG_ARM64_SVE
> +static bool has_sve(const struct arm64_cpu_capabilities *cap, int scope)
> +{
> +	u64 aa64pfr0 = __read_scoped_sysreg(SYS_ID_AA64PFR0_EL1, scope);
> +
> +	if (FIELD_GET(ID_AA64PFR0_EL1_SVE, aa64pfr0) < ID_AA64PFR0_EL1_SVE_IMP)
> +		return false;
> +
> +	return has_user_cpuid_feature(cap, scope);
> +}
> +#endif

We can name this has_sve_feature() and use it with the existing
HWCAP_CAP_MATCH() macro. I think it would look identical.

We might even be able to use system_supports_sve() directly and avoid
changing read_scoped_sysreg(). setup_user_features() is called in
smp_cpus_done() after setup_system_features(), so using
system_supports_sve() directly should be fine here.
Catalin Marinas Jan. 3, 2025, 5:39 p.m. UTC | #2
On Fri, Jan 03, 2025 at 05:20:05PM +0000, Catalin Marinas wrote:
> On Fri, Jan 03, 2025 at 02:26:35PM +0000, Marc Zyngier wrote:
> > @@ -3022,6 +3027,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> >  		.matches = match,						\
> >  	}
> >  
> > +#define HWCAP_CAP_MATCH_ID(match, reg, field, min_value, cap_type, cap)		\
> > +	{									\
> > +		__HWCAP_CAP(#cap, cap_type, cap)				\
> > +		HWCAP_CPUID_MATCH(reg, field, min_value) 			\
> > +		.matches = match,						\
> > +	}
> 
> Do we actually need this macro?

Ignore me, we still need this macro as HWCAP_CAP_MATCH does not take all
the arguments. Maybe not the read_scoped_sysreg() change though.
Marc Zyngier Jan. 3, 2025, 5:58 p.m. UTC | #3
On Fri, 03 Jan 2025 17:20:05 +0000,
Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> On Fri, Jan 03, 2025 at 02:26:35PM +0000, Marc Zyngier wrote:
> > The hwcaps code that exposes SVE features to userspace only
> > considers ID_AA64ZFR0_EL1, while this is only valid when
> > ID_AA64PFR0_EL1.SVE advertises that SVE is actually supported.
> > 
> > The expectations are that when ID_AA64PFR0_EL1.SVE is 0, the
> > ID_AA64ZFR0_EL1 register is also 0. So far, so good.
> > 
> > Things become a bit more interesting if the HW implements SME.
> > In this case, a few ID_AA64ZFR0_EL1 fields indicate *SME*
> > features. And these fields overlap with their SVE interpretations.
> > But the architecture says that the SME and SVE feature sets must
> > match, so we're still hunky-dory.
> > 
> > This goes wrong if the HW implements SME, but not SVE. In this
> > case, we end-up advertising some SVE features to userspace, even
> > if the HW has none. That's because we never consider whether SVE
> > is actually implemented. Oh well.
> > 
> > Fix it by restricting all SVE capabilities to ID_AA64PFR0_EL1.SVE
> > being non-zero.
> > 
> > Reported-by: Catalin Marinas <catalin.marinas@arm.com>
> > Signed-off-by: Marc Zyngier <maz@kernel.org>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Mark Brown <broonie@kernel.org>
> > Cc: stable@vger.kernel.org
> 
> I'd add:
> 
> Fixes: 06a916feca2b ("arm64: Expose SVE2 features for userspace")
> 
> While at the time the code was correct, the architecture messed up our
> assumptions with the introduction of SME.

Good point.

> 
> > @@ -3022,6 +3027,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> >  		.matches = match,						\
> >  	}
> >  
> > +#define HWCAP_CAP_MATCH_ID(match, reg, field, min_value, cap_type, cap)		\
> > +	{									\
> > +		__HWCAP_CAP(#cap, cap_type, cap)				\
> > +		HWCAP_CPUID_MATCH(reg, field, min_value) 			\
> > +		.matches = match,						\
> > +	}
> 
> Do we actually need this macro?

It is either this macro, or a large-ish switch/case statement doing
the same thing. See below.

> 
> > +
> >  #ifdef CONFIG_ARM64_PTR_AUTH
> >  static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = {
> >  	{
> > @@ -3050,6 +3062,18 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = {
> >  };
> >  #endif
> >  
> > +#ifdef CONFIG_ARM64_SVE
> > +static bool has_sve(const struct arm64_cpu_capabilities *cap, int scope)
> > +{
> > +	u64 aa64pfr0 = __read_scoped_sysreg(SYS_ID_AA64PFR0_EL1, scope);
> > +
> > +	if (FIELD_GET(ID_AA64PFR0_EL1_SVE, aa64pfr0) < ID_AA64PFR0_EL1_SVE_IMP)
> > +		return false;
> > +
> > +	return has_user_cpuid_feature(cap, scope);
> > +}
> > +#endif
> 
> We can name this has_sve_feature() and use it with the existing
> HWCAP_CAP_MATCH() macro. I think it would look identical.

I don't think that works. HWCAP_CAP_MATCH() doesn't take the
reg/field/limit information that we need to compute the
capability. Without such information neatly populated in
arm64_cpu_capabilities, you can't call has_user_cpuid_feature().

A previous incarnation of the same patch was using that macro. But you
then end-up with having to map the cap with the field/limit and
perform the check "by hand". Roughly, this would look like this:

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d793ca08549cd..76566a8bcdd3c 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -3065,12 +3065,22 @@ static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = {
 #ifdef CONFIG_ARM64_SVE
 static bool has_sve(const struct arm64_cpu_capabilities *cap, int scope)
 {
-	u64 aa64pfr0 = __read_scoped_sysreg(SYS_ID_AA64PFR0_EL1, scope);
+	u64 zfr0;
 
-	if (FIELD_GET(ID_AA64PFR0_EL1_SVE, aa64pfr0) < ID_AA64PFR0_EL1_SVE_IMP)
+	if (!system_supports_sve())
 		return false;
 
-	return has_user_cpuid_feature(cap, scope);
+	zfr0 = __read_scoped_sysreg(SYS_ID_AA64ZFR0_EL1, scope);
+
+	switch (cap->cap) {
+	case KERNEL_HWCAP_SVE2P1:
+		return SYS_FIELF_GET(SYS_ID_AA64ZFR0_EL1, SVEver) >= SYS_ID_AA64ZFR0_EL1_SVEver_SVE2p1;
+	case KERNEL_HWCAP_SVE2:
+		return SYS_FIELF_GET(SYS_ID_AA64ZFR0_EL1, SVEver) >= SYS_ID_AA64ZFR0_EL1_SVEver_SVE2;
+	case KERNEL_HWCAP_SVEAES:
+		return SYS_FIELF_GET(SYS_ID_AA64ZFR0_EL1, AES) >= SYS_ID_AA64ZFR0_EL1_AES_IMP;
+	[...]
+	}
 }
 #endif
 
Frankly, I don't think this is worth it, and you still need to hack
read_scoped_sysreg().

> 
> We might even be able to use system_supports_sve() directly and avoid
> changing read_scoped_sysreg(). setup_user_features() is called in
> smp_cpus_done() after setup_system_features(), so using
> system_supports_sve() directly should be fine here.

Yeah, that should work.

Thanks,

	M.
Marc Zyngier Jan. 3, 2025, 6 p.m. UTC | #4
On Fri, 03 Jan 2025 17:39:39 +0000,
Catalin Marinas <catalin.marinas@arm.com> wrote:
> 
> On Fri, Jan 03, 2025 at 05:20:05PM +0000, Catalin Marinas wrote:
> > On Fri, Jan 03, 2025 at 02:26:35PM +0000, Marc Zyngier wrote:
> > > @@ -3022,6 +3027,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> > >  		.matches = match,						\
> > >  	}
> > >  
> > > +#define HWCAP_CAP_MATCH_ID(match, reg, field, min_value, cap_type, cap)		\
> > > +	{									\
> > > +		__HWCAP_CAP(#cap, cap_type, cap)				\
> > > +		HWCAP_CPUID_MATCH(reg, field, min_value) 			\
> > > +		.matches = match,						\
> > > +	}
> > 
> > Do we actually need this macro?
> 
> Ignore me, we still need this macro as HWCAP_CAP_MATCH does not take all
> the arguments. Maybe not the read_scoped_sysreg() change though.

Ah, I just replied to your earlier email. Agreed on avoiding the
read_scoped_sysreg() change, replacing it with system_supports_sve().

I'll respin the change now.

	M.
diff mbox series

Patch

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 6ce71f444ed84..d793ca08549cd 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1593,14 +1593,19 @@  feature_matches(u64 reg, const struct arm64_cpu_capabilities *entry)
 	return val >= min && val <= max;
 }
 
-static u64
-read_scoped_sysreg(const struct arm64_cpu_capabilities *entry, int scope)
+static u64 __read_scoped_sysreg(u64 reg, int scope)
 {
 	WARN_ON(scope == SCOPE_LOCAL_CPU && preemptible());
 	if (scope == SCOPE_SYSTEM)
-		return read_sanitised_ftr_reg(entry->sys_reg);
+		return read_sanitised_ftr_reg(reg);
 	else
-		return __read_sysreg_by_encoding(entry->sys_reg);
+		return __read_sysreg_by_encoding(reg);
+}
+
+static u64
+read_scoped_sysreg(const struct arm64_cpu_capabilities *entry, int scope)
+{
+	return __read_scoped_sysreg(entry->sys_reg, scope);
 }
 
 static bool
@@ -3022,6 +3027,13 @@  static const struct arm64_cpu_capabilities arm64_features[] = {
 		.matches = match,						\
 	}
 
+#define HWCAP_CAP_MATCH_ID(match, reg, field, min_value, cap_type, cap)		\
+	{									\
+		__HWCAP_CAP(#cap, cap_type, cap)				\
+		HWCAP_CPUID_MATCH(reg, field, min_value) 			\
+		.matches = match,						\
+	}
+
 #ifdef CONFIG_ARM64_PTR_AUTH
 static const struct arm64_cpu_capabilities ptr_auth_hwcap_addr_matches[] = {
 	{
@@ -3050,6 +3062,18 @@  static const struct arm64_cpu_capabilities ptr_auth_hwcap_gen_matches[] = {
 };
 #endif
 
+#ifdef CONFIG_ARM64_SVE
+static bool has_sve(const struct arm64_cpu_capabilities *cap, int scope)
+{
+	u64 aa64pfr0 = __read_scoped_sysreg(SYS_ID_AA64PFR0_EL1, scope);
+
+	if (FIELD_GET(ID_AA64PFR0_EL1_SVE, aa64pfr0) < ID_AA64PFR0_EL1_SVE_IMP)
+		return false;
+
+	return has_user_cpuid_feature(cap, scope);
+}
+#endif
+
 static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(ID_AA64ISAR0_EL1, AES, PMULL, CAP_HWCAP, KERNEL_HWCAP_PMULL),
 	HWCAP_CAP(ID_AA64ISAR0_EL1, AES, AES, CAP_HWCAP, KERNEL_HWCAP_AES),
@@ -3092,19 +3116,19 @@  static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(ID_AA64MMFR2_EL1, AT, IMP, CAP_HWCAP, KERNEL_HWCAP_USCAT),
 #ifdef CONFIG_ARM64_SVE
 	HWCAP_CAP(ID_AA64PFR0_EL1, SVE, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, SVEver, SVE2p1, CAP_HWCAP, KERNEL_HWCAP_SVE2P1),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, SVEver, SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, AES, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEAES),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, AES, PMULL128, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, BitPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, B16B16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE_B16B16),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, BF16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBF16),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_SVE_EBF16),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, SHA3, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESHA3),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, SM4, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESM4),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, F32MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM),
-	HWCAP_CAP(ID_AA64ZFR0_EL1, F64MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, SVEver, SVE2p1, CAP_HWCAP, KERNEL_HWCAP_SVE2P1),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, SVEver, SVE2, CAP_HWCAP, KERNEL_HWCAP_SVE2),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, AES, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEAES),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, AES, PMULL128, CAP_HWCAP, KERNEL_HWCAP_SVEPMULL),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, BitPerm, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBITPERM),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, B16B16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE_B16B16),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, BF16, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEBF16),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_SVE_EBF16),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, SHA3, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESHA3),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, SM4, IMP, CAP_HWCAP, KERNEL_HWCAP_SVESM4),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEI8MM),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, F32MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF32MM),
+	HWCAP_CAP_MATCH_ID(has_sve, ID_AA64ZFR0_EL1, F64MM, IMP, CAP_HWCAP, KERNEL_HWCAP_SVEF64MM),
 #endif
 #ifdef CONFIG_ARM64_GCS
 	HWCAP_CAP(ID_AA64PFR1_EL1, GCS, IMP, CAP_HWCAP, KERNEL_HWCAP_GCS),