diff mbox series

arm64/sme: Add hwcap for Scalable Matrix Extension

Message ID 20220414115544.36204-1-tianjia.zhang@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series arm64/sme: Add hwcap for Scalable Matrix Extension | expand

Commit Message

tianjia.zhang April 14, 2022, 11:55 a.m. UTC
Allow userspace to detect support for SME (Scalable Matrix Extension)
by providing a hwcap for it, using the official feature name FEAT_SME,
declared in ARM DDI 0487H.a specification.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 Documentation/arm64/elf_hwcaps.rst  |  4 ++++
 arch/arm64/include/asm/hwcap.h      |  1 +
 arch/arm64/include/asm/sysreg.h     |  1 +
 arch/arm64/include/uapi/asm/hwcap.h |  1 +
 arch/arm64/kernel/cpufeature.c      | 13 +++++++++++++
 arch/arm64/kernel/cpuinfo.c         |  1 +
 arch/arm64/tools/cpucaps            |  1 +
 7 files changed, 22 insertions(+)

Comments

Mark Brown April 14, 2022, 12:02 p.m. UTC | #1
On Thu, Apr 14, 2022 at 07:55:44PM +0800, Tianjia Zhang wrote:

> Allow userspace to detect support for SME (Scalable Matrix Extension)
> by providing a hwcap for it, using the official feature name FEAT_SME,
> declared in ARM DDI 0487H.a specification.

There's already a hwcap for the core feature and all the subfeatures
added as part of the series I've been posting for SME:

   https://lore.kernel.org/linux-arm-kernel/20220408114328.1401034-1-broonie@kernel.org/

Why add something independently, especially given that there is no way
for userspace to do anything constructive with the feature without the
rest of the kernel support?  Any attempt to use SME instructions without
kernel support will trap and generate a SIGILL even if the feature is
present in hardware.

Do you have a system with SME that you're trying to use?  Review/testing
on the current series would be appreciated.
Marc Zyngier April 14, 2022, 12:06 p.m. UTC | #2
On Thu, 14 Apr 2022 12:55:44 +0100,
Tianjia Zhang <tianjia.zhang@linux.alibaba.com> wrote:
> 
> Allow userspace to detect support for SME (Scalable Matrix Extension)
> by providing a hwcap for it, using the official feature name FEAT_SME,
> declared in ARM DDI 0487H.a specification.

Err, not just that, for sure. What does this patch buys you on its
own, given that the kernel doesn't implement anything yet and that all
the SME instructions will UNDEF?

[1] is the real deal.

Thanks,

	M.

[1] https://lore.kernel.org/r/20220408114328.1401034-1-broonie@kernel.org
tianjia.zhang April 15, 2022, 2:25 a.m. UTC | #3
Hi Mark,

On 4/14/22 8:02 PM, Mark Brown wrote:
> On Thu, Apr 14, 2022 at 07:55:44PM +0800, Tianjia Zhang wrote:
> 
>> Allow userspace to detect support for SME (Scalable Matrix Extension)
>> by providing a hwcap for it, using the official feature name FEAT_SME,
>> declared in ARM DDI 0487H.a specification.
> 
> There's already a hwcap for the core feature and all the subfeatures
> added as part of the series I've been posting for SME:
> 
>     https://lore.kernel.org/linux-arm-kernel/20220408114328.1401034-1-broonie@kernel.org/
> 
> Why add something independently, especially given that there is no way
> for userspace to do anything constructive with the feature without the
> rest of the kernel support?  Any attempt to use SME instructions without
> kernel support will trap and generate a SIGILL even if the feature is
> present in hardware.

Great job, I encountered the issue of invalid REVD (requires FEAT_SME)
instruction when developing SVE2 programs, so I plan to gradually
support SME in the kernel, thanks for your contribution, you can ignore
my patch.

In addition, I would like to ask a question, whether there is an
alternative SVE2 instruction for the REVD instruction that can complete
this operation, if the machine does not support SME.

> 
> Do you have a system with SME that you're trying to use?  Review/testing
> on the current series would be appreciated.

Unfortunately, the value currently read by my machine ID_AA64PFR1_EL1
register is 0x121. It seems that the hardware does not support SME. Is
there any other help I can provide?

Kind regards,
Tianjia
tianjia.zhang April 15, 2022, 2:30 a.m. UTC | #4
Hi Marc,

On 4/14/22 8:06 PM, Marc Zyngier wrote:
> On Thu, 14 Apr 2022 12:55:44 +0100,
> Tianjia Zhang <tianjia.zhang@linux.alibaba.com> wrote:
>>
>> Allow userspace to detect support for SME (Scalable Matrix Extension)
>> by providing a hwcap for it, using the official feature name FEAT_SME,
>> declared in ARM DDI 0487H.a specification.
> 
> Err, not just that, for sure. What does this patch buys you on its
> own, given that the kernel doesn't implement anything yet and that all
> the SME instructions will UNDEF?
> 
> [1] is the real deal.
> 
> Thanks,
> 
> 	M.
> 
> [1] https://lore.kernel.org/r/20220408114328.1401034-1-broonie@kernel.org
> 

Thanks for your suggestion, I have a very simple scenario, I can see
whether the SME feature is supported in cpuinfo, it seems impractical at
the moment.

Kind regards,
Tianjia
Mark Brown April 19, 2022, 1:58 p.m. UTC | #5
On Fri, Apr 15, 2022 at 10:25:33AM +0800, Tianjia Zhang wrote:
> On 4/14/22 8:02 PM, Mark Brown wrote:
> > On Thu, Apr 14, 2022 at 07:55:44PM +0800, Tianjia Zhang wrote:

> > Why add something independently, especially given that there is no way
> > for userspace to do anything constructive with the feature without the
> > rest of the kernel support?  Any attempt to use SME instructions without
> > kernel support will trap and generate a SIGILL even if the feature is
> > present in hardware.

> Great job, I encountered the issue of invalid REVD (requires FEAT_SME)
> instruction when developing SVE2 programs, so I plan to gradually
> support SME in the kernel, thanks for your contribution, you can ignore
> my patch.

I see.  Unfortunately all the new registers mean that we really need to
define all the ABI as soon as we enable anything and the only thing we
can really skip out on when doing initial enablement is KVM (which I
have in fact skipped for the time being, I'll look at that at some point
after the initial support is landed).

> In addition, I would like to ask a question, whether there is an
> alternative SVE2 instruction for the REVD instruction that can complete
> this operation, if the machine does not support SME.

I'm not aware of anything, but I am mostly focused on the OS support
rather than any of the actual mathematical operations that are more the
point of these architecture features so I might be missing something.

> > Do you have a system with SME that you're trying to use?  Review/testing
> > on the current series would be appreciated.

> Unfortunately, the value currently read by my machine ID_AA64PFR1_EL1
> register is 0x121. It seems that the hardware does not support SME. Is
> there any other help I can provide?

Other than verifying that the series doesn't cause trouble for systems
without SME
tianjia.zhang April 24, 2022, 9:27 a.m. UTC | #6
Hi Mark,

On 4/19/22 9:58 PM, Mark Brown wrote:
> On Fri, Apr 15, 2022 at 10:25:33AM +0800, Tianjia Zhang wrote:
>> On 4/14/22 8:02 PM, Mark Brown wrote:
>>> On Thu, Apr 14, 2022 at 07:55:44PM +0800, Tianjia Zhang wrote:
> 
>>> Why add something independently, especially given that there is no way
>>> for userspace to do anything constructive with the feature without the
>>> rest of the kernel support?  Any attempt to use SME instructions without
>>> kernel support will trap and generate a SIGILL even if the feature is
>>> present in hardware.
> 
>> Great job, I encountered the issue of invalid REVD (requires FEAT_SME)
>> instruction when developing SVE2 programs, so I plan to gradually
>> support SME in the kernel, thanks for your contribution, you can ignore
>> my patch.
> 
> I see.  Unfortunately all the new registers mean that we really need to
> define all the ABI as soon as we enable anything and the only thing we
> can really skip out on when doing initial enablement is KVM (which I
> have in fact skipped for the time being, I'll look at that at some point
> after the initial support is landed).
> 
>> In addition, I would like to ask a question, whether there is an
>> alternative SVE2 instruction for the REVD instruction that can complete
>> this operation, if the machine does not support SME.
> 
> I'm not aware of anything, but I am mostly focused on the OS support
> rather than any of the actual mathematical operations that are more the
> point of these architecture features so I might be missing something.
> 
>>> Do you have a system with SME that you're trying to use?  Review/testing
>>> on the current series would be appreciated.
> 
>> Unfortunately, the value currently read by my machine ID_AA64PFR1_EL1
>> register is 0x121. It seems that the hardware does not support SME. Is
>> there any other help I can provide?
> 
> Other than verifying that the series doesn't cause trouble for systems
> without SME

Thanks for your reply, I have indirectly implemented the functionality
of the REVD instruction using the tbl instruction on a machine that does
not support SME.

For this group of patchsets, I will do some tests later, which may take
a long time, and there is currently no exclusive machine at hand.

Best regards,
Tianjia
diff mbox series

Patch

diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index a8f30963e550..50d2309a60d5 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -264,6 +264,10 @@  HWCAP2_MTE3
     Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0011, as described
     by Documentation/arm64/memory-tagging-extension.rst.
 
+HWCAP2_SME
+
+    Functionality implied by ID_AA64PFR1_EL1.SME == 0b0001.
+
 4. Unused AT_HWCAP bits
 -----------------------
 
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 8db5ec0089db..5299afc30fb0 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -109,6 +109,7 @@ 
 #define KERNEL_HWCAP_AFP		__khwcap2_feature(AFP)
 #define KERNEL_HWCAP_RPRES		__khwcap2_feature(RPRES)
 #define KERNEL_HWCAP_MTE3		__khwcap2_feature(MTE3)
+#define KERNEL_HWCAP_SME		__khwcap2_feature(SME)
 
 /*
  * This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index fbf5f8bb9055..e66f9360cd93 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -836,6 +836,7 @@ 
 #define ID_AA64PFR0_ELx_32BIT_64BIT	0x2
 
 /* id_aa64pfr1 */
+#define ID_AA64PFR1_SME_SHIFT		24
 #define ID_AA64PFR1_MPAMFRAC_SHIFT	16
 #define ID_AA64PFR1_RASFRAC_SHIFT	12
 #define ID_AA64PFR1_MTE_SHIFT		8
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 99cb5d383048..0371779c7ca2 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -79,5 +79,6 @@ 
 #define HWCAP2_AFP		(1 << 20)
 #define HWCAP2_RPRES		(1 << 21)
 #define HWCAP2_MTE3		(1 << 22)
+#define HWCAP2_SME		(1 << 23)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d72c4b4d389c..55c5e4b9c50e 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -261,6 +261,7 @@  static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SME_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MPAMFRAC_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_RASFRAC_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
@@ -2442,6 +2443,17 @@  static const struct arm64_cpu_capabilities arm64_features[] = {
 		.matches = has_cpuid_feature,
 		.min_field_value = 1,
 	},
+	{
+		.desc = "Scalable Matrix Extension",
+		.capability = ARM64_SME,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR1_EL1,
+		.field_pos = ID_AA64PFR1_SME_SHIFT,
+		.field_width = 4,
+		.sign = FTR_UNSIGNED,
+		.min_field_value = 1,
+	},
 	{},
 };
 
@@ -2572,6 +2584,7 @@  static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_MTE, CAP_HWCAP, KERNEL_HWCAP_MTE),
 	HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_MTE_ASYMM, CAP_HWCAP, KERNEL_HWCAP_MTE3),
 #endif /* CONFIG_ARM64_MTE */
+	HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_SME_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SME),
 	HWCAP_CAP(SYS_ID_AA64MMFR0_EL1, ID_AA64MMFR0_ECV_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ECV),
 	HWCAP_CAP(SYS_ID_AA64MMFR1_EL1, ID_AA64MMFR1_AFP_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AFP),
 	HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_RPRES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES),
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 330b92ea863a..87be4ba601eb 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -98,6 +98,7 @@  static const char *const hwcap_str[] = {
 	[KERNEL_HWCAP_AFP]		= "afp",
 	[KERNEL_HWCAP_RPRES]		= "rpres",
 	[KERNEL_HWCAP_MTE3]		= "mte3",
+	[KERNEL_HWCAP_SME]		= "sme",
 };
 
 #ifdef CONFIG_COMPAT
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 3ed418f70e3b..c0c05399b24a 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -49,6 +49,7 @@  SPECTRE_V4
 SPECTRE_BHB
 SSBS
 SVE
+SME
 UNMAP_KERNEL_AT_EL0
 WORKAROUND_834220
 WORKAROUND_843419