[1/2] arm64: Expose address bits (physical/virtual) via cpuinfo
diff mbox series

Message ID 1548709076-22317-2-git-send-email-bhsharma@redhat.com
State New
Headers show
Series
  • arm64: Expose physical and virtual address capabilities to user-space
Related show

Commit Message

Bhupesh Sharma Jan. 28, 2019, 8:57 p.m. UTC
With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
supports these extensions can support upto 52-bit virtual and 52-bit
physical addresses respectively.

Since at the moment we enable the support of these extensions via CONFIG
flags, e.g.
 - LPA via CONFIG_ARM64_PA_BITS_52, and
 - LVA via CONFIG_ARM64_FORCE_52BIT

The easiest way a user can determine the physical/virtual
addresses supported on the hardware, is via the '/proc/cpuinfo'
interface.

This patches enables the same.

Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
---
 arch/arm64/include/asm/cpufeature.h | 59 ++++++++++++++++++++++++-------------
 arch/arm64/include/asm/sysreg.h     | 19 ++++++++++++
 arch/arm64/kernel/cpuinfo.c         |  4 ++-
 3 files changed, 61 insertions(+), 21 deletions(-)

Comments

Suzuki K Poulose Jan. 29, 2019, 10:09 a.m. UTC | #1
Hi Bupesh

On 28/01/2019 20:57, Bhupesh Sharma wrote:
> With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
> supports these extensions can support upto 52-bit virtual and 52-bit
> physical addresses respectively.
> 
> Since at the moment we enable the support of these extensions via CONFIG
> flags, e.g.
>   - LPA via CONFIG_ARM64_PA_BITS_52, and
>   - LVA via CONFIG_ARM64_FORCE_52BIT
> 
> The easiest way a user can determine the physical/virtual
> addresses supported on the hardware, is via the '/proc/cpuinfo'
> interface.

Why do we need this information ?

Btw, this keeps coming up all the time and the answer to this approach is
always no. We cannot break the "unwritten" ABI of /proc/cpuinfo, again.
See :

https://patchwork.kernel.org/patch/8669301/


Suzuki
Bhupesh Sharma Jan. 30, 2019, 7:48 p.m. UTC | #2
Hi Suzuki,

Thanks for your review.

On 01/29/2019 03:39 PM, Suzuki K Poulose wrote:
> Hi Bupesh
> 
> On 28/01/2019 20:57, Bhupesh Sharma wrote:
>> With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
>> supports these extensions can support upto 52-bit virtual and 52-bit
>> physical addresses respectively.
>>
>> Since at the moment we enable the support of these extensions via CONFIG
>> flags, e.g.
>>   - LPA via CONFIG_ARM64_PA_BITS_52, and
>>   - LVA via CONFIG_ARM64_FORCE_52BIT
>>
>> The easiest way a user can determine the physical/virtual
>> addresses supported on the hardware, is via the '/proc/cpuinfo'
>> interface.
> 
> Why do we need this information ?

Sorry for the delay in reply, but I wanted to collect as much 
information from our test teams as possible before replying to this thread.

So here is brief list of reasons, as to why we need this information in 
user-space:

1. This information is useful for a non-expert user, using Linux 
distributions (like Fedora) on different arm64 platforms. The default 
configuration (.config) will be the same for a distribution flavor and 
is supposed to work fine on all underlying arm64 platforms.

a). Now some of these underlying platforms may support ARMv8-8.2 
extension while others don't.

b). Users performing performance bench-marking on these platforms run 
benchmarks with different page-sizes and address ranges.

c). Right now they have no way to know, about the underlying VARange and 
PARange values other than reading the config file and search for the flags.

For e.g. lets consider the 'pg-table_tests' (See - 
<https://github.com/sanskriti-s/pg-table_tests>), which is used to test 
and verify 5-level page table behavior on x86_64 Linux. It requires 
determining if 5-level page tables are fully supported, for which it 
uses either 'Intel 'la57' cpu flag' in:

$ cat /proc/cpuinfo', or

$ grep CONFIG_X86_5LEVEL /boot/config-$(uname -r)
CONFIG_X86_5LEVEL=y

This test suite is easily modifiable for verifying 52-bit ARMv8.2-LVA 
support.

d). Now when running the above suite and sharing results, it might be 
that the .config file is not available or even in the case it is 
available the CONFIG flag settings in .config file are not intuitive to 
a non-expert user for arm64 (the example below is of 64K page size, 
48-bit kernel VA, 52-bit User space VA and 52-bit PA):

   CONFIG_ARM64_64K_PAGES=y
   CONFIG_ARM64_USER_VA_BITS_52=y
   CONFIG_ARM64_VA_BITS=48
   CONFIG_ARM64_PA_BITS_52=y
   CONFIG_ARM64_PA_BITS=52

Compare it with a single CONFIG_X86_5LEVEL=y config item for x86_64.

Also the cpu flag in '/proc/cpuinfo' may not hold any descriptive value 
for ARMv8.2 hardware.

2. So, its much easier in above cases to see and quote the output of '$ 
cat /proc/cpuinfo' instead, for example:

$ cat /proc/cpuinfo

<..snip..>
processor	: 31
<..snip..>
CPU architecture: 8
<..snip..>
address sizes	: 52 bits physical, 48 bits virtual

> Btw, this keeps coming up all the time and the answer to this approach is
> always no. We cannot break the "unwritten" ABI of /proc/cpuinfo, again.
> See :
> 
> https://patchwork.kernel.org/patch/8669301/

I understand your point, but from a user-space/command-line p-o-v we 
would ideally want an arm64 server to support most features that are 
already available on a x86_64 server (I guess that was the whole point 
of the SBSA server specifications - we want all arm64 servers to look 
and feel the same way in terms of user-experience).

Also right now there is an absence of a standard ABI between the 
user-space and kernel for exporting this information to the user-space, 
with two exceptions:

1. For vmcoreinfo specific user-space utilities (like makedumpfile and 
crash) I have proposed a couple of CONFIG flags to be added to the 
vmcoreinfo, so that user-space utilities can use the same (See 
<http://lists.infradead.org/pipermail/kexec/2019-January/022387.html> 
for details).

2. For other user-space utilities (especially those which make a 'mmap' 
call and pass an address hint to the get the kernel to provide a high 
address), I can see only two methods to determine the underlying kernel 
support:

a). Read the CONFIG flags from .config (as I captured some paragraphs 
above), or

b). In absence of .config file on the system, read the system ID 
registers like 'ID_AA64MMFR0_EL1' and 'ID_AA64MMFR2_EL1' (which PATCH 
2/2 of this series tries to enable from kernel side) and then make a 
decision on whether to pass a hint to 'mmap'.

It might be that I am missing other standard ABI mechanisms. If so, 
please point me to the same.

Thanks,
Bhupesh
James Morse Feb. 4, 2019, 4:54 p.m. UTC | #3
Hi Bhupesh,

On 30/01/2019 19:48, Bhupesh Sharma wrote:
> On 01/29/2019 03:39 PM, Suzuki K Poulose wrote:
>> On 28/01/2019 20:57, Bhupesh Sharma wrote:
>>> With ARMv8.2-LVA and LPA architecture extensions, arm64 hardware which
>>> supports these extensions can support upto 52-bit virtual and 52-bit
>>> physical addresses respectively.
>>>
>>> Since at the moment we enable the support of these extensions via CONFIG
>>> flags, e.g.
>>>   - LPA via CONFIG_ARM64_PA_BITS_52, and
>>>   - LVA via CONFIG_ARM64_FORCE_52BIT
>>>
>>> The easiest way a user can determine the physical/virtual
>>> addresses supported on the hardware, is via the '/proc/cpuinfo'
>>> interface.
>>
>> Why do we need this information ?
> 
> Sorry for the delay in reply, but I wanted to collect as much information from
> our test teams as possible before replying to this thread.
> 
> So here is brief list of reasons, as to why we need this information in user-space:
> 
> 1. This information is useful for a non-expert user, using Linux distributions
> (like Fedora) on different arm64 platforms. The default configuration (.config)
> will be the same for a distribution flavor and is supposed to work fine on all
> underlying arm64 platforms.
> 
> a). Now some of these underlying platforms may support ARMv8-8.2 extension while
> others don't.
> 
> b). Users performing performance bench-marking on these platforms run benchmarks
> with different page-sizes and address ranges.

> c). Right now they have no way to know, about the underlying VARange and PARange
> values other than reading the config file and search for the flags.

Why do they need to know? What decision can you make with this information that
you can't make without it?


> For e.g. lets consider the 'pg-table_tests' (See -
> <https://github.com/sanskriti-s/pg-table_tests>), which is used to test and
> verify 5-level page table behavior on x86_64 Linux. It requires determining if
> 5-level page tables are fully supported, 

... but we don't have 5-level pages tables ...


> for which it uses either 'Intel 'la57'
> cpu flag' in:
> 
> $ cat /proc/cpuinfo', or
> 
> $ grep CONFIG_X86_5LEVEL /boot/config-$(uname -r)
> CONFIG_X86_5LEVEL=y
> 
> This test suite is easily modifiable for verifying 52-bit ARMv8.2-LVA support.

This looks like a test to check all kernel page-table walkers have been updated
for a fifth level. We don't need to worry about this.

You should just need to remove the arch-specific test. If you provide the hint
on platforms that support it, the mapping should succeed. On platforms that
don't, it won't.

Why does user space need to know in advance of making the hint?


> d). Now when running the above suite and sharing results, it might be that the
> .config file is not available or even in the case it is available the CONFIG
> flag settings in .config file are not intuitive to a non-expert user for arm64
> (the example below is of 64K page size, 48-bit kernel VA, 52-bit User space VA
> and 52-bit PA):

I agree inspecting the Kconfig is an inappropriate way for user-space 'to know'
what the kernel supports.

I can only see a 'supports 52bit va' flag as being useful to a program that
doesn't actually want to use it, but for some bizarre reason wants to know.

For coredumps the question isn't "was it supported", but "was it in use", which
you can tell from the pagetables.


> Also right now there is an absence of a standard ABI between the user-space and
> kernel for exporting this information to the user-space, with two exceptions:
> 
> 1. For vmcoreinfo specific user-space utilities (like makedumpfile and crash) I
> have proposed a couple of CONFIG flags to be added to the vmcoreinfo, so that
> user-space utilities can use the same (See
> <http://lists.infradead.org/pipermail/kexec/2019-January/022387.html> for details).

vmcoreinfo is for things like crash/gdb/makedumpfile to provide kernel-specific
information that they couldn't possibly work without. Like the page size. 52bit
support doesn't fit here as a 52bit-aware walker works regardless of whether
52bit was in use.


> 2. For other user-space utilities (especially those which make a 'mmap' call and
> pass an address hint to the get the kernel to provide a high address), 

> I can see only two methods to determine the underlying kernel support:
> 
> a). Read the CONFIG flags from .config (as I captured some paragraphs above), or
> 
> b). In absence of .config file on the system, read the system ID registers like
> 'ID_AA64MMFR0_EL1' and 'ID_AA64MMFR2_EL1' (which PATCH 2/2 of this series tries
> to enable from kernel side) and then make a decision on whether to pass a hint
> to 'mmap'.

It seems you're expecting to know whether 52bit-VA is supported without actually
using it. What is this useful for?

The point of the hint is you want to allocate memory, and can work with 52bit-VA
if the platform supports it. If it doesn't, you still want to allocate the
memory. We shouldn't need a hint that the 52bit-va hint is supported.


> It might be that I am missing other standard ABI mechanisms. If so, please point
> me to the same.
We also have HWCAP: Documentation/arm64/elf_hwcaps.txt

These are used for things the program may need to run: like floating point, or
the presence of particular instructions. User-space absolutely has to know about
these in advance, as it will get a SIGILL if support is not present.

52bit VA doesn't fit here: memory is memory. Needing to know implies user-space
is unwilling to use memory if the bits above 48bits aren't set.


Thanks,

James

Patch
diff mbox series

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index dfcfba725d72..2f1270ddc277 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -522,6 +522,45 @@  static inline bool system_supports_32bit_el0(void)
 	return cpus_have_const_cap(ARM64_HAS_32BIT_EL0);
 }
 
+static inline u32 id_aa64mmfr0_parange_to_phys_shift(int parange)
+{
+	switch (parange) {
+	case ID_AA64MMFR0_PARANGE_32: return PARANGE_32;
+	case ID_AA64MMFR0_PARANGE_36: return PARANGE_36;
+	case ID_AA64MMFR0_PARANGE_40: return PARANGE_40;
+	case ID_AA64MMFR0_PARANGE_44: return PARANGE_44;
+	case ID_AA64MMFR0_PARANGE_48: return PARANGE_48;
+	case ID_AA64MMFR0_PARANGE_52: return PARANGE_52;
+	/*
+	 * A future PE could use a value unknown to the kernel.
+	 * However, by the "D10.1.4 Principles of the ID scheme
+	 * for fields in ID registers", ARM DDI 0487C.a, any new
+	 * value is guaranteed to be higher than what we know already.
+	 * As a safe limit, we return the limit supported by the kernel.
+	 */
+	default: return CONFIG_ARM64_PA_BITS;
+	}
+}
+
+static inline u32 id_aa64mmfr0_pa_range_bits(void)
+{
+	u64 mmfr0;
+
+	mmfr0 =	read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
+	return id_aa64mmfr0_parange_to_phys_shift(mmfr0 & 0x7);
+}
+
+static inline u32 id_aa64mmfr2_va_range_bits(void)
+{
+	u64 mmfr2;
+	u32 val;
+
+	mmfr2 =	read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
+	val = cpuid_feature_extract_unsigned_field(mmfr2,
+						ID_AA64MMFR2_LVA_SHIFT);
+	return ((val == ID_AA64MMFR2_VARANGE_52) ? VARANGE_52 : VARANGE_48);
+}
+
 static inline bool system_supports_4kb_granule(void)
 {
 	u64 mmfr0;
@@ -636,26 +675,6 @@  static inline void arm64_set_ssbd_mitigation(bool state) {}
 
 extern int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
 
-static inline u32 id_aa64mmfr0_parange_to_phys_shift(int parange)
-{
-	switch (parange) {
-	case 0: return 32;
-	case 1: return 36;
-	case 2: return 40;
-	case 3: return 42;
-	case 4: return 44;
-	case 5: return 48;
-	case 6: return 52;
-	/*
-	 * A future PE could use a value unknown to the kernel.
-	 * However, by the "D10.1.4 Principles of the ID scheme
-	 * for fields in ID registers", ARM DDI 0487C.a, any new
-	 * value is guaranteed to be higher than what we know already.
-	 * As a safe limit, we return the limit supported by the kernel.
-	 */
-	default: return CONFIG_ARM64_PA_BITS;
-	}
-}
 #endif /* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 72dc4c011014..70910b14b2f3 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -617,9 +617,22 @@ 
 #define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
 #define ID_AA64MMFR0_TGRAN16_NI		0x0
 #define ID_AA64MMFR0_TGRAN16_SUPPORTED	0x1
+#define ID_AA64MMFR0_PARANGE_32		0x0
+#define ID_AA64MMFR0_PARANGE_36		0x1
+#define ID_AA64MMFR0_PARANGE_40		0x2
+#define ID_AA64MMFR0_PARANGE_42		0x3
+#define ID_AA64MMFR0_PARANGE_44		0x4
 #define ID_AA64MMFR0_PARANGE_48		0x5
 #define ID_AA64MMFR0_PARANGE_52		0x6
 
+#define PARANGE_32			32
+#define PARANGE_36			36
+#define PARANGE_40			40
+#define PARANGE_42			42
+#define PARANGE_44			44
+#define PARANGE_48			48
+#define PARANGE_52			52
+
 #ifdef CONFIG_ARM64_PA_BITS_52
 #define ID_AA64MMFR0_PARANGE_MAX	ID_AA64MMFR0_PARANGE_52
 #else
@@ -646,6 +659,12 @@ 
 #define ID_AA64MMFR2_UAO_SHIFT		4
 #define ID_AA64MMFR2_CNP_SHIFT		0
 
+#define ID_AA64MMFR2_VARANGE_48		0x0
+#define ID_AA64MMFR2_VARANGE_52		0x1
+
+#define VARANGE_48			48
+#define VARANGE_52			52
+
 /* id_aa64dfr0 */
 #define ID_AA64DFR0_PMSVER_SHIFT	32
 #define ID_AA64DFR0_CTX_CMPS_SHIFT	28
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index ca0685f33900..66583ac3be19 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -177,7 +177,9 @@  static int c_show(struct seq_file *m, void *v)
 		seq_printf(m, "CPU architecture: 8\n");
 		seq_printf(m, "CPU variant\t: 0x%x\n", MIDR_VARIANT(midr));
 		seq_printf(m, "CPU part\t: 0x%03x\n", MIDR_PARTNUM(midr));
-		seq_printf(m, "CPU revision\t: %d\n\n", MIDR_REVISION(midr));
+		seq_printf(m, "CPU revision\t: %d\n", MIDR_REVISION(midr));
+		seq_printf(m, "address sizes\t: %d bits physical, %d bits virtual\n\n",
+				id_aa64mmfr0_pa_range_bits(), id_aa64mmfr2_va_range_bits());
 	}
 
 	return 0;