diff mbox

Report double word access atomicity on LPAE enabled targets through AUXV

Message ID 20130416172258.GE3770@mudshark.cambridge.arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Will Deacon April 16, 2013, 5:22 p.m. UTC
On Fri, Apr 12, 2013 at 09:58:43PM +0100, Vladimir Danushevsky wrote:
> On Apr 8, 2013, at 11:57 AM, Will Deacon wrote:
> > Actually, it's not just the presence of those instructions -- it's their
> > behaviour wrt atomicity. They are only guaranteed to be atomic if the CPU in
> > question supports LPAE. We could call it "lpae" but it might be set even
> > when the kernel is using the short-descriptor format.
> 
> I also think that ATOMICD would describe the intended behavior better.

I started having second thoughts about the name, on the offchance that some
other feature introduced with LPAE is found to be useful to userspace. Then
userspace would end up checking ATOMICD in order to determine something
different, which is counter-intuitive.

Maybe there is no other `killer feature' for userspace with LPAE, but at
least we'd be reporting what the hardware says, rather than the small bit
which we're interested in.

Patch below...

Will

--->8

commit c6eaaa758c7956b18aa0cfabe9500ef73514b319
Author: Will Deacon <will.deacon@arm.com>
Date:   Mon Apr 8 17:13:12 2013 +0100

    ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
    
    CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
    userspace software can avoid having to use the exclusive variants of
    these instructions if they wish.
    
    This patch advertises the atomicity of these instructions via the
    hwcaps, so userspace can detect this CPU feature.
    
    Signed-off-by: Will Deacon <will.deacon@arm.com>

Comments

Will Deacon April 17, 2013, 9:48 a.m. UTC | #1
On Tue, Apr 16, 2013 at 06:22:58PM +0100, Will Deacon wrote:
> Patch below...

[...]

> +#define HWCAP_LPAE     (1 << 20)

[...]

> @@ -875,6 +880,7 @@ static const char *hwcap_str[] = {
>         "vfpv4",
>         "idiva",
>         "idivt",
> +       "atomicd",
>         NULL
>  };

You can tell I was in two minds about this, but that string should read
"lpae".

Will
Catalin Marinas April 18, 2013, 4:22 p.m. UTC | #2
On Tue, Apr 16, 2013 at 06:22:59PM +0100, Will Deacon wrote:
> On Fri, Apr 12, 2013 at 09:58:43PM +0100, Vladimir Danushevsky wrote:
> > On Apr 8, 2013, at 11:57 AM, Will Deacon wrote:
> > > Actually, it's not just the presence of those instructions -- it's their
> > > behaviour wrt atomicity. They are only guaranteed to be atomic if the CPU in
> > > question supports LPAE. We could call it "lpae" but it might be set even
> > > when the kernel is using the short-descriptor format.
> > 
> > I also think that ATOMICD would describe the intended behavior better.
> 
> I started having second thoughts about the name, on the offchance that some
> other feature introduced with LPAE is found to be useful to userspace. Then
> userspace would end up checking ATOMICD in order to determine something
> different, which is counter-intuitive.
> 
> Maybe there is no other `killer feature' for userspace with LPAE, but at
> least we'd be reporting what the hardware says, rather than the small bit
> which we're interested in.
> 
> Patch below...
> 
> Will
> 
> --->8
> 
> commit c6eaaa758c7956b18aa0cfabe9500ef73514b319
> Author: Will Deacon <will.deacon@arm.com>
> Date:   Mon Apr 8 17:13:12 2013 +0100
> 
>     ARM: elf: add new hwcap for identifying atomic ldrd/strd instructions
>     
>     CPUs implementing LPAE have atomic ldrd/strd instructions, meaning that
>     userspace software can avoid having to use the exclusive variants of
>     these instructions if they wish.
>     
>     This patch advertises the atomicity of these instructions via the
>     hwcaps, so userspace can detect this CPU feature.
>     
>     Signed-off-by: Will Deacon <will.deacon@arm.com>
> 
> diff --git a/arch/arm/include/uapi/asm/hwcap.h b/arch/arm/include/uapi/asm/hwcap.h
> index 3688fd1..6d34d08 100644
> --- a/arch/arm/include/uapi/asm/hwcap.h
> +++ b/arch/arm/include/uapi/asm/hwcap.h
> @@ -25,6 +25,6 @@
>  #define HWCAP_IDIVT    (1 << 18)
>  #define HWCAP_VFPD32   (1 << 19)       /* set if VFP has 32 regs (not 16) */
>  #define HWCAP_IDIV     (HWCAP_IDIVA | HWCAP_IDIVT)
> -
> +#define HWCAP_LPAE     (1 << 20)

I would rather add individual entries like atomicd. User space does not
need to know whether LPAE is supported or not.
Russell King - ARM Linux April 22, 2013, 2:17 p.m. UTC | #3
On Thu, Apr 18, 2013 at 05:22:33PM +0100, Catalin Marinas wrote:
> I would rather add individual entries like atomicd. User space does not
> need to know whether LPAE is supported or not.

I'm not sure that userspace should know this detail.

| LDM, LDC, LDC2, LDRD, STM, STC, STC2, STRD, PUSH, POP, RFE, SRS, VLDM,
| VLDR, VSTM, and VSTR instructions are executed as a sequence of word-
| aligned word accesses. Each 32-bit word access is guaranteed to be
| single-copy atomic. The architecture does not require subsequences
| of two or more word accesses from the sequence to be single-copy
| atomic.
| 
| In an implementation that includes the Large Physical Address Extension,
| LDRD and STRD accesses to 64-bit aligned locations are 64-bit single-copy
| atomic as seen by translation table walks and accesses to translation
| tables.
| ...
| Load/store multiple instructions, such as LDM, LDRD, STM, and STRD,
| generate multiple word accesses, each of which is a separate access
| for the purpose of determining ordering.

Note carefully the wording in the ARM ARM in the second paragraph -
it's not saying that LDRD and STRD accesses are single-copy atomic
to all viewers, but it's limited to the translation tables.

Sure, you can argue that the translation tables are just normal memory
accesses, but in that case why use the above wording if not to permit
them to be multiple-copy accesses as stated in the paragraph above
and elsewhere in the ARM ARM in the paragraph below.

Maybe an error in the ARM ARM?  It doesn't seem wise through to permit
userspace to assume that LDRD/STRD are atomic in userspace given the
above wording unless the ARM ARM gets fixed.
Catalin Marinas April 22, 2013, 2:28 p.m. UTC | #4
On Mon, Apr 22, 2013 at 03:17:19PM +0100, Russell King - ARM Linux wrote:
> On Thu, Apr 18, 2013 at 05:22:33PM +0100, Catalin Marinas wrote:
> > I would rather add individual entries like atomicd. User space does not
> > need to know whether LPAE is supported or not.
> 
> I'm not sure that userspace should know this detail.

Which detail, LPAE or atomic double accesses?

> | LDM, LDC, LDC2, LDRD, STM, STC, STC2, STRD, PUSH, POP, RFE, SRS, VLDM,
> | VLDR, VSTM, and VSTR instructions are executed as a sequence of word-
> | aligned word accesses. Each 32-bit word access is guaranteed to be
> | single-copy atomic. The architecture does not require subsequences
> | of two or more word accesses from the sequence to be single-copy
> | atomic.
> | 
> | In an implementation that includes the Large Physical Address Extension,
> | LDRD and STRD accesses to 64-bit aligned locations are 64-bit single-copy
> | atomic as seen by translation table walks and accesses to translation
> | tables.
> | ...
> | Load/store multiple instructions, such as LDM, LDRD, STM, and STRD,
> | generate multiple word accesses, each of which is a separate access
> | for the purpose of determining ordering.
> 
> Note carefully the wording in the ARM ARM in the second paragraph -
> it's not saying that LDRD and STRD accesses are single-copy atomic
> to all viewers, but it's limited to the translation tables.

Yes, I raised this (in private I think) when Will proposed similar
kernel changes. But we discussed with the ARM architecture guys and even
though the ARM ARM states that this, in practice the CPUs just implement
full 64-bit atomic accesses.

> Sure, you can argue that the translation tables are just normal memory
> accesses, but in that case why use the above wording if not to permit
> them to be multiple-copy accesses as stated in the paragraph above
> and elsewhere in the ARM ARM in the paragraph below.
> 
> Maybe an error in the ARM ARM?  It doesn't seem wise through to permit
> userspace to assume that LDRD/STRD are atomic in userspace given the
> above wording unless the ARM ARM gets fixed.

Not an error in the ARM ARM but I don't think ARM is going to extend
this definition (not an easy process for such modifications). For
practical purposes, CPUs with LPAE implement 64-bit atomics and it's of
real benefit for some user-space cases.

Maybe a better alternative would be to add it via __v7_proc on
individual CPUs (A7 and A15).
diff mbox

Patch

diff --git a/arch/arm/include/uapi/asm/hwcap.h b/arch/arm/include/uapi/asm/hwcap.h
index 3688fd1..6d34d08 100644
--- a/arch/arm/include/uapi/asm/hwcap.h
+++ b/arch/arm/include/uapi/asm/hwcap.h
@@ -25,6 +25,6 @@ 
 #define HWCAP_IDIVT    (1 << 18)
 #define HWCAP_VFPD32   (1 << 19)       /* set if VFP has 32 regs (not 16) */
 #define HWCAP_IDIV     (HWCAP_IDIVA | HWCAP_IDIVT)
-
+#define HWCAP_LPAE     (1 << 20)
 
 #endif /* _UAPI__ASMARM_HWCAP_H */
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index eaa7900..57a53c5 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -356,7 +356,7 @@  void __init early_print(const char *str, ...)
 
 static void __init cpuid_init_hwcaps(void)
 {
-       unsigned int divide_instrs;
+       unsigned int divide_instrs, vmsa;
 
        if (cpu_architecture() < CPU_ARCH_ARMv7)
                return;
@@ -369,6 +369,11 @@  static void __init cpuid_init_hwcaps(void)
        case 1:
                elf_hwcap |= HWCAP_IDIVT;
        }
+
+       /* LPAE implies atomic ldrd/strd instructions */
+       vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0;
+       if (vmsa >= 5)
+               elf_hwcap |= HWCAP_LPAE;
 }
 
 static void __init feat_v6_fixup(void)
@@ -875,6 +880,7 @@  static const char *hwcap_str[] = {
        "vfpv4",
        "idiva",
        "idivt",
+       "atomicd",
        NULL
 };