diff mbox series

[v2,4/6] Documentation: arm64: document support for the AMU extension

Message ID 20191218182607.21607-5-ionela.voinescu@arm.com (mailing list archive)
State New, archived
Headers show
Series arm64: ARMv8.4 Activity Monitors support | expand

Commit Message

Ionela Voinescu Dec. 18, 2019, 6:26 p.m. UTC
The activity monitors extension is an optional extension introduced
by the ARMv8.4 CPU architecture.

Add initial documentation for the AMUv1 extension:
 - arm64/amu.txt: AMUv1 documentation
 - arm64/booting.txt: system registers initialisation
 - arm64/cpu-feature-registers.txt: visibility to userspace

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
---
 Documentation/arm64/amu.rst                   | 107 ++++++++++++++++++
 Documentation/arm64/booting.rst               |  14 +++
 Documentation/arm64/cpu-feature-registers.rst |   2 +
 Documentation/arm64/index.rst                 |   1 +
 4 files changed, 124 insertions(+)
 create mode 100644 Documentation/arm64/amu.rst

Comments

Valentin Schneider Jan. 27, 2020, 4:47 p.m. UTC | #1
On 18/12/2019 18:26, Ionela Voinescu wrote:
> +Basic support
> +-------------
> +
> +The kernel can safely run a mix of CPUs with and without support for the
> +activity monitors extension. Therefore, when CONFIG_ARM64_AMU_EXTN is
> +selected we unconditionally enable the capability to allow any late CPU
> +(secondary or hotplugged) to detect and use the feature.
> +
> +When the feature is detected on a CPU, a per-CPU variable (amu_feat) is
> +set, but this does not guarantee the correct functionality of the
> +counters, only the presence of the extension.
> +
> +Firmware (code running at higher exception levels, e.g. arm-tf) support is
> +needed to:
> + - Enable access for lower exception levels (EL2 and EL1) to the AMU
> +   registers.
> + - Enable the counters. If not enabled these will read as 0.

Just to make sure I understand - if AMUs are physically present but not
enabled by FW, we'll still
- see them as implemented in ID_AA64PFR0_EL1.AMU
- see some counters as available with e.g. AMCGCR_ELO.CG0NC > 0

But reading some AMEVCNTR<g><n> will return 0?

> + - Save/restore the counters before/after the CPU is being put/brought up
> +   from the 'off' power state.
> +
> +When using kernels that have this configuration enabled but boot with
> +broken firmware the user may experience panics or lockups when accessing
> +the counter registers.

Yikes

> Even if these symptoms are not observed, the
> +values returned by the register reads might not correctly reflect reality.
> +Most commonly, the counters will read as 0, indicating that they are not
> +enabled. If proper support is not provided in firmware it's best to disable
> +CONFIG_ARM64_AMU_EXTN.
> +

I haven't seen something that would try to catch this on the kernel side.
Can we try to detect that (e.g. at least one counter returns > 0) in
cpu_amu_enable() and thus not write to the CPU-local 'amu_feat'?

While we're on the topic of detecting broken stuff, what if some CPUs
implement some auxiliary counters that some others don't?

> +The fixed counters of AMUv1 are accessible though the following system
> +register definitions:
> + - SYS_AMEVCNTR0_CORE_EL0
> + - SYS_AMEVCNTR0_CONST_EL0
> + - SYS_AMEVCNTR0_INST_RET_EL0
> + - SYS_AMEVCNTR0_MEM_STALL_EL0
> +
> +Auxiliary platform specific counters can be accessed using
> +SYS_AMEVCNTR1_EL0(n), where n is a value between 0 and 15.
> +
> +Details can be found in: arch/arm64/include/asm/sysreg.h.
> +
> diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
> index 5d78a6f5b0ae..a3f1a47b6f1c 100644
> --- a/Documentation/arm64/booting.rst
> +++ b/Documentation/arm64/booting.rst
> @@ -248,6 +248,20 @@ Before jumping into the kernel, the following conditions must be met:
>      - HCR_EL2.APK (bit 40) must be initialised to 0b1
>      - HCR_EL2.API (bit 41) must be initialised to 0b1
>  
> +  For CPUs with Activity Monitors Unit v1 (AMUv1) extension present:
> +  - If EL3 is present:
> +    CPTR_EL3.TAM (bit 30) must be initialised to 0b0
> +    CPTR_EL2.TAM (bit 30) must be initialised to 0b0
> +    AMCNTENSET0_EL0 must be initialised to 0b1111

Nit: Or be a superset of the above, right? AIUI v1 only mandates the lower
4 bits to be set. Probably doesn't matter that much...
Ionela Voinescu Jan. 28, 2020, 4:53 p.m. UTC | #2
On Monday 27 Jan 2020 at 16:47:29 (+0000), Valentin Schneider wrote:
> On 18/12/2019 18:26, Ionela Voinescu wrote:
> > +Basic support
> > +-------------
> > +
> > +The kernel can safely run a mix of CPUs with and without support for the
> > +activity monitors extension. Therefore, when CONFIG_ARM64_AMU_EXTN is
> > +selected we unconditionally enable the capability to allow any late CPU
> > +(secondary or hotplugged) to detect and use the feature.
> > +
> > +When the feature is detected on a CPU, a per-CPU variable (amu_feat) is
> > +set, but this does not guarantee the correct functionality of the
> > +counters, only the presence of the extension.
> > +
> > +Firmware (code running at higher exception levels, e.g. arm-tf) support is
> > +needed to:
> > + - Enable access for lower exception levels (EL2 and EL1) to the AMU
> > +   registers.
> > + - Enable the counters. If not enabled these will read as 0.
> 
> Just to make sure I understand - if AMUs are physically present but not
> enabled by FW, we'll still
> - see them as implemented in ID_AA64PFR0_EL1.AMU

Yes, this feature register only shows the physical presence on the unit
in hardware.

> - see some counters as available with e.g. AMCGCR_ELO.CG0NC > 0
> 

Yes, the same as above, this only shows their physical presence. For
AMUv1 - AMCGCR_ELO.CG0NC: the value of this field is set to 4.
AMCGCR_ELO.CG1NC will show the number of auxiliary counters implemented
in hardware.

> But reading some AMEVCNTR<g><n> will return 0?

Or you won't be able to access them at all. Lacking firmware support
accesses to AMU registers could be trapped in EL3. If access for EL1 and
EL2 is enabled from EL3, it's still possible that the counters
themselves are not enabled - that means they are not enabled to count
the events they are designed to be counting. That's why in this case the
event counter register could read 0.

But if we read 0, it does not necessarily mean that the counter is
disabled. It could also mean that the events is meant to count did not
happen yet.

> 
> > + - Save/restore the counters before/after the CPU is being put/brought up
> > +   from the 'off' power state.
> > +
> > +When using kernels that have this configuration enabled but boot with
> > +broken firmware the user may experience panics or lockups when accessing
> > +the counter registers.
> 
> Yikes
> 
> > Even if these symptoms are not observed, the
> > +values returned by the register reads might not correctly reflect reality.
> > +Most commonly, the counters will read as 0, indicating that they are not
> > +enabled. If proper support is not provided in firmware it's best to disable
> > +CONFIG_ARM64_AMU_EXTN.
> > +
> 
> I haven't seen something that would try to catch this on the kernel side.
> Can we try to detect that (e.g. at least one counter returns > 0) in
> cpu_amu_enable() and thus not write to the CPU-local 'amu_feat'?
> 

I'm reluctant to do this especially given that platforms might choose to
keep some counters disabled while enabling some counters that might not
have counted any events by the time we reach cpu_enable. We would end up
mistakenly disabling the feature. I would rather leave the validation of
the counters to be done at the location and for the purpose of their
use: see patch 6/6 - the use of counters for frequency invariance.

> While we're on the topic of detecting broken stuff, what if some CPUs
> implement some auxiliary counters that some others don't?
> 

I think it should be up to the user of that counter to decide if the
usecase is at CPU level or system level. My intention of this base
support was to keep it simple and allow users of some counters to
decide on their own how to validate and make use of either architected
or auxiliary counters.

For example, in the case of frequency invariance, given a platform that
does not support cpufreq based invariance, I would validate all CPUs for
the use of AMU core and constant counters. If it happens that some CPUs
do not support those counters or they are not enabled, we'd have to
disable frequency invariance at system level.

For some other scenarios only partial support is needed - only a subset
of CPUs need to support the counters for their use to be feasible.

But I believe only the user of the counters can decide, whether this is
happening in architecture code, driver code, generic code.

> > +The fixed counters of AMUv1 are accessible though the following system
> > +register definitions:
> > + - SYS_AMEVCNTR0_CORE_EL0
> > + - SYS_AMEVCNTR0_CONST_EL0
> > + - SYS_AMEVCNTR0_INST_RET_EL0
> > + - SYS_AMEVCNTR0_MEM_STALL_EL0
> > +
> > +Auxiliary platform specific counters can be accessed using
> > +SYS_AMEVCNTR1_EL0(n), where n is a value between 0 and 15.
> > +
> > +Details can be found in: arch/arm64/include/asm/sysreg.h.
> > +
> > diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
> > index 5d78a6f5b0ae..a3f1a47b6f1c 100644
> > --- a/Documentation/arm64/booting.rst
> > +++ b/Documentation/arm64/booting.rst
> > @@ -248,6 +248,20 @@ Before jumping into the kernel, the following conditions must be met:
> >      - HCR_EL2.APK (bit 40) must be initialised to 0b1
> >      - HCR_EL2.API (bit 41) must be initialised to 0b1
> >  
> > +  For CPUs with Activity Monitors Unit v1 (AMUv1) extension present:
> > +  - If EL3 is present:
> > +    CPTR_EL3.TAM (bit 30) must be initialised to 0b0
> > +    CPTR_EL2.TAM (bit 30) must be initialised to 0b0
> > +    AMCNTENSET0_EL0 must be initialised to 0b1111
> 
> Nit: Or be a superset of the above, right? AIUI v1 only mandates the lower
> 4 bits to be set. Probably doesn't matter that much...
> 

Right! This is more of a guideline: it can be a subset as well, if
platforms don't want some counters enabled. It can set the lower 4 bits
for enablement of all 4 architecture counters for v1, or more for future
versions with more architected counters.

Thanks,
Ionela.
Valentin Schneider Jan. 28, 2020, 6:36 p.m. UTC | #3
On 28/01/2020 16:53, Ionela Voinescu wrote:
> Or you won't be able to access them at all. Lacking firmware support
> accesses to AMU registers could be trapped in EL3. If access for EL1 and
> EL2 is enabled from EL3, it's still possible that the counters
> themselves are not enabled - that means they are not enabled to count
> the events they are designed to be counting. That's why in this case the
> event counter register could read 0.
> 
> But if we read 0, it does not necessarily mean that the counter is
> disabled. It could also mean that the events is meant to count did not
> happen yet.
> 

Right, which (as we discussed offline) is quite likely to happen if/when
we get stuff like SVE counters and we try to read them at boot time. Might
be worth adding a small note about that (0 != disabled).

>> I haven't seen something that would try to catch this on the kernel side.
>> Can we try to detect that (e.g. at least one counter returns > 0) in
>> cpu_amu_enable() and thus not write to the CPU-local 'amu_feat'?
>>
> 
> I'm reluctant to do this especially given that platforms might choose to
> keep some counters disabled while enabling some counters that might not
> have counted any events by the time we reach cpu_enable. We would end up
> mistakenly disabling the feature. I would rather leave the validation of
> the counters to be done at the location and for the purpose of their
> use: see patch 6/6 - the use of counters for frequency invariance.
> 

Hmph, I'm a bit torn on that one. It would be really nice to provide *some*
amount of sanity checking at core level - e.g. by checking that at least
one of the four architected counters reads non-zero. But as you say these
could be disabled, while some other arch/aux counter is enabled, and we
could then mistakenly disable the feature. So we can't really do much
unless we handle *each* individual counter. Oh well :/

>> While we're on the topic of detecting broken stuff, what if some CPUs
>> implement some auxiliary counters that some others don't?
>>
> 
> I think it should be up to the user of that counter to decide if the
> usecase is at CPU level or system level. My intention of this base
> support was to keep it simple and allow users of some counters to
> decide on their own how to validate and make use of either architected
> or auxiliary counters.
> 
> For example, in the case of frequency invariance, given a platform that
> does not support cpufreq based invariance, I would validate all CPUs for
> the use of AMU core and constant counters. If it happens that some CPUs
> do not support those counters or they are not enabled, we'd have to
> disable frequency invariance at system level.
> 
> For some other scenarios only partial support is needed - only a subset
> of CPUs need to support the counters for their use to be feasible.
> 
> But I believe only the user of the counters can decide, whether this is
> happening in architecture code, driver code, generic code.
> 

Right, the FIE support is actually a good example of that, I think.
Suzuki K Poulose Jan. 30, 2020, 3:04 p.m. UTC | #4
Hi Ionela,

On 18/12/2019 18:26, Ionela Voinescu wrote:
> The activity monitors extension is an optional extension introduced
> by the ARMv8.4 CPU architecture.
> 
> Add initial documentation for the AMUv1 extension:
>   - arm64/amu.txt: AMUv1 documentation
>   - arm64/booting.txt: system registers initialisation
>   - arm64/cpu-feature-registers.txt: visibility to userspace

We have stopped adding "invisible" fields to the list. So, you
can drop the changes to cpu-feature-registers.txt.

> 
> Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Jonathan Corbet <corbet@lwn.net>
> ---
>   Documentation/arm64/amu.rst                   | 107 ++++++++++++++++++
>   Documentation/arm64/booting.rst               |  14 +++
>   Documentation/arm64/cpu-feature-registers.rst |   2 +
>   Documentation/arm64/index.rst                 |   1 +
>   4 files changed, 124 insertions(+)
>   create mode 100644 Documentation/arm64/amu.rst
> 
> diff --git a/Documentation/arm64/amu.rst b/Documentation/arm64/amu.rst
> new file mode 100644
> index 000000000000..62a6635522e1
> --- /dev/null
> +++ b/Documentation/arm64/amu.rst
> @@ -0,0 +1,107 @@
> +=======================================================
> +Activity Monitors Unit (AMU) extension in AArch64 Linux
> +=======================================================
> +
> +Author: Ionela Voinescu <ionela.voinescu@arm.com>
> +
> +Date: 2019-09-10
> +
> +This document briefly describes the provision of Activity Monitors Unit
> +support in AArch64 Linux.
> +
> +
> +Architecture overview
> +---------------------
> +
> +The activity monitors extension is an optional extension introduced by the
> +ARMv8.4 CPU architecture.
> +
> +The activity monitors unit, implemented in each CPU, provides performance
> +counters intended for system management use. The AMU extension provides a
> +system register interface to the counter registers and also supports an
> +optional external memory-mapped interface.
> +
> +Version 1 of the Activity Monitors architecture implements a counter group
> +of four fixed and architecturally defined 64-bit event counters.
> +  - CPU cycle counter: increments at the frequency of the CPU.
> +  - Constant counter: increments at the fixed frequency of the system
> +    clock.
> +  - Instructions retired: increments with every architecturally executed
> +    instruction.
> +  - Memory stall cycles: counts instruction dispatch stall cycles caused by
> +    misses in the last level cache within the clock domain.
> +
> +When in WFI or WFE these counters do not increment.
> +
> +The Activity Monitors architecture provides space for up to 16 architected
> +event counters. Future versions of the architecture may use this space to
> +implement additional architected event counters.
> +
> +Additionally, version 1 implements a counter group of up to 16 auxiliary
> +64-bit event counters.
> +
> +On cold reset all counters reset to 0.
> +
> +
> +Basic support
> +-------------
> +
> +The kernel can safely run a mix of CPUs with and without support for the
> +activity monitors extension.


  Therefore, when CONFIG_ARM64_AMU_EXTN is
> +selected we unconditionally enable the capability to allow any late CPU
> +(secondary or hotplugged) to detect and use the feature.
> +
> +When the feature is detected on a CPU, a per-CPU variable (amu_feat) is
> +set, but this does not guarantee the correct functionality of the
> +counters, only the presence of the extension.

nit: I would rather omit the implementation details (esp variable names)
in the documentation. It may become a pain to keep this in sync with the
code changes. You could simply mention, "we keep track of the 
availability of the feature" per CPU. If someone wants to figure out
how, they can always read the code.

> +
> +Firmware (code running at higher exception levels, e.g. arm-tf) support is
> +needed to:
> + - Enable access for lower exception levels (EL2 and EL1) to the AMU
> +   registers.
> + - Enable the counters. If not enabled these will read as 0.
> + - Save/restore the counters before/after the CPU is being put/brought up
> +   from the 'off' power state.
> +
> +When using kernels that have this configuration enabled but boot with
> +broken firmware the user may experience panics or lockups when accessing
> +the counter registers. Even if these symptoms are not observed, the
> +values returned by the register reads might not correctly reflect reality.
> +Most commonly, the counters will read as 0, indicating that they are not
> +enabled. If proper support is not provided in firmware it's best to disable
> +CONFIG_ARM64_AMU_EXTN.

For the sake of one kernel runs everywhere, do we need some other
mechanism to disable the AMU. e.g kernel parameter to disable amu
at runtime ?

> diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
> index 5d78a6f5b0ae..a3f1a47b6f1c 100644
> --- a/Documentation/arm64/booting.rst
> +++ b/Documentation/arm64/booting.rst
> @@ -248,6 +248,20 @@ Before jumping into the kernel, the following conditions must be met:
>       - HCR_EL2.APK (bit 40) must be initialised to 0b1
>       - HCR_EL2.API (bit 41) must be initialised to 0b1
>   
> +  For CPUs with Activity Monitors Unit v1 (AMUv1) extension present:
> +  - If EL3 is present:
> +    CPTR_EL3.TAM (bit 30) must be initialised to 0b0
> +    CPTR_EL2.TAM (bit 30) must be initialised to 0b0
> +    AMCNTENSET0_EL0 must be initialised to 0b1111
> +    AMCNTENSET1_EL0 must be initialised to a platform specific value
> +    having 0b1 set for the corresponding bit for each of the auxiliary
> +    counters present.
> +  - If the kernel is entered at EL1:
> +    AMCNTENSET0_EL0 must be initialised to 0b1111
> +    AMCNTENSET1_EL0 must be initialised to a platform specific value
> +    having 0b1 set for the corresponding bit for each of the auxiliary
> +    counters present.
> +
>   The requirements described above for CPU mode, caches, MMUs, architected
>   timers, coherency and system registers apply to all CPUs.  All CPUs must
>   enter the kernel in the same exception level.
> diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
> index b6e44884e3ad..4770ae54032b 100644
> --- a/Documentation/arm64/cpu-feature-registers.rst
> +++ b/Documentation/arm64/cpu-feature-registers.rst
> @@ -150,6 +150,8 @@ infrastructure:
>        +------------------------------+---------+---------+
>        | DIT                          | [51-48] |    y    |
>        +------------------------------+---------+---------+
> +     | AMU                          | [47-44] |    n    |
> +     +------------------------------+---------+---------+

As mentioned above, please drop it.


Suzuki
Ionela Voinescu Jan. 30, 2020, 4:45 p.m. UTC | #5
Hi Suzuki,

On Thursday 30 Jan 2020 at 15:04:27 (+0000), Suzuki Kuruppassery Poulose wrote:
> Hi Ionela,
> 
> On 18/12/2019 18:26, Ionela Voinescu wrote:
> > The activity monitors extension is an optional extension introduced
> > by the ARMv8.4 CPU architecture.
> > 
> > Add initial documentation for the AMUv1 extension:
> >   - arm64/amu.txt: AMUv1 documentation
> >   - arm64/booting.txt: system registers initialisation
> >   - arm64/cpu-feature-registers.txt: visibility to userspace
> 
> We have stopped adding "invisible" fields to the list. So, you
> can drop the changes to cpu-feature-registers.txt.
> 
> > 
> > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Jonathan Corbet <corbet@lwn.net>
> > ---
> >   Documentation/arm64/amu.rst                   | 107 ++++++++++++++++++
> >   Documentation/arm64/booting.rst               |  14 +++
> >   Documentation/arm64/cpu-feature-registers.rst |   2 +
> >   Documentation/arm64/index.rst                 |   1 +
> >   4 files changed, 124 insertions(+)
> >   create mode 100644 Documentation/arm64/amu.rst
> > 
> > diff --git a/Documentation/arm64/amu.rst b/Documentation/arm64/amu.rst
> > new file mode 100644
> > index 000000000000..62a6635522e1
> > --- /dev/null
> > +++ b/Documentation/arm64/amu.rst
> > @@ -0,0 +1,107 @@
> > +=======================================================
> > +Activity Monitors Unit (AMU) extension in AArch64 Linux
> > +=======================================================
> > +
> > +Author: Ionela Voinescu <ionela.voinescu@arm.com>
> > +
> > +Date: 2019-09-10
> > +
> > +This document briefly describes the provision of Activity Monitors Unit
> > +support in AArch64 Linux.
> > +
> > +
> > +Architecture overview
> > +---------------------
> > +
> > +The activity monitors extension is an optional extension introduced by the
> > +ARMv8.4 CPU architecture.
> > +
> > +The activity monitors unit, implemented in each CPU, provides performance
> > +counters intended for system management use. The AMU extension provides a
> > +system register interface to the counter registers and also supports an
> > +optional external memory-mapped interface.
> > +
> > +Version 1 of the Activity Monitors architecture implements a counter group
> > +of four fixed and architecturally defined 64-bit event counters.
> > +  - CPU cycle counter: increments at the frequency of the CPU.
> > +  - Constant counter: increments at the fixed frequency of the system
> > +    clock.
> > +  - Instructions retired: increments with every architecturally executed
> > +    instruction.
> > +  - Memory stall cycles: counts instruction dispatch stall cycles caused by
> > +    misses in the last level cache within the clock domain.
> > +
> > +When in WFI or WFE these counters do not increment.
> > +
> > +The Activity Monitors architecture provides space for up to 16 architected
> > +event counters. Future versions of the architecture may use this space to
> > +implement additional architected event counters.
> > +
> > +Additionally, version 1 implements a counter group of up to 16 auxiliary
> > +64-bit event counters.
> > +
> > +On cold reset all counters reset to 0.
> > +
> > +
> > +Basic support
> > +-------------
> > +
> > +The kernel can safely run a mix of CPUs with and without support for the
> > +activity monitors extension.
> 
> 
>  Therefore, when CONFIG_ARM64_AMU_EXTN is
> > +selected we unconditionally enable the capability to allow any late CPU
> > +(secondary or hotplugged) to detect and use the feature.
> > +
> > +When the feature is detected on a CPU, a per-CPU variable (amu_feat) is
> > +set, but this does not guarantee the correct functionality of the
> > +counters, only the presence of the extension.
> 
> nit: I would rather omit the implementation details (esp variable names)
> in the documentation. It may become a pain to keep this in sync with the
> code changes. You could simply mention, "we keep track of the availability
> of the feature" per CPU. If someone wants to figure out
> how, they can always read the code.
> 
> > +
> > +Firmware (code running at higher exception levels, e.g. arm-tf) support is
> > +needed to:
> > + - Enable access for lower exception levels (EL2 and EL1) to the AMU
> > +   registers.
> > + - Enable the counters. If not enabled these will read as 0.
> > + - Save/restore the counters before/after the CPU is being put/brought up
> > +   from the 'off' power state.
> > +
> > +When using kernels that have this configuration enabled but boot with
> > +broken firmware the user may experience panics or lockups when accessing
> > +the counter registers. Even if these symptoms are not observed, the
> > +values returned by the register reads might not correctly reflect reality.
> > +Most commonly, the counters will read as 0, indicating that they are not
> > +enabled. If proper support is not provided in firmware it's best to disable
> > +CONFIG_ARM64_AMU_EXTN.
> 
> For the sake of one kernel runs everywhere, do we need some other
> mechanism to disable the AMU. e.g kernel parameter to disable amu
> at runtime ?
>

The reason I've not added this is twofold:
 - Even if we add this, it should be in order to disable the use of the
   counters for a certain purpose, in this case  frequency invariance.
   On its own AMU provides the counters but it does not mandate their
   use.
 - I could add something to disable the use of the core and cycle
   counters for frequency invariance at runtime, but I doubt that
   anyone would use it. Logically it makes sense to use the counters
   order to have a more accurate view of the performance that the CPUs
   are actually providing. Therefore, until anyone asks for this, I
   thought it's better to keep it simple and not add extra switches,
   until there is a use for them.

Does it make sense?

P.S. I'll make all the other changes you've suggested in v3. 

Thank you,
Ionela.



> > diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
> > index 5d78a6f5b0ae..a3f1a47b6f1c 100644
> > --- a/Documentation/arm64/booting.rst
> > +++ b/Documentation/arm64/booting.rst
> > @@ -248,6 +248,20 @@ Before jumping into the kernel, the following conditions must be met:
> >       - HCR_EL2.APK (bit 40) must be initialised to 0b1
> >       - HCR_EL2.API (bit 41) must be initialised to 0b1
> > +  For CPUs with Activity Monitors Unit v1 (AMUv1) extension present:
> > +  - If EL3 is present:
> > +    CPTR_EL3.TAM (bit 30) must be initialised to 0b0
> > +    CPTR_EL2.TAM (bit 30) must be initialised to 0b0
> > +    AMCNTENSET0_EL0 must be initialised to 0b1111
> > +    AMCNTENSET1_EL0 must be initialised to a platform specific value
> > +    having 0b1 set for the corresponding bit for each of the auxiliary
> > +    counters present.
> > +  - If the kernel is entered at EL1:
> > +    AMCNTENSET0_EL0 must be initialised to 0b1111
> > +    AMCNTENSET1_EL0 must be initialised to a platform specific value
> > +    having 0b1 set for the corresponding bit for each of the auxiliary
> > +    counters present.
> > +
> >   The requirements described above for CPU mode, caches, MMUs, architected
> >   timers, coherency and system registers apply to all CPUs.  All CPUs must
> >   enter the kernel in the same exception level.
> > diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
> > index b6e44884e3ad..4770ae54032b 100644
> > --- a/Documentation/arm64/cpu-feature-registers.rst
> > +++ b/Documentation/arm64/cpu-feature-registers.rst
> > @@ -150,6 +150,8 @@ infrastructure:
> >        +------------------------------+---------+---------+
> >        | DIT                          | [51-48] |    y    |
> >        +------------------------------+---------+---------+
> > +     | AMU                          | [47-44] |    n    |
> > +     +------------------------------+---------+---------+
> 
> As mentioned above, please drop it.
> 
> 
> Suzuki
Suzuki K Poulose Jan. 30, 2020, 6:26 p.m. UTC | #6
Hi Ionela,

On Thu, Jan 30, 2020 at 04:45:42PM +0000, Ionela Voinescu wrote:
> Hi Suzuki,
> 
> On Thursday 30 Jan 2020 at 15:04:27 (+0000), Suzuki Kuruppassery Poulose wrote:
> > Hi Ionela,
> > 
> > On 18/12/2019 18:26, Ionela Voinescu wrote:
> > > The activity monitors extension is an optional extension introduced
> > > by the ARMv8.4 CPU architecture.
> > > 
> > > Add initial documentation for the AMUv1 extension:
> > >   - arm64/amu.txt: AMUv1 documentation
> > >   - arm64/booting.txt: system registers initialisation
> > >   - arm64/cpu-feature-registers.txt: visibility to userspace
> > 
> > We have stopped adding "invisible" fields to the list. So, you
> > can drop the changes to cpu-feature-registers.txt.
> > 
> > > 
> > > Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
> > > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > > Cc: Will Deacon <will@kernel.org>
> > > Cc: Jonathan Corbet <corbet@lwn.net>
> > > ---
> > >   Documentation/arm64/amu.rst                   | 107 ++++++++++++++++++
> > >   Documentation/arm64/booting.rst               |  14 +++
> > >   Documentation/arm64/cpu-feature-registers.rst |   2 +
> > >   Documentation/arm64/index.rst                 |   1 +
> > >   4 files changed, 124 insertions(+)
> > >   create mode 100644 Documentation/arm64/amu.rst
> > > 
> > > diff --git a/Documentation/arm64/amu.rst b/Documentation/arm64/amu.rst
> > > new file mode 100644
> > > index 000000000000..62a6635522e1
> > > --- /dev/null
> > > +++ b/Documentation/arm64/amu.rst
> > > @@ -0,0 +1,107 @@
> > > +-------------
> > > +
> > > +The kernel can safely run a mix of CPUs with and without support for the
> > > +activity monitors extension.
> > 
> > 
> >  Therefore, when CONFIG_ARM64_AMU_EXTN is
> > > +selected we unconditionally enable the capability to allow any late CPU
> > > +(secondary or hotplugged) to detect and use the feature.
> > > +
> > > +When the feature is detected on a CPU, a per-CPU variable (amu_feat) is
> > > +set, but this does not guarantee the correct functionality of the
> > > +counters, only the presence of the extension.
> > 
> > nit: I would rather omit the implementation details (esp variable names)
> > in the documentation. It may become a pain to keep this in sync with the
> > code changes. You could simply mention, "we keep track of the availability
> > of the feature" per CPU. If someone wants to figure out
> > how, they can always read the code.
> > 
> > > +
> > > +Firmware (code running at higher exception levels, e.g. arm-tf) support is
> > > +needed to:
> > > + - Enable access for lower exception levels (EL2 and EL1) to the AMU
> > > +   registers.
> > > + - Enable the counters. If not enabled these will read as 0.
> > > + - Save/restore the counters before/after the CPU is being put/brought up
> > > +   from the 'off' power state.
> > > +
> > > +When using kernels that have this configuration enabled but boot with
> > > +broken firmware the user may experience panics or lockups when accessing
> > > +the counter registers. Even if these symptoms are not observed, the
> > > +values returned by the register reads might not correctly reflect reality.
> > > +Most commonly, the counters will read as 0, indicating that they are not
> > > +enabled. If proper support is not provided in firmware it's best to disable
> > > +CONFIG_ARM64_AMU_EXTN.
> > 
> > For the sake of one kernel runs everywhere, do we need some other
> > mechanism to disable the AMU. e.g kernel parameter to disable amu
> > at runtime ?
> >
> 
> The reason I've not added this is twofold:
>  - Even if we add this, it should be in order to disable the use of the
>    counters for a certain purpose, in this case  frequency invariance.
>    On its own AMU provides the counters but it does not mandate their
>    use.
>  - I could add something to disable the use of the core and cycle
>    counters for frequency invariance at runtime, but I doubt that
>    anyone would use it. Logically it makes sense to use the counters
>    order to have a more accurate view of the performance that the CPUs
>    are actually providing. Therefore, until anyone asks for this, I
>    thought it's better to keep it simple and not add extra switches,
>    until there is a use for them.
> 
> Does it make sense?

The comment is about addressing someone who must run an "AMU" enabled
kernel ("one kernel") on a system with potentially "broken firmware",
where there is no option to use the system as you mention above,
the firmware could panic. How common is the "broken firmware" ?
Right now there is no way to ensure "firmware" is sane and if
someone detects that firmware is broken, there is no way to
disable the AMU if they are running a standard distro kernel.
A kernel parameter could prevent the AMU capability from
being detected on a broken system and thus make it usable
(without the AMU of course). Now, if the "broken firmware"
is extremely rare, we could simply ignore this case and
ignore the suggestion.

Suzuki



> 
> P.S. I'll make all the other changes you've suggested in v3. 
> 
> Thank you,
> Ionela.
> 
> 
> 
> > > diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
> > > index 5d78a6f5b0ae..a3f1a47b6f1c 100644
> > > --- a/Documentation/arm64/booting.rst
> > > +++ b/Documentation/arm64/booting.rst
> > > @@ -248,6 +248,20 @@ Before jumping into the kernel, the following conditions must be met:
> > >       - HCR_EL2.APK (bit 40) must be initialised to 0b1
> > >       - HCR_EL2.API (bit 41) must be initialised to 0b1
> > > +  For CPUs with Activity Monitors Unit v1 (AMUv1) extension present:
> > > +  - If EL3 is present:
> > > +    CPTR_EL3.TAM (bit 30) must be initialised to 0b0
> > > +    CPTR_EL2.TAM (bit 30) must be initialised to 0b0
> > > +    AMCNTENSET0_EL0 must be initialised to 0b1111
> > > +    AMCNTENSET1_EL0 must be initialised to a platform specific value
> > > +    having 0b1 set for the corresponding bit for each of the auxiliary
> > > +    counters present.
> > > +  - If the kernel is entered at EL1:
> > > +    AMCNTENSET0_EL0 must be initialised to 0b1111
> > > +    AMCNTENSET1_EL0 must be initialised to a platform specific value
> > > +    having 0b1 set for the corresponding bit for each of the auxiliary
> > > +    counters present.
> > > +
> > >   The requirements described above for CPU mode, caches, MMUs, architected
> > >   timers, coherency and system registers apply to all CPUs.  All CPUs must
> > >   enter the kernel in the same exception level.
> > > diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
> > > index b6e44884e3ad..4770ae54032b 100644
> > > --- a/Documentation/arm64/cpu-feature-registers.rst
> > > +++ b/Documentation/arm64/cpu-feature-registers.rst
> > > @@ -150,6 +150,8 @@ infrastructure:
> > >        +------------------------------+---------+---------+
> > >        | DIT                          | [51-48] |    y    |
> > >        +------------------------------+---------+---------+
> > > +     | AMU                          | [47-44] |    n    |
> > > +     +------------------------------+---------+---------+
> > 
> > As mentioned above, please drop it.
> > 
> > 
> > Suzuki
Ionela Voinescu Jan. 31, 2020, 9:54 a.m. UTC | #7
On Thursday 30 Jan 2020 at 18:26:53 (+0000), Suzuki K Poulose wrote:
[..]
> > > > +Firmware (code running at higher exception levels, e.g. arm-tf) support is
> > > > +needed to:
> > > > + - Enable access for lower exception levels (EL2 and EL1) to the AMU
> > > > +   registers.
> > > > + - Enable the counters. If not enabled these will read as 0.
> > > > + - Save/restore the counters before/after the CPU is being put/brought up
> > > > +   from the 'off' power state.
> > > > +
> > > > +When using kernels that have this configuration enabled but boot with
> > > > +broken firmware the user may experience panics or lockups when accessing
> > > > +the counter registers. Even if these symptoms are not observed, the
> > > > +values returned by the register reads might not correctly reflect reality.
> > > > +Most commonly, the counters will read as 0, indicating that they are not
> > > > +enabled. If proper support is not provided in firmware it's best to disable
> > > > +CONFIG_ARM64_AMU_EXTN.
> > > 
> > > For the sake of one kernel runs everywhere, do we need some other
> > > mechanism to disable the AMU. e.g kernel parameter to disable amu
> > > at runtime ?
> > >
> > 
> > The reason I've not added this is twofold:
> >  - Even if we add this, it should be in order to disable the use of the
> >    counters for a certain purpose, in this case  frequency invariance.
> >    On its own AMU provides the counters but it does not mandate their
> >    use.
> >  - I could add something to disable the use of the core and cycle
> >    counters for frequency invariance at runtime, but I doubt that
> >    anyone would use it. Logically it makes sense to use the counters
> >    order to have a more accurate view of the performance that the CPUs
> >    are actually providing. Therefore, until anyone asks for this, I
> >    thought it's better to keep it simple and not add extra switches,
> >    until there is a use for them.
> > 
> > Does it make sense?
> 
> The comment is about addressing someone who must run an "AMU" enabled
> kernel ("one kernel") on a system with potentially "broken firmware",
> where there is no option to use the system as you mention above,
> the firmware could panic. How common is the "broken firmware" ?
> Right now there is no way to ensure "firmware" is sane and if
> someone detects that firmware is broken, there is no way to
> disable the AMU if they are running a standard distro kernel.
> A kernel parameter could prevent the AMU capability from
> being detected on a broken system and thus make it usable
> (without the AMU of course). Now, if the "broken firmware"
> is extremely rare, we could simply ignore this case and
> ignore the suggestion.
> 
> Suzuki
> 
>

Sorry Suzuki, I initially interpreted the question independently from
the context and only thought about cases where they are working
correctly but users might want to disable the use of them.

In this case, I don't see any harm in adding a command line parameter
to disable the use of the unit, even if it's only to support firmware
that does not support AMU at all, rather than the implementation being
broken.

I'm not really sure how common bad firmware would be. I suppose that
firmware as bad as to cause firmware panics and lockups would be quite
rare, but scenarios where firmware might not properly support AMU and
result in kernel lockups could be more often, and this would handle
both.

Thank you,
Ionela.
diff mbox series

Patch

diff --git a/Documentation/arm64/amu.rst b/Documentation/arm64/amu.rst
new file mode 100644
index 000000000000..62a6635522e1
--- /dev/null
+++ b/Documentation/arm64/amu.rst
@@ -0,0 +1,107 @@ 
+=======================================================
+Activity Monitors Unit (AMU) extension in AArch64 Linux
+=======================================================
+
+Author: Ionela Voinescu <ionela.voinescu@arm.com>
+
+Date: 2019-09-10
+
+This document briefly describes the provision of Activity Monitors Unit
+support in AArch64 Linux.
+
+
+Architecture overview
+---------------------
+
+The activity monitors extension is an optional extension introduced by the
+ARMv8.4 CPU architecture.
+
+The activity monitors unit, implemented in each CPU, provides performance
+counters intended for system management use. The AMU extension provides a
+system register interface to the counter registers and also supports an
+optional external memory-mapped interface.
+
+Version 1 of the Activity Monitors architecture implements a counter group
+of four fixed and architecturally defined 64-bit event counters.
+  - CPU cycle counter: increments at the frequency of the CPU.
+  - Constant counter: increments at the fixed frequency of the system
+    clock.
+  - Instructions retired: increments with every architecturally executed
+    instruction.
+  - Memory stall cycles: counts instruction dispatch stall cycles caused by
+    misses in the last level cache within the clock domain.
+
+When in WFI or WFE these counters do not increment.
+
+The Activity Monitors architecture provides space for up to 16 architected
+event counters. Future versions of the architecture may use this space to
+implement additional architected event counters.
+
+Additionally, version 1 implements a counter group of up to 16 auxiliary
+64-bit event counters.
+
+On cold reset all counters reset to 0.
+
+
+Basic support
+-------------
+
+The kernel can safely run a mix of CPUs with and without support for the
+activity monitors extension. Therefore, when CONFIG_ARM64_AMU_EXTN is
+selected we unconditionally enable the capability to allow any late CPU
+(secondary or hotplugged) to detect and use the feature.
+
+When the feature is detected on a CPU, a per-CPU variable (amu_feat) is
+set, but this does not guarantee the correct functionality of the
+counters, only the presence of the extension.
+
+Firmware (code running at higher exception levels, e.g. arm-tf) support is
+needed to:
+ - Enable access for lower exception levels (EL2 and EL1) to the AMU
+   registers.
+ - Enable the counters. If not enabled these will read as 0.
+ - Save/restore the counters before/after the CPU is being put/brought up
+   from the 'off' power state.
+
+When using kernels that have this configuration enabled but boot with
+broken firmware the user may experience panics or lockups when accessing
+the counter registers. Even if these symptoms are not observed, the
+values returned by the register reads might not correctly reflect reality.
+Most commonly, the counters will read as 0, indicating that they are not
+enabled. If proper support is not provided in firmware it's best to disable
+CONFIG_ARM64_AMU_EXTN.
+
+The fixed counters of AMUv1 are accessible though the following system
+register definitions:
+ - SYS_AMEVCNTR0_CORE_EL0
+ - SYS_AMEVCNTR0_CONST_EL0
+ - SYS_AMEVCNTR0_INST_RET_EL0
+ - SYS_AMEVCNTR0_MEM_STALL_EL0
+
+Auxiliary platform specific counters can be accessed using
+SYS_AMEVCNTR1_EL0(n), where n is a value between 0 and 15.
+
+Details can be found in: arch/arm64/include/asm/sysreg.h.
+
+
+Userspace access
+----------------
+
+Currently, access from userspace to the AMU registers is disabled due to:
+ - Security reasons: they might expose information about code executed in
+   secure mode.
+ - Purpose: AMU counters are intended for system management use.
+
+Also, the presence of the feature is not visible to userspace.
+
+
+Virtualization
+--------------
+
+Currently, access from userspace (EL0) and kernelspace (EL1) on the KVM
+guest side is disabled due to:
+ - Security reasons: they might expose information about code executed
+   by other guests or the host.
+
+Any attempt to access the AMU registers will result in an UNDEFINED
+exception being injected into the guest.
diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
index 5d78a6f5b0ae..a3f1a47b6f1c 100644
--- a/Documentation/arm64/booting.rst
+++ b/Documentation/arm64/booting.rst
@@ -248,6 +248,20 @@  Before jumping into the kernel, the following conditions must be met:
     - HCR_EL2.APK (bit 40) must be initialised to 0b1
     - HCR_EL2.API (bit 41) must be initialised to 0b1
 
+  For CPUs with Activity Monitors Unit v1 (AMUv1) extension present:
+  - If EL3 is present:
+    CPTR_EL3.TAM (bit 30) must be initialised to 0b0
+    CPTR_EL2.TAM (bit 30) must be initialised to 0b0
+    AMCNTENSET0_EL0 must be initialised to 0b1111
+    AMCNTENSET1_EL0 must be initialised to a platform specific value
+    having 0b1 set for the corresponding bit for each of the auxiliary
+    counters present.
+  - If the kernel is entered at EL1:
+    AMCNTENSET0_EL0 must be initialised to 0b1111
+    AMCNTENSET1_EL0 must be initialised to a platform specific value
+    having 0b1 set for the corresponding bit for each of the auxiliary
+    counters present.
+
 The requirements described above for CPU mode, caches, MMUs, architected
 timers, coherency and system registers apply to all CPUs.  All CPUs must
 enter the kernel in the same exception level.
diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
index b6e44884e3ad..4770ae54032b 100644
--- a/Documentation/arm64/cpu-feature-registers.rst
+++ b/Documentation/arm64/cpu-feature-registers.rst
@@ -150,6 +150,8 @@  infrastructure:
      +------------------------------+---------+---------+
      | DIT                          | [51-48] |    y    |
      +------------------------------+---------+---------+
+     | AMU                          | [47-44] |    n    |
+     +------------------------------+---------+---------+
      | SVE                          | [35-32] |    y    |
      +------------------------------+---------+---------+
      | GIC                          | [27-24] |    n    |
diff --git a/Documentation/arm64/index.rst b/Documentation/arm64/index.rst
index 5c0c69dc58aa..09cbb4ed2237 100644
--- a/Documentation/arm64/index.rst
+++ b/Documentation/arm64/index.rst
@@ -6,6 +6,7 @@  ARM64 Architecture
     :maxdepth: 1
 
     acpi_object_usage
+    amu
     arm-acpi
     booting
     cpu-feature-registers