diff mbox series

[RFC] soc: fujitsu: Add cache driver code

Message ID 1614764303-34903-2-git-send-email-tan.shaopeng@jp.fujitsu.com (mailing list archive)
State Not Applicable
Headers show
Series [RFC] soc: fujitsu: Add cache driver code | expand

Commit Message

Shaopeng Tan March 3, 2021, 9:38 a.m. UTC
From: "tan.shaopeng" <tan.shaopeng@jp.fujitsu.com>

This driver offers cache functions(including hardware prefetch function
and sector cache function) for A64FX system.
When built as a module, this will be called as "fujitsu_cache".

Signed-off-by: tan.shaopeng <tan.shaopeng@jp.fujitsu.com>
---
 MAINTAINERS                         |   6 ++
 arch/arm64/Kconfig.platforms        |   5 +
 drivers/soc/Kconfig                 |   1 +
 drivers/soc/Makefile                |   1 +
 drivers/soc/fujitsu/Kconfig         |  26 +++++
 drivers/soc/fujitsu/fujitsu_cache.c | 183 ++++++++++++++++++++++++++++++++++++
 6 files changed, 222 insertions(+)
 create mode 100644 drivers/soc/fujitsu/Kconfig
 create mode 100644 drivers/soc/fujitsu/fujitsu_cache.c

Comments

Arnd Bergmann March 3, 2021, 11:24 a.m. UTC | #1
On Wed, Mar 3, 2021 at 10:38 AM tan.shaopeng
<tan.shaopeng@jp.fujitsu.com> wrote:
> +
> +config FUJITSU_CACHE
> +        tristate "FUJITSU Cache Driver"
> +        depends on ARM64_VHE || COMPILE_TEST
> +        help
> +          FUJITSU Cache Driver
> +
> +          This driver offers cache functions for A64FX system.
> +         Loading this cache driver, control registers will be set to enable
> +         these functions, and advanced settings registers will be set by default
> +         values. After loading this driver, you can use the default values of the
> +         advanced settings registers or set the advanced settings registers
> +         from EL0. Unloading this driver, control registers will be clear to
> +         disable these functions.
> +          When built as a module, this will be called as "fujitsu_cache".

My feeling is that this code should be in arch/arm64/, as the cache
is generally considered part of the CPU, rather than part of the wider
SoC design, or something that can be controlled separately from the
core kernel and memory management code.

> +module_param(tagaddr_ctrl_reg, ulong, 0444);
> +MODULE_PARM_DESC(tagaddr_ctrl_reg, "HPC tag address override control register");
> +module_param(hwpf_ctrl_reg, ulong, 0444);
> +MODULE_PARM_DESC(hwpf_ctrl_reg, "hardware prefetch assist control register");
> +module_param(sec_ctrl_reg, ulong, 0444);
> +MODULE_PARM_DESC(sec_ctrl_reg, "sector cache control register");
> +module_param(sec_assign_reg, ulong, 0444);
> +MODULE_PARM_DESC(sec_assign_reg, "sector cache assign register");
> +module_param(sec_set0_l2_reg, ulong, 0444);
> +MODULE_PARM_DESC(sec_set0_l2_reg, "sector cache L2 way register(sector=0,1)");
> +module_param(sec_set1_l2_reg, ulong, 0444);
> +MODULE_PARM_DESC(sec_set1_l2_reg, "sector cache L2 way register(sector=2,3)");

My feeling is that the actual settings need to be on a higher level, not tied
to the specific register-level implementation of this chip. Normally,  the L2
cache is set up by the firmware according to local policy, and the settings
can either be read by the kernel from registers or passed down through the
device tree. It sounds like you want to control the policy at runtime in the
operating system rather than at boot time, so for each setting you wish to
override, there should be description of what the setting does and what
the purpose of overriding the firmware setting is.

Looking at the first one in the list, I see the specification mentions
multiple distinct features that can be enabled or disabled, so these
should probably get controlled individually. I also see that it is possible
to control this for TTBR1 and TTBR0 separately, and we probably
cannot allow user space (through module parameters or any other
interface) to control TTBR1, which is where the kernel resides.

The TTBR0 settings in turn would seem to interact with
CONFIG_ARM64_MTE, and should not be controlled independently
but through the same interfaces as that if we find that it does need
to be controlled at all.

I have not looked at any further details, but that should help get an
idea of what I think would happen with the other registers.

> +static int __init fujitsu_drv_init(void)
> +{
> +       int ret;
> +
> +       if (read_cpuid_implementor() != ARM_CPU_IMP_FUJITSU)
> +               return -ENODEV;
> +       if (read_cpuid_part_number() != FUJITSU_CPU_PART_A64FX)
> +               return -ENODEV;

The module_init function should not return an error when it is running on
incompatible hardware, please just change this to silently return success
to avoid warning about the failed initcall if the driver is built into a generic
kernel.

       Arnd
Shaopeng Tan (Fujitsu) March 4, 2021, 10:34 a.m. UTC | #2
Hi, 
 
Thanks for your comments

> On Wed, Mar 3, 2021 at 10:38 AM tan.shaopeng
> <tan.shaopeng@jp.fujitsu.com> wrote:
> > +
> > +config FUJITSU_CACHE
> > +        tristate "FUJITSU Cache Driver"
> > +        depends on ARM64_VHE || COMPILE_TEST
> > +        help
> > +          FUJITSU Cache Driver
> > +
> > +          This driver offers cache functions for A64FX system.
> > +         Loading this cache driver, control registers will be set to enable
> > +         these functions, and advanced settings registers will be set by
> default
> > +         values. After loading this driver, you can use the default values of
> the
> > +         advanced settings registers or set the advanced settings registers
> > +         from EL0. Unloading this driver, control registers will be clear to
> > +         disable these functions.
> > +          When built as a module, this will be called as "fujitsu_cache".
> 
> My feeling is that this code should be in arch/arm64/, as the cache
> is generally considered part of the CPU, rather than part of the wider
> SoC design, or something that can be controlled separately from the
> core kernel and memory management code.

Thanks for your advice. I also would like to hear the opinions from 
other soc&arm maintainers, and then consider whether to add this to 
arch/arm64/. 

> > +module_param(tagaddr_ctrl_reg, ulong, 0444);
> > +MODULE_PARM_DESC(tagaddr_ctrl_reg, "HPC tag address override
> control register");
> > +module_param(hwpf_ctrl_reg, ulong, 0444);
> > +MODULE_PARM_DESC(hwpf_ctrl_reg, "hardware prefetch assist control
> register");
> > +module_param(sec_ctrl_reg, ulong, 0444);
> > +MODULE_PARM_DESC(sec_ctrl_reg, "sector cache control register");
> > +module_param(sec_assign_reg, ulong, 0444);
> > +MODULE_PARM_DESC(sec_assign_reg, "sector cache assign register");
> > +module_param(sec_set0_l2_reg, ulong, 0444);
> > +MODULE_PARM_DESC(sec_set0_l2_reg, "sector cache L2 way
> register(sector=0,1)");
> > +module_param(sec_set1_l2_reg, ulong, 0444);
> > +MODULE_PARM_DESC(sec_set1_l2_reg, "sector cache L2 way
> register(sector=2,3)");
> 
> My feeling is that the actual settings need to be on a higher level, not tied
> to the specific register-level implementation of this chip. Normally,  the L2
> cache is set up by the firmware according to local policy, and the settings
> can either be read by the kernel from registers or passed down through the
> device tree. It sounds like you want to control the policy at runtime in the
> operating system rather than at boot time, so for each setting you wish to
> override, there should be description of what the setting does and what
> the purpose of overriding the firmware setting is.

OK, I will change module parameter from specific register-level to 
a higher level. And, I will modify the description of module parameters. 
To be clear, we don't suppose these parameters (EL1 registers) are 
often changed at runtime.

> Looking at the first one in the list, I see the specification mentions
> multiple distinct features that can be enabled or disabled, so these
> should probably get controlled individually.

It is not necessary to enable/disable every feature individually. 
There are no plans to use these features individually.  

> I also see that it is possible
> to control this for TTBR1 and TTBR0 separately, and we probably
> cannot allow user space (through module parameters or any other
> interface) to control TTBR1, which is where the kernel resides.

This driver does not change the values of TTBR0 and TTBR1, 
and the values of TCR_EL1.TBI0 and TCR_EL1.TBI1.
The cache functions can be used when TBI is already enabled.

> The TTBR0 settings in turn would seem to interact with
> CONFIG_ARM64_MTE, and should not be controlled independently
> but through the same interfaces as that if we find that it does need
> to be controlled at all.

MTE is not supported on A64FX. 
So, this function does not conflict with MTE.

> I have not looked at any further details, but that should help get an
> idea of what I think would happen with the other registers.
> 
> > +static int __init fujitsu_drv_init(void)
> > +{
> > +       int ret;
> > +
> > +       if (read_cpuid_implementor() != ARM_CPU_IMP_FUJITSU)
> > +               return -ENODEV;
> > +       if (read_cpuid_part_number() != FUJITSU_CPU_PART_A64FX)
> > +               return -ENODEV;
> 
> The module_init function should not return an error when it is running on
> incompatible hardware, please just change this to silently return success
> to avoid warning about the failed initcall if the driver is built into a generic
> kernel.

OK, I will change these codes to return success. 

Best regards, 
Tan Shaopeng
Will Deacon March 4, 2021, 10:46 a.m. UTC | #3
On Thu, Mar 04, 2021 at 10:34:43AM +0000, tan.shaopeng@fujitsu.com wrote:
> > On Wed, Mar 3, 2021 at 10:38 AM tan.shaopeng
> > <tan.shaopeng@jp.fujitsu.com> wrote:
> > > +
> > > +config FUJITSU_CACHE
> > > +        tristate "FUJITSU Cache Driver"
> > > +        depends on ARM64_VHE || COMPILE_TEST
> > > +        help
> > > +          FUJITSU Cache Driver
> > > +
> > > +          This driver offers cache functions for A64FX system.
> > > +         Loading this cache driver, control registers will be set to enable
> > > +         these functions, and advanced settings registers will be set by
> > default
> > > +         values. After loading this driver, you can use the default values of
> > the
> > > +         advanced settings registers or set the advanced settings registers
> > > +         from EL0. Unloading this driver, control registers will be clear to
> > > +         disable these functions.
> > > +          When built as a module, this will be called as "fujitsu_cache".
> > 
> > My feeling is that this code should be in arch/arm64/, as the cache
> > is generally considered part of the CPU, rather than part of the wider
> > SoC design, or something that can be controlled separately from the
> > core kernel and memory management code.
> 
> Thanks for your advice. I also would like to hear the opinions from 
> other soc&arm maintainers, and then consider whether to add this to 
> arch/arm64/.

Given that all of this is outside of the scope of the architecture, I don't
think that arch/arm64/ is the right place for it. Perhaps this would fit
into the resctrl rework that James has been doing for MPAM?

Will
Arnd Bergmann March 4, 2021, 10:54 a.m. UTC | #4
On Thu, Mar 4, 2021 at 11:34 AM tan.shaopeng@fujitsu.com
<tan.shaopeng@fujitsu.com> wrote:
> > On Wed, Mar 3, 2021 at 10:38 AM tan.shaopeng
> > <tan.shaopeng@jp.fujitsu.com> wrote:

> > > +module_param(tagaddr_ctrl_reg, ulong, 0444);
> > > +MODULE_PARM_DESC(tagaddr_ctrl_reg, "HPC tag address override
> > control register");
> > > +module_param(hwpf_ctrl_reg, ulong, 0444);
> > > +MODULE_PARM_DESC(hwpf_ctrl_reg, "hardware prefetch assist control
> > register");
> > > +module_param(sec_ctrl_reg, ulong, 0444);
> > > +MODULE_PARM_DESC(sec_ctrl_reg, "sector cache control register");
> > > +module_param(sec_assign_reg, ulong, 0444);
> > > +MODULE_PARM_DESC(sec_assign_reg, "sector cache assign register");
> > > +module_param(sec_set0_l2_reg, ulong, 0444);
> > > +MODULE_PARM_DESC(sec_set0_l2_reg, "sector cache L2 way
> > register(sector=0,1)");
> > > +module_param(sec_set1_l2_reg, ulong, 0444);
> > > +MODULE_PARM_DESC(sec_set1_l2_reg, "sector cache L2 way
> > register(sector=2,3)");
> >
> > My feeling is that the actual settings need to be on a higher level, not tied
> > to the specific register-level implementation of this chip. Normally,  the L2
> > cache is set up by the firmware according to local policy, and the settings
> > can either be read by the kernel from registers or passed down through the
> > device tree. It sounds like you want to control the policy at runtime in the
> > operating system rather than at boot time, so for each setting you wish to
> > override, there should be description of what the setting does and what
> > the purpose of overriding the firmware setting is.
>
> OK, I will change module parameter from specific register-level to
> a higher level. And, I will modify the description of module parameters.
> To be clear, we don't suppose these parameters (EL1 registers) are
> often changed at runtime.
>
> > Looking at the first one in the list, I see the specification mentions
> > multiple distinct features that can be enabled or disabled, so these
> > should probably get controlled individually.
>
> It is not necessary to enable/disable every feature individually.
> There are no plans to use these features individually.

Ok, in that case you probably want a smaller number of global
settings can describe all the combinations one may need in
practice.

> > I also see that it is possible
> > to control this for TTBR1 and TTBR0 separately, and we probably
> > cannot allow user space (through module parameters or any other
> > interface) to control TTBR1, which is where the kernel resides.
>
> This driver does not change the values of TTBR0 and TTBR1,
> and the values of TCR_EL1.TBI0 and TCR_EL1.TBI1.
> The cache functions can be used when TBI is already enabled.
>
> > The TTBR0 settings in turn would seem to interact with
> > CONFIG_ARM64_MTE, and should not be controlled independently
> > but through the same interfaces as that if we find that it does need
> > to be controlled at all.
>
> MTE is not supported on A64FX.
> So, this function does not conflict with MTE.

But could you guarantee that no future generation might support
both features? In either case, it sounds like the kernel would need
to know when this is enabled in the same way: as far as I understand,
both of these put extra information into the upper eight bits of a
pointer, and any code that interacts with user addresses would need
to know about this.

       Arnd
Arnd Bergmann March 4, 2021, 10:58 a.m. UTC | #5
On Thu, Mar 4, 2021 at 11:46 AM Will Deacon <will@kernel.org> wrote:
> On Thu, Mar 04, 2021 at 10:34:43AM +0000, tan.shaopeng@fujitsu.com wrote:
> > > On Wed, Mar 3, 2021 at 10:38 AM tan.shaopeng
> > > <tan.shaopeng@jp.fujitsu.com> wrote:
> > > > +
> > > > +config FUJITSU_CACHE
> > > > +        tristate "FUJITSU Cache Driver"
> > > > +        depends on ARM64_VHE || COMPILE_TEST
> > > > +        help
> > > > +          FUJITSU Cache Driver
> > > > +
> > > > +          This driver offers cache functions for A64FX system.
> > > > +         Loading this cache driver, control registers will be set to enable
> > > > +         these functions, and advanced settings registers will be set by
> > > default
> > > > +         values. After loading this driver, you can use the default values of
> > > the
> > > > +         advanced settings registers or set the advanced settings registers
> > > > +         from EL0. Unloading this driver, control registers will be clear to
> > > > +         disable these functions.
> > > > +          When built as a module, this will be called as "fujitsu_cache".
> > >
> > > My feeling is that this code should be in arch/arm64/, as the cache
> > > is generally considered part of the CPU, rather than part of the wider
> > > SoC design, or something that can be controlled separately from the
> > > core kernel and memory management code.
> >
> > Thanks for your advice. I also would like to hear the opinions from
> > other soc&arm maintainers, and then consider whether to add this to
> > arch/arm64/.
>
> Given that all of this is outside of the scope of the architecture, I don't
> think that arch/arm64/ is the right place for it. Perhaps this would fit
> into the resctrl rework that James has been doing for MPAM?

Indeed, that sounds like a good starting point. I don't understand enough
about either of the two to be sure, but it sounds like there is some overlap
in functionality, and ideally we would have one user interface that can
deal with all the hardware implementations (intel, arm, fujitsu and any
future ones).

      Arnd
Shaopeng Tan (Fujitsu) March 5, 2021, 7:48 a.m. UTC | #6
> On Thu, Mar 4, 2021 at 11:46 AM Will Deacon <will@kernel.org> wrote:
> > On Thu, Mar 04, 2021 at 10:34:43AM +0000, tan.shaopeng@fujitsu.com
> wrote:
> > > > On Wed, Mar 3, 2021 at 10:38 AM tan.shaopeng
> > > > <tan.shaopeng@jp.fujitsu.com> wrote:
> > > > > +
> > > > > +config FUJITSU_CACHE
> > > > > +        tristate "FUJITSU Cache Driver"
> > > > > +        depends on ARM64_VHE || COMPILE_TEST
> > > > > +        help
> > > > > +          FUJITSU Cache Driver
> > > > > +
> > > > > +          This driver offers cache functions for A64FX system.
> > > > > +         Loading this cache driver, control registers will be set to
> enable
> > > > > +         these functions, and advanced settings registers will be set
> by
> > > > default
> > > > > +         values. After loading this driver, you can use the default
> values of
> > > > the
> > > > > +         advanced settings registers or set the advanced settings
> registers
> > > > > +         from EL0. Unloading this driver, control registers will be
> clear to
> > > > > +         disable these functions.
> > > > > +          When built as a module, this will be called as
> "fujitsu_cache".
> > > >
> > > > My feeling is that this code should be in arch/arm64/, as the cache
> > > > is generally considered part of the CPU, rather than part of the wider
> > > > SoC design, or something that can be controlled separately from the
> > > > core kernel and memory management code.
> > >
> > > Thanks for your advice. I also would like to hear the opinions from
> > > other soc&arm maintainers, and then consider whether to add this to
> > > arch/arm64/.
> >
> > Given that all of this is outside of the scope of the architecture, I don't
> > think that arch/arm64/ is the right place for it. Perhaps this would fit
> > into the resctrl rework that James has been doing for MPAM?
> 
> Indeed, that sounds like a good starting point. I don't understand enough
> about either of the two to be sure, but it sounds like there is some overlap
> in functionality, and ideally we would have one user interface that can
> deal with all the hardware implementations (intel, arm, fujitsu and any
> future ones).

I don't know resctrl well and I’m not sure if it can be used for sector cache now.
Also, it seems to me that resctrl has no control with hardware prefetch. 
I will learn more about resctrl.

Best regards, 
Tan Shaopeng
Shaopeng Tan (Fujitsu) March 5, 2021, 8:10 a.m. UTC | #7
> On Thu, Mar 4, 2021 at 11:34 AM tan.shaopeng@fujitsu.com
> <tan.shaopeng@fujitsu.com> wrote:
> > > On Wed, Mar 3, 2021 at 10:38 AM tan.shaopeng
> > > <tan.shaopeng@jp.fujitsu.com> wrote:
> 
> > > > +module_param(tagaddr_ctrl_reg, ulong, 0444);
> > > > +MODULE_PARM_DESC(tagaddr_ctrl_reg, "HPC tag address override
> > > control register");
> > > > +module_param(hwpf_ctrl_reg, ulong, 0444);
> > > > +MODULE_PARM_DESC(hwpf_ctrl_reg, "hardware prefetch assist
> control
> > > register");
> > > > +module_param(sec_ctrl_reg, ulong, 0444);
> > > > +MODULE_PARM_DESC(sec_ctrl_reg, "sector cache control register");
> > > > +module_param(sec_assign_reg, ulong, 0444);
> > > > +MODULE_PARM_DESC(sec_assign_reg, "sector cache assign
> register");
> > > > +module_param(sec_set0_l2_reg, ulong, 0444);
> > > > +MODULE_PARM_DESC(sec_set0_l2_reg, "sector cache L2 way
> > > register(sector=0,1)");
> > > > +module_param(sec_set1_l2_reg, ulong, 0444);
> > > > +MODULE_PARM_DESC(sec_set1_l2_reg, "sector cache L2 way
> > > register(sector=2,3)");
> > >
> > > My feeling is that the actual settings need to be on a higher level, not tied
> > > to the specific register-level implementation of this chip. Normally,  the L2
> > > cache is set up by the firmware according to local policy, and the settings
> > > can either be read by the kernel from registers or passed down through the
> > > device tree. It sounds like you want to control the policy at runtime in the
> > > operating system rather than at boot time, so for each setting you wish to
> > > override, there should be description of what the setting does and what
> > > the purpose of overriding the firmware setting is.
> >
> > OK, I will change module parameter from specific register-level to
> > a higher level. And, I will modify the description of module parameters.
> > To be clear, we don't suppose these parameters (EL1 registers) are
> > often changed at runtime.
> >
> > > Looking at the first one in the list, I see the specification mentions
> > > multiple distinct features that can be enabled or disabled, so these
> > > should probably get controlled individually.
> >
> > It is not necessary to enable/disable every feature individually.
> > There are no plans to use these features individually.
> 
> Ok, in that case you probably want a smaller number of global
> settings can describe all the combinations one may need in
> practice.

Thanks for your advice. 
I will consider the optimal module parameters if we end up make a driver.

> > > I also see that it is possible
> > > to control this for TTBR1 and TTBR0 separately, and we probably
> > > cannot allow user space (through module parameters or any other
> > > interface) to control TTBR1, which is where the kernel resides.
> >
> > This driver does not change the values of TTBR0 and TTBR1,
> > and the values of TCR_EL1.TBI0 and TCR_EL1.TBI1.
> > The cache functions can be used when TBI is already enabled.
> >
> > > The TTBR0 settings in turn would seem to interact with
> > > CONFIG_ARM64_MTE, and should not be controlled independently
> > > but through the same interfaces as that if we find that it does need
> > > to be controlled at all.
> >
> > MTE is not supported on A64FX.
> > So, this function does not conflict with MTE.
> 
> But could you guarantee that no future generation might support
> both features? In either case, it sounds like the kernel would need
> to know when this is enabled in the same way: as far as I understand,
> both of these put extra information into the upper eight bits of a
> pointer, and any code that interacts with user addresses would need
> to know about this.

I think these functions and MTE cannot be enabled 
at the same time on hardware level.

Best regards,
Tan Shaopeng
Shaopeng Tan (Fujitsu) March 31, 2021, 8:52 a.m. UTC | #8
Dear James Morse

> > Given that all of this is outside of the scope of the architecture, I
> > don't think that arch/arm64/ is the right place for it. Perhaps this
> > would fit into the resctrl rework that James has been doing for MPAM?
> 
> Indeed, that sounds like a good starting point. I don't understand enough about
> either of the two to be sure, but it sounds like there is some overlap in
> functionality, and ideally we would have one user interface that can deal with
> all the hardware implementations (intel, arm, fujitsu and any future ones).

According to my study, it seems possible to add A64FX's sector cache function
to resctrl. I heard you are working on resctrl rework for MPAM , 
but I cannot find related patches on ML archive. Therefore, 
I would like to know the status about MPAM support. 

(1) I think the first step is to support resctrl for ARM arch. 
   Have you finished the work of arm support and when will you release it? 
(2) When will you release MPAM patch?

Best regards
Tan Shaopeng
James Morse April 1, 2021, 4:15 p.m. UTC | #9
Hi Tan Shaopeng,

On 31/03/2021 09:52, tan.shaopeng@fujitsu.com wrote:
>>> Given that all of this is outside of the scope of the architecture, I
>>> don't think that arch/arm64/ is the right place for it. Perhaps this
>>> would fit into the resctrl rework that James has been doing for MPAM?
>>
>> Indeed, that sounds like a good starting point. I don't understand enough about
>> either of the two to be sure, but it sounds like there is some overlap in
>> functionality, and ideally we would have one user interface that can deal with
>> all the hardware implementations (intel, arm, fujitsu and any future ones).

> According to my study, it seems possible to add A64FX's sector cache function
> to resctrl.

I think this depends on whether it maps to one of Intel RDT's existing schema.
While adding new ones looks easy, it is bad for user-space as they are not portable
between machines that support resctrl.


> I heard you are working on resctrl rework for MPAM , 
> but I cannot find related patches on ML archive. Therefore, 
> I would like to know the status about MPAM support. 

Its about five or six series that refactor resctrl inside arch/x86 to have a clear
boundary between arch-specific code and the resctrl code that implements the filesystem,
then pull it out to /fs/ and build the MPAM driver to make it work for arm64.

Unfortunately, its rather large (>100 patches), so will take some time to be reviewed.


> (1) I think the first step is to support resctrl for ARM arch. 
>    Have you finished the work of arm support and when will you release it? 

No-one wants a second copy of the code to implement resctrl, as this will introduce subtle
bugs that user-space would have to work around, and make it harder to merge later.

The first step is to refactor the arch/x86 implementation of resctrl so that the parts
that are visible to user-space can be moved somewhere that is common to multiple
architectures.

I've posted the next chunk of that work here:
https://lore.kernel.org/lkml/20210312175849.8327-1-james.morse@arm.com/

> (2) When will you release MPAM patch?

The latest complete version of the tree is here:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/log/?h=mpam/snapshot/feb

It is over a year old, and has plenty of bugs.

I hope to push a newer version of the branch once I have a complete copy of the tree that
has been tested, and is based on the current version of the ACPI table.



Thanks,

James
Shaopeng Tan (Fujitsu) April 2, 2021, 8:44 a.m. UTC | #10
Hi James Morse

> On 31/03/2021 09:52, tan.shaopeng@fujitsu.com wrote:
> >>> Given that all of this is outside of the scope of the architecture,
> >>> I don't think that arch/arm64/ is the right place for it. Perhaps
> >>> this would fit into the resctrl rework that James has been doing for
> MPAM?
> >>
> >> Indeed, that sounds like a good starting point. I don't understand
> >> enough about either of the two to be sure, but it sounds like there
> >> is some overlap in functionality, and ideally we would have one user
> >> interface that can deal with all the hardware implementations (intel, arm,
> fujitsu and any future ones).
> 
> > According to my study, it seems possible to add A64FX's sector cache
> > function to resctrl.
> 
> I think this depends on whether it maps to one of Intel RDT's existing schema.
> While adding new ones looks easy, it is bad for user-space as they are not
> portable between machines that support resctrl.

Thanks for your advice, I will try finding a way to add
the sector cache function in resctrl by using existing schema.

> > I heard you are working on resctrl rework for MPAM , but I cannot find
> > related patches on ML archive. Therefore, I would like to know the
> > status about MPAM support.
> 
> Its about five or six series that refactor resctrl inside arch/x86 to have a clear
> boundary between arch-specific code and the resctrl code that implements the
> filesystem, then pull it out to /fs/ and build the MPAM driver to make it work for
> arm64.
> 
> Unfortunately, its rather large (>100 patches), so will take some time to be
> reviewed.
> 
> 
> > (1) I think the first step is to support resctrl for ARM arch.
> >    Have you finished the work of arm support and when will you release it?
> 
> No-one wants a second copy of the code to implement resctrl, as this will
> introduce subtle bugs that user-space would have to work around, and make it
> harder to merge later.
> 
> The first step is to refactor the arch/x86 implementation of resctrl so that the
> parts that are visible to user-space can be moved somewhere that is common
> to multiple architectures.
> 
> I've posted the next chunk of that work here:
> https://lore.kernel.org/lkml/20210312175849.8327-1-james.morse@arm.com/
> 
> > (2) When will you release MPAM patch?
> 
> The latest complete version of the tree is here:
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/log/?h=mpa
> m/snapshot/feb
> 
> It is over a year old, and has plenty of bugs.
> 
> I hope to push a newer version of the branch once I have a complete copy of the
> tree that has been tested, and is based on the current version of the ACPI table.

Thanks for your information. Let me see the patch.

Thanks, 
Tan Shaopeng
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 6eff4f7..93da2f9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1508,6 +1508,12 @@  T:	git git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git
 F:	arch/arm/mach-*/
 F:	arch/arm/plat-*/
 
+ARM/A64FX SOC SUPPORT
+M:	SHAOPENG TAN <tan.shaopeng@jp.fujitsu.com>
+L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
+S:	Maintained
+F:	drivers/soc/fujitsu/
+
 ARM/ACTIONS SEMI ARCHITECTURE
 M:	Andreas Färber <afaerber@suse.de>
 M:	Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
diff --git a/arch/arm64/Kconfig.platforms b/arch/arm64/Kconfig.platforms
index 6eecdef..41fb214 100644
--- a/arch/arm64/Kconfig.platforms
+++ b/arch/arm64/Kconfig.platforms
@@ -1,6 +1,11 @@ 
 # SPDX-License-Identifier: GPL-2.0-only
 menu "Platform selection"
 
+config ARCH_A64FX
+	bool "Fujitsu A64FX Platforms"
+	help
+	  This enables support for Fujitsu A64FX SoC family.
+
 config ARCH_ACTIONS
 	bool "Actions Semi Platforms"
 	select OWL_TIMER
diff --git a/drivers/soc/Kconfig b/drivers/soc/Kconfig
index d097d07..7a52b5d 100644
--- a/drivers/soc/Kconfig
+++ b/drivers/soc/Kconfig
@@ -7,6 +7,7 @@  source "drivers/soc/aspeed/Kconfig"
 source "drivers/soc/atmel/Kconfig"
 source "drivers/soc/bcm/Kconfig"
 source "drivers/soc/fsl/Kconfig"
+source "drivers/soc/fujitsu/Kconfig"
 source "drivers/soc/imx/Kconfig"
 source "drivers/soc/ixp4xx/Kconfig"
 source "drivers/soc/litex/Kconfig"
diff --git a/drivers/soc/Makefile b/drivers/soc/Makefile
index 699b758..57c0ddd 100644
--- a/drivers/soc/Makefile
+++ b/drivers/soc/Makefile
@@ -10,6 +10,7 @@  obj-y				+= bcm/
 obj-$(CONFIG_ARCH_DOVE)		+= dove/
 obj-$(CONFIG_MACH_DOVE)		+= dove/
 obj-y				+= fsl/
+obj-y				+= fujitsu/
 obj-$(CONFIG_ARCH_GEMINI)	+= gemini/
 obj-y				+= imx/
 obj-$(CONFIG_ARCH_IXP4XX)	+= ixp4xx/
diff --git a/drivers/soc/fujitsu/Kconfig b/drivers/soc/fujitsu/Kconfig
new file mode 100644
index 0000000..7754fb5
--- /dev/null
+++ b/drivers/soc/fujitsu/Kconfig
@@ -0,0 +1,26 @@ 
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# FUJITSU SoC drivers
+#
+menuconfig SOC_FUJITSU
+	bool "FUJITSU SoC drivers"
+	depends on ARCH_A64FX || COMPILE_TEST
+
+if SOC_FUJITSU
+
+config FUJITSU_CACHE
+        tristate "FUJITSU Cache Driver"
+        depends on ARM64_VHE || COMPILE_TEST
+        help
+          FUJITSU Cache Driver
+
+          This driver offers cache functions for A64FX system.
+	  Loading this cache driver, control registers will be set to enable
+	  these functions, and advanced settings registers will be set by default
+	  values. After loading this driver, you can use the default values of the
+	  advanced settings registers or set the advanced settings registers
+	  from EL0. Unloading this driver, control registers will be clear to
+	  disable these functions.
+          When built as a module, this will be called as "fujitsu_cache".
+
+endif # SOC_FUJITSU
diff --git a/drivers/soc/fujitsu/fujitsu_cache.c b/drivers/soc/fujitsu/fujitsu_cache.c
new file mode 100644
index 0000000..852fe1c
--- /dev/null
+++ b/drivers/soc/fujitsu/fujitsu_cache.c
@@ -0,0 +1,183 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2021 FUJITSU LIMITED
+ *
+ * This driver enables HPC tag address override function &
+ * hardware prefetch assist function & sector cache function on A64FX.
+ *
+ * After loading this driver, detail settings of these functions
+ * can be made from EL0
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/cpuhotplug.h>
+#include <asm/cputype.h>
+#include <asm/sysreg.h>
+
+#ifdef pr_fmt
+#undef pr_fmt
+#endif
+#define pr_fmt(fmt) "[%s:%s:%d] " fmt, KBUILD_MODNAME, __func__, __LINE__
+
+#define TAG_ADDRESS_CTRL_EL1_ENABLE	GENMASK(31, 0)
+#define HWPF_CTRL_EL1_ENABLE		GENMASK(63, 0)
+#define SEC_CTRL_EL1_ENABLE		GENMASK(63, 62)
+#define SEC_ASSIGN_EL1_DEFAULT		((u64)0)
+#define SEC_SET0_L2_EL1_DEFAULT		GENMASK(63, 0)
+#define SEC_SET1_L2_EL1_DEFAULT		GENMASK(63, 0)
+
+/* hpc tag address */
+#define IMP_FJ_TAG_ADDRESS_CTRL_EL1	sys_reg(3, 0, 11, 2, 0)
+
+/* hardware prefetch assist */
+#define IMP_PF_CTRL_EL1			sys_reg(3, 0, 11, 4, 0)
+#define IMP_PF_STREAM_DETECT_CTRL_EL0	sys_reg(3, 3, 11, 4, 0)
+#define IMP_PF_INJECTION_CTRL0_EL0	sys_reg(3, 3, 11, 6, 0)
+#define IMP_PF_INJECTION_CTRL1_EL0	sys_reg(3, 3, 11, 6, 1)
+#define IMP_PF_INJECTION_CTRL2_EL0	sys_reg(3, 3, 11, 6, 2)
+#define IMP_PF_INJECTION_CTRL3_EL0	sys_reg(3, 3, 11, 6, 3)
+#define IMP_PF_INJECTION_CTRL4_EL0	sys_reg(3, 3, 11, 6, 4)
+#define IMP_PF_INJECTION_CTRL5_EL0	sys_reg(3, 3, 11, 6, 5)
+#define IMP_PF_INJECTION_CTRL6_EL0	sys_reg(3, 3, 11, 6, 6)
+#define IMP_PF_INJECTION_CTRL7_EL0	sys_reg(3, 3, 11, 6, 7)
+#define IMP_PF_INJECTION_DISTANCE0_EL0	sys_reg(3, 3, 11, 7, 0)
+#define IMP_PF_INJECTION_DISTANCE1_EL0	sys_reg(3, 3, 11, 7, 1)
+#define IMP_PF_INJECTION_DISTANCE2_EL0	sys_reg(3, 3, 11, 7, 2)
+#define IMP_PF_INJECTION_DISTANCE3_EL0	sys_reg(3, 3, 11, 7, 3)
+#define IMP_PF_INJECTION_DISTANCE4_EL0	sys_reg(3, 3, 11, 7, 4)
+#define IMP_PF_INJECTION_DISTANCE5_EL0	sys_reg(3, 3, 11, 7, 5)
+#define IMP_PF_INJECTION_DISTANCE6_EL0	sys_reg(3, 3, 11, 7, 6)
+#define IMP_PF_INJECTION_DISTANCE7_EL0	sys_reg(3, 3, 11, 7, 7)
+
+/* sector cache */
+#define IMP_SCCR_CTRL_EL1		sys_reg(3, 0, 11, 8, 0)
+#define IMP_SCCR_ASSIGN_EL1		sys_reg(3, 0, 11, 8, 1)
+#define IMP_SCCR_L1_EL0			sys_reg(3, 3, 11, 8, 2)
+#define IMP_SCCR_SET0_L2_EL1		sys_reg(3, 0, 15, 8, 2)
+#define IMP_SCCR_SET1_L2_EL1		sys_reg(3, 0, 15, 8, 3)
+#define IMP_SCCR_VSCCR_L2_EL0		sys_reg(3, 3, 15, 8, 2)
+
+static unsigned long tagaddr_ctrl_reg = TAG_ADDRESS_CTRL_EL1_ENABLE;
+static unsigned long hwpf_ctrl_reg = HWPF_CTRL_EL1_ENABLE;
+static unsigned long sec_ctrl_reg = SEC_CTRL_EL1_ENABLE;
+static unsigned long sec_assign_reg = SEC_ASSIGN_EL1_DEFAULT;
+static unsigned long sec_set0_l2_reg = SEC_SET0_L2_EL1_DEFAULT;
+static unsigned long sec_set1_l2_reg = SEC_SET1_L2_EL1_DEFAULT;
+
+module_param(tagaddr_ctrl_reg, ulong, 0444);
+MODULE_PARM_DESC(tagaddr_ctrl_reg, "HPC tag address override control register");
+module_param(hwpf_ctrl_reg, ulong, 0444);
+MODULE_PARM_DESC(hwpf_ctrl_reg, "hardware prefetch assist control register");
+module_param(sec_ctrl_reg, ulong, 0444);
+MODULE_PARM_DESC(sec_ctrl_reg, "sector cache control register");
+module_param(sec_assign_reg, ulong, 0444);
+MODULE_PARM_DESC(sec_assign_reg, "sector cache assign register");
+module_param(sec_set0_l2_reg, ulong, 0444);
+MODULE_PARM_DESC(sec_set0_l2_reg, "sector cache L2 way register(sector=0,1)");
+module_param(sec_set1_l2_reg, ulong, 0444);
+MODULE_PARM_DESC(sec_set1_l2_reg, "sector cache L2 way register(sector=2,3)");
+
+static enum cpuhp_state _hp_state;
+
+static void fujitsu_hwpf_setting_regs_clear(void)
+{
+	write_sysreg_s(0, IMP_PF_STREAM_DETECT_CTRL_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_CTRL0_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_CTRL1_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_CTRL2_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_CTRL3_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_CTRL4_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_CTRL5_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_CTRL6_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_CTRL7_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_DISTANCE0_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_DISTANCE1_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_DISTANCE2_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_DISTANCE3_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_DISTANCE4_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_DISTANCE5_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_DISTANCE6_EL0);
+	write_sysreg_s(0, IMP_PF_INJECTION_DISTANCE7_EL0);
+}
+
+static void fujitsu_sec_setting_regs_clear(void)
+{
+	write_sysreg_s(0, IMP_SCCR_L1_EL0);
+	write_sysreg_s(0, IMP_SCCR_VSCCR_L2_EL0);
+}
+
+static void fujitsu_cache_destroy(void *none)
+{
+	/* just clear the values of all registers */
+	write_sysreg_s(0, IMP_FJ_TAG_ADDRESS_CTRL_EL1);
+
+	write_sysreg_s(0, IMP_PF_CTRL_EL1);
+	fujitsu_hwpf_setting_regs_clear();
+
+	write_sysreg_s(0, IMP_SCCR_CTRL_EL1);
+	write_sysreg_s(0, IMP_SCCR_ASSIGN_EL1);
+	write_sysreg_s(0, IMP_SCCR_SET0_L2_EL1);
+	write_sysreg_s(0, IMP_SCCR_SET1_L2_EL1);
+	fujitsu_sec_setting_regs_clear();
+}
+
+static int fujitsu_cache_init(unsigned int cpu)
+{
+	/* Enabled HPC tag address override,
+	 * and then hardware prefetch assist & sector cache can be customized.
+	 * Default values can be changed by module parameters.
+	 */
+	write_sysreg_s(tagaddr_ctrl_reg, IMP_FJ_TAG_ADDRESS_CTRL_EL1);
+
+	/* hardware prefetch assist */
+	write_sysreg_s(hwpf_ctrl_reg, IMP_PF_CTRL_EL1);
+	fujitsu_hwpf_setting_regs_clear();
+
+	/* sector cache */
+	write_sysreg_s(sec_ctrl_reg, IMP_SCCR_CTRL_EL1);
+	write_sysreg_s(sec_assign_reg, IMP_SCCR_ASSIGN_EL1);
+	write_sysreg_s(sec_set0_l2_reg, IMP_SCCR_SET0_L2_EL1);
+	write_sysreg_s(sec_set1_l2_reg, IMP_SCCR_SET1_L2_EL1);
+	fujitsu_sec_setting_regs_clear();
+
+	return 0;
+}
+
+static int __init fujitsu_drv_init(void)
+{
+	int ret;
+
+	if (read_cpuid_implementor() != ARM_CPU_IMP_FUJITSU)
+		return -ENODEV;
+	if (read_cpuid_part_number() != FUJITSU_CPU_PART_A64FX)
+		return -ENODEV;
+
+	/* register initialize */
+	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "soc/fujitsu_cache:online",
+				fujitsu_cache_init, NULL);
+	if (ret < 0) {
+		pr_err("cpuhp setup failed: %d\n", ret);
+		return ret;
+	}
+
+	_hp_state = ret;
+
+	return 0;
+}
+
+static void __exit fujitsu_drv_exit(void)
+{
+	cpuhp_remove_state(_hp_state);
+	/* register reset */
+	on_each_cpu_mask(cpu_online_mask, fujitsu_cache_destroy, NULL, 1);
+}
+
+module_init(fujitsu_drv_init);
+module_exit(fujitsu_drv_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("FUJITSU LIMITED");
+MODULE_DESCRIPTION("Fujitsu HPC HardWare Prefetch Assist, Hardware Prefetch, Sector Cache Driver");