Message ID | 20230412185759.755408-3-rrendec@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arch_topology: Pre-allocate cacheinfo from primary CPU | expand |
Hi Will, On Wed, Apr 12, 2023 at 02:57:58PM -0400, Radu Rendec wrote: > This patch adds an architecture specific early cache level detection > handler for arm64. This is basically the CLIDR_EL1 based detection that > was previously done (only) in init_cache_level(). > > This is part of a patch series that attempts to further the work in > commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU"). > Previously, in the absence of any DT/ACPI cache info, architecture > specific cache detection and info allocation for secondary CPUs would > happen in non-preemptible context during early CPU initialization and > trigger a "BUG: sleeping function called from invalid context" splat on > an RT kernel. > > This patch does not solve the problem completely for RT kernels. It > relies on the assumption that on most systems, the CPUs are symmetrical > and therefore have the same number of cache leaves. The cacheinfo memory > is allocated early (on the primary CPU), relying on the new handler. If > later (when CLIDR_EL1 based detection runs again on the secondary CPU) > the initial assumption proves to be wrong and the CPU has in fact more > leaves, the cacheinfo memory is reallocated, and that still triggers a > splat on an RT kernel. > > In other words, asymmetrical CPU systems *must* still provide cacheinfo > data in DT/ACPI to avoid the splat on RT kernels (unless secondary CPUs > happen to have less leaves than the primary CPU). But symmetrical CPU > systems (the majority) can now get away without the additional DT/ACPI > data and rely on CLIDR_EL1 based detection. > If you are okay with the change, can I have your Acked-by, so that I can route this via Greg's tree ?
On Thu, Apr 13, 2023 at 11:22:26AM +0100, Sudeep Holla wrote: > Hi Will, > > On Wed, Apr 12, 2023 at 02:57:58PM -0400, Radu Rendec wrote: > > This patch adds an architecture specific early cache level detection > > handler for arm64. This is basically the CLIDR_EL1 based detection that > > was previously done (only) in init_cache_level(). > > > > This is part of a patch series that attempts to further the work in > > commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU"). > > Previously, in the absence of any DT/ACPI cache info, architecture > > specific cache detection and info allocation for secondary CPUs would > > happen in non-preemptible context during early CPU initialization and > > trigger a "BUG: sleeping function called from invalid context" splat on > > an RT kernel. > > > > This patch does not solve the problem completely for RT kernels. It > > relies on the assumption that on most systems, the CPUs are symmetrical > > and therefore have the same number of cache leaves. The cacheinfo memory > > is allocated early (on the primary CPU), relying on the new handler. If > > later (when CLIDR_EL1 based detection runs again on the secondary CPU) > > the initial assumption proves to be wrong and the CPU has in fact more > > leaves, the cacheinfo memory is reallocated, and that still triggers a > > splat on an RT kernel. > > > > In other words, asymmetrical CPU systems *must* still provide cacheinfo > > data in DT/ACPI to avoid the splat on RT kernels (unless secondary CPUs > > happen to have less leaves than the primary CPU). But symmetrical CPU > > systems (the majority) can now get away without the additional DT/ACPI > > data and rely on CLIDR_EL1 based detection. > > > > If you are okay with the change, can I have your Acked-by, so that I can > route this via Greg's tree ? I really dislike the profileration of __weak functions in this file, rather than the usual approach of having arch-specific static inlines in a header file but it seems that nobody has the appetite to clean that up :( So I'm fine for Greg to queue this if he wants to, but I'd be a lot more excited if somebody tidied things up a bit first. Will
On Thu, Apr 13, 2023 at 03:45:22PM +0100, Will Deacon wrote: > On Thu, Apr 13, 2023 at 11:22:26AM +0100, Sudeep Holla wrote: > > Hi Will, > > > > On Wed, Apr 12, 2023 at 02:57:58PM -0400, Radu Rendec wrote: > > > This patch adds an architecture specific early cache level detection > > > handler for arm64. This is basically the CLIDR_EL1 based detection that > > > was previously done (only) in init_cache_level(). > > > > > > This is part of a patch series that attempts to further the work in > > > commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU"). > > > Previously, in the absence of any DT/ACPI cache info, architecture > > > specific cache detection and info allocation for secondary CPUs would > > > happen in non-preemptible context during early CPU initialization and > > > trigger a "BUG: sleeping function called from invalid context" splat on > > > an RT kernel. > > > > > > This patch does not solve the problem completely for RT kernels. It > > > relies on the assumption that on most systems, the CPUs are symmetrical > > > and therefore have the same number of cache leaves. The cacheinfo memory > > > is allocated early (on the primary CPU), relying on the new handler. If > > > later (when CLIDR_EL1 based detection runs again on the secondary CPU) > > > the initial assumption proves to be wrong and the CPU has in fact more > > > leaves, the cacheinfo memory is reallocated, and that still triggers a > > > splat on an RT kernel. > > > > > > In other words, asymmetrical CPU systems *must* still provide cacheinfo > > > data in DT/ACPI to avoid the splat on RT kernels (unless secondary CPUs > > > happen to have less leaves than the primary CPU). But symmetrical CPU > > > systems (the majority) can now get away without the additional DT/ACPI > > > data and rely on CLIDR_EL1 based detection. > > > > > > > If you are okay with the change, can I have your Acked-by, so that I can > > route this via Greg's tree ? > > I really dislike the profileration of __weak functions in this file, You mean in the generic cacheinfo.c right ? Coz arm64 version must not have any and that is the file in this patch. > rather than the usual approach of having arch-specific static inlines in > a header file but it seems that nobody has the appetite to clean that up :( > Yes, I will try that when I get sometime. I had not seen or touched this for longtime until recently when new requirements around this are coming. > So I'm fine for Greg to queue this if he wants to, but I'd be a lot more > excited if somebody tidied things up a bit first. > Agreed. One reason for such weak functions was to avoid conditional compilation based on arch at that time with the aim to include couple of more archs but that hasn't happened and perhaps it is time that it needs a refresh. -- Regards, Sudeep
On Thu, Apr 13, 2023 at 04:05:05PM +0100, Sudeep Holla wrote: > On Thu, Apr 13, 2023 at 03:45:22PM +0100, Will Deacon wrote: > > On Thu, Apr 13, 2023 at 11:22:26AM +0100, Sudeep Holla wrote: > > > Hi Will, > > > > > > On Wed, Apr 12, 2023 at 02:57:58PM -0400, Radu Rendec wrote: > > > > This patch adds an architecture specific early cache level detection > > > > handler for arm64. This is basically the CLIDR_EL1 based detection that > > > > was previously done (only) in init_cache_level(). > > > > > > > > This is part of a patch series that attempts to further the work in > > > > commit 5944ce092b97 ("arch_topology: Build cacheinfo from primary CPU"). > > > > Previously, in the absence of any DT/ACPI cache info, architecture > > > > specific cache detection and info allocation for secondary CPUs would > > > > happen in non-preemptible context during early CPU initialization and > > > > trigger a "BUG: sleeping function called from invalid context" splat on > > > > an RT kernel. > > > > > > > > This patch does not solve the problem completely for RT kernels. It > > > > relies on the assumption that on most systems, the CPUs are symmetrical > > > > and therefore have the same number of cache leaves. The cacheinfo memory > > > > is allocated early (on the primary CPU), relying on the new handler. If > > > > later (when CLIDR_EL1 based detection runs again on the secondary CPU) > > > > the initial assumption proves to be wrong and the CPU has in fact more > > > > leaves, the cacheinfo memory is reallocated, and that still triggers a > > > > splat on an RT kernel. > > > > > > > > In other words, asymmetrical CPU systems *must* still provide cacheinfo > > > > data in DT/ACPI to avoid the splat on RT kernels (unless secondary CPUs > > > > happen to have less leaves than the primary CPU). But symmetrical CPU > > > > systems (the majority) can now get away without the additional DT/ACPI > > > > data and rely on CLIDR_EL1 based detection. > > > > > > > > > > If you are okay with the change, can I have your Acked-by, so that I can > > > route this via Greg's tree ? > > > > I really dislike the profileration of __weak functions in this file, > > You mean in the generic cacheinfo.c right ? Coz arm64 version must not have > any and that is the file in this patch. Right, but we're providing implementations of both early_cache_level() and init_cache_level(), which are weak symbols in the core code. Will
diff --git a/arch/arm64/kernel/cacheinfo.c b/arch/arm64/kernel/cacheinfo.c index c307f69e9b55..d9c9218fa1fd 100644 --- a/arch/arm64/kernel/cacheinfo.c +++ b/arch/arm64/kernel/cacheinfo.c @@ -38,11 +38,9 @@ static void ci_leaf_init(struct cacheinfo *this_leaf, this_leaf->type = type; } -int init_cache_level(unsigned int cpu) +static void detect_cache_level(unsigned int *level_p, unsigned int *leaves_p) { unsigned int ctype, level, leaves; - int fw_level, ret; - struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); for (level = 1, leaves = 0; level <= MAX_CACHE_LEVEL; level++) { ctype = get_cache_type(level); @@ -54,6 +52,27 @@ int init_cache_level(unsigned int cpu) leaves += (ctype == CACHE_TYPE_SEPARATE) ? 2 : 1; } + *level_p = level; + *leaves_p = leaves; +} + +int early_cache_level(unsigned int cpu) +{ + struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); + + detect_cache_level(&this_cpu_ci->num_levels, &this_cpu_ci->num_leaves); + + return 0; +} + +int init_cache_level(unsigned int cpu) +{ + unsigned int level, leaves; + int fw_level, ret; + struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu); + + detect_cache_level(&level, &leaves); + if (acpi_disabled) { fw_level = of_find_last_cache_level(cpu); } else {