diff mbox series

[3/3] arm64: rebuild sched domains on invariance status changes

Message ID 20200924123937.20938-4-ionela.voinescu@arm.com (mailing list archive)
State Changes Requested, archived
Headers show
Series condition EAS enablement on FI support | expand

Commit Message

Ionela Voinescu Sept. 24, 2020, 12:39 p.m. UTC
Task scheduler behavior depends on frequency invariance (FI) support and
the resulting invariant load tracking signals. For example, in order to
make accurate predictions across CPUs for all performance states, Energy
Aware Scheduling (EAS) needs frequency-invariant load tracking signals
and therefore it has a direct dependency on FI. If a platform is found
lacking FI support, EAS is disabled.

While arch_scale_freq_invariant() will see changes in FI support, it
could return different values during system initialisation. Such a
scenario will happen for a system that does not support cpufreq driven
FI, but does support counter-driven FI. For such a system,
arch_scale_freq_invariant() will return false if called before counter
based FI initialisation, but change its status to true after it.

For arm64 this affects the task scheduler behavior which builds its
scheduling domain hierarchy well before the late counter-based FI init.
During that process it will disable EAS due to its dependency on FI.

Two points of early calls to arch_scale_freq_invariant() which determine
EAS enablement are:
 - (1) drivers/base/arch_topology.c:126 <<update_topology_flags_workfn>>
		rebuild_sched_domains();
       This will happen after CPU capacity initialisation.
 - (2) kernel/sched/cpufreq_schedutil.c:917 <<rebuild_sd_workfn>>
		rebuild_sched_domains_energy();
		-->rebuild_sched_domains();
       This will happen during sched_cpufreq_governor_change() for the
       schedutil cpufreq governor.

Therefore, if there is a change in FI support status after counter init,
use the existing rebuild_sched_domains_energy() function to trigger a
rebuild of the scheduling and performance domains that in turn determine
the enablement of EAS.

Signed-off-by: Ionela Voinescu <ionela.voinescu@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/topology.h |  1 +
 arch/arm64/kernel/topology.c      | 10 ++++++++++
 2 files changed, 11 insertions(+)

Comments

Quentin Perret Sept. 24, 2020, 1:39 p.m. UTC | #1
On Thursday 24 Sep 2020 at 13:39:37 (+0100), Ionela Voinescu wrote:
> For arm64 this affects the task scheduler behavior which builds its
> scheduling domain hierarchy well before the late counter-based FI init.
> During that process it will disable EAS due to its dependency on FI.

Does it mean we get a warn on every boot, even though this is a
perfectly normal scenario?

Thanks,
Quentin
Ionela Voinescu Sept. 24, 2020, 4:10 p.m. UTC | #2
On Thursday 24 Sep 2020 at 14:39:25 (+0100), Quentin Perret wrote:
> On Thursday 24 Sep 2020 at 13:39:37 (+0100), Ionela Voinescu wrote:
> > For arm64 this affects the task scheduler behavior which builds its
> > scheduling domain hierarchy well before the late counter-based FI init.
> > During that process it will disable EAS due to its dependency on FI.
> 
> Does it mean we get a warn on every boot, even though this is a
> perfectly normal scenario?
> 

Yes, we will get a few "Disabling EAS: frequency-invariant load tracking
not supported" warnings until the final rebuild of the sched domains
finds FI supported and enables EAS (silently this time, which possibly
makes things worse). We have the same behavior for removing and adding
the schedutil governor.

I'm not sure what is a good way of fixing this.. I could add more info
to the warning to suggest it might be temporary ("Disabling EAS:
frequency-invariant load tracking currently not supported"). For further
debugging there are the additional prints guarded by sched_debug().

I'll look over the code some more to see if other ideas pop out. Any
suggestions are appreciated.

Many thanks for the review,
Ionela.

> Thanks,
> Quentin
Quentin Perret Sept. 25, 2020, 1:59 p.m. UTC | #3
Hey Ionela,

On Thursday 24 Sep 2020 at 17:10:02 (+0100), Ionela Voinescu wrote:
> I'm not sure what is a good way of fixing this.. I could add more info
> to the warning to suggest it might be temporary ("Disabling EAS:
> frequency-invariant load tracking currently not supported"). For further
> debugging there are the additional prints guarded by sched_debug().
> 
> I'll look over the code some more to see if other ideas pop out. Any
> suggestions are appreciated.

Right, I'm not seeing anything perfect here, but I think I'd be
personally happy with this message being entirely guarded by
sched_debug(), like we do for asym CPU capacities for instance.

It's not easy to see if EAS has started at all w/o sched debug anyway,
so I expect folks who need it to enable the debug stuff during
bring-up. With a descriptive enough warn message, that should be just
fine. But that's my 2p, so I'm happy to hear if others disagree.

Thanks,
Quentin
Dietmar Eggemann Sept. 28, 2020, 11:55 a.m. UTC | #4
On 25/09/2020 15:59, Quentin Perret wrote:
> Hey Ionela,
> 
> On Thursday 24 Sep 2020 at 17:10:02 (+0100), Ionela Voinescu wrote:
>> I'm not sure what is a good way of fixing this.. I could add more info
>> to the warning to suggest it might be temporary ("Disabling EAS:
>> frequency-invariant load tracking currently not supported"). For further
>> debugging there are the additional prints guarded by sched_debug().
>>
>> I'll look over the code some more to see if other ideas pop out. Any
>> suggestions are appreciated.
> 
> Right, I'm not seeing anything perfect here, but I think I'd be
> personally happy with this message being entirely guarded by
> sched_debug(), like we do for asym CPU capacities for instance.
> 
> It's not easy to see if EAS has started at all w/o sched debug anyway,
> so I expect folks who need it to enable the debug stuff during
> bring-up. With a descriptive enough warn message, that should be just
> fine. But that's my 2p, so I'm happy to hear if others disagree.

Are you discussing a scenario where the system doesn't have FI via
CPUfreq but only via AMU? And then we would get the pr_warn

 "rd %*pbl: Disabling EAS: frequency-invariant load tracking not
 supported"

in (1)-(3)?

(1) initial sd build
(2) update_topology_flags_workfn()
(3) rebuild_sched_domains_energy()
(4) init_amu_fie()

Today (e.g. on Juno( we start EAS within (1)

root@juno:~# dmesg | grep "build_perf_domains\|EAS"
[    3.491304] *** build_perf_domains: rd 0-5
[    3.574226] sched_energy_set: starting EAS <--- !!!
[    3.847584] *** build_perf_domains: rd 0-5
[    3.928227] *** build_perf_domains: rd 0-5

And on a future AMU FI only system it would look like:

 Disabling EAS: frequency-invariant load tracking not supported"
 Disabling EAS: frequency-invariant load tracking not supported"
 Disabling EAS: frequency-invariant load tracking not supported"
 sched_energy_set: starting EAS

I guess it's a good idea to put all those warnings which indicate why
EAS can't be started under sched_debug().

The warning "rd %*pbl: CPUs do not have asymmetric capacities" already
is. This one is actually very similar to the FI related one, since
'asymmetric capacities' could only exist starting with (2) (big.LITTLE
based entirely on CPUfreq diffs)
Ionela Voinescu Sept. 28, 2020, 2:23 p.m. UTC | #5
Hi guys,

On Monday 28 Sep 2020 at 13:55:49 (+0200), Dietmar Eggemann wrote:
> On 25/09/2020 15:59, Quentin Perret wrote:
> > Hey Ionela,
> > 
> > On Thursday 24 Sep 2020 at 17:10:02 (+0100), Ionela Voinescu wrote:
> >> I'm not sure what is a good way of fixing this.. I could add more info
> >> to the warning to suggest it might be temporary ("Disabling EAS:
> >> frequency-invariant load tracking currently not supported"). For further
> >> debugging there are the additional prints guarded by sched_debug().
> >>
> >> I'll look over the code some more to see if other ideas pop out. Any
> >> suggestions are appreciated.
> > 
> > Right, I'm not seeing anything perfect here, but I think I'd be
> > personally happy with this message being entirely guarded by
> > sched_debug(), like we do for asym CPU capacities for instance.
> > 
> > It's not easy to see if EAS has started at all w/o sched debug anyway,
> > so I expect folks who need it to enable the debug stuff during
> > bring-up. With a descriptive enough warn message, that should be just
> > fine. But that's my 2p, so I'm happy to hear if others disagree.
> 
> Are you discussing a scenario where the system doesn't have FI via
> CPUfreq but only via AMU? And then we would get the pr_warn
> 
>  "rd %*pbl: Disabling EAS: frequency-invariant load tracking not
>  supported"
> 

Yes, that is correct. Unfortunately for !sched_debug, even if we have
FI via AMUs, the EAS enablement message "sched_energy_set: starting EAS"
won't appear, and therefore one would only see the warnings above, giving
the wrong impression that EAS is disabled.

> in (1)-(3)?
> 
> (1) initial sd build
> (2) update_topology_flags_workfn()
> (3) rebuild_sched_domains_energy()
> (4) init_amu_fie()
> 
> Today (e.g. on Juno( we start EAS within (1)
> 
> root@juno:~# dmesg | grep "build_perf_domains\|EAS"
> [    3.491304] *** build_perf_domains: rd 0-5
> [    3.574226] sched_energy_set: starting EAS <--- !!!
> [    3.847584] *** build_perf_domains: rd 0-5
> [    3.928227] *** build_perf_domains: rd 0-5
> 
> And on a future AMU FI only system it would look like:
> 
>  Disabling EAS: frequency-invariant load tracking not supported"
>  Disabling EAS: frequency-invariant load tracking not supported"
>  Disabling EAS: frequency-invariant load tracking not supported"
>  sched_energy_set: starting EAS
> 

Correct (with the same mention that "sched_energy_set: starting EAS" is
guarded by sched debug).

> I guess it's a good idea to put all those warnings which indicate why
> EAS can't be started under sched_debug().
> 
> The warning "rd %*pbl: CPUs do not have asymmetric capacities" already
> is. This one is actually very similar to the FI related one, since
> 'asymmetric capacities' could only exist starting with (2) (big.LITTLE
> based entirely on CPUfreq diffs)

Yes, this seems the right solution, as suggested by Quentin as well.
I'll do this together with the other suggestions and submit v2.

Thank you both,
Ionela.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h
index 7cb519473fbd..9394101e3c08 100644
--- a/arch/arm64/include/asm/topology.h
+++ b/arch/arm64/include/asm/topology.h
@@ -16,6 +16,7 @@  int pcibus_to_node(struct pci_bus *bus);
 
 #include <linux/arch_topology.h>
 
+void rebuild_sched_domains_energy(void);
 #ifdef CONFIG_ARM64_AMU_EXTN
 /*
  * Replace task scheduler's default counter-based
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 543c67cae02f..2a9b69fdabc9 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -213,6 +213,7 @@  static DEFINE_STATIC_KEY_FALSE(amu_fie_key);
 
 static int __init init_amu_fie(void)
 {
+	bool invariance_status = topology_scale_freq_invariant();
 	cpumask_var_t valid_cpus;
 	bool have_policy = false;
 	int ret = 0;
@@ -255,6 +256,15 @@  static int __init init_amu_fie(void)
 	if (!topology_scale_freq_invariant())
 		static_branch_disable(&amu_fie_key);
 
+	/*
+	 * Task scheduler behavior depends on frequency invariance support,
+	 * either cpufreq or counter driven. If the support status changes as
+	 * a result of counter initialisation and use, retrigger the build of
+	 * scheduling domains to ensure the information is propagated properly.
+	 */
+	if (invariance_status != topology_scale_freq_invariant())
+		rebuild_sched_domains_energy();
+
 free_valid_mask:
 	free_cpumask_var(valid_cpus);