mbox series

[00/13] Intel Hardware P-States (HWP) support

Message ID 20210503192810.36084-1-jandryuk@gmail.com (mailing list archive)
Headers show
Series Intel Hardware P-States (HWP) support | expand

Message

Jason Andryuk May 3, 2021, 7:27 p.m. UTC
Hi,

This patch series adds Hardware-Controlled Performance States (HWP) for
Intel processors to Xen.

With HWP, the processor makes its own determinations for frequency
selection depending, though users can set some parameters and
preferences.  There is also Turbo Boost which dynamically pushes the max
frequency if possible.

The existing governors don't work with HWP since they select frequencies
and HWP doesn't expose those.  Therefore a dummy hwp-interal governor is
used that doesn't do anything.

xenpm get-cpufreq-para is extended to show HWP parameters, and
set-cpufreq-hwp is added to set them.

A lightly loaded OpenXT laptop showed ~1W power savings according to
powertop.  A mostly idle Fedora system (dom0 only) showed a more modest
power savings.

This for for a 10th gen 6-core 1600 MHz base 4900 MHZ max cpu.  In the
default balance mode, Turbo Boost doesn't exceed 4GHz.  Tweaking the
energy_perf preference with `xenpm set-cpufreq-hwp balance ene:64`,
I've seen the CPU hit 4.7GHz before throttling down and bouncing around
between 4.3 and 4.5 GHz.  Curiously the other cores read ~4GHz when
turbo boost takes affect.  This was done after pinning all dom0 cores,
and using taskset to pin to vCPU/pCPU 11 and running a bash tightloop.

There are only minor changes since the RFC posting.  Typo fixes and a
few hunks have been folded into earlier patches.  I reordered "xenpm:
Factor out a non-fatal cpuid_parse variant" immediately before "xenpm:
Add set-cpufreq-hwp subcommand" where it is used.

Open questions:

HWP defaults to enabled and running in balanced mode.  This is useful
for testing, but should the old ACPI cpufreq driver remain the default?

This series unilaterally enables Hardware Duty Cycling (HDC) which is
another feature to autonomously powerdown things.  That is enabled if
HWP is enabled.  Maybe that want to be configurable?

I've only tested on an 8th gen and a 10th gen systems.  The don't have
fast MSR support, and they do have activity window and energy_perf
support.  So the respective other modes are untested.

This changes the systcl_pm_op hypercall, so that wants review.

I wanted to get this out since I know Qubes is also interested.

Regards,
Jason

Jason Andryuk (13):
  cpufreq: Allow restricting to internal governors only
  cpufreq: Add perf_freq to cpuinfo
  cpufreq: Export intel_feature_detect
  cpufreq: Add Hardware P-State (HWP) driver
  xenpm: Change get-cpufreq-para output for internal
  cpufreq: Export HWP parameters to userspace
  libxc: Include hwp_para in definitions
  xenpm: Print HWP parameters
  xen: Add SET_CPUFREQ_HWP xen_sysctl_pm_op
  libxc: Add xc_set_cpufreq_hwp
  xenpm: Factor out a non-fatal cpuid_parse variant
  xenpm: Add set-cpufreq-hwp subcommand
  CHANGELOG: Add Intel HWP entry

 CHANGELOG.md                              |   2 +
 docs/misc/xen-command-line.pandoc         |   9 +
 tools/include/xenctrl.h                   |   6 +
 tools/libs/ctrl/xc_pm.c                   |  18 +
 tools/misc/xenpm.c                        | 375 +++++++++++-
 xen/arch/x86/acpi/cpufreq/Makefile        |   1 +
 xen/arch/x86/acpi/cpufreq/cpufreq.c       |  15 +-
 xen/arch/x86/acpi/cpufreq/hwp.c           | 671 ++++++++++++++++++++++
 xen/drivers/acpi/pmstat.c                 |  30 +
 xen/drivers/cpufreq/cpufreq.c             |   4 +
 xen/drivers/cpufreq/utility.c             |   1 +
 xen/include/acpi/cpufreq/cpufreq.h        |   9 +
 xen/include/acpi/cpufreq/processor_perf.h |   5 +
 xen/include/asm-x86/cpufeature.h          |  11 +-
 xen/include/asm-x86/msr-index.h           |  17 +
 xen/include/public/sysctl.h               |  52 +-
 16 files changed, 1197 insertions(+), 29 deletions(-)
 create mode 100644 xen/arch/x86/acpi/cpufreq/hwp.c

Comments

Jan Beulich May 20, 2021, 2:57 p.m. UTC | #1
On 03.05.2021 21:27, Jason Andryuk wrote:
> Open questions:
> 
> HWP defaults to enabled and running in balanced mode.  This is useful
> for testing, but should the old ACPI cpufreq driver remain the default?

As long as this new code is experimental, yes. But once it's deemed
stable and fully supported, I think on HWP-capable hardware the new
driver should be used by default.

> This series unilaterally enables Hardware Duty Cycling (HDC) which is
> another feature to autonomously powerdown things.  That is enabled if
> HWP is enabled.  Maybe that want to be configurable?

Probably; not even sure what the default would want to be.

Jan