diff mbox series

[3/3] xen/acpi: upload power and performance related data from a PVH dom0

Message ID 20221121102113.41893-4-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series xen: ACPI processor related fixes | expand

Commit Message

Roger Pau Monné Nov. 21, 2022, 10:21 a.m. UTC
When running as a PVH dom0 the ACPI MADT is crafted by Xen in order to
report the correct numbers of vCPUs that dom0 has, so the host MADT is
not provided to dom0.  This creates issues when parsing the power and
performance related data from ACPI dynamic tables, as the ACPI
Processor UIDs found on the dynamic code are likely to not match the
ones crafted by Xen in the dom0 MADT.

Xen would rely on Linux having filled at least the power and
performance related data of the vCPUs on the system, and would clone
that information in order to setup the remaining pCPUs on the system
if dom0 vCPUs < pCPUs.  However when running as PVH dom0 it's likely
that none of dom0 CPUs will have the power and performance data
filled, and hence the Xen ACPI Processor driver needs to fetch that
information by itself.

In order to do so correctly, introduce a new helper to fetch the _CST
data without taking into account the system capabilities from the
CPUID output, as the capabilities reported to dom0 in CPUID might be
different from the ones on the host.

Note that the newly introduced code will only fetch the _CST, _PSS,
_PPC and _PCT from a single CPU, and clone that information for all the
other Processors.  This won't work on an heterogeneous system with
Processors having different power and performance related data between
them.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 arch/x86/include/asm/xen/hypervisor.h |   2 +-
 arch/x86/xen/enlighten.c              |   2 +-
 drivers/xen/xen-acpi-processor.c      | 225 ++++++++++++++++++++++++--
 3 files changed, 211 insertions(+), 18 deletions(-)

Comments

Josef Johansson Jan. 30, 2023, 9:10 a.m. UTC | #1
On 11/21/22 11:21, Roger Pau Monne wrote:
> When running as a PVH dom0 the ACPI MADT is crafted by Xen in order to
> report the correct numbers of vCPUs that dom0 has, so the host MADT is
> not provided to dom0.  This creates issues when parsing the power and
> performance related data from ACPI dynamic tables, as the ACPI
> Processor UIDs found on the dynamic code are likely to not match the
> ones crafted by Xen in the dom0 MADT.
>
> Xen would rely on Linux having filled at least the power and
> performance related data of the vCPUs on the system, and would clone
> that information in order to setup the remaining pCPUs on the system
> if dom0 vCPUs < pCPUs.  However when running as PVH dom0 it's likely
> that none of dom0 CPUs will have the power and performance data
> filled, and hence the Xen ACPI Processor driver needs to fetch that
> information by itself.
>
> In order to do so correctly, introduce a new helper to fetch the _CST
> data without taking into account the system capabilities from the
> CPUID output, as the capabilities reported to dom0 in CPUID might be
> different from the ones on the host.
>
>

Hi Roger,

First of all, thanks for doing the grunt work here to clear up the ACPI 
situation.

A bit of background, I'm trying to get an AMD APU (CPUID 0x17 (23)) to 
work properly
under Xen. It works fine otherwise but something is amiss under Xen.

Just to make sure that the patch is working as intended, what I found 
when trying it out
is that the first 8 CPUs have C-States, but 0, 2, 4, 6, 8, 10, 12, 14 
have P-States, out of
16 CPUs. Xen tells Linux that there are 8 CPUs since it's running with 
nosmt.

Regards
- Josef

xen_acpi_processor: Max ACPI ID: 30
xen_acpi_processor: Uploading Xen processor PM info
xen_acpi_processor: ACPI CPU0 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU0 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU1 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU2 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU2 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU3 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU4 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU4 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU5 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU6 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU6 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU7 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU0 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU0 w/ PST:coord_type = 254 domain = 0
xen_acpi_processor: CPU with ACPI ID 1 is unavailable
xen_acpi_processor: ACPI CPU2 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU2 w/ PST:coord_type = 254 domain = 1
xen_acpi_processor: CPU with ACPI ID 3 is unavailable
xen_acpi_processor: ACPI CPU4 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU4 w/ PST:coord_type = 254 domain = 2
xen_acpi_processor: CPU with ACPI ID 5 is unavailable
xen_acpi_processor: ACPI CPU6 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU6 w/ PST:coord_type = 254 domain = 3
xen_acpi_processor: CPU with ACPI ID 7 is unavailable
xen_acpi_processor: ACPI CPU8 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU8 w/ PST:coord_type = 254 domain = 4
xen_acpi_processor: CPU with ACPI ID 9 is unavailable
xen_acpi_processor: ACPI CPU10 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU10 w/ PST:coord_type = 254 domain = 5
xen_acpi_processor: CPU with ACPI ID 11 is unavailable
xen_acpi_processor: ACPI CPU12 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU12 w/ PST:coord_type = 254 domain = 6
xen_acpi_processor: CPU with ACPI ID 13 is unavailable
xen_acpi_processor: ACPI CPU14 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU14 w/ PST:coord_type = 254 domain = 7
xen_acpi_processor: CPU with ACPI ID 15 is unavailable
xen_acpi_processor: ACPI CPU8 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU10 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU12 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU14 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS

As a bonus, here are the previous debug output on the same machine.

xen_acpi_processor: Max ACPI ID: 30
xen_acpi_processor: Uploading Xen processor PM info
xen_acpi_processor: ACPI CPU0 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU0 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU1 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU2 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU2 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU3 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU4 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU4 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU5 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU6 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU6 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU7 - C-states uploaded.
xen_acpi_processor:      C1: ACPI HLT 1 uS
xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
xen_acpi_processor: ACPI CPU0 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU0 w/ PST:coord_type = 254 domain = 0
xen_acpi_processor: ACPI CPU1 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU1 w/ PST:coord_type = 254 domain = 0
xen_acpi_processor: ACPI CPU2 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU2 w/ PST:coord_type = 254 domain = 1
xen_acpi_processor: ACPI CPU3 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU3 w/ PST:coord_type = 254 domain = 1
xen_acpi_processor: ACPI CPU4 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU4 w/ PST:coord_type = 254 domain = 2
xen_acpi_processor: ACPI CPU5 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU5 w/ PST:coord_type = 254 domain = 2
xen_acpi_processor: ACPI CPU6 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU6 w/ PST:coord_type = 254 domain = 3
xen_acpi_processor: ACPI CPU7 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU7 w/ PST:coord_type = 254 domain = 3
xen_acpi_processor: ACPI CPU8 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU8 w/ PST:coord_type = 254 domain = 4
xen_acpi_processor: ACPI CPU9 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU9 w/ PST:coord_type = 254 domain = 4
xen_acpi_processor: ACPI CPU10 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU10 w/ PST:coord_type = 254 domain = 5
xen_acpi_processor: ACPI CPU11 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU11 w/ PST:coord_type = 254 domain = 5
xen_acpi_processor: ACPI CPU12 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU12 w/ PST:coord_type = 254 domain = 6
xen_acpi_processor: ACPI CPU13 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU13 w/ PST:coord_type = 254 domain = 6
xen_acpi_processor: ACPI CPU14 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU14 w/ PST:coord_type = 254 domain = 7
xen_acpi_processor: ACPI CPU15 w/ PBLK:0x0
xen_acpi_processor: ACPI CPU15 w/ PST:coord_type = 254 domain = 7
xen_acpi_processor: ACPI CPU8 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU10 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU12 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
xen_acpi_processor: ACPI CPU14 - P-states uploaded.
xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
Roger Pau Monné March 15, 2023, 11:39 a.m. UTC | #2
On Mon, Jan 30, 2023 at 10:10:05AM +0100, Josef Johansson wrote:
> 
> On 11/21/22 11:21, Roger Pau Monne wrote:
> > When running as a PVH dom0 the ACPI MADT is crafted by Xen in order to
> > report the correct numbers of vCPUs that dom0 has, so the host MADT is
> > not provided to dom0.  This creates issues when parsing the power and
> > performance related data from ACPI dynamic tables, as the ACPI
> > Processor UIDs found on the dynamic code are likely to not match the
> > ones crafted by Xen in the dom0 MADT.
> > 
> > Xen would rely on Linux having filled at least the power and
> > performance related data of the vCPUs on the system, and would clone
> > that information in order to setup the remaining pCPUs on the system
> > if dom0 vCPUs < pCPUs.  However when running as PVH dom0 it's likely
> > that none of dom0 CPUs will have the power and performance data
> > filled, and hence the Xen ACPI Processor driver needs to fetch that
> > information by itself.
> > 
> > In order to do so correctly, introduce a new helper to fetch the _CST
> > data without taking into account the system capabilities from the
> > CPUID output, as the capabilities reported to dom0 in CPUID might be
> > different from the ones on the host.
> > 
> > 
> 
> Hi Roger,
> 
> First of all, thanks for doing the grunt work here to clear up the ACPI
> situation.
> 
> A bit of background, I'm trying to get an AMD APU (CPUID 0x17 (23)) to work
> properly
> under Xen. It works fine otherwise but something is amiss under Xen.

Hello,

Sorry for the delay, I've been on paternity leave and just caching up
on emails.

> Just to make sure that the patch is working as intended, what I found when
> trying it out
> is that the first 8 CPUs have C-States, but 0, 2, 4, 6, 8, 10, 12, 14 have
> P-States, out of
> 16 CPUs. Xen tells Linux that there are 8 CPUs since it's running with
> nosmt.
> 
> Regards
> - Josef
> 
> xen_acpi_processor: Max ACPI ID: 30
> xen_acpi_processor: Uploading Xen processor PM info
> xen_acpi_processor: ACPI CPU0 - C-states uploaded.
> xen_acpi_processor:      C1: ACPI HLT 1 uS
> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> xen_acpi_processor: ACPI CPU0 - P-states uploaded.
> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> xen_acpi_processor: ACPI CPU1 - C-states uploaded.
> xen_acpi_processor:      C1: ACPI HLT 1 uS
> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> xen_acpi_processor: ACPI CPU2 - C-states uploaded.
> xen_acpi_processor:      C1: ACPI HLT 1 uS
> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> xen_acpi_processor: ACPI CPU2 - P-states uploaded.
> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> xen_acpi_processor: ACPI CPU3 - C-states uploaded.
> xen_acpi_processor:      C1: ACPI HLT 1 uS
> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> xen_acpi_processor: ACPI CPU4 - C-states uploaded.
> xen_acpi_processor:      C1: ACPI HLT 1 uS
> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> xen_acpi_processor: ACPI CPU4 - P-states uploaded.
> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> xen_acpi_processor: ACPI CPU5 - C-states uploaded.
> xen_acpi_processor:      C1: ACPI HLT 1 uS
> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> xen_acpi_processor: ACPI CPU6 - C-states uploaded.
> xen_acpi_processor:      C1: ACPI HLT 1 uS
> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> xen_acpi_processor: ACPI CPU6 - P-states uploaded.
> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> xen_acpi_processor: ACPI CPU7 - C-states uploaded.
> xen_acpi_processor:      C1: ACPI HLT 1 uS
> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> xen_acpi_processor: ACPI CPU0 w/ PBLK:0x0
> xen_acpi_processor: ACPI CPU0 w/ PST:coord_type = 254 domain = 0
> xen_acpi_processor: CPU with ACPI ID 1 is unavailable

Hm, that's weird, do you think you could check why it reports the CPU
is unavailable?

Overall I don't like the situation of the ACPI handling when running
as dom0. It's fragile to rely on the data for dom0 CPUs to be
filled by the system (by adding some band aids here and there so that
the PV vCPUs are matched against the MADT objects) and then cloning
the data for any physical CPU exceeding the number of dom0 virtual
CPUs.

IMO it would be much better to just do the handling of ACPI processor
objects in a Xen specific driver (preventing the native driver from
attaching) in order to fetch the data and upload it to Xen.  This is
what I've attempted to do on FreeBSD, and resulted in a cleaner
implementation:

https://cgit.freebsd.org/src/tree/sys/dev/xen/cpu/xen_acpi_cpu.c

I however don't have time to do this right now for Linux.

> xen_acpi_processor: ACPI CPU2 w/ PBLK:0x0
> xen_acpi_processor: ACPI CPU2 w/ PST:coord_type = 254 domain = 1
> xen_acpi_processor: CPU with ACPI ID 3 is unavailable
> xen_acpi_processor: ACPI CPU4 w/ PBLK:0x0
> xen_acpi_processor: ACPI CPU4 w/ PST:coord_type = 254 domain = 2
> xen_acpi_processor: CPU with ACPI ID 5 is unavailable
> xen_acpi_processor: ACPI CPU6 w/ PBLK:0x0
> xen_acpi_processor: ACPI CPU6 w/ PST:coord_type = 254 domain = 3
> xen_acpi_processor: CPU with ACPI ID 7 is unavailable
> xen_acpi_processor: ACPI CPU8 w/ PBLK:0x0
> xen_acpi_processor: ACPI CPU8 w/ PST:coord_type = 254 domain = 4
> xen_acpi_processor: CPU with ACPI ID 9 is unavailable
> xen_acpi_processor: ACPI CPU10 w/ PBLK:0x0
> xen_acpi_processor: ACPI CPU10 w/ PST:coord_type = 254 domain = 5
> xen_acpi_processor: CPU with ACPI ID 11 is unavailable
> xen_acpi_processor: ACPI CPU12 w/ PBLK:0x0
> xen_acpi_processor: ACPI CPU12 w/ PST:coord_type = 254 domain = 6
> xen_acpi_processor: CPU with ACPI ID 13 is unavailable
> xen_acpi_processor: ACPI CPU14 w/ PBLK:0x0
> xen_acpi_processor: ACPI CPU14 w/ PST:coord_type = 254 domain = 7
> xen_acpi_processor: CPU with ACPI ID 15 is unavailable
> xen_acpi_processor: ACPI CPU8 - P-states uploaded.
> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> xen_acpi_processor: ACPI CPU10 - P-states uploaded.
> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> xen_acpi_processor: ACPI CPU12 - P-states uploaded.
> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> xen_acpi_processor: ACPI CPU14 - P-states uploaded.
> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> 
> As a bonus, here are the previous debug output on the same machine.

I think the output below is with dom0 running as plain PV rather than
PVH?

Thanks, Roger.
Josef Johansson March 16, 2023, 7:54 a.m. UTC | #3
On 3/15/23 12:39, Roger Pau Monné wrote:
> On Mon, Jan 30, 2023 at 10:10:05AM +0100, Josef Johansson wrote:
>> On 11/21/22 11:21, Roger Pau Monne wrote:
>>> When running as a PVH dom0 the ACPI MADT is crafted by Xen in order to
>>> report the correct numbers of vCPUs that dom0 has, so the host MADT is
>>> not provided to dom0.  This creates issues when parsing the power and
>>> performance related data from ACPI dynamic tables, as the ACPI
>>> Processor UIDs found on the dynamic code are likely to not match the
>>> ones crafted by Xen in the dom0 MADT.
>>>
>>> Xen would rely on Linux having filled at least the power and
>>> performance related data of the vCPUs on the system, and would clone
>>> that information in order to setup the remaining pCPUs on the system
>>> if dom0 vCPUs < pCPUs.  However when running as PVH dom0 it's likely
>>> that none of dom0 CPUs will have the power and performance data
>>> filled, and hence the Xen ACPI Processor driver needs to fetch that
>>> information by itself.
>>>
>>> In order to do so correctly, introduce a new helper to fetch the _CST
>>> data without taking into account the system capabilities from the
>>> CPUID output, as the capabilities reported to dom0 in CPUID might be
>>> different from the ones on the host.
>>>
>>>
>> Hi Roger,
>>
>> First of all, thanks for doing the grunt work here to clear up the ACPI
>> situation.
>>
>> A bit of background, I'm trying to get an AMD APU (CPUID 0x17 (23)) to work
>> properly
>> under Xen. It works fine otherwise but something is amiss under Xen.
> Hello,
>
> Sorry for the delay, I've been on paternity leave and just caching up
> on emails.
Hi Roger,

Congratulations! I hope you had time to really connect. It's the most 
important thing we can do here in life.

I came into this to understand each and every error in my boot-log, it 
turns out that the latest
kernel+xen+firmware fixes suspend/resume for me, thus is this not 
related. But as I pointed out,
the output does not make any sense (nor yours nor the upstream). I 
should check the debug
output with suspend working fine now to see if there are any changes, 
that would be quite
interesting.

Also, I should mention that your patch broke some things on my system 
and made it
unstable. I don't remember exactly and I know you said that this is more 
of a PoC. Just a
heads up.
>> Just to make sure that the patch is working as intended, what I found when
>> trying it out
>> is that the first 8 CPUs have C-States, but 0, 2, 4, 6, 8, 10, 12, 14 have
>> P-States, out of
>> 16 CPUs. Xen tells Linux that there are 8 CPUs since it's running with
>> nosmt.
>>
>> Regards
>> - Josef
>>
>> xen_acpi_processor: Max ACPI ID: 30
>> xen_acpi_processor: Uploading Xen processor PM info
>> xen_acpi_processor: ACPI CPU0 - C-states uploaded.
>> xen_acpi_processor:      C1: ACPI HLT 1 uS
>> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
>> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
>> xen_acpi_processor: ACPI CPU0 - P-states uploaded.
>> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
>> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
>> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
>> xen_acpi_processor: ACPI CPU1 - C-states uploaded.
>> xen_acpi_processor:      C1: ACPI HLT 1 uS
>> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
>> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
>> xen_acpi_processor: ACPI CPU2 - C-states uploaded.
>> xen_acpi_processor:      C1: ACPI HLT 1 uS
>> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
>> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
>> xen_acpi_processor: ACPI CPU2 - P-states uploaded.
>> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
>> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
>> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
>> xen_acpi_processor: ACPI CPU3 - C-states uploaded.
>> xen_acpi_processor:      C1: ACPI HLT 1 uS
>> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
>> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
>> xen_acpi_processor: ACPI CPU4 - C-states uploaded.
>> xen_acpi_processor:      C1: ACPI HLT 1 uS
>> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
>> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
>> xen_acpi_processor: ACPI CPU4 - P-states uploaded.
>> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
>> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
>> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
>> xen_acpi_processor: ACPI CPU5 - C-states uploaded.
>> xen_acpi_processor:      C1: ACPI HLT 1 uS
>> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
>> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
>> xen_acpi_processor: ACPI CPU6 - C-states uploaded.
>> xen_acpi_processor:      C1: ACPI HLT 1 uS
>> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
>> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
>> xen_acpi_processor: ACPI CPU6 - P-states uploaded.
>> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
>> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
>> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
>> xen_acpi_processor: ACPI CPU7 - C-states uploaded.
>> xen_acpi_processor:      C1: ACPI HLT 1 uS
>> xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
>> xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
>> xen_acpi_processor: ACPI CPU0 w/ PBLK:0x0
>> xen_acpi_processor: ACPI CPU0 w/ PST:coord_type = 254 domain = 0
>> xen_acpi_processor: CPU with ACPI ID 1 is unavailable
> Hm, that's weird, do you think you could check why it reports the CPU
> is unavailable?
If you would give me a hint at how I could check?
Right now my guess is that C state and P state is not in sync, thus P 
state are for every other
CPU and C state is for the first 8. AFAIK AMD only have 
performance-cores (unlike Intel).
>
> Overall I don't like the situation of the ACPI handling when running
> as dom0. It's fragile to rely on the data for dom0 CPUs to be
> filled by the system (by adding some band aids here and there so that
> the PV vCPUs are matched against the MADT objects) and then cloning
> the data for any physical CPU exceeding the number of dom0 virtual
> CPUs.
That's my understanding from earlier implementation as well, nobody 
actually like it,
But the current solution is something working in a bad environment.
>
> IMO it would be much better to just do the handling of ACPI processor
> objects in a Xen specific driver (preventing the native driver from
> attaching) in order to fetch the data and upload it to Xen.  This is
> what I've attempted to do on FreeBSD, and resulted in a cleaner
> implementation:
>
> <link>
>
> I however don't have time to do this right now for Linux.

Maybe I can take a stab, I very much like the climate of the kernel but 
everything
seem so scary :) I've been trying to understand things better, how 
they're all
connected.
>
>> xen_acpi_processor: ACPI CPU2 w/ PBLK:0x0
>> xen_acpi_processor: ACPI CPU2 w/ PST:coord_type = 254 domain = 1
>> xen_acpi_processor: CPU with ACPI ID 3 is unavailable
>> xen_acpi_processor: ACPI CPU4 w/ PBLK:0x0
>> xen_acpi_processor: ACPI CPU4 w/ PST:coord_type = 254 domain = 2
>> xen_acpi_processor: CPU with ACPI ID 5 is unavailable
>> xen_acpi_processor: ACPI CPU6 w/ PBLK:0x0
>> xen_acpi_processor: ACPI CPU6 w/ PST:coord_type = 254 domain = 3
>> xen_acpi_processor: CPU with ACPI ID 7 is unavailable
>> xen_acpi_processor: ACPI CPU8 w/ PBLK:0x0
>> xen_acpi_processor: ACPI CPU8 w/ PST:coord_type = 254 domain = 4
>> xen_acpi_processor: CPU with ACPI ID 9 is unavailable
>> xen_acpi_processor: ACPI CPU10 w/ PBLK:0x0
>> xen_acpi_processor: ACPI CPU10 w/ PST:coord_type = 254 domain = 5
>> xen_acpi_processor: CPU with ACPI ID 11 is unavailable
>> xen_acpi_processor: ACPI CPU12 w/ PBLK:0x0
>> xen_acpi_processor: ACPI CPU12 w/ PST:coord_type = 254 domain = 6
>> xen_acpi_processor: CPU with ACPI ID 13 is unavailable
>> xen_acpi_processor: ACPI CPU14 w/ PBLK:0x0
>> xen_acpi_processor: ACPI CPU14 w/ PST:coord_type = 254 domain = 7
>> xen_acpi_processor: CPU with ACPI ID 15 is unavailable
>> xen_acpi_processor: ACPI CPU8 - P-states uploaded.
>> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
>> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
>> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
>> xen_acpi_processor: ACPI CPU10 - P-states uploaded.
>> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
>> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
>> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
>> xen_acpi_processor: ACPI CPU12 - P-states uploaded.
>> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
>> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
>> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
>> xen_acpi_processor: ACPI CPU14 - P-states uploaded.
>> xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
>> xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
>> xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
>>
>> As a bonus, here are the previous debug output on the same machine.
> I think the output below is with dom0 running as plain PV rather than
> PVH?
This is the upstream ACPI implementation vs yours. What would plain PV 
vs PVH be in dom0?

Regards
- Josef
> Thanks, Roger.
Roger Pau Monné March 16, 2023, 10:06 a.m. UTC | #4
On Thu, Mar 16, 2023 at 08:54:57AM +0100, Josef Johansson wrote:
> On 3/15/23 12:39, Roger Pau Monné wrote:
> > On Mon, Jan 30, 2023 at 10:10:05AM +0100, Josef Johansson wrote:
> > > On 11/21/22 11:21, Roger Pau Monne wrote:
> > > > When running as a PVH dom0 the ACPI MADT is crafted by Xen in order to
> > > > report the correct numbers of vCPUs that dom0 has, so the host MADT is
> > > > not provided to dom0.  This creates issues when parsing the power and
> > > > performance related data from ACPI dynamic tables, as the ACPI
> > > > Processor UIDs found on the dynamic code are likely to not match the
> > > > ones crafted by Xen in the dom0 MADT.
> > > > 
> > > > Xen would rely on Linux having filled at least the power and
> > > > performance related data of the vCPUs on the system, and would clone
> > > > that information in order to setup the remaining pCPUs on the system
> > > > if dom0 vCPUs < pCPUs.  However when running as PVH dom0 it's likely
> > > > that none of dom0 CPUs will have the power and performance data
> > > > filled, and hence the Xen ACPI Processor driver needs to fetch that
> > > > information by itself.
> > > > 
> > > > In order to do so correctly, introduce a new helper to fetch the _CST
> > > > data without taking into account the system capabilities from the
> > > > CPUID output, as the capabilities reported to dom0 in CPUID might be
> > > > different from the ones on the host.
> > > > 
> > > > 
> > > Hi Roger,
> > > 
> > > First of all, thanks for doing the grunt work here to clear up the ACPI
> > > situation.
> > > 
> > > A bit of background, I'm trying to get an AMD APU (CPUID 0x17 (23)) to work
> > > properly
> > > under Xen. It works fine otherwise but something is amiss under Xen.
> > Hello,
> > 
> > Sorry for the delay, I've been on paternity leave and just caching up
> > on emails.
> Hi Roger,
> 
> Congratulations! I hope you had time to really connect. It's the most
> important thing we can do here in life.
> 
> I came into this to understand each and every error in my boot-log, it turns
> out that the latest
> kernel+xen+firmware fixes suspend/resume for me, thus is this not related.
> But as I pointed out,
> the output does not make any sense (nor yours nor the upstream). I should
> check the debug
> output with suspend working fine now to see if there are any changes, that
> would be quite
> interesting.
> 
> Also, I should mention that your patch broke some things on my system and
> made it
> unstable. I don't remember exactly and I know you said that this is more of
> a PoC. Just a
> heads up.

Right, I don't plan to send the PVH part just now, and instead I'm
focusing in the first patch that should fix _PDC evaluation for PV
dom0.  I will Cc you on the last version so you can give it a try and
assert is not regressing stuff for you.

> > > Just to make sure that the patch is working as intended, what I found when
> > > trying it out
> > > is that the first 8 CPUs have C-States, but 0, 2, 4, 6, 8, 10, 12, 14 have
> > > P-States, out of
> > > 16 CPUs. Xen tells Linux that there are 8 CPUs since it's running with
> > > nosmt.
> > > 
> > > Regards
> > > - Josef
> > > 
> > > xen_acpi_processor: Max ACPI ID: 30
> > > xen_acpi_processor: Uploading Xen processor PM info
> > > xen_acpi_processor: ACPI CPU0 - C-states uploaded.
> > > xen_acpi_processor:      C1: ACPI HLT 1 uS
> > > xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> > > xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> > > xen_acpi_processor: ACPI CPU0 - P-states uploaded.
> > > xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> > > xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> > > xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> > > xen_acpi_processor: ACPI CPU1 - C-states uploaded.
> > > xen_acpi_processor:      C1: ACPI HLT 1 uS
> > > xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> > > xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> > > xen_acpi_processor: ACPI CPU2 - C-states uploaded.
> > > xen_acpi_processor:      C1: ACPI HLT 1 uS
> > > xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> > > xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> > > xen_acpi_processor: ACPI CPU2 - P-states uploaded.
> > > xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> > > xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> > > xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> > > xen_acpi_processor: ACPI CPU3 - C-states uploaded.
> > > xen_acpi_processor:      C1: ACPI HLT 1 uS
> > > xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> > > xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> > > xen_acpi_processor: ACPI CPU4 - C-states uploaded.
> > > xen_acpi_processor:      C1: ACPI HLT 1 uS
> > > xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> > > xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> > > xen_acpi_processor: ACPI CPU4 - P-states uploaded.
> > > xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> > > xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> > > xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> > > xen_acpi_processor: ACPI CPU5 - C-states uploaded.
> > > xen_acpi_processor:      C1: ACPI HLT 1 uS
> > > xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> > > xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> > > xen_acpi_processor: ACPI CPU6 - C-states uploaded.
> > > xen_acpi_processor:      C1: ACPI HLT 1 uS
> > > xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> > > xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> > > xen_acpi_processor: ACPI CPU6 - P-states uploaded.
> > > xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> > > xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> > > xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> > > xen_acpi_processor: ACPI CPU7 - C-states uploaded.
> > > xen_acpi_processor:      C1: ACPI HLT 1 uS
> > > xen_acpi_processor:      C2: ACPI IOPORT 0x414 18 uS
> > > xen_acpi_processor:      C3: ACPI IOPORT 0x415 350 uS
> > > xen_acpi_processor: ACPI CPU0 w/ PBLK:0x0
> > > xen_acpi_processor: ACPI CPU0 w/ PST:coord_type = 254 domain = 0
> > > xen_acpi_processor: CPU with ACPI ID 1 is unavailable
> > Hm, that's weird, do you think you could check why it reports the CPU
> > is unavailable?
> If you would give me a hint at how I could check?

It likely requires you to add printk statements to the kernel in order
to figure out which conditional fails when running as a PVH dom0.

> Right now my guess is that C state and P state is not in sync, thus P state
> are for every other
> CPU and C state is for the first 8. AFAIK AMD only have performance-cores
> (unlike Intel).

Linux thinking the CPU is not online is more likely to be due to the
ACPI ID differences when running as a PVH dom0.  Anyway, I will try to
revisit this and figure out what's wrong.

> > 
> > Overall I don't like the situation of the ACPI handling when running
> > as dom0. It's fragile to rely on the data for dom0 CPUs to be
> > filled by the system (by adding some band aids here and there so that
> > the PV vCPUs are matched against the MADT objects) and then cloning
> > the data for any physical CPU exceeding the number of dom0 virtual
> > CPUs.
> That's my understanding from earlier implementation as well, nobody actually
> like it,
> But the current solution is something working in a bad environment.
> > 
> > IMO it would be much better to just do the handling of ACPI processor
> > objects in a Xen specific driver (preventing the native driver from
> > attaching) in order to fetch the data and upload it to Xen.  This is
> > what I've attempted to do on FreeBSD, and resulted in a cleaner
> > implementation:
> > 
> > <link>
> > 
> > I however don't have time to do this right now for Linux.
> 
> Maybe I can take a stab, I very much like the climate of the kernel but
> everything
> seem so scary :) I've been trying to understand things better, how they're
> all
> connected.
> > 
> > > xen_acpi_processor: ACPI CPU2 w/ PBLK:0x0
> > > xen_acpi_processor: ACPI CPU2 w/ PST:coord_type = 254 domain = 1
> > > xen_acpi_processor: CPU with ACPI ID 3 is unavailable
> > > xen_acpi_processor: ACPI CPU4 w/ PBLK:0x0
> > > xen_acpi_processor: ACPI CPU4 w/ PST:coord_type = 254 domain = 2
> > > xen_acpi_processor: CPU with ACPI ID 5 is unavailable
> > > xen_acpi_processor: ACPI CPU6 w/ PBLK:0x0
> > > xen_acpi_processor: ACPI CPU6 w/ PST:coord_type = 254 domain = 3
> > > xen_acpi_processor: CPU with ACPI ID 7 is unavailable
> > > xen_acpi_processor: ACPI CPU8 w/ PBLK:0x0
> > > xen_acpi_processor: ACPI CPU8 w/ PST:coord_type = 254 domain = 4
> > > xen_acpi_processor: CPU with ACPI ID 9 is unavailable
> > > xen_acpi_processor: ACPI CPU10 w/ PBLK:0x0
> > > xen_acpi_processor: ACPI CPU10 w/ PST:coord_type = 254 domain = 5
> > > xen_acpi_processor: CPU with ACPI ID 11 is unavailable
> > > xen_acpi_processor: ACPI CPU12 w/ PBLK:0x0
> > > xen_acpi_processor: ACPI CPU12 w/ PST:coord_type = 254 domain = 6
> > > xen_acpi_processor: CPU with ACPI ID 13 is unavailable
> > > xen_acpi_processor: ACPI CPU14 w/ PBLK:0x0
> > > xen_acpi_processor: ACPI CPU14 w/ PST:coord_type = 254 domain = 7
> > > xen_acpi_processor: CPU with ACPI ID 15 is unavailable
> > > xen_acpi_processor: ACPI CPU8 - P-states uploaded.
> > > xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> > > xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> > > xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> > > xen_acpi_processor: ACPI CPU10 - P-states uploaded.
> > > xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> > > xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> > > xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> > > xen_acpi_processor: ACPI CPU12 - P-states uploaded.
> > > xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> > > xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> > > xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> > > xen_acpi_processor: ACPI CPU14 - P-states uploaded.
> > > xen_acpi_processor:      *P0: 1700 MHz, 2071 mW, 0 uS
> > > xen_acpi_processor:       P1: 1600 MHz, 1520 mW, 0 uS
> > > xen_acpi_processor:       P2: 1400 MHz, 1277 mW, 0 uS
> > > 
> > > As a bonus, here are the previous debug output on the same machine.
> > I think the output below is with dom0 running as plain PV rather than
> > PVH?
> This is the upstream ACPI implementation vs yours. What would plain PV vs
> PVH be in dom0?

But that's always with Linux running as a dom0, or just running bare
metal?

Thanks, Roger.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/xen/hypervisor.h b/arch/x86/include/asm/xen/hypervisor.h
index b4ed90ef5e68..1ead5253bc6c 100644
--- a/arch/x86/include/asm/xen/hypervisor.h
+++ b/arch/x86/include/asm/xen/hypervisor.h
@@ -62,7 +62,7 @@  void __init mem_map_via_hcall(struct boot_params *boot_params_p);
 #endif
 
 #ifdef CONFIG_XEN_DOM0
-bool __init xen_processor_present(uint32_t acpi_id);
+bool xen_processor_present(uint32_t acpi_id);
 void xen_sanitize_pdc(uint32_t *buf);
 #else
 static inline bool xen_processor_present(uint32_t acpi_id)
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 394dd6675113..a7b41103d3e5 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -348,7 +348,7 @@  EXPORT_SYMBOL(xen_arch_unregister_cpu);
 #endif
 
 #ifdef CONFIG_XEN_DOM0
-bool __init xen_processor_present(uint32_t acpi_id)
+bool xen_processor_present(uint32_t acpi_id)
 {
 	unsigned int i, maxid;
 	struct xen_platform_op op = {
diff --git a/drivers/xen/xen-acpi-processor.c b/drivers/xen/xen-acpi-processor.c
index 9cb61db67efd..b189ea69d557 100644
--- a/drivers/xen/xen-acpi-processor.c
+++ b/drivers/xen/xen-acpi-processor.c
@@ -48,6 +48,8 @@  static unsigned long *acpi_id_cst_present;
 /* Which ACPI P-State dependencies for a enumerated processor */
 static struct acpi_psd_package *acpi_psd;
 
+static bool pr_initialized;
+
 static int push_cxx_to_hypervisor(struct acpi_processor *_pr)
 {
 	struct xen_platform_op op = {
@@ -172,8 +174,13 @@  static int xen_copy_psd_data(struct acpi_processor *_pr,
 
 	/* 'acpi_processor_preregister_performance' does not parse if the
 	 * num_processors <= 1, but Xen still requires it. Do it manually here.
+	 *
+	 * Also init the field if not set, as that's possible if the physical
+	 * CPUs on the system doesn't match the data provided in the MADT when
+	 * running as a PVH dom0.
 	 */
-	if (pdomain->num_processors <= 1) {
+	if (pdomain->num_processors <= 1 ||
+	    dst->shared_type == CPUFREQ_SHARED_TYPE_NONE) {
 		if (pdomain->coord_type == DOMAIN_COORD_TYPE_SW_ALL)
 			dst->shared_type = CPUFREQ_SHARED_TYPE_ALL;
 		else if (pdomain->coord_type == DOMAIN_COORD_TYPE_HW_ALL)
@@ -313,6 +320,155 @@  static unsigned int __init get_max_acpi_id(void)
 	pr_debug("Max ACPI ID: %u\n", max_acpi_id);
 	return max_acpi_id;
 }
+
+/*
+ * Custom version of the native acpi_processor_evaluate_cst() function, to
+ * avoid some sanity checks done based on the CPUID data.  When running as a
+ * Xen domain the CPUID data provided to dom0 is not the native one, so C
+ * states cannot be sanity checked.  Leave it to the hypervisor which is also
+ * the entity running the driver.
+ */
+static int xen_acpi_processor_evaluate_cst(acpi_handle handle,
+					   struct acpi_processor_power *info)
+{
+	struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
+	union acpi_object *cst;
+	acpi_status status;
+	u64 count;
+	int last_index = 0;
+	int i, ret = 0;
+
+	status = acpi_evaluate_object(handle, "_CST", NULL, &buffer);
+	if (ACPI_FAILURE(status)) {
+		acpi_handle_debug(handle, "No _CST\n");
+		return -ENODEV;
+	}
+
+	cst = buffer.pointer;
+
+	/* There must be at least 2 elements. */
+	if (!cst || cst->type != ACPI_TYPE_PACKAGE || cst->package.count < 2) {
+		acpi_handle_warn(handle, "Invalid _CST output\n");
+		ret = -EFAULT;
+		goto end;
+	}
+
+	count = cst->package.elements[0].integer.value;
+
+	/* Validate the number of C-states. */
+	if (count < 1 || count != cst->package.count - 1) {
+		acpi_handle_warn(handle, "Inconsistent _CST data\n");
+		ret = -EFAULT;
+		goto end;
+	}
+
+	for (i = 1; i <= count; i++) {
+		union acpi_object *element;
+		union acpi_object *obj;
+		struct acpi_power_register *reg;
+		struct acpi_processor_cx cx;
+
+		/*
+		 * If there is not enough space for all C-states, skip the
+		 * excess ones and log a warning.
+		 */
+		if (last_index >= ACPI_PROCESSOR_MAX_POWER - 1) {
+			acpi_handle_warn(handle, "No room for more idle states (limit: %d)\n",
+					 ACPI_PROCESSOR_MAX_POWER - 1);
+			break;
+		}
+
+		memset(&cx, 0, sizeof(cx));
+
+		element = &cst->package.elements[i];
+		if (element->type != ACPI_TYPE_PACKAGE) {
+			acpi_handle_info(handle, "_CST C%d type(%x) is not package, skip...\n",
+					 i, element->type);
+			continue;
+		}
+
+		if (element->package.count != 4) {
+			acpi_handle_info(handle, "_CST C%d package count(%d) is not 4, skip...\n",
+				i, element->package.count);
+			continue;
+		}
+
+		obj = &element->package.elements[0];
+
+		if (obj->type != ACPI_TYPE_BUFFER) {
+			acpi_handle_info(handle, "_CST C%d package element[0] type(%x) is not buffer, skip...\n",
+					 i, obj->type);
+			continue;
+		}
+
+		reg = (struct acpi_power_register *)obj->buffer.pointer;
+
+		obj = &element->package.elements[1];
+		if (obj->type != ACPI_TYPE_INTEGER) {
+			acpi_handle_info(handle, "_CST C[%d] package element[1] type(%x) is not integer, skip...\n",
+					 i, obj->type);
+			continue;
+		}
+
+		cx.type = obj->integer.value;
+		/*
+		 * There are known cases in which the _CST output does not
+		 * contain C1, so if the type of the first state found is not
+		 * C1, leave an empty slot for C1 to be filled in later.
+		 */
+		if (i == 1 && cx.type != ACPI_STATE_C1)
+			last_index = 1;
+
+		cx.address = reg->address;
+		cx.index = last_index + 1;
+
+		switch (reg->space_id) {
+		case ACPI_ADR_SPACE_FIXED_HARDWARE:
+			cx.entry_method = ACPI_CSTATE_FFH;
+			break;
+
+		case ACPI_ADR_SPACE_SYSTEM_IO:
+			cx.entry_method = ACPI_CSTATE_SYSTEMIO;
+			break;
+
+		default:
+			acpi_handle_info(handle, "_CST C%d space_id(%x) neither FIXED_HARDWARE nor SYSTEM_IO, skip...\n",
+					 i, reg->space_id);
+			continue;
+		}
+
+		if (cx.type == ACPI_STATE_C1)
+			cx.valid = 1;
+
+		obj = &element->package.elements[2];
+		if (obj->type != ACPI_TYPE_INTEGER) {
+			acpi_handle_info(handle, "_CST C%d package element[2] type(%x) not integer, skip...\n",
+					 i, obj->type);
+			continue;
+		}
+
+		cx.latency = obj->integer.value;
+
+		obj = &element->package.elements[3];
+		if (obj->type != ACPI_TYPE_INTEGER) {
+			acpi_handle_info(handle, "_CST C%d package element[3] type(%x) not integer, skip...\n",
+					 i, obj->type);
+			continue;
+		}
+
+		memcpy(&info->states[++last_index], &cx, sizeof(cx));
+	}
+
+	acpi_handle_info(handle, "Found %d idle states\n", last_index);
+
+	info->count = last_index;
+
+end:
+	kfree(buffer.pointer);
+
+	return ret;
+}
+
 /*
  * The read_acpi_id and check_acpi_ids are there to support the Xen
  * oddity of virtual CPUs != physical CPUs in the initial domain.
@@ -354,24 +510,44 @@  read_acpi_id(acpi_handle handle, u32 lvl, void *context, void **rv)
 	default:
 		return AE_OK;
 	}
-	if (invalid_phys_cpuid(acpi_get_phys_id(handle,
-						acpi_type == ACPI_TYPE_DEVICE,
-						acpi_id))) {
+
+	if (!xen_processor_present(acpi_id)) {
 		pr_debug("CPU with ACPI ID %u is unavailable\n", acpi_id);
 		return AE_OK;
 	}
-	/* There are more ACPI Processor objects than in x2APIC or MADT.
-	 * This can happen with incorrect ACPI SSDT declerations. */
-	if (acpi_id >= nr_acpi_bits) {
-		pr_debug("max acpi id %u, trying to set %u\n",
-			 nr_acpi_bits - 1, acpi_id);
-		return AE_OK;
-	}
+
 	/* OK, There is a ACPI Processor object */
 	__set_bit(acpi_id, acpi_id_present);
 
 	pr_debug("ACPI CPU%u w/ PBLK:0x%lx\n", acpi_id, (unsigned long)pblk);
 
+	if (!pr_initialized) {
+		struct acpi_processor *pr = context;
+		int rc;
+
+		/*
+		 * There's no CPU on the system that has any performance or
+		 * power related data, initialize all the required fields by
+		 * fetching that info here.
+		 *
+		 * Note such information is only fetched once, and then reused
+		 * for all pCPUs.  This won't work on heterogeneous systems
+		 * with different Cx anb/or Px states between CPUs.
+		 */
+
+		pr->handle = handle;
+
+		rc = acpi_processor_get_performance_info(pr);
+		if (rc)
+			pr_debug("ACPI CPU%u failed to get performance data\n",
+				 acpi_id);
+		rc = xen_acpi_processor_evaluate_cst(handle, &pr->power);
+		if (rc)
+			pr_debug("ACPI CPU%u failed to get _CST data\n", acpi_id);
+
+		pr_initialized = true;
+	}
+
 	/* It has P-state dependencies */
 	if (!acpi_processor_get_psd(handle, &acpi_psd[acpi_id])) {
 		pr_debug("ACPI CPU%u w/ PST:coord_type = %llu domain = %llu\n",
@@ -392,8 +568,7 @@  read_acpi_id(acpi_handle handle, u32 lvl, void *context, void **rv)
 static int check_acpi_ids(struct acpi_processor *pr_backup)
 {
 
-	if (!pr_backup)
-		return -ENODEV;
+	BUG_ON(!pr_backup);
 
 	if (acpi_id_present && acpi_id_cst_present)
 		/* OK, done this once .. skip to uploading */
@@ -422,8 +597,8 @@  static int check_acpi_ids(struct acpi_processor *pr_backup)
 
 	acpi_walk_namespace(ACPI_TYPE_PROCESSOR, ACPI_ROOT_OBJECT,
 			    ACPI_UINT32_MAX,
-			    read_acpi_id, NULL, NULL, NULL);
-	acpi_get_devices(ACPI_PROCESSOR_DEVICE_HID, read_acpi_id, NULL, NULL);
+			    read_acpi_id, NULL, pr_backup, NULL);
+	acpi_get_devices(ACPI_PROCESSOR_DEVICE_HID, read_acpi_id, pr_backup, NULL);
 
 upload:
 	if (!bitmap_equal(acpi_id_present, acpi_ids_done, nr_acpi_bits)) {
@@ -464,6 +639,7 @@  static int xen_upload_processor_pm_data(void)
 	struct acpi_processor *pr_backup = NULL;
 	int i;
 	int rc = 0;
+	bool free_perf = false;
 
 	pr_info("Uploading Xen processor PM info\n");
 
@@ -475,13 +651,30 @@  static int xen_upload_processor_pm_data(void)
 
 		if (!pr_backup) {
 			pr_backup = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
-			if (pr_backup)
+			if (pr_backup) {
 				memcpy(pr_backup, _pr, sizeof(struct acpi_processor));
+				pr_initialized = true;
+			}
 		}
 		(void)upload_pm_data(_pr);
 	}
 
+	if (!pr_backup) {
+		pr_backup = kzalloc(sizeof(struct acpi_processor), GFP_KERNEL);
+		if (!pr_backup)
+			return -ENOMEM;
+		pr_backup->performance = kzalloc(sizeof(struct acpi_processor_performance),
+						 GFP_KERNEL);
+		if (!pr_backup->performance) {
+			kfree(pr_backup);
+			return -ENOMEM;
+		}
+		free_perf = true;
+	}
+
 	rc = check_acpi_ids(pr_backup);
+	if (free_perf)
+		kfree(pr_backup->performance);
 	kfree(pr_backup);
 
 	return rc;