[v2,02/11] acpi: Define ACPI IO registers for PVH guests

Message ID	1478702399-14538-3-git-send-email-boris.ostrovsky@oracle.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <xen-devel-bounces@lists.xen.org> From: Boris Ostrovsky <boris.ostrovsky@oracle.com> To: xen-devel@lists.xen.org Date: Wed, 9 Nov 2016 09:39:50 -0500 Message-Id: <1478702399-14538-3-git-send-email-boris.ostrovsky@oracle.com> In-Reply-To: <1478702399-14538-1-git-send-email-boris.ostrovsky@oracle.com> References: <1478702399-14538-1-git-send-email-boris.ostrovsky@oracle.com> Cc: wei.liu2@citrix.com, andrew.cooper3@citrix.com, ian.jackson@eu.citrix.com, Julien Grall <julien.grall@arm.com>, Paul Durrant <paul.durrant@citrix.com>, jbeulich@suse.com, Boris Ostrovsky <boris.ostrovsky@oracle.com>, roger.pau@citrix.com Subject: [Xen-devel] [PATCH v2 02/11] acpi: Define ACPI IO registers for PVH guests Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" <xen-devel-bounces@lists.xen.org>

Boris Ostrovsky Nov. 9, 2016, 2:39 p.m. UTC

ACPI hotplug-related IO accesses (to GPE0 block) are handled
by qemu for HVM guests. Since PVH guests don't have qemu these
accesses will need to be procesed by the hypervisor.

Because ACPI event model expects pm1a block to be present we
need to have the hypervisor emulate it as well.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
---
CC: Julien Grall <julien.grall@arm.com>
CC: Paul Durrant <paul.durrant@citrix.com>
---
Changes in v2:
* Added public macros for ACPI CPU bitmap (at 0xaf00). Note: PRST
  region shrinks from 32 bytes to 16 (when 128 VCPUs are
  supported).


 tools/libacpi/mk_dsdt.c          |  4 +++-
 tools/libacpi/static_tables.c    | 28 +++++++++++-----------------
 xen/include/asm-x86/hvm/domain.h |  6 ++++++
 xen/include/public/arch-arm.h    | 11 ++++++++---
 xen/include/public/hvm/ioreq.h   | 13 +++++++++++++
 5 files changed, 41 insertions(+), 21 deletions(-)

Julien Grall Nov. 9, 2016, 2:48 p.m. UTC | #1

Hi Boris,

On 09/11/16 14:39, Boris Ostrovsky wrote:
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index bd974fb..b793774 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -383,6 +383,9 @@ typedef uint64_t xen_callback_t;
>   * should instead use the FDT.
>   */
>
> +/* Current supported guest VCPUs */
> +#define GUEST_MAX_VCPUS 128
> +
>  /* Physical Address Space */
>
>  /*
> @@ -410,6 +413,11 @@ typedef uint64_t xen_callback_t;
>  #define GUEST_ACPI_BASE 0x20000000ULL
>  #define GUEST_ACPI_SIZE 0x02000000ULL
>
> +/* Location of online VCPU bitmap. */
> +#define ACPI_CPU_MAP                 0xaf00
> +#define ACPI_CPU_MAP_LEN             ((GUEST_MAX_VCPUS / 8) + \
> +                                      ((GUEST_MAX_VCPUS & 7) ? 1 : 0))
> +

I don't understand this change. PRST is not generated for ARM (see the 
return 0 in mk_dsdt a bit before).

In any case, I am expecting CPU hotplug to be handle via PSCI on ARM.

Regards,

Paul Durrant Nov. 9, 2016, 2:48 p.m. UTC | #2

> -----Original Message-----
> From: Boris Ostrovsky [mailto:boris.ostrovsky@oracle.com]
> Sent: 09 November 2016 14:40
> To: xen-devel@lists.xen.org
> Cc: jbeulich@suse.com; Andrew Cooper <Andrew.Cooper3@citrix.com>;
> Wei Liu <wei.liu2@citrix.com>; Ian Jackson <Ian.Jackson@citrix.com>; Roger
> Pau Monne <roger.pau@citrix.com>; Boris Ostrovsky
> <boris.ostrovsky@oracle.com>; Julien Grall <julien.grall@arm.com>; Paul
> Durrant <Paul.Durrant@citrix.com>
> Subject: [PATCH v2 02/11] acpi: Define ACPI IO registers for PVH guests
> 
> ACPI hotplug-related IO accesses (to GPE0 block) are handled
> by qemu for HVM guests. Since PVH guests don't have qemu these
> accesses will need to be procesed by the hypervisor.
> 
> Because ACPI event model expects pm1a block to be present we
> need to have the hypervisor emulate it as well.
> 
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
> CC: Julien Grall <julien.grall@arm.com>
> CC: Paul Durrant <paul.durrant@citrix.com>

Reviewed-by: Paul Durrant <paul.durrant@citrix.com>

> ---
> Changes in v2:
> * Added public macros for ACPI CPU bitmap (at 0xaf00). Note: PRST
>   region shrinks from 32 bytes to 16 (when 128 VCPUs are
>   supported).
> 
> 
>  tools/libacpi/mk_dsdt.c          |  4 +++-
>  tools/libacpi/static_tables.c    | 28 +++++++++++-----------------
>  xen/include/asm-x86/hvm/domain.h |  6 ++++++
>  xen/include/public/arch-arm.h    | 11 ++++++++---
>  xen/include/public/hvm/ioreq.h   | 13 +++++++++++++
>  5 files changed, 41 insertions(+), 21 deletions(-)
> 
> diff --git a/tools/libacpi/mk_dsdt.c b/tools/libacpi/mk_dsdt.c
> index 4ae68bc..2b8234d 100644
> --- a/tools/libacpi/mk_dsdt.c
> +++ b/tools/libacpi/mk_dsdt.c
> @@ -19,6 +19,7 @@
>  #include <stdbool.h>
>  #if defined(__i386__) || defined(__x86_64__)
>  #include <xen/hvm/hvm_info_table.h>
> +#include <xen/hvm/ioreq.h>
>  #elif defined(__aarch64__)
>  #include <xen/arch-arm.h>
>  #endif
> @@ -244,7 +245,8 @@ int main(int argc, char **argv)
>  #endif
> 
>      /* Operation Region 'PRST': bitmask of online CPUs. */
> -    stmt("OperationRegion", "PRST, SystemIO, 0xaf00, 32");
> +    stmt("OperationRegion", "PRST, SystemIO, 0x%x, %d",
> +        ACPI_CPU_MAP, ACPI_CPU_MAP_LEN);
>      push_block("Field", "PRST, ByteAcc, NoLock, Preserve");
>      indent(); printf("PRS, %u\n", max_cpus);
>      pop_block();
> diff --git a/tools/libacpi/static_tables.c b/tools/libacpi/static_tables.c
> index 617bf68..413abcc 100644
> --- a/tools/libacpi/static_tables.c
> +++ b/tools/libacpi/static_tables.c
> @@ -20,6 +20,8 @@
>   * Firmware ACPI Control Structure (FACS).
>   */
> 
> +#define ACPI_REG_BIT_OFFSET    0
> +
>  struct acpi_20_facs Facs = {
>      .signature = ACPI_2_0_FACS_SIGNATURE,
>      .length    = sizeof(struct acpi_20_facs),
> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>  /*
>   * Fixed ACPI Description Table (FADT).
>   */
> -
> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
> -
>  struct acpi_20_fadt Fadt = {
>      .header = {
>          .signature    = ACPI_2_0_FADT_SIGNATURE,
> @@ -56,9 +50,9 @@ struct acpi_20_fadt Fadt = {
>      .pm1a_cnt_blk = ACPI_PM1A_CNT_BLK_ADDRESS_V1,
>      .pm_tmr_blk = ACPI_PM_TMR_BLK_ADDRESS_V1,
>      .gpe0_blk = ACPI_GPE0_BLK_ADDRESS_V1,
> -    .pm1_evt_len = ACPI_PM1A_EVT_BLK_BIT_WIDTH / 8,
> -    .pm1_cnt_len = ACPI_PM1A_CNT_BLK_BIT_WIDTH / 8,
> -    .pm_tmr_len = ACPI_PM_TMR_BLK_BIT_WIDTH / 8,
> +    .pm1_evt_len = ACPI_PM1A_EVT_BLK_LEN,
> +    .pm1_cnt_len = ACPI_PM1A_CNT_BLK_LEN,
> +    .pm_tmr_len = ACPI_PM_TMR_BLK_LEN,
>      .gpe0_blk_len = ACPI_GPE0_BLK_LEN_V1,
> 
>      .p_lvl2_lat = 0x0fff, /* >100,  means we do not support C2 state */
> @@ -79,22 +73,22 @@ struct acpi_20_fadt Fadt = {
> 
>      .x_pm1a_evt_blk = {
>          .address_space_id    = ACPI_SYSTEM_IO,
> -        .register_bit_width  = ACPI_PM1A_EVT_BLK_BIT_WIDTH,
> -        .register_bit_offset = ACPI_PM1A_EVT_BLK_BIT_OFFSET,
> +        .register_bit_width  = ACPI_PM1A_EVT_BLK_LEN * 8,
> +        .register_bit_offset = ACPI_REG_BIT_OFFSET,
>          .address             = ACPI_PM1A_EVT_BLK_ADDRESS_V1,
>      },
> 
>      .x_pm1a_cnt_blk = {
>          .address_space_id    = ACPI_SYSTEM_IO,
> -        .register_bit_width  = ACPI_PM1A_CNT_BLK_BIT_WIDTH,
> -        .register_bit_offset = ACPI_PM1A_CNT_BLK_BIT_OFFSET,
> +        .register_bit_width  = ACPI_PM1A_CNT_BLK_LEN * 8,
> +        .register_bit_offset = ACPI_REG_BIT_OFFSET,
>          .address             = ACPI_PM1A_CNT_BLK_ADDRESS_V1,
>      },
> 
>      .x_pm_tmr_blk = {
>          .address_space_id    = ACPI_SYSTEM_IO,
> -        .register_bit_width  = ACPI_PM_TMR_BLK_BIT_WIDTH,
> -        .register_bit_offset = ACPI_PM_TMR_BLK_BIT_OFFSET,
> +        .register_bit_width  = ACPI_PM_TMR_BLK_LEN * 8,
> +        .register_bit_offset = ACPI_REG_BIT_OFFSET,
>          .address             = ACPI_PM_TMR_BLK_ADDRESS_V1,
>      }
>  };
> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-
> x86/hvm/domain.h
> index f34d784..f492a2b 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -87,6 +87,12 @@ struct hvm_domain {
>      } ioreq_server;
>      struct hvm_ioreq_server *default_ioreq_server;
> 
> +    /* PVH guests */
> +    struct {
> +        uint8_t pm1a[ACPI_PM1A_EVT_BLK_LEN];
> +        uint8_t gpe[ACPI_GPE0_BLK_LEN_V1];
> +    } acpi_io;
> +
>      /* Cached CF8 for guest PCI config cycles */
>      uint32_t                pci_cf8;
> 
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index bd974fb..b793774 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -383,6 +383,9 @@ typedef uint64_t xen_callback_t;
>   * should instead use the FDT.
>   */
> 
> +/* Current supported guest VCPUs */
> +#define GUEST_MAX_VCPUS 128
> +
>  /* Physical Address Space */
> 
>  /*
> @@ -410,6 +413,11 @@ typedef uint64_t xen_callback_t;
>  #define GUEST_ACPI_BASE 0x20000000ULL
>  #define GUEST_ACPI_SIZE 0x02000000ULL
> 
> +/* Location of online VCPU bitmap. */
> +#define ACPI_CPU_MAP                 0xaf00
> +#define ACPI_CPU_MAP_LEN             ((GUEST_MAX_VCPUS / 8) + \
> +                                      ((GUEST_MAX_VCPUS & 7) ? 1 : 0))
> +
>  /*
>   * 16MB == 4096 pages reserved for guest to use as a region to map its
>   * grant table in.
> @@ -435,9 +443,6 @@ typedef uint64_t xen_callback_t;
>  #define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE,
> GUEST_RAM1_BASE }
>  #define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE,
> GUEST_RAM1_SIZE }
> 
> -/* Current supported guest VCPUs */
> -#define GUEST_MAX_VCPUS 128
> -
>  /* Interrupts */
>  #define GUEST_TIMER_VIRT_PPI    27
>  #define GUEST_TIMER_PHYS_S_PPI  29
> diff --git a/xen/include/public/hvm/ioreq.h
> b/xen/include/public/hvm/ioreq.h
> index 2e5809b..e3fa704 100644
> --- a/xen/include/public/hvm/ioreq.h
> +++ b/xen/include/public/hvm/ioreq.h
> @@ -24,6 +24,8 @@
>  #ifndef _IOREQ_H_
>  #define _IOREQ_H_
> 
> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
> +
>  #define IOREQ_READ      1
>  #define IOREQ_WRITE     0
> 
> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
> 
> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
> +#define ACPI_PM_TMR_BLK_LEN          0x04
> +
> +/* Location of online VCPU bitmap. */
> +#define ACPI_CPU_MAP                 0xaf00
> +#define ACPI_CPU_MAP_LEN             ((HVM_MAX_VCPUS / 8) + \
> +                                      ((HVM_MAX_VCPUS & 7) ? 1 : 0))
> +#if ACPI_CPU_MAP + ACPI_CPU_MAP_LEN >=
> ACPI_GPE0_BLK_ADDRESS_V1
> +#error "ACPI_CPU_MAP is too big"
> +#endif
> 
>  #endif /* _IOREQ_H_ */
> 
> --
> 2.7.4

Boris Ostrovsky Nov. 9, 2016, 2:54 p.m. UTC | #3

On 11/09/2016 09:48 AM, Julien Grall wrote:
> Hi Boris,
>
> On 09/11/16 14:39, Boris Ostrovsky wrote:
>> diff --git a/xen/include/public/arch-arm.h
>> b/xen/include/public/arch-arm.h
>> index bd974fb..b793774 100644
>> --- a/xen/include/public/arch-arm.h
>> +++ b/xen/include/public/arch-arm.h
>> @@ -383,6 +383,9 @@ typedef uint64_t xen_callback_t;
>>   * should instead use the FDT.
>>   */
>>
>> +/* Current supported guest VCPUs */
>> +#define GUEST_MAX_VCPUS 128
>> +
>>  /* Physical Address Space */
>>
>>  /*
>> @@ -410,6 +413,11 @@ typedef uint64_t xen_callback_t;
>>  #define GUEST_ACPI_BASE 0x20000000ULL
>>  #define GUEST_ACPI_SIZE 0x02000000ULL
>>
>> +/* Location of online VCPU bitmap. */
>> +#define ACPI_CPU_MAP                 0xaf00
>> +#define ACPI_CPU_MAP_LEN             ((GUEST_MAX_VCPUS / 8) + \
>> +                                      ((GUEST_MAX_VCPUS & 7) ? 1 : 0))
>> +
>
> I don't understand this change. PRST is not generated for ARM (see the
> return 0 in mk_dsdt a bit before).

Oh, right --- I missed it. Then this change needs to be dropped.

-boris

>
> In any case, I am expecting CPU hotplug to be handle via PSCI on ARM.
>
> Regards,
>

Andrew Cooper Nov. 9, 2016, 2:59 p.m. UTC | #4

On 09/11/16 14:39, Boris Ostrovsky wrote:
> ACPI hotplug-related IO accesses (to GPE0 block) are handled
> by qemu for HVM guests. Since PVH guests don't have qemu these
> accesses will need to be procesed by the hypervisor.
>
> Because ACPI event model expects pm1a block to be present we
> need to have the hypervisor emulate it as well.
>
> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> ---
> CC: Julien Grall <julien.grall@arm.com>
> CC: Paul Durrant <paul.durrant@citrix.com>
> ---
> Changes in v2:
> * Added public macros for ACPI CPU bitmap (at 0xaf00). Note: PRST
>   region shrinks from 32 bytes to 16 (when 128 VCPUs are
>   supported).
>
>
>  tools/libacpi/mk_dsdt.c          |  4 +++-
>  tools/libacpi/static_tables.c    | 28 +++++++++++-----------------
>  xen/include/asm-x86/hvm/domain.h |  6 ++++++
>  xen/include/public/arch-arm.h    | 11 ++++++++---
>  xen/include/public/hvm/ioreq.h   | 13 +++++++++++++
>  5 files changed, 41 insertions(+), 21 deletions(-)
>
> diff --git a/tools/libacpi/mk_dsdt.c b/tools/libacpi/mk_dsdt.c
> index 4ae68bc..2b8234d 100644
> --- a/tools/libacpi/mk_dsdt.c
> +++ b/tools/libacpi/mk_dsdt.c
> @@ -19,6 +19,7 @@
>  #include <stdbool.h>
>  #if defined(__i386__) || defined(__x86_64__)
>  #include <xen/hvm/hvm_info_table.h>
> +#include <xen/hvm/ioreq.h>
>  #elif defined(__aarch64__)
>  #include <xen/arch-arm.h>
>  #endif
> @@ -244,7 +245,8 @@ int main(int argc, char **argv)
>  #endif
>  
>      /* Operation Region 'PRST': bitmask of online CPUs. */
> -    stmt("OperationRegion", "PRST, SystemIO, 0xaf00, 32");
> +    stmt("OperationRegion", "PRST, SystemIO, 0x%x, %d",
> +        ACPI_CPU_MAP, ACPI_CPU_MAP_LEN);
>      push_block("Field", "PRST, ByteAcc, NoLock, Preserve");
>      indent(); printf("PRS, %u\n", max_cpus);
>      pop_block();
> diff --git a/tools/libacpi/static_tables.c b/tools/libacpi/static_tables.c
> index 617bf68..413abcc 100644
> --- a/tools/libacpi/static_tables.c
> +++ b/tools/libacpi/static_tables.c
> @@ -20,6 +20,8 @@
>   * Firmware ACPI Control Structure (FACS).
>   */
>  
> +#define ACPI_REG_BIT_OFFSET    0
> +
>  struct acpi_20_facs Facs = {
>      .signature = ACPI_2_0_FACS_SIGNATURE,
>      .length    = sizeof(struct acpi_20_facs),
> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>  /*
>   * Fixed ACPI Description Table (FADT).
>   */
> -
> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
> -
>  struct acpi_20_fadt Fadt = {
>      .header = {
>          .signature    = ACPI_2_0_FADT_SIGNATURE,
> @@ -56,9 +50,9 @@ struct acpi_20_fadt Fadt = {
>      .pm1a_cnt_blk = ACPI_PM1A_CNT_BLK_ADDRESS_V1,
>      .pm_tmr_blk = ACPI_PM_TMR_BLK_ADDRESS_V1,
>      .gpe0_blk = ACPI_GPE0_BLK_ADDRESS_V1,
> -    .pm1_evt_len = ACPI_PM1A_EVT_BLK_BIT_WIDTH / 8,
> -    .pm1_cnt_len = ACPI_PM1A_CNT_BLK_BIT_WIDTH / 8,
> -    .pm_tmr_len = ACPI_PM_TMR_BLK_BIT_WIDTH / 8,
> +    .pm1_evt_len = ACPI_PM1A_EVT_BLK_LEN,
> +    .pm1_cnt_len = ACPI_PM1A_CNT_BLK_LEN,
> +    .pm_tmr_len = ACPI_PM_TMR_BLK_LEN,
>      .gpe0_blk_len = ACPI_GPE0_BLK_LEN_V1,
>  
>      .p_lvl2_lat = 0x0fff, /* >100,  means we do not support C2 state */
> @@ -79,22 +73,22 @@ struct acpi_20_fadt Fadt = {
>  
>      .x_pm1a_evt_blk = {
>          .address_space_id    = ACPI_SYSTEM_IO,
> -        .register_bit_width  = ACPI_PM1A_EVT_BLK_BIT_WIDTH,
> -        .register_bit_offset = ACPI_PM1A_EVT_BLK_BIT_OFFSET,
> +        .register_bit_width  = ACPI_PM1A_EVT_BLK_LEN * 8,
> +        .register_bit_offset = ACPI_REG_BIT_OFFSET,
>          .address             = ACPI_PM1A_EVT_BLK_ADDRESS_V1,
>      },
>  
>      .x_pm1a_cnt_blk = {
>          .address_space_id    = ACPI_SYSTEM_IO,
> -        .register_bit_width  = ACPI_PM1A_CNT_BLK_BIT_WIDTH,
> -        .register_bit_offset = ACPI_PM1A_CNT_BLK_BIT_OFFSET,
> +        .register_bit_width  = ACPI_PM1A_CNT_BLK_LEN * 8,
> +        .register_bit_offset = ACPI_REG_BIT_OFFSET,
>          .address             = ACPI_PM1A_CNT_BLK_ADDRESS_V1,
>      },
>  
>      .x_pm_tmr_blk = {
>          .address_space_id    = ACPI_SYSTEM_IO,
> -        .register_bit_width  = ACPI_PM_TMR_BLK_BIT_WIDTH,
> -        .register_bit_offset = ACPI_PM_TMR_BLK_BIT_OFFSET,
> +        .register_bit_width  = ACPI_PM_TMR_BLK_LEN * 8,
> +        .register_bit_offset = ACPI_REG_BIT_OFFSET,
>          .address             = ACPI_PM_TMR_BLK_ADDRESS_V1,
>      }
>  };
> diff --git a/xen/include/asm-x86/hvm/domain.h b/xen/include/asm-x86/hvm/domain.h
> index f34d784..f492a2b 100644
> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -87,6 +87,12 @@ struct hvm_domain {
>      } ioreq_server;
>      struct hvm_ioreq_server *default_ioreq_server;
>  
> +    /* PVH guests */
> +    struct {
> +        uint8_t pm1a[ACPI_PM1A_EVT_BLK_LEN];
> +        uint8_t gpe[ACPI_GPE0_BLK_LEN_V1];
> +    } acpi_io;
> +
>      /* Cached CF8 for guest PCI config cycles */
>      uint32_t                pci_cf8;
>  
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index bd974fb..b793774 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -383,6 +383,9 @@ typedef uint64_t xen_callback_t;
>   * should instead use the FDT.
>   */
>  
> +/* Current supported guest VCPUs */
> +#define GUEST_MAX_VCPUS 128
> +
>  /* Physical Address Space */
>  
>  /*
> @@ -410,6 +413,11 @@ typedef uint64_t xen_callback_t;
>  #define GUEST_ACPI_BASE 0x20000000ULL
>  #define GUEST_ACPI_SIZE 0x02000000ULL
>  
> +/* Location of online VCPU bitmap. */
> +#define ACPI_CPU_MAP                 0xaf00
> +#define ACPI_CPU_MAP_LEN             ((GUEST_MAX_VCPUS / 8) + \
> +                                      ((GUEST_MAX_VCPUS & 7) ? 1 : 0))
> +
>  /*
>   * 16MB == 4096 pages reserved for guest to use as a region to map its
>   * grant table in.
> @@ -435,9 +443,6 @@ typedef uint64_t xen_callback_t;
>  #define GUEST_RAM_BANK_BASES   { GUEST_RAM0_BASE, GUEST_RAM1_BASE }
>  #define GUEST_RAM_BANK_SIZES   { GUEST_RAM0_SIZE, GUEST_RAM1_SIZE }
>  
> -/* Current supported guest VCPUs */
> -#define GUEST_MAX_VCPUS 128
> -
>  /* Interrupts */
>  #define GUEST_TIMER_VIRT_PPI    27
>  #define GUEST_TIMER_PHYS_S_PPI  29
> diff --git a/xen/include/public/hvm/ioreq.h b/xen/include/public/hvm/ioreq.h
> index 2e5809b..e3fa704 100644
> --- a/xen/include/public/hvm/ioreq.h
> +++ b/xen/include/public/hvm/ioreq.h
> @@ -24,6 +24,8 @@
>  #ifndef _IOREQ_H_
>  #define _IOREQ_H_
>  
> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
> +
>  #define IOREQ_READ      1
>  #define IOREQ_WRITE     0
>  
> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>  
> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
> +#define ACPI_PM_TMR_BLK_LEN          0x04
> +
> +/* Location of online VCPU bitmap. */
> +#define ACPI_CPU_MAP                 0xaf00
> +#define ACPI_CPU_MAP_LEN             ((HVM_MAX_VCPUS / 8) + \
> +                                      ((HVM_MAX_VCPUS & 7) ? 1 : 0))
> +#if ACPI_CPU_MAP + ACPI_CPU_MAP_LEN >= ACPI_GPE0_BLK_ADDRESS_V1
> +#error "ACPI_CPU_MAP is too big"
> +#endif

Why is this in ioreq.h?  It has nothing to do with ioreq's.

The current ACPI bits in here are to do with the qemu ACPI interface,
not the Xen ACPI interface.

Also, please can we avoid hard-coding the location of the map in the
hypervisor ABI.  These constants make it impossible to ever extend the
number of HVM vcpus at a future date.

~Andrew

Boris Ostrovsky Nov. 9, 2016, 3:14 p.m. UTC | #5

On 11/09/2016 09:59 AM, Andrew Cooper wrote:
>
>> diff --git a/xen/include/public/hvm/ioreq.h b/xen/include/public/hvm/ioreq.h
>> index 2e5809b..e3fa704 100644
>> --- a/xen/include/public/hvm/ioreq.h
>> +++ b/xen/include/public/hvm/ioreq.h
>> @@ -24,6 +24,8 @@
>>  #ifndef _IOREQ_H_
>>  #define _IOREQ_H_
>>  
>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>> +
>>  #define IOREQ_READ      1
>>  #define IOREQ_WRITE     0
>>  
>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>  
>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>> +
>> +/* Location of online VCPU bitmap. */
>> +#define ACPI_CPU_MAP                 0xaf00
>> +#define ACPI_CPU_MAP_LEN             ((HVM_MAX_VCPUS / 8) + \
>> +                                      ((HVM_MAX_VCPUS & 7) ? 1 : 0))
>> +#if ACPI_CPU_MAP + ACPI_CPU_MAP_LEN >= ACPI_GPE0_BLK_ADDRESS_V1
>> +#error "ACPI_CPU_MAP is too big"
>> +#endif
> Why is this in ioreq.h?  It has nothing to do with ioreq's.
>
> The current ACPI bits in here are to do with the qemu ACPI interface,
> not the Xen ACPI interface.
>
> Also, please can we avoid hard-coding the location of the map in the
> hypervisor ABI.  These constants make it impossible to ever extend the
> number of HVM vcpus at a future date.

The first three logically belong here because corresponding blocks'
addresses are defined right above.

ACPI_CPU_MAP has to be seen by both the toolstack (libacpi) and the
hypervisor (and qemu as well, although it is defined as
PIIX4_CPU_HOTPLUG_IO_BASE).

Where do you think it should go then?

-boris

Andrew Cooper Nov. 9, 2016, 7:58 p.m. UTC | #6

On 09/11/16 15:14, Boris Ostrovsky wrote:
> On 11/09/2016 09:59 AM, Andrew Cooper wrote:
>>> diff --git a/xen/include/public/hvm/ioreq.h b/xen/include/public/hvm/ioreq.h
>>> index 2e5809b..e3fa704 100644
>>> --- a/xen/include/public/hvm/ioreq.h
>>> +++ b/xen/include/public/hvm/ioreq.h
>>> @@ -24,6 +24,8 @@
>>>  #ifndef _IOREQ_H_
>>>  #define _IOREQ_H_
>>>  
>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>> +
>>>  #define IOREQ_READ      1
>>>  #define IOREQ_WRITE     0
>>>  
>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>  
>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>> +
>>> +/* Location of online VCPU bitmap. */
>>> +#define ACPI_CPU_MAP                 0xaf00
>>> +#define ACPI_CPU_MAP_LEN             ((HVM_MAX_VCPUS / 8) + \
>>> +                                      ((HVM_MAX_VCPUS & 7) ? 1 : 0))
>>> +#if ACPI_CPU_MAP + ACPI_CPU_MAP_LEN >= ACPI_GPE0_BLK_ADDRESS_V1
>>> +#error "ACPI_CPU_MAP is too big"
>>> +#endif
>> Why is this in ioreq.h?  It has nothing to do with ioreq's.
>>
>> The current ACPI bits in here are to do with the qemu ACPI interface,
>> not the Xen ACPI interface.
>>
>> Also, please can we avoid hard-coding the location of the map in the
>> hypervisor ABI.  These constants make it impossible to ever extend the
>> number of HVM vcpus at a future date.
> The first three logically belong here because corresponding blocks'
> addresses are defined right above.

They have no relationship to the ones above, other than their name.

>
> ACPI_CPU_MAP has to be seen by both the toolstack (libacpi) and the
> hypervisor (and qemu as well, although it is defined as
> PIIX4_CPU_HOTPLUG_IO_BASE).
>
> Where do you think it should go then?

This highlights a reoccurring problem in Xen which desperately needs
fixing, but still isn't high enough on my TODO list to tackle yet.

There is no central registration of claims on domain resources.  This is
the root cause of memory accounting problems for HVM guests.

The way I planned to fix this was to have Xen maintain a registry of
domains physical resources which ends up looking very much like
/proc/io{mem,ports}.  There will be a hypercall interface for querying
this information, and for a toolstack and device model to modify it.

The key point is that Xen needs to be authoritative source of
information pertaining to layout, rather than the current fiasco we have
of the toolstack, qemu and hvmloader all thinking they know and control
what's going on.  This fixes several current unknowns which have caused
real problems, such as whether a domain was told about certain RMRRs
when it booted, or how many PXEROMs qemu tried to fit into the physmap.

This information (eventually, when I get Xen-level migration v2 sorted)
needs to move at the head of the migration stream.

The way I would envisage this working is that on domain create, Xen
makes a blank map indicating that all space is free.  By selecting
X86_EMUL_APCI_*, Xen takes out an allocation when it wires up the ioport
handler.

Later, when constructing the ACPI tables, the toolstack reads the
current ioport allocations and can see exactly which ports are reserved
for what.

Now, I understand that lumbering you with this work as a prerequisite
would be unfair.

Therefore, I will accept an alternative of hiding all these definitions
behind __XEN_TOOLS__ so the longterm fix can be introduced in a
compatible manner in the future.

That said, I am still certain that they shouldn't live in ioreq.h, as
they have nothing to do with Qemu.

~Andrew

Boris Ostrovsky Nov. 9, 2016, 9:01 p.m. UTC | #7

On 11/09/2016 02:58 PM, Andrew Cooper wrote:
> On 09/11/16 15:14, Boris Ostrovsky wrote:
>> On 11/09/2016 09:59 AM, Andrew Cooper wrote:
>>>> diff --git a/xen/include/public/hvm/ioreq.h b/xen/include/public/hvm/ioreq.h
>>>> index 2e5809b..e3fa704 100644
>>>> --- a/xen/include/public/hvm/ioreq.h
>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>> @@ -24,6 +24,8 @@
>>>>  #ifndef _IOREQ_H_
>>>>  #define _IOREQ_H_
>>>>  
>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>> +
>>>>  #define IOREQ_READ      1
>>>>  #define IOREQ_WRITE     0
>>>>  
>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>  
>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>>> +
>>>> +/* Location of online VCPU bitmap. */
>>>> +#define ACPI_CPU_MAP                 0xaf00
>>>> +#define ACPI_CPU_MAP_LEN             ((HVM_MAX_VCPUS / 8) + \
>>>> +                                      ((HVM_MAX_VCPUS & 7) ? 1 : 0))
>>>> +#if ACPI_CPU_MAP + ACPI_CPU_MAP_LEN >= ACPI_GPE0_BLK_ADDRESS_V1
>>>> +#error "ACPI_CPU_MAP is too big"
>>>> +#endif
>>> Why is this in ioreq.h?  It has nothing to do with ioreq's.
>>>
>>> The current ACPI bits in here are to do with the qemu ACPI interface,
>>> not the Xen ACPI interface.
>>>
>>> Also, please can we avoid hard-coding the location of the map in the
>>> hypervisor ABI.  These constants make it impossible to ever extend the
>>> number of HVM vcpus at a future date.
>> The first three logically belong here because corresponding blocks'
>> addresses are defined right above.
> They have no relationship to the ones above, other than their name.

They describe the same object --- for example
ACPI_PM1A_CNT_BLK_ADDRESS_V1 and (new) ACPI_PM1A_CNT_BLK_LEN describe
pm1a control.

As far as definitions being there for qemu interface only ---
ACPI_PM1A_CNT_BLK_ADDRESS_V1, for example, is used only by hvmloader and
libacpi.


>
>> ACPI_CPU_MAP has to be seen by both the toolstack (libacpi) and the
>> hypervisor (and qemu as well, although it is defined as
>> PIIX4_CPU_HOTPLUG_IO_BASE).
>>
>> Where do you think it should go then?
> This highlights a reoccurring problem in Xen which desperately needs
> fixing, but still isn't high enough on my TODO list to tackle yet.
>
> There is no central registration of claims on domain resources.  This is
> the root cause of memory accounting problems for HVM guests.
>
>
> The way I planned to fix this was to have Xen maintain a registry of
> domains physical resources which ends up looking very much like
> /proc/io{mem,ports}.  There will be a hypercall interface for querying
> this information, and for a toolstack and device model to modify it.
>
> The key point is that Xen needs to be authoritative source of
> information pertaining to layout, rather than the current fiasco we have
> of the toolstack, qemu and hvmloader all thinking they know and control
> what's going on.  This fixes several current unknowns which have caused
> real problems, such as whether a domain was told about certain RMRRs
> when it booted, or how many PXEROMs qemu tried to fit into the physmap.
>
> This information (eventually, when I get Xen-level migration v2 sorted)
> needs to move at the head of the migration stream.
>
> The way I would envisage this working is that on domain create, Xen
> makes a blank map indicating that all space is free.  By selecting
> X86_EMUL_APCI_*, Xen takes out an allocation when it wires up the ioport
> handler.
>
> Later, when constructing the ACPI tables, the toolstack reads the
> current ioport allocations and can see exactly which ports are reserved
> for what.
>
>
> Now, I understand that lumbering you with this work as a prerequisite
> would be unfair.
>
> Therefore, I will accept an alternative of hiding all these definitions
> behind __XEN_TOOLS__ so the longterm fix can be introduced in a
> compatible manner in the future.


__XEN_TOOLS__ or (__XEN__ || __XEN_TOOLS__) ? Because both the toolstack
and the hypervisor want to see them.


>
> That said, I am still certain that they shouldn't live in ioreq.h, as
> they have nothing to do with Qemu.

None of the existing files looks (to me) much better in terms of being
more appropriate. include/public/arch-x86/xen.h?


-boris

Boris Ostrovsky Nov. 14, 2016, 3:01 p.m. UTC | #8

On 11/09/2016 04:01 PM, Boris Ostrovsky wrote:
> On 11/09/2016 02:58 PM, Andrew Cooper wrote:
>> On 09/11/16 15:14, Boris Ostrovsky wrote:
>>> On 11/09/2016 09:59 AM, Andrew Cooper wrote:
>>>>> diff --git a/xen/include/public/hvm/ioreq.h b/xen/include/public/hvm/ioreq.h
>>>>> index 2e5809b..e3fa704 100644
>>>>> --- a/xen/include/public/hvm/ioreq.h
>>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>>> @@ -24,6 +24,8 @@
>>>>>  #ifndef _IOREQ_H_
>>>>>  #define _IOREQ_H_
>>>>>  
>>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>>> +
>>>>>  #define IOREQ_READ      1
>>>>>  #define IOREQ_WRITE     0
>>>>>  
>>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>>  
>>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>>>> +
>>>>> +/* Location of online VCPU bitmap. */
>>>>> +#define ACPI_CPU_MAP                 0xaf00
>>>>> +#define ACPI_CPU_MAP_LEN             ((HVM_MAX_VCPUS / 8) + \
>>>>> +                                      ((HVM_MAX_VCPUS & 7) ? 1 : 0))
>>>>> +#if ACPI_CPU_MAP + ACPI_CPU_MAP_LEN >= ACPI_GPE0_BLK_ADDRESS_V1
>>>>> +#error "ACPI_CPU_MAP is too big"
>>>>> +#endif
>>>> Why is this in ioreq.h?  It has nothing to do with ioreq's.
>>>>
>>>> The current ACPI bits in here are to do with the qemu ACPI interface,
>>>> not the Xen ACPI interface.
>>>>
>>>> Also, please can we avoid hard-coding the location of the map in the
>>>> hypervisor ABI.  These constants make it impossible to ever extend the
>>>> number of HVM vcpus at a future date.
>>> The first three logically belong here because corresponding blocks'
>>> addresses are defined right above.
>> They have no relationship to the ones above, other than their name.
> They describe the same object --- for example
> ACPI_PM1A_CNT_BLK_ADDRESS_V1 and (new) ACPI_PM1A_CNT_BLK_LEN describe
> pm1a control.
>
> As far as definitions being there for qemu interface only ---
> ACPI_PM1A_CNT_BLK_ADDRESS_V1, for example, is used only by hvmloader and
> libacpi.
>
>
>>> ACPI_CPU_MAP has to be seen by both the toolstack (libacpi) and the
>>> hypervisor (and qemu as well, although it is defined as
>>> PIIX4_CPU_HOTPLUG_IO_BASE).
>>>
>>> Where do you think it should go then?
>> This highlights a reoccurring problem in Xen which desperately needs
>> fixing, but still isn't high enough on my TODO list to tackle yet.
>>
>> There is no central registration of claims on domain resources.  This is
>> the root cause of memory accounting problems for HVM guests.
>>
>>
>> The way I planned to fix this was to have Xen maintain a registry of
>> domains physical resources which ends up looking very much like
>> /proc/io{mem,ports}.  There will be a hypercall interface for querying
>> this information, and for a toolstack and device model to modify it.
>>
>> The key point is that Xen needs to be authoritative source of
>> information pertaining to layout, rather than the current fiasco we have
>> of the toolstack, qemu and hvmloader all thinking they know and control
>> what's going on.  This fixes several current unknowns which have caused
>> real problems, such as whether a domain was told about certain RMRRs
>> when it booted, or how many PXEROMs qemu tried to fit into the physmap.
>>
>> This information (eventually, when I get Xen-level migration v2 sorted)
>> needs to move at the head of the migration stream.
>>
>> The way I would envisage this working is that on domain create, Xen
>> makes a blank map indicating that all space is free.  By selecting
>> X86_EMUL_APCI_*, Xen takes out an allocation when it wires up the ioport
>> handler.
>>
>> Later, when constructing the ACPI tables, the toolstack reads the
>> current ioport allocations and can see exactly which ports are reserved
>> for what.
>>
>>
>> Now, I understand that lumbering you with this work as a prerequisite
>> would be unfair.
>>
>> Therefore, I will accept an alternative of hiding all these definitions
>> behind __XEN_TOOLS__ so the longterm fix can be introduced in a
>> compatible manner in the future.
>
> __XEN_TOOLS__ or (__XEN__ || __XEN_TOOLS__) ? Because both the toolstack
> and the hypervisor want to see them.
>
>
>> That said, I am still certain that they shouldn't live in ioreq.h, as
>> they have nothing to do with Qemu.
> None of the existing files looks (to me) much better in terms of being
> more appropriate. include/public/arch-x86/xen.h?

Andrew, ping on these two questions.

-boris

Andrew Cooper Nov. 14, 2016, 5:19 p.m. UTC | #9

On 14/11/16 15:01, Boris Ostrovsky wrote:
> On 11/09/2016 04:01 PM, Boris Ostrovsky wrote:
>> On 11/09/2016 02:58 PM, Andrew Cooper wrote:
>>> On 09/11/16 15:14, Boris Ostrovsky wrote:
>>>> On 11/09/2016 09:59 AM, Andrew Cooper wrote:
>>>>>> diff --git a/xen/include/public/hvm/ioreq.h b/xen/include/public/hvm/ioreq.h
>>>>>> index 2e5809b..e3fa704 100644
>>>>>> --- a/xen/include/public/hvm/ioreq.h
>>>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>>>> @@ -24,6 +24,8 @@
>>>>>>  #ifndef _IOREQ_H_
>>>>>>  #define _IOREQ_H_
>>>>>>  
>>>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>>>> +
>>>>>>  #define IOREQ_READ      1
>>>>>>  #define IOREQ_WRITE     0
>>>>>>  
>>>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>>>  
>>>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>>>>> +
>>>>>> +/* Location of online VCPU bitmap. */
>>>>>> +#define ACPI_CPU_MAP                 0xaf00
>>>>>> +#define ACPI_CPU_MAP_LEN             ((HVM_MAX_VCPUS / 8) + \
>>>>>> +                                      ((HVM_MAX_VCPUS & 7) ? 1 : 0))
>>>>>> +#if ACPI_CPU_MAP + ACPI_CPU_MAP_LEN >= ACPI_GPE0_BLK_ADDRESS_V1
>>>>>> +#error "ACPI_CPU_MAP is too big"
>>>>>> +#endif
>>>>> Why is this in ioreq.h?  It has nothing to do with ioreq's.
>>>>>
>>>>> The current ACPI bits in here are to do with the qemu ACPI interface,
>>>>> not the Xen ACPI interface.
>>>>>
>>>>> Also, please can we avoid hard-coding the location of the map in the
>>>>> hypervisor ABI.  These constants make it impossible to ever extend the
>>>>> number of HVM vcpus at a future date.
>>>> The first three logically belong here because corresponding blocks'
>>>> addresses are defined right above.
>>> They have no relationship to the ones above, other than their name.
>> They describe the same object --- for example
>> ACPI_PM1A_CNT_BLK_ADDRESS_V1 and (new) ACPI_PM1A_CNT_BLK_LEN describe
>> pm1a control.
>>
>> As far as definitions being there for qemu interface only ---
>> ACPI_PM1A_CNT_BLK_ADDRESS_V1, for example, is used only by hvmloader and
>> libacpi.
>>
>>
>>>> ACPI_CPU_MAP has to be seen by both the toolstack (libacpi) and the
>>>> hypervisor (and qemu as well, although it is defined as
>>>> PIIX4_CPU_HOTPLUG_IO_BASE).
>>>>
>>>> Where do you think it should go then?
>>> This highlights a reoccurring problem in Xen which desperately needs
>>> fixing, but still isn't high enough on my TODO list to tackle yet.
>>>
>>> There is no central registration of claims on domain resources.  This is
>>> the root cause of memory accounting problems for HVM guests.
>>>
>>>
>>> The way I planned to fix this was to have Xen maintain a registry of
>>> domains physical resources which ends up looking very much like
>>> /proc/io{mem,ports}.  There will be a hypercall interface for querying
>>> this information, and for a toolstack and device model to modify it.
>>>
>>> The key point is that Xen needs to be authoritative source of
>>> information pertaining to layout, rather than the current fiasco we have
>>> of the toolstack, qemu and hvmloader all thinking they know and control
>>> what's going on.  This fixes several current unknowns which have caused
>>> real problems, such as whether a domain was told about certain RMRRs
>>> when it booted, or how many PXEROMs qemu tried to fit into the physmap.
>>>
>>> This information (eventually, when I get Xen-level migration v2 sorted)
>>> needs to move at the head of the migration stream.
>>>
>>> The way I would envisage this working is that on domain create, Xen
>>> makes a blank map indicating that all space is free.  By selecting
>>> X86_EMUL_APCI_*, Xen takes out an allocation when it wires up the ioport
>>> handler.
>>>
>>> Later, when constructing the ACPI tables, the toolstack reads the
>>> current ioport allocations and can see exactly which ports are reserved
>>> for what.
>>>
>>>
>>> Now, I understand that lumbering you with this work as a prerequisite
>>> would be unfair.
>>>
>>> Therefore, I will accept an alternative of hiding all these definitions
>>> behind __XEN_TOOLS__ so the longterm fix can be introduced in a
>>> compatible manner in the future.
>> __XEN_TOOLS__ or (__XEN__ || __XEN_TOOLS__) ? Because both the toolstack
>> and the hypervisor want to see them.

(__XEN__ || __XEN_TOOLS__) is fine.

>>
>>
>>> That said, I am still certain that they shouldn't live in ioreq.h, as
>>> they have nothing to do with Qemu.
>> None of the existing files looks (to me) much better in terms of being
>> more appropriate. include/public/arch-x86/xen.h?
> Andrew, ping on these two questions.

Sorry for letting this slip through the cracks.

Leave them here for now.  xen.h is not a better place for them to live.

~Andrew

Jan Beulich Nov. 15, 2016, 8:47 a.m. UTC | #10

>>> On 09.11.16 at 15:39, <boris.ostrovsky@oracle.com> wrote:
> @@ -244,7 +245,8 @@ int main(int argc, char **argv)
>  #endif
>  
>      /* Operation Region 'PRST': bitmask of online CPUs. */
> -    stmt("OperationRegion", "PRST, SystemIO, 0xaf00, 32");
> +    stmt("OperationRegion", "PRST, SystemIO, 0x%x, %d",

%#x

> --- a/tools/libacpi/static_tables.c
> +++ b/tools/libacpi/static_tables.c
> @@ -20,6 +20,8 @@
>   * Firmware ACPI Control Structure (FACS).
>   */
>  
> +#define ACPI_REG_BIT_OFFSET    0

Can you completely exclude us ever wanting to support something
that's not on a byte boundary? I think there was a good reason ...

> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>  /*
>   * Fixed ACPI Description Table (FADT).
>   */
> -
> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00

... these specified both width and offset.

> --- a/xen/include/asm-x86/hvm/domain.h
> +++ b/xen/include/asm-x86/hvm/domain.h
> @@ -87,6 +87,12 @@ struct hvm_domain {
>      } ioreq_server;
>      struct hvm_ioreq_server *default_ioreq_server;
>  
> +    /* PVH guests */
> +    struct {
> +        uint8_t pm1a[ACPI_PM1A_EVT_BLK_LEN];
> +        uint8_t gpe[ACPI_GPE0_BLK_LEN_V1];
> +    } acpi_io;

This is left unused in this patch - please add it once it gets used.

> --- a/xen/include/public/hvm/ioreq.h
> +++ b/xen/include/public/hvm/ioreq.h
> @@ -24,6 +24,8 @@
>  #ifndef _IOREQ_H_
>  #define _IOREQ_H_
>  
> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
> +
>  #define IOREQ_READ      1
>  #define IOREQ_WRITE     0
>  
> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>  
> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
> +#define ACPI_PM_TMR_BLK_LEN          0x04

Just like ACPI_GPE0_BLK_LEN these should really go next to their
address definitions. Provided we really want to hard code further
values here in the first place, which I don't think we should. The
goal should rather be for all these hard coded values to go away
(which really should have happened when the V1 variants had
been added).

> +/* Location of online VCPU bitmap. */
> +#define ACPI_CPU_MAP                 0xaf00
> +#define ACPI_CPU_MAP_LEN             ((HVM_MAX_VCPUS / 8) + \
> +                                      ((HVM_MAX_VCPUS & 7) ? 1 : 0))

((HVM_MAX_VCPUS + 7) / 8)

But as indicated elsewhere, this should be made less static anyway.

Also, no matter that there are (many, or should I say only) bad
examples in this file, please don't add new possible name space
conflicts: Any new constants should start with XEN_.

Jan

Boris Ostrovsky Nov. 15, 2016, 2:47 p.m. UTC | #11

On 11/15/2016 03:47 AM, Jan Beulich wrote:
>
>> --- a/tools/libacpi/static_tables.c
>> +++ b/tools/libacpi/static_tables.c
>> @@ -20,6 +20,8 @@
>>   * Firmware ACPI Control Structure (FACS).
>>   */
>>  
>> +#define ACPI_REG_BIT_OFFSET    0
> Can you completely exclude us ever wanting to support something
> that's not on a byte boundary? I think there was a good reason ...
>
>> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>>  /*
>>   * Fixed ACPI Description Table (FADT).
>>   */
>> -
>> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
>> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
>> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
>> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
>> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
>> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
> ... these specified both width and offset.

Since OFFSET is not used anywhere I kept it local to static_tables.c. I
can restore these macros per block and move them to public header but...

>> --- a/xen/include/public/hvm/ioreq.h
>> +++ b/xen/include/public/hvm/ioreq.h
>> @@ -24,6 +24,8 @@
>>  #ifndef _IOREQ_H_
>>  #define _IOREQ_H_
>>  
>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>> +
>>  #define IOREQ_READ      1
>>  #define IOREQ_WRITE     0
>>  
>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>  
>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>> +#define ACPI_PM_TMR_BLK_LEN          0x04
> Just like ACPI_GPE0_BLK_LEN these should really go next to their
> address definitions. 

... together with this, it will make it rather messy/unsightly to go
with Andrew's request to ifdef this with __XEN__/__XEN_TOOLS__.

> Provided we really want to hard code further
> values here in the first place, which I don't think we should. The
> goal should rather be for all these hard coded values to go away
> (which really should have happened when the V1 variants had
> been added).

How can we not hardcode this if the values should match those in FADT
(i.e. static_tables.c)?

>
>> +/* Location of online VCPU bitmap. */
>> +#define ACPI_CPU_MAP                 0xaf00
>> +#define ACPI_CPU_MAP_LEN             ((HVM_MAX_VCPUS / 8) + \
>> +                                      ((HVM_MAX_VCPUS & 7) ? 1 : 0))
> ((HVM_MAX_VCPUS + 7) / 8)
>
> But as indicated elsewhere, this should be made less static anyway.

Right, we could size it by domain.max_vcpus.

-boris

Jan Beulich Nov. 15, 2016, 3:13 p.m. UTC | #12

>>> On 15.11.16 at 15:47, <boris.ostrovsky@oracle.com> wrote:
> On 11/15/2016 03:47 AM, Jan Beulich wrote:
>>
>>> --- a/tools/libacpi/static_tables.c
>>> +++ b/tools/libacpi/static_tables.c
>>> @@ -20,6 +20,8 @@
>>>   * Firmware ACPI Control Structure (FACS).
>>>   */
>>>  
>>> +#define ACPI_REG_BIT_OFFSET    0
>> Can you completely exclude us ever wanting to support something
>> that's not on a byte boundary? I think there was a good reason ...
>>
>>> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>>>  /*
>>>   * Fixed ACPI Description Table (FADT).
>>>   */
>>> -
>>> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
>>> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
>>> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
>>> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
>>> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
>>> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
>> ... these specified both width and offset.
> 
> Since OFFSET is not used anywhere I kept it local to static_tables.c. I
> can restore these macros per block and move them to public header but...
> 
>>> --- a/xen/include/public/hvm/ioreq.h
>>> +++ b/xen/include/public/hvm/ioreq.h
>>> @@ -24,6 +24,8 @@
>>>  #ifndef _IOREQ_H_
>>>  #define _IOREQ_H_
>>>  
>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>> +
>>>  #define IOREQ_READ      1
>>>  #define IOREQ_WRITE     0
>>>  
>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>  
>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>> Just like ACPI_GPE0_BLK_LEN these should really go next to their
>> address definitions. 
> 
> ... together with this, it will make it rather messy/unsightly to go
> with Andrew's request to ifdef this with __XEN__/__XEN_TOOLS__.

Well, framing them that way is a good excuse for having them
separate from the others. In fact, however, the others also
should get hidden in the same way, just that we would need to
be more careful there (read: make the condition also check
__XEN_INTERFACE_VERSION__).

>> Provided we really want to hard code further
>> values here in the first place, which I don't think we should. The
>> goal should rather be for all these hard coded values to go away
>> (which really should have happened when the V1 variants had
>> been added).
> 
> How can we not hardcode this if the values should match those in FADT
> (i.e. static_tables.c)?

By having the loading entity obtain the dynamic values and adjust
the table(s) accordingly.

Jan

Boris Ostrovsky Nov. 15, 2016, 3:41 p.m. UTC | #13

On 11/15/2016 10:13 AM, Jan Beulich wrote:
>>>> On 15.11.16 at 15:47, <boris.ostrovsky@oracle.com> wrote:
>> On 11/15/2016 03:47 AM, Jan Beulich wrote:
>>>> --- a/tools/libacpi/static_tables.c
>>>> +++ b/tools/libacpi/static_tables.c
>>>> @@ -20,6 +20,8 @@
>>>>   * Firmware ACPI Control Structure (FACS).
>>>>   */
>>>>  
>>>> +#define ACPI_REG_BIT_OFFSET    0
>>> Can you completely exclude us ever wanting to support something
>>> that's not on a byte boundary? I think there was a good reason ...
>>>
>>>> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>>>>  /*
>>>>   * Fixed ACPI Description Table (FADT).
>>>>   */
>>>> -
>>>> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
>>>> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
>>>> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
>>>> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
>>>> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
>>>> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
>>> ... these specified both width and offset.
>> Since OFFSET is not used anywhere I kept it local to static_tables.c. I
>> can restore these macros per block and move them to public header but...
>>
>>>> --- a/xen/include/public/hvm/ioreq.h
>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>> @@ -24,6 +24,8 @@
>>>>  #ifndef _IOREQ_H_
>>>>  #define _IOREQ_H_
>>>>  
>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>> +
>>>>  #define IOREQ_READ      1
>>>>  #define IOREQ_WRITE     0
>>>>  
>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>  
>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>> Just like ACPI_GPE0_BLK_LEN these should really go next to their
>>> address definitions. 
>> ... together with this, it will make it rather messy/unsightly to go
>> with Andrew's request to ifdef this with __XEN__/__XEN_TOOLS__.
> Well, framing them that way is a good excuse for having them
> separate from the others. In fact, however, the others also
> should get hidden in the same way, just that we would need to
> be more careful there (read: make the condition also check
> __XEN_INTERFACE_VERSION__).

Sorry, I don't follow this. How can interface version help here?

>
>>> Provided we really want to hard code further
>>> values here in the first place, which I don't think we should. The
>>> goal should rather be for all these hard coded values to go away
>>> (which really should have happened when the V1 variants had
>>> been added).
>> How can we not hardcode this if the values should match those in FADT
>> (i.e. static_tables.c)?
> By having the loading entity obtain the dynamic values and adjust
> the table(s) accordingly.

And this. Which loading entity (ACPI builder?) and how would it adjust
the addresses? It still needs those addresses defined somewhere. And the
the hypervisor, which can't parse guest FADT, needs to get those addresses.

-boris

Jan Beulich Nov. 15, 2016, 3:53 p.m. UTC | #14

>>> On 15.11.16 at 16:41, <boris.ostrovsky@oracle.com> wrote:
> On 11/15/2016 10:13 AM, Jan Beulich wrote:
>>>>> On 15.11.16 at 15:47, <boris.ostrovsky@oracle.com> wrote:
>>> On 11/15/2016 03:47 AM, Jan Beulich wrote:
>>>>> --- a/tools/libacpi/static_tables.c
>>>>> +++ b/tools/libacpi/static_tables.c
>>>>> @@ -20,6 +20,8 @@
>>>>>   * Firmware ACPI Control Structure (FACS).
>>>>>   */
>>>>>  
>>>>> +#define ACPI_REG_BIT_OFFSET    0
>>>> Can you completely exclude us ever wanting to support something
>>>> that's not on a byte boundary? I think there was a good reason ...
>>>>
>>>>> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>>>>>  /*
>>>>>   * Fixed ACPI Description Table (FADT).
>>>>>   */
>>>>> -
>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
>>>>> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
>>>>> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
>>>> ... these specified both width and offset.
>>> Since OFFSET is not used anywhere I kept it local to static_tables.c. I
>>> can restore these macros per block and move them to public header but...
>>>
>>>>> --- a/xen/include/public/hvm/ioreq.h
>>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>>> @@ -24,6 +24,8 @@
>>>>>  #ifndef _IOREQ_H_
>>>>>  #define _IOREQ_H_
>>>>>  
>>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>>> +
>>>>>  #define IOREQ_READ      1
>>>>>  #define IOREQ_WRITE     0
>>>>>  
>>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>>  
>>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>>> Just like ACPI_GPE0_BLK_LEN these should really go next to their
>>>> address definitions. 
>>> ... together with this, it will make it rather messy/unsightly to go
>>> with Andrew's request to ifdef this with __XEN__/__XEN_TOOLS__.
>> Well, framing them that way is a good excuse for having them
>> separate from the others. In fact, however, the others also
>> should get hidden in the same way, just that we would need to
>> be more careful there (read: make the condition also check
>> __XEN_INTERFACE_VERSION__).
> 
> Sorry, I don't follow this. How can interface version help here?

We can't outright remove existing definitions from the public interface,
but we can limit their exposure to old consumers.

>>>> Provided we really want to hard code further
>>>> values here in the first place, which I don't think we should. The
>>>> goal should rather be for all these hard coded values to go away
>>>> (which really should have happened when the V1 variants had
>>>> been added).
>>> How can we not hardcode this if the values should match those in FADT
>>> (i.e. static_tables.c)?
>> By having the loading entity obtain the dynamic values and adjust
>> the table(s) accordingly.
> 
> And this. Which loading entity (ACPI builder?) and how would it adjust
> the addresses? It still needs those addresses defined somewhere. And the
> the hypervisor, which can't parse guest FADT, needs to get those addresses.

Didn't Andrew make quite clear that there needs to be a central
authority assigning guest resources? That's where the values
would come from, and they would need to be suitably propagated
to however is in need of knowing them.

Jan

Boris Ostrovsky Nov. 15, 2016, 4:23 p.m. UTC | #15

On 11/15/2016 10:53 AM, Jan Beulich wrote:
>>>> On 15.11.16 at 16:41, <boris.ostrovsky@oracle.com> wrote:
>> On 11/15/2016 10:13 AM, Jan Beulich wrote:
>>>>>> On 15.11.16 at 15:47, <boris.ostrovsky@oracle.com> wrote:
>>>> On 11/15/2016 03:47 AM, Jan Beulich wrote:
>>>>>> --- a/tools/libacpi/static_tables.c
>>>>>> +++ b/tools/libacpi/static_tables.c
>>>>>> @@ -20,6 +20,8 @@
>>>>>>   * Firmware ACPI Control Structure (FACS).
>>>>>>   */
>>>>>>  
>>>>>> +#define ACPI_REG_BIT_OFFSET    0
>>>>> Can you completely exclude us ever wanting to support something
>>>>> that's not on a byte boundary? I think there was a good reason ...
>>>>>
>>>>>> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>>>>>>  /*
>>>>>>   * Fixed ACPI Description Table (FADT).
>>>>>>   */
>>>>>> -
>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
>>>>>> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
>>>>>> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
>>>>> ... these specified both width and offset.
>>>> Since OFFSET is not used anywhere I kept it local to static_tables.c. I
>>>> can restore these macros per block and move them to public header but...
>>>>
>>>>>> --- a/xen/include/public/hvm/ioreq.h
>>>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>>>> @@ -24,6 +24,8 @@
>>>>>>  #ifndef _IOREQ_H_
>>>>>>  #define _IOREQ_H_
>>>>>>  
>>>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>>>> +
>>>>>>  #define IOREQ_READ      1
>>>>>>  #define IOREQ_WRITE     0
>>>>>>  
>>>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>>>  
>>>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>>>> Just like ACPI_GPE0_BLK_LEN these should really go next to their
>>>>> address definitions. 
>>>> ... together with this, it will make it rather messy/unsightly to go
>>>> with Andrew's request to ifdef this with __XEN__/__XEN_TOOLS__.
>>> Well, framing them that way is a good excuse for having them
>>> separate from the others. In fact, however, the others also
>>> should get hidden in the same way, just that we would need to
>>> be more careful there (read: make the condition also check
>>> __XEN_INTERFACE_VERSION__).
>> Sorry, I don't follow this. How can interface version help here?
> We can't outright remove existing definitions from the public interface,
> but we can limit their exposure to old consumers.

But don't we need to support both V0 and V1 as long as qemu-trad is
supported? In other words, checking interface version won't limit the
scope at this point.

>
>>>>> Provided we really want to hard code further
>>>>> values here in the first place, which I don't think we should. The
>>>>> goal should rather be for all these hard coded values to go away
>>>>> (which really should have happened when the V1 variants had
>>>>> been added).
>>>> How can we not hardcode this if the values should match those in FADT
>>>> (i.e. static_tables.c)?
>>> By having the loading entity obtain the dynamic values and adjust
>>> the table(s) accordingly.
>> And this. Which loading entity (ACPI builder?) and how would it adjust
>> the addresses? It still needs those addresses defined somewhere. And the
>> the hypervisor, which can't parse guest FADT, needs to get those addresses.
> Didn't Andrew make quite clear that there needs to be a central
> authority assigning guest resources? That's where the values
> would come from, and they would need to be suitably propagated
> to however is in need of knowing them.

Oh, but that is still (way?) off at this point. From what I understood
about Andrew's proposal this will require fairly significant update of
how regions are registered.

-boris

Jan Beulich Nov. 15, 2016, 4:33 p.m. UTC | #16

>>> On 15.11.16 at 17:23, <boris.ostrovsky@oracle.com> wrote:
> On 11/15/2016 10:53 AM, Jan Beulich wrote:
>>>>> On 15.11.16 at 16:41, <boris.ostrovsky@oracle.com> wrote:
>>> On 11/15/2016 10:13 AM, Jan Beulich wrote:
>>>>>>> On 15.11.16 at 15:47, <boris.ostrovsky@oracle.com> wrote:
>>>>> On 11/15/2016 03:47 AM, Jan Beulich wrote:
>>>>>>> --- a/tools/libacpi/static_tables.c
>>>>>>> +++ b/tools/libacpi/static_tables.c
>>>>>>> @@ -20,6 +20,8 @@
>>>>>>>   * Firmware ACPI Control Structure (FACS).
>>>>>>>   */
>>>>>>>  
>>>>>>> +#define ACPI_REG_BIT_OFFSET    0
>>>>>> Can you completely exclude us ever wanting to support something
>>>>>> that's not on a byte boundary? I think there was a good reason ...
>>>>>>
>>>>>>> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>>>>>>>  /*
>>>>>>>   * Fixed ACPI Description Table (FADT).
>>>>>>>   */
>>>>>>> -
>>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
>>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
>>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
>>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
>>>>>>> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
>>>>>>> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
>>>>>> ... these specified both width and offset.
>>>>> Since OFFSET is not used anywhere I kept it local to static_tables.c. I
>>>>> can restore these macros per block and move them to public header but...
>>>>>
>>>>>>> --- a/xen/include/public/hvm/ioreq.h
>>>>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>>>>> @@ -24,6 +24,8 @@
>>>>>>>  #ifndef _IOREQ_H_
>>>>>>>  #define _IOREQ_H_
>>>>>>>  
>>>>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>>>>> +
>>>>>>>  #define IOREQ_READ      1
>>>>>>>  #define IOREQ_WRITE     0
>>>>>>>  
>>>>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>>>>  
>>>>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>>>>> Just like ACPI_GPE0_BLK_LEN these should really go next to their
>>>>>> address definitions. 
>>>>> ... together with this, it will make it rather messy/unsightly to go
>>>>> with Andrew's request to ifdef this with __XEN__/__XEN_TOOLS__.
>>>> Well, framing them that way is a good excuse for having them
>>>> separate from the others. In fact, however, the others also
>>>> should get hidden in the same way, just that we would need to
>>>> be more careful there (read: make the condition also check
>>>> __XEN_INTERFACE_VERSION__).
>>> Sorry, I don't follow this. How can interface version help here?
>> We can't outright remove existing definitions from the public interface,
>> but we can limit their exposure to old consumers.
> 
> But don't we need to support both V0 and V1 as long as qemu-trad is
> supported? In other words, checking interface version won't limit the
> scope at this point.

Doesn't qemu-trad set __XEN_TOOLS__?

>>>>>> Provided we really want to hard code further
>>>>>> values here in the first place, which I don't think we should. The
>>>>>> goal should rather be for all these hard coded values to go away
>>>>>> (which really should have happened when the V1 variants had
>>>>>> been added).
>>>>> How can we not hardcode this if the values should match those in FADT
>>>>> (i.e. static_tables.c)?
>>>> By having the loading entity obtain the dynamic values and adjust
>>>> the table(s) accordingly.
>>> And this. Which loading entity (ACPI builder?) and how would it adjust
>>> the addresses? It still needs those addresses defined somewhere. And the
>>> the hypervisor, which can't parse guest FADT, needs to get those addresses.
>> Didn't Andrew make quite clear that there needs to be a central
>> authority assigning guest resources? That's where the values
>> would come from, and they would need to be suitably propagated
>> to however is in need of knowing them.
> 
> Oh, but that is still (way?) off at this point. From what I understood
> about Andrew's proposal this will require fairly significant update of
> how regions are registered.

Well, perhaps. Yet I question whether it's a good idea to add another
fixed address right now, instead of switching over first.

Jan

Boris Ostrovsky Nov. 15, 2016, 4:58 p.m. UTC | #17

On 11/15/2016 11:33 AM, Jan Beulich wrote:
>>>> On 15.11.16 at 17:23, <boris.ostrovsky@oracle.com> wrote:
>> On 11/15/2016 10:53 AM, Jan Beulich wrote:
>>>>>> On 15.11.16 at 16:41, <boris.ostrovsky@oracle.com> wrote:
>>>> On 11/15/2016 10:13 AM, Jan Beulich wrote:
>>>>>>>> On 15.11.16 at 15:47, <boris.ostrovsky@oracle.com> wrote:
>>>>>> On 11/15/2016 03:47 AM, Jan Beulich wrote:
>>>>>>>> --- a/tools/libacpi/static_tables.c
>>>>>>>> +++ b/tools/libacpi/static_tables.c
>>>>>>>> @@ -20,6 +20,8 @@
>>>>>>>>   * Firmware ACPI Control Structure (FACS).
>>>>>>>>   */
>>>>>>>>  
>>>>>>>> +#define ACPI_REG_BIT_OFFSET    0
>>>>>>> Can you completely exclude us ever wanting to support something
>>>>>>> that's not on a byte boundary? I think there was a good reason ...
>>>>>>>
>>>>>>>> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>>>>>>>>  /*
>>>>>>>>   * Fixed ACPI Description Table (FADT).
>>>>>>>>   */
>>>>>>>> -
>>>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
>>>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
>>>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
>>>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
>>>>>>>> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
>>>>>>>> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
>>>>>>> ... these specified both width and offset.
>>>>>> Since OFFSET is not used anywhere I kept it local to static_tables.c. I
>>>>>> can restore these macros per block and move them to public header but...
>>>>>>
>>>>>>>> --- a/xen/include/public/hvm/ioreq.h
>>>>>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>>>>>> @@ -24,6 +24,8 @@
>>>>>>>>  #ifndef _IOREQ_H_
>>>>>>>>  #define _IOREQ_H_
>>>>>>>>  
>>>>>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>>>>>> +
>>>>>>>>  #define IOREQ_READ      1
>>>>>>>>  #define IOREQ_WRITE     0
>>>>>>>>  
>>>>>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>>>>>  
>>>>>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>>>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>>>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>>>>>> Just like ACPI_GPE0_BLK_LEN these should really go next to their
>>>>>>> address definitions. 
>>>>>> ... together with this, it will make it rather messy/unsightly to go
>>>>>> with Andrew's request to ifdef this with __XEN__/__XEN_TOOLS__.
>>>>> Well, framing them that way is a good excuse for having them
>>>>> separate from the others. In fact, however, the others also
>>>>> should get hidden in the same way, just that we would need to
>>>>> be more careful there (read: make the condition also check
>>>>> __XEN_INTERFACE_VERSION__).
>>>> Sorry, I don't follow this. How can interface version help here?
>>> We can't outright remove existing definitions from the public interface,
>>> but we can limit their exposure to old consumers.
>> But don't we need to support both V0 and V1 as long as qemu-trad is
>> supported? In other words, checking interface version won't limit the
>> scope at this point.
> Doesn't qemu-trad set __XEN_TOOLS__?

Oh, so you meant that interface version would be an OR, in addition to
__XEN__ and __XEN_TOOLS__?

>
>>>>>>> Provided we really want to hard code further
>>>>>>> values here in the first place, which I don't think we should. The
>>>>>>> goal should rather be for all these hard coded values to go away
>>>>>>> (which really should have happened when the V1 variants had
>>>>>>> been added).
>>>>>> How can we not hardcode this if the values should match those in FADT
>>>>>> (i.e. static_tables.c)?
>>>>> By having the loading entity obtain the dynamic values and adjust
>>>>> the table(s) accordingly.
>>>> And this. Which loading entity (ACPI builder?) and how would it adjust
>>>> the addresses? It still needs those addresses defined somewhere. And the
>>>> the hypervisor, which can't parse guest FADT, needs to get those addresses.
>>> Didn't Andrew make quite clear that there needs to be a central
>>> authority assigning guest resources? That's where the values
>>> would come from, and they would need to be suitably propagated
>>> to however is in need of knowing them.
>> Oh, but that is still (way?) off at this point. From what I understood
>> about Andrew's proposal this will require fairly significant update of
>> how regions are registered.
> Well, perhaps. Yet I question whether it's a good idea to add another
> fixed address right now, instead of switching over first.


I think getting that framework in order would be out of the scope of
what this series is trying to achieve.

-boris

Jan Beulich Nov. 15, 2016, 4:59 p.m. UTC | #18

>>> On 15.11.16 at 17:58, <boris.ostrovsky@oracle.com> wrote:
> On 11/15/2016 11:33 AM, Jan Beulich wrote:
>>>>> On 15.11.16 at 17:23, <boris.ostrovsky@oracle.com> wrote:
>>> On 11/15/2016 10:53 AM, Jan Beulich wrote:
>>>>>>> On 15.11.16 at 16:41, <boris.ostrovsky@oracle.com> wrote:
>>>>> On 11/15/2016 10:13 AM, Jan Beulich wrote:
>>>>>>>>> On 15.11.16 at 15:47, <boris.ostrovsky@oracle.com> wrote:
>>>>>>> On 11/15/2016 03:47 AM, Jan Beulich wrote:
>>>>>>>>> --- a/tools/libacpi/static_tables.c
>>>>>>>>> +++ b/tools/libacpi/static_tables.c
>>>>>>>>> @@ -20,6 +20,8 @@
>>>>>>>>>   * Firmware ACPI Control Structure (FACS).
>>>>>>>>>   */
>>>>>>>>>  
>>>>>>>>> +#define ACPI_REG_BIT_OFFSET    0
>>>>>>>> Can you completely exclude us ever wanting to support something
>>>>>>>> that's not on a byte boundary? I think there was a good reason ...
>>>>>>>>
>>>>>>>>> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>>>>>>>>>  /*
>>>>>>>>>   * Fixed ACPI Description Table (FADT).
>>>>>>>>>   */
>>>>>>>>> -
>>>>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
>>>>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
>>>>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
>>>>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
>>>>>>>>> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
>>>>>>>>> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
>>>>>>>> ... these specified both width and offset.
>>>>>>> Since OFFSET is not used anywhere I kept it local to static_tables.c. I
>>>>>>> can restore these macros per block and move them to public header but...
>>>>>>>
>>>>>>>>> --- a/xen/include/public/hvm/ioreq.h
>>>>>>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>>>>>>> @@ -24,6 +24,8 @@
>>>>>>>>>  #ifndef _IOREQ_H_
>>>>>>>>>  #define _IOREQ_H_
>>>>>>>>>  
>>>>>>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>>>>>>> +
>>>>>>>>>  #define IOREQ_READ      1
>>>>>>>>>  #define IOREQ_WRITE     0
>>>>>>>>>  
>>>>>>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>>>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>>>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>>>>>>  
>>>>>>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>>>>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>>>>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>>>>>>> Just like ACPI_GPE0_BLK_LEN these should really go next to their
>>>>>>>> address definitions. 
>>>>>>> ... together with this, it will make it rather messy/unsightly to go
>>>>>>> with Andrew's request to ifdef this with __XEN__/__XEN_TOOLS__.
>>>>>> Well, framing them that way is a good excuse for having them
>>>>>> separate from the others. In fact, however, the others also
>>>>>> should get hidden in the same way, just that we would need to
>>>>>> be more careful there (read: make the condition also check
>>>>>> __XEN_INTERFACE_VERSION__).
>>>>> Sorry, I don't follow this. How can interface version help here?
>>>> We can't outright remove existing definitions from the public interface,
>>>> but we can limit their exposure to old consumers.
>>> But don't we need to support both V0 and V1 as long as qemu-trad is
>>> supported? In other words, checking interface version won't limit the
>>> scope at this point.
>> Doesn't qemu-trad set __XEN_TOOLS__?
> 
> Oh, so you meant that interface version would be an OR, in addition to
> __XEN__ and __XEN_TOOLS__?

Yes.

Jan

Andrew Cooper Nov. 15, 2016, 6:31 p.m. UTC | #19

On 15/11/16 16:58, Boris Ostrovsky wrote:
> On 11/15/2016 11:33 AM, Jan Beulich wrote:
>>>>> On 15.11.16 at 17:23, <boris.ostrovsky@oracle.com> wrote:
>>> On 11/15/2016 10:53 AM, Jan Beulich wrote:
>>>>>>> On 15.11.16 at 16:41, <boris.ostrovsky@oracle.com> wrote:
>>>>> On 11/15/2016 10:13 AM, Jan Beulich wrote:
>>>>>>>>> On 15.11.16 at 15:47, <boris.ostrovsky@oracle.com> wrote:
>>>>>>> On 11/15/2016 03:47 AM, Jan Beulich wrote:
>>>>>>>>> --- a/tools/libacpi/static_tables.c
>>>>>>>>> +++ b/tools/libacpi/static_tables.c
>>>>>>>>> @@ -20,6 +20,8 @@
>>>>>>>>>   * Firmware ACPI Control Structure (FACS).
>>>>>>>>>   */
>>>>>>>>>  
>>>>>>>>> +#define ACPI_REG_BIT_OFFSET    0
>>>>>>>> Can you completely exclude us ever wanting to support something
>>>>>>>> that's not on a byte boundary? I think there was a good reason ...
>>>>>>>>
>>>>>>>>> @@ -30,14 +32,6 @@ struct acpi_20_facs Facs = {
>>>>>>>>>  /*
>>>>>>>>>   * Fixed ACPI Description Table (FADT).
>>>>>>>>>   */
>>>>>>>>> -
>>>>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_WIDTH         0x20
>>>>>>>>> -#define ACPI_PM1A_EVT_BLK_BIT_OFFSET        0x00
>>>>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_WIDTH         0x10
>>>>>>>>> -#define ACPI_PM1A_CNT_BLK_BIT_OFFSET        0x00
>>>>>>>>> -#define ACPI_PM_TMR_BLK_BIT_WIDTH           0x20
>>>>>>>>> -#define ACPI_PM_TMR_BLK_BIT_OFFSET          0x00
>>>>>>>> ... these specified both width and offset.
>>>>>>> Since OFFSET is not used anywhere I kept it local to static_tables.c. I
>>>>>>> can restore these macros per block and move them to public header but...
>>>>>>>
>>>>>>>>> --- a/xen/include/public/hvm/ioreq.h
>>>>>>>>> +++ b/xen/include/public/hvm/ioreq.h
>>>>>>>>> @@ -24,6 +24,8 @@
>>>>>>>>>  #ifndef _IOREQ_H_
>>>>>>>>>  #define _IOREQ_H_
>>>>>>>>>  
>>>>>>>>> +#include "hvm_info_table.h" /* HVM_MAX_VCPUS */
>>>>>>>>> +
>>>>>>>>>  #define IOREQ_READ      1
>>>>>>>>>  #define IOREQ_WRITE     0
>>>>>>>>>  
>>>>>>>>> @@ -124,6 +126,17 @@ typedef struct buffered_iopage buffered_iopage_t;
>>>>>>>>>  #define ACPI_GPE0_BLK_ADDRESS        ACPI_GPE0_BLK_ADDRESS_V0
>>>>>>>>>  #define ACPI_GPE0_BLK_LEN            ACPI_GPE0_BLK_LEN_V0
>>>>>>>>>  
>>>>>>>>> +#define ACPI_PM1A_EVT_BLK_LEN        0x04
>>>>>>>>> +#define ACPI_PM1A_CNT_BLK_LEN        0x02
>>>>>>>>> +#define ACPI_PM_TMR_BLK_LEN          0x04
>>>>>>>> Just like ACPI_GPE0_BLK_LEN these should really go next to their
>>>>>>>> address definitions. 
>>>>>>> ... together with this, it will make it rather messy/unsightly to go
>>>>>>> with Andrew's request to ifdef this with __XEN__/__XEN_TOOLS__.
>>>>>> Well, framing them that way is a good excuse for having them
>>>>>> separate from the others. In fact, however, the others also
>>>>>> should get hidden in the same way, just that we would need to
>>>>>> be more careful there (read: make the condition also check
>>>>>> __XEN_INTERFACE_VERSION__).
>>>>> Sorry, I don't follow this. How can interface version help here?
>>>> We can't outright remove existing definitions from the public interface,
>>>> but we can limit their exposure to old consumers.
>>> But don't we need to support both V0 and V1 as long as qemu-trad is
>>> supported? In other words, checking interface version won't limit the
>>> scope at this point.
>> Doesn't qemu-trad set __XEN_TOOLS__?
> Oh, so you meant that interface version would be an OR, in addition to
> __XEN__ and __XEN_TOOLS__?
>
>>>>>>>> Provided we really want to hard code further
>>>>>>>> values here in the first place, which I don't think we should. The
>>>>>>>> goal should rather be for all these hard coded values to go away
>>>>>>>> (which really should have happened when the V1 variants had
>>>>>>>> been added).
>>>>>>> How can we not hardcode this if the values should match those in FADT
>>>>>>> (i.e. static_tables.c)?
>>>>>> By having the loading entity obtain the dynamic values and adjust
>>>>>> the table(s) accordingly.
>>>>> And this. Which loading entity (ACPI builder?) and how would it adjust
>>>>> the addresses? It still needs those addresses defined somewhere. And the
>>>>> the hypervisor, which can't parse guest FADT, needs to get those addresses.
>>>> Didn't Andrew make quite clear that there needs to be a central
>>>> authority assigning guest resources? That's where the values
>>>> would come from, and they would need to be suitably propagated
>>>> to however is in need of knowing them.
>>> Oh, but that is still (way?) off at this point. From what I understood
>>> about Andrew's proposal this will require fairly significant update of
>>> how regions are registered.
>> Well, perhaps. Yet I question whether it's a good idea to add another
>> fixed address right now, instead of switching over first.
>
> I think getting that framework in order would be out of the scope of
> what this series is trying to achieve.

Yes - adding the central authority framework is out of scope, because it
is a lot of work on its own to do.

All I want to ensure is that no hardcoded numbers get published in a
stable place in the API.  i.e. once the central management work is done,
I want these defines to disappear completely from the header file. 
Stuff behind __TOOLS__ is safe to modify later.

~Andrew

[v2,02/11] acpi: Define ACPI IO registers for PVH guests

Commit Message

Comments

Patch