mbox series

[RFC,00/21] kvm/arm: Introduce a customizable aarch64 KVM host model

Message ID 20241025101959.601048-1-eric.auger@redhat.com (mailing list archive)
Headers show
Series kvm/arm: Introduce a customizable aarch64 KVM host model | expand

Message

Eric Auger Oct. 25, 2024, 10:17 a.m. UTC
This RFC series introduces a KVM host "custom" model.

Since v6.7 kernel, KVM/arm allows the userspace to overwrite the values
of a subset of ID regs. The list of writable fields continues to grow.
The feature ID range is defined as the AArch64 System register space
with op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, op2=={0-7}.

The custom model uses this capability and allows to tune the host
passthrough model by overriding some of the host passthrough ID regs.

The end goal is to get more flexibility when migrating guests
between different machines. We would like the upper software layer
to be able detect how tunable the vpcu is on both source and destination
and accordingly define a customized KVM host model that can fit
both ends. With the legacy host passthrough model, this migration
use case would fail.

QEMU queries the host kernel to get the list of writable ID reg
fields and expose all the writable fields as uint64 properties. Those
are named "SYSREG_<REG>_<FIELD>". REG and FIELD names are those
described in ARM ARM Reference manual and linux arch/arm64/tools/sysreg.
Some awk scriptsintroduced in the series help parsing the sysreg file and
generate some code. those scripts are used in a similar way as
scripts/update-linux-headers.sh.  In case the ABI gets broken, it is
still possible to manually edit the generated code. However it is
globally expected the REG and FIELD names are stable.

The list of SYSREG_ID properties can be retrieved through the qmp
monitor using query-cpu-model-expansion [2].

The first part of the series mostly consists in migrating id reg
storage from named fields in ARMISARegisters to anonymous index
ordered storage in an IdRegMap struct array. The goal is to have
a generic way to store all id registers, also compatible with the
way we retrieve their writable capability at kernel level through
the KVM_ARM_GET_REG_WRITABLE_MASKS ioctl. Having named fields
prevented us from getting this scalability/genericity. Although the
change is invasive  it is quite straightforward and should be easy
to be reviewed.

Then the bulk of the job is to retrieve the writable ID fields and
match them against a "human readable" description of those fields.
We use awk scripts, derived from kernel arch/arm64/tools/gen-sysreg.awk
(so all the credit to Mark Rutland) that populates a data structure
which describes all the ID regs in sysreg and their fields. We match
writable ID reg fields with those latter and dynamically create a
uint64 property.

Then we need to extend the list of id regs read from the host
so that we get a chance to let their value overriden and write them
back into KVM .

This expectation is that this custom KVM host model can prepare for
the advent of named models. Introducing named models with reduced
and explicitly defined features is the next step.

Obviously this series is not able to cope with non writable ID regs.
For instance the problematic of MIDR/REVIDR setting is not handled
at the moment.

Connie & Eric

This series can be found at:
https://github.com/eauger/qemu/tree/custom-cpu-model-rfc

TESTS:
- with few IDREG fields that can be easily examined from guest
  userspace:
  -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0,SYSREG_ID_AA64ISAR1_EL1_DPB=0x0
- migration between custom models
- TCG A57 non regressions. Light testing for TCG though. Deep
  review may detect some mistakes when migrating between named fields
  and IDRegMap storage
- light testing of introspection. Testing a given writable ID field
  value with query-cpu-model-expansion is not supported yet.

TODO/QUESTIONS:
- Some idreg named fields are not yet migrated to an array storage.
  some of them are not in isar struct either. Maybe we could have
  handled TCG and KVM separately and it may turn out that this
  conversion is unneeded. So as it is quite cumbersome I prefered
  to keep it for a later stage.
- the custom model does not come with legacy host properties
  such as SVE, MTE, expecially those that induce some KVM
  settings. This needs to be fixed.
- The custom model and its exposed properties depend on the host
  capabilities. More and more IDREG become writable meaning that
  the custom model gains more properties over the time and it is
  host linux dependent. At the moment there is no versioning in
  place. By default the custom model is a host passthrough model
  (besides the legacy functions). So if the end-user tries to set
  a field that is not writable from a kernel pov, it will fail.
  Nevertheless a versionned custom model could constrain the props
  exposed, independently on the host linux capabilities.
- the QEMU layer does not take care of IDREG field value consistency.
  The kernel neither. I imagine this could be the role of the upper
  layer to implement a vcpu profile that makes sure settings are
  consistent. Here we come to "named" models. What should they look
  like on ARM?
- Implementation details:
  -  it seems there are a lot of duplications in
  the code. ID regs are described in different manners, with different
  data structs, for TCG, now for KVM.
  - The IdRegMap->regs is sparsely populated. Maybe a better data
  struct could be used, although this is the one chosen for the kernel
  uapi.

References:

[1] [PATCH v12 00/11] Support writable CPU ID registers from userspace
https://lore.kernel.org/all/20230609190054.1542113-1-oliver.upton@linux.dev/

[2]
qemu-system-aarch64 -qmp unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -M virt --enable-kvm -cpu custom
scripts/qmp/qmp-shell /home/augere/TEST/QEMU/qmp-sock
Welcome to the QMP low-level shell!
Connected to QEMU 9.0.50
(QEMU) query-cpu-model-expansion type=full model={"name":"custom"}

[3]
KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES
KVM_ARM_GET_REG_WRITABLE_MASKS
Documentation/virt/kvm/api.rst

[4] linux "sysreg" file
linux/arch/arm64/tools/sysreg and gen-sysreg.awk
./tools/include/generated/asm/sysreg-defs.h


Cornelia Huck (4):
  kvm: kvm_get_writable_id_regs
  virt: Allow custom vcpu model in arm virt
  arm-qmp-cmds: introspection for custom model
  arm/cpu-features: Document custom vcpu model

Eric Auger (17):
  arm/cpu: Add sysreg definitions in cpu-sysegs.h
  arm/cpu: Store aa64isar0 into the idregs arrays
  arm/cpu: Store aa64isar1/2 into the idregs array
  arm/cpu: Store aa64drf0/1 into the idregs array
  arm/cpu: Store aa64mmfr0-3 into the idregs array
  arm/cpu: Store aa64drf0/1 into the idregs array
  arm/cpu: Store aa64smfr0 into the idregs array
  arm/cpu: Store id_isar0-7 into the idregs array
  arm/cpu: Store id_mfr0/1 into the idregs array
  arm/cpu: Store id_dfr0/1 into the idregs array
  arm/cpu: Store id_mmfr0-5 into the idregs array
  arm/cpu: Add infra to handle generated ID register definitions
  arm/cpu: Add sysreg generation scripts
  arm/cpu: Add generated files
  arm/kvm: Allow reading all the writable ID registers
  arm/kvm: write back modified ID regs to KVM
  arm/cpu: Introduce a customizable kvm host cpu model

 docs/system/arm/cpu-features.rst      |  55 ++-
 target/arm/cpu-custom.h               |  58 +++
 target/arm/cpu-features.h             | 307 ++++++------
 target/arm/cpu-sysregs.h              | 152 ++++++
 target/arm/cpu.h                      | 120 +++--
 target/arm/internals.h                |   6 +-
 target/arm/kvm_arm.h                  |  16 +-
 hw/arm/virt.c                         |   3 +
 hw/intc/armv7m_nvic.c                 |  27 +-
 target/arm/arm-qmp-cmds.c             |  56 ++-
 target/arm/cpu-sysreg-properties.c    | 682 ++++++++++++++++++++++++++
 target/arm/cpu.c                      | 124 +++--
 target/arm/cpu64.c                    | 265 +++++++---
 target/arm/helper.c                   |  68 +--
 target/arm/kvm.c                      | 253 +++++++---
 target/arm/ptw.c                      |   6 +-
 target/arm/tcg/cpu-v7m.c              | 174 +++----
 target/arm/tcg/cpu32.c                | 320 ++++++------
 target/arm/tcg/cpu64.c                | 460 ++++++++---------
 scripts/gen-cpu-sysreg-properties.awk | 325 ++++++++++++
 scripts/gen-cpu-sysregs-header.awk    |  47 ++
 scripts/update-aarch64-sysreg-code.sh |  27 +
 target/arm/meson.build                |   1 +
 target/arm/trace-events               |   8 +
 24 files changed, 2646 insertions(+), 914 deletions(-)
 create mode 100644 target/arm/cpu-custom.h
 create mode 100644 target/arm/cpu-sysregs.h
 create mode 100644 target/arm/cpu-sysreg-properties.c
 create mode 100755 scripts/gen-cpu-sysreg-properties.awk
 create mode 100755 scripts/gen-cpu-sysregs-header.awk
 create mode 100755 scripts/update-aarch64-sysreg-code.sh

Comments

Cornelia Huck Oct. 25, 2024, 12:49 p.m. UTC | #1
On Fri, Oct 25 2024, Eric Auger <eric.auger@redhat.com> wrote:

> This RFC series introduces a KVM host "custom" model.
>
> Since v6.7 kernel, KVM/arm allows the userspace to overwrite the values
> of a subset of ID regs. The list of writable fields continues to grow.
> The feature ID range is defined as the AArch64 System register space
> with op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, op2=={0-7}.
>
> The custom model uses this capability and allows to tune the host
> passthrough model by overriding some of the host passthrough ID regs.
>
> The end goal is to get more flexibility when migrating guests
> between different machines. We would like the upper software layer
> to be able detect how tunable the vpcu is on both source and destination
> and accordingly define a customized KVM host model that can fit
> both ends. With the legacy host passthrough model, this migration
> use case would fail.
>
> QEMU queries the host kernel to get the list of writable ID reg
> fields and expose all the writable fields as uint64 properties. Those
> are named "SYSREG_<REG>_<FIELD>". REG and FIELD names are those
> described in ARM ARM Reference manual and linux arch/arm64/tools/sysreg.
> Some awk scriptsintroduced in the series help parsing the sysreg file and
> generate some code. those scripts are used in a similar way as
> scripts/update-linux-headers.sh.  In case the ABI gets broken, it is
> still possible to manually edit the generated code. However it is
> globally expected the REG and FIELD names are stable.
>
> The list of SYSREG_ID properties can be retrieved through the qmp
> monitor using query-cpu-model-expansion [2].
>
> The first part of the series mostly consists in migrating id reg
> storage from named fields in ARMISARegisters to anonymous index
> ordered storage in an IdRegMap struct array. The goal is to have
> a generic way to store all id registers, also compatible with the
> way we retrieve their writable capability at kernel level through
> the KVM_ARM_GET_REG_WRITABLE_MASKS ioctl. Having named fields
> prevented us from getting this scalability/genericity. Although the
> change is invasive  it is quite straightforward and should be easy
> to be reviewed.
>
> Then the bulk of the job is to retrieve the writable ID fields and
> match them against a "human readable" description of those fields.
> We use awk scripts, derived from kernel arch/arm64/tools/gen-sysreg.awk
> (so all the credit to Mark Rutland) that populates a data structure
> which describes all the ID regs in sysreg and their fields. We match
> writable ID reg fields with those latter and dynamically create a
> uint64 property.
>
> Then we need to extend the list of id regs read from the host
> so that we get a chance to let their value overriden and write them
> back into KVM .
>
> This expectation is that this custom KVM host model can prepare for
> the advent of named models. Introducing named models with reduced
> and explicitly defined features is the next step.
>
> Obviously this series is not able to cope with non writable ID regs.
> For instance the problematic of MIDR/REVIDR setting is not handled
> at the moment.
>
> Connie & Eric
>
> This series can be found at:
> https://github.com/eauger/qemu/tree/custom-cpu-model-rfc
>
> TESTS:
> - with few IDREG fields that can be easily examined from guest
>   userspace:
>   -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0,SYSREG_ID_AA64ISAR1_EL1_DPB=0x0
> - migration between custom models
> - TCG A57 non regressions. Light testing for TCG though. Deep
>   review may detect some mistakes when migrating between named fields
>   and IDRegMap storage
> - light testing of introspection. Testing a given writable ID field
>   value with query-cpu-model-expansion is not supported yet.
>
> TODO/QUESTIONS:
> - Some idreg named fields are not yet migrated to an array storage.
>   some of them are not in isar struct either. Maybe we could have
>   handled TCG and KVM separately and it may turn out that this
>   conversion is unneeded. So as it is quite cumbersome I prefered
>   to keep it for a later stage.
> - the custom model does not come with legacy host properties
>   such as SVE, MTE, expecially those that induce some KVM
>   settings. This needs to be fixed.

One idea would be to convert these to sugar and configure the relevant
features instead. The accelerator-specific properties would need to
either stay or be deprecated and later removed, I think; I'd hope there
are as few as possible of those. The ones requiring to actively do
something (e.g. enable MTE with KVM) are the fun ones.

> - The custom model and its exposed properties depend on the host
>   capabilities. More and more IDREG become writable meaning that
>   the custom model gains more properties over the time and it is
>   host linux dependent. At the moment there is no versioning in
>   place. By default the custom model is a host passthrough model
>   (besides the legacy functions). So if the end-user tries to set
>   a field that is not writable from a kernel pov, it will fail.
>   Nevertheless a versionned custom model could constrain the props
>   exposed, independently on the host linux capabilities.

Alternatively (or additionally?), provide a way to discover the set of
writable id regs, e.g. via a qmp command that returns what QEMU got from
KVM_ARM_GET_REG_WRITABLE_MASKS.

> - the QEMU layer does not take care of IDREG field value consistency.
>   The kernel neither. I imagine this could be the role of the upper
>   layer to implement a vcpu profile that makes sure settings are
>   consistent. Here we come to "named" models. What should they look
>   like on ARM?

I would go with Armv8.6, Armv9.2, etc. that include the mandatory
features, and additionally the features that can go on top (base model,
can be extended; this is similar to what we already have e.g. for s390
z<whatever> models, but allows for the complication that we have many
vendors doing many different designs.) Not sure if we'd also want models
for specific cpus, but that could go on top, I guess.

> - Implementation details:
>   -  it seems there are a lot of duplications in
>   the code. ID regs are described in different manners, with different
>   data structs, for TCG, now for KVM.

Base data struct for all accelerators and used for accelerator-agnostic
code, accelerator-specific structures on top (so that we can add new
accelerators, if wanted?) Once the cpu is up and running, we're not
manipulating the regs anyway.

>   - The IdRegMap->regs is sparsely populated. Maybe a better data
>   struct could be used, although this is the one chosen for the kernel
>   uapi.
>
> References:
>
> [1] [PATCH v12 00/11] Support writable CPU ID registers from userspace
> https://lore.kernel.org/all/20230609190054.1542113-1-oliver.upton@linux.dev/
>
> [2]
> qemu-system-aarch64 -qmp unix:/home/augere/TEST/QEMU/qmp-sock,server,nowait -M virt --enable-kvm -cpu custom
> scripts/qmp/qmp-shell /home/augere/TEST/QEMU/qmp-sock
> Welcome to the QMP low-level shell!
> Connected to QEMU 9.0.50
> (QEMU) query-cpu-model-expansion type=full model={"name":"custom"}
>
> [3]
> KVM_CAP_ARM_SUPPORTED_REG_MASK_RANGES
> KVM_ARM_GET_REG_WRITABLE_MASKS
> Documentation/virt/kvm/api.rst
>
> [4] linux "sysreg" file
> linux/arch/arm64/tools/sysreg and gen-sysreg.awk
> ./tools/include/generated/asm/sysreg-defs.h
>
>
> Cornelia Huck (4):
>   kvm: kvm_get_writable_id_regs
>   virt: Allow custom vcpu model in arm virt
>   arm-qmp-cmds: introspection for custom model
>   arm/cpu-features: Document custom vcpu model
>
> Eric Auger (17):
>   arm/cpu: Add sysreg definitions in cpu-sysegs.h
>   arm/cpu: Store aa64isar0 into the idregs arrays
>   arm/cpu: Store aa64isar1/2 into the idregs array
>   arm/cpu: Store aa64drf0/1 into the idregs array
>   arm/cpu: Store aa64mmfr0-3 into the idregs array
>   arm/cpu: Store aa64drf0/1 into the idregs array
>   arm/cpu: Store aa64smfr0 into the idregs array
>   arm/cpu: Store id_isar0-7 into the idregs array
>   arm/cpu: Store id_mfr0/1 into the idregs array
>   arm/cpu: Store id_dfr0/1 into the idregs array
>   arm/cpu: Store id_mmfr0-5 into the idregs array
>   arm/cpu: Add infra to handle generated ID register definitions
>   arm/cpu: Add sysreg generation scripts
>   arm/cpu: Add generated files
>   arm/kvm: Allow reading all the writable ID registers
>   arm/kvm: write back modified ID regs to KVM
>   arm/cpu: Introduce a customizable kvm host cpu model
>
>  docs/system/arm/cpu-features.rst      |  55 ++-
>  target/arm/cpu-custom.h               |  58 +++
>  target/arm/cpu-features.h             | 307 ++++++------
>  target/arm/cpu-sysregs.h              | 152 ++++++
>  target/arm/cpu.h                      | 120 +++--
>  target/arm/internals.h                |   6 +-
>  target/arm/kvm_arm.h                  |  16 +-
>  hw/arm/virt.c                         |   3 +
>  hw/intc/armv7m_nvic.c                 |  27 +-
>  target/arm/arm-qmp-cmds.c             |  56 ++-
>  target/arm/cpu-sysreg-properties.c    | 682 ++++++++++++++++++++++++++
>  target/arm/cpu.c                      | 124 +++--
>  target/arm/cpu64.c                    | 265 +++++++---
>  target/arm/helper.c                   |  68 +--
>  target/arm/kvm.c                      | 253 +++++++---
>  target/arm/ptw.c                      |   6 +-
>  target/arm/tcg/cpu-v7m.c              | 174 +++----
>  target/arm/tcg/cpu32.c                | 320 ++++++------
>  target/arm/tcg/cpu64.c                | 460 ++++++++---------
>  scripts/gen-cpu-sysreg-properties.awk | 325 ++++++++++++
>  scripts/gen-cpu-sysregs-header.awk    |  47 ++
>  scripts/update-aarch64-sysreg-code.sh |  27 +
>  target/arm/meson.build                |   1 +
>  target/arm/trace-events               |   8 +
>  24 files changed, 2646 insertions(+), 914 deletions(-)
>  create mode 100644 target/arm/cpu-custom.h
>  create mode 100644 target/arm/cpu-sysregs.h
>  create mode 100644 target/arm/cpu-sysreg-properties.c
>  create mode 100755 scripts/gen-cpu-sysreg-properties.awk
>  create mode 100755 scripts/gen-cpu-sysregs-header.awk
>  create mode 100755 scripts/update-aarch64-sysreg-code.sh
>
> -- 
> 2.41.0
Kashyap Chamarthy Oct. 25, 2024, 2:51 p.m. UTC | #2
On Fri, Oct 25, 2024 at 12:17:19PM +0200, Eric Auger wrote:

Hi Eric,

I'm new to Arm, so please bear with my questions :)

> This RFC series introduces a KVM host "custom" model.

(a) On terminology: as we know, in the x86 world, QEMU uses these
    terms[1]:

    - Host passthrough
    - Named CPU models
    - Then there's the libvirt abstraction, "host-model", that aims to
      provide the best of 'host-passthrough' + named CPU models.

    Now I see the term "host 'custom' model" here.  Most
    management-layer tools and libvirt users are familiar with the
    classic terms "host-model" or "custom".  If we now say "host
    'custom' model", it can create confusion.  I hope we can settle on
    one of the existing terms, or create a new term if need be.

    (I'll share one more thought on how layers above libvirt tend to use
    the term "custom", as a reply to patch 21/21, "arm/cpu-features:
    Document custom vcpu model".)

(b) The current CPU features doc[2] for Arm doesn't mention "host
    passthrough" at all.  It is only implied by the last part of this
    paragraph, from the section titled "A note about CPU models and
    KVM"[3]:

      "Named CPU models generally do not work with KVM. There are a few
      cases that do work [...] but mostly if KVM is enabled the 'host'
      CPU type must be used."

    Related: in your reply[4] to Dan in this series, you write: "Having
    named models is the next thing".  So named CPU models will be a
    thing in Arm, too?  Then the above statement in the Arm
    'cpu-features' will need updating :-)

[...]

> - the QEMU layer does not take care of IDREG field value consistency.
>   The kernel neither. I imagine this could be the role of the upper
>   layer to implement a vcpu profile that makes sure settings are
>   consistent. Here we come to "named" models. What should they look
>   like on ARM?

Are there reasons why they can't be similar to how x86 reports in
`qemu-system-x86 -cpu help`?  

E.g. If it's an NVIDIA "Grace A02" (Neoverse-V2) host, it can report:

    [gracehopper] $> qemu-kvm -cpu help
    Available CPUs:
      gracehopper-neoverse-v2
      cortex-a57 (deprecated)
      host
      max

Or whatever is the preferred nomenclature for ARM.  It also gives users
of both x86 and ARM deployments a consistent expectation.  

Currently on a "Grace A02" ("Neoverse-V2") machine, it reports:

    [gracehopper] $> qemu-kvm -cpu help
    Available CPUs:
      cortex-a57 (deprecated)
      host
      max

I see it's because there are no named models yet on ARM :-)

[...]

[1] https://www.qemu.org/docs/master/system/i386/cpu.html
[2] https://www.qemu.org/docs/master/system/arm/cpu-features.html
[3] https://www.qemu.org/docs/master/system/arm/cpu-features.html#a-note-about-cpu-models-and-kvm
[4] https://lists.nongnu.org/archive/html/qemu-arm/2024-10/msg00891.html