diff mbox series

[v2] hw/misc/vmfwupdate: Introduce hypervisor fw-cfg interface support

Message ID 20241216114841.1025070-1-anisinha@redhat.com (mailing list archive)
State New
Headers show
Series [v2] hw/misc/vmfwupdate: Introduce hypervisor fw-cfg interface support | expand

Commit Message

Ani Sinha Dec. 16, 2024, 11:48 a.m. UTC
VM firmware update is a mechanism where the virtual machines can use their
preferred and trusted firmware image in their execution environment without
having to depend on a untrusted party to provide the firmware bundle. This is
particularly useful for confidential virtual machines that are deployed in the
cloud where the tenant and the cloud provider are two different entities. In
this scenario, virtual machines can bring their own trusted firmware image
bundled as a part of their filesystem (using UKIs for example[1]) and then use
this hypervisor interface to update to their trusted firmware image. This also
allows the guests to have a consistent measurements on the firmware image.

This change introduces basic support for the fw-cfg based hypervisor interface
and the corresponding device. The change also includes the
specification document for this interface. The interface is made generic
enough so that guests are free to use their own ABI to pass required
information between initial and trusted execution contexts (where they are
running their own trusted firmware image) without the hypervisor getting
involved in between. In subsequent patches, we will introduce other minimal
changes on the hypervisor that are required to make the mechanism work.

[1] See systemd pull requests https://github.com/systemd/systemd/pull/35091
and https://github.com/systemd/systemd/pull/35281 for some discussions on
how we can bundle firmware image within an UKI.

CC: Alex Graf <graf@amazon.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Gerd Hoffman <kraxel@redhat.com>
CC: Igor Mammedov <imammedo@redhat.com>
CC: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Ani Sinha <anisinha@redhat.com>
---
 MAINTAINERS                  |   9 ++
 docs/specs/index.rst         |   1 +
 docs/specs/vmfwupdate.rst    | 109 ++++++++++++++++++++++++
 hw/misc/meson.build          |   2 +
 hw/misc/vmfwupdate.c         | 157 +++++++++++++++++++++++++++++++++++
 include/hw/misc/vmfwupdate.h | 103 +++++++++++++++++++++++
 6 files changed, 381 insertions(+)
 create mode 100644 docs/specs/vmfwupdate.rst
 create mode 100644 hw/misc/vmfwupdate.c
 create mode 100644 include/hw/misc/vmfwupdate.h

changelogs:
v2: do not allow changing bios region if advertized size is 0 (non-pc
platforms).

Comments

Philippe Mathieu-Daudé Dec. 16, 2024, 3:05 p.m. UTC | #1
Hi Ani,

On 16/12/24 12:48, Ani Sinha wrote:
> VM firmware update is a mechanism where the virtual machines can use their
> preferred and trusted firmware image in their execution environment without
> having to depend on a untrusted party to provide the firmware bundle. This is
> particularly useful for confidential virtual machines that are deployed in the
> cloud where the tenant and the cloud provider are two different entities. In
> this scenario, virtual machines can bring their own trusted firmware image
> bundled as a part of their filesystem (using UKIs for example[1]) and then use
> this hypervisor interface to update to their trusted firmware image. This also
> allows the guests to have a consistent measurements on the firmware image.
> 
> This change introduces basic support for the fw-cfg based hypervisor interface
> and the corresponding device. The change also includes the
> specification document for this interface. The interface is made generic
> enough so that guests are free to use their own ABI to pass required
> information between initial and trusted execution contexts (where they are
> running their own trusted firmware image) without the hypervisor getting
> involved in between. In subsequent patches, we will introduce other minimal
> changes on the hypervisor that are required to make the mechanism work.
> 
> [1] See systemd pull requests https://github.com/systemd/systemd/pull/35091
> and https://github.com/systemd/systemd/pull/35281 for some discussions on
> how we can bundle firmware image within an UKI.
> 
> CC: Alex Graf <graf@amazon.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Gerd Hoffman <kraxel@redhat.com>
> CC: Igor Mammedov <imammedo@redhat.com>
> CC: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> ---
>   MAINTAINERS                  |   9 ++
>   docs/specs/index.rst         |   1 +
>   docs/specs/vmfwupdate.rst    | 109 ++++++++++++++++++++++++
>   hw/misc/meson.build          |   2 +
>   hw/misc/vmfwupdate.c         | 157 +++++++++++++++++++++++++++++++++++
>   include/hw/misc/vmfwupdate.h | 103 +++++++++++++++++++++++
>   6 files changed, 381 insertions(+)
>   create mode 100644 docs/specs/vmfwupdate.rst
>   create mode 100644 hw/misc/vmfwupdate.c
>   create mode 100644 include/hw/misc/vmfwupdate.h

Can we have a test?

> diff --git a/hw/misc/meson.build b/hw/misc/meson.build
> index d02d96e403..4c5bdb0de2 100644
> --- a/hw/misc/meson.build
> +++ b/hw/misc/meson.build
> @@ -148,6 +148,8 @@ specific_ss.add(when: 'CONFIG_MAC_VIA', if_true: files('mac_via.c'))
>   specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true: files('mips_cmgcr.c', 'mips_cpc.c'))
>   specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true: files('mips_itu.c'))
>   
> +specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true: files('vmfwupdate.c'))
> +
>   system_ss.add(when: 'CONFIG_SBSA_REF', if_true: files('sbsa_ec.c'))
>   
>   # HPPA devices
> diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
> new file mode 100644
> index 0000000000..1e29a610c0
> --- /dev/null
> +++ b/hw/misc/vmfwupdate.c
> @@ -0,0 +1,157 @@
> +/*
> + * Guest driven VM boot component update device
> + * For details and specification, please look at docs/specs/vmfwupdate.rst.
> + *
> + * Copyright (C) 2024 Red Hat, Inc.
> + *
> + * Authors: Ani Sinha <anisinha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/module.h"
> +#include "sysemu/reset.h"
> +#include "hw/nvram/fw_cfg.h"
> +#include "hw/i386/pc.h"
> +#include "hw/qdev-properties.h"
> +#include "hw/misc/vmfwupdate.h"
> +#include "qemu/error-report.h"
> +
> +static void fw_update_reset(void *dev)
> +{
> +    /* a NOOP at present */
> +    return;
> +}
> +
> +
> +static uint64_t get_max_fw_size(void)
> +{
> +    Object *m_obj = qdev_get_machine();
> +    PCMachineState *pcms = PC_MACHINE(m_obj);
> +
> +    if (pcms) {
> +        return pcms->max_fw_size;
> +    } else {
> +        return 0;
> +    }
> +}
> +
> +static void fw_blob_write(void *dev, off_t offset, size_t len)
> +{
> +    VMFwUpdateState *s = VMFWUPDATE(dev);
> +
> +    /* for non-pc platform, we do not allow changing bios_size yet */
> +    if (!s->plat_bios_size) {
> +        return;
> +    }
> +
> +    /*
> +     * in order to change the bios size, appropriate capability
> +     * must be enabled
> +     */
> +    if (s->fw_blob.bios_size &&
> +        !(s->capability & VMFWUPDATE_CAP_BIOS_RESIZE)) {
> +        warn_report("vmfwupdate: VMFWUPDATE_CAP_BIOS_RESIZE not enabled");
> +        return;
> +    }
> +
> +    s->plat_bios_size = s->fw_blob.bios_size;
> +
> +    return;
> +}
> +
> +static void vmfwupdate_realize(DeviceState *dev, Error **errp)
> +{
> +    VMFwUpdateState *s = VMFWUPDATE(dev);
> +    FWCfgState *fw_cfg = fw_cfg_find();
> +
> +    /* multiple devices are not supported */
> +    if (!vmfwupdate_find()) {
> +        error_setg(errp, "at most one %s device is permitted",
> +                   TYPE_VMFWUPDATE);
> +        return;
> +    }
> +
> +    /* fw_cfg with DMA support is necessary to support this device */
> +    if (!fw_cfg || !fw_cfg_dma_enabled(fw_cfg)) {
> +        error_setg(errp, "%s device requires fw_cfg",
> +                   TYPE_VMFWUPDATE);
> +        return;
> +    }
> +
> +    memset(&s->fw_blob, 0, sizeof(s->fw_blob));
> +    memset(&s->opaque_blobs, 0, sizeof(s->opaque_blobs));
> +
> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_OBLOB,
> +                             NULL, NULL, s,
> +                             &s->opaque_blobs,
> +                             sizeof(s->opaque_blobs),
> +                             false);
> +
> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_FWBLOB,
> +                             NULL, fw_blob_write, s,
> +                             &s->fw_blob,
> +                             sizeof(s->fw_blob),
> +                             false);
> +
> +    /*
> +     * Add global capability fw_cfg file. This will be used by the guest to
> +     * check capability of the hypervisor.
> +     */
> +    s->capability = cpu_to_le16(CAP_VMFWUPD_MASK | VMFWUPDATE_CAP_EDKROM);
> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_CAP,
> +                    &s->capability, sizeof(s->capability));
> +
> +    s->plat_bios_size = get_max_fw_size(); /* for non-pc, this is 0 */
> +    /* size of bios region for the platform - read only by the guest */
> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_BIOS_SIZE,
> +                    &s->plat_bios_size, sizeof(s->plat_bios_size));
> +    /*
> +     * add fw cfg control file to disable the hypervisor interface.
> +     */
> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_CONTROL,
> +                             NULL, NULL, s,
> +                             &s->disable,
> +                             sizeof(s->disable),
> +                             false);
> +    /*
> +     * This device requires to register a global reset because it is
> +     * not plugged to a bus (which, as its QOM parent, would reset it).
> +     */
> +    qemu_register_reset(fw_update_reset, dev);
> +}
> +
> +static Property vmfwupdate_properties[] = {
> +    DEFINE_PROP_UINT8("disable", VMFwUpdateState, disable, 0),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    /* we are not interested in migration - so no need to populate dc->vmsd */
> +    dc->desc = "VM firmware blob update device";
> +    dc->realize = vmfwupdate_realize;
> +    dc->hotpluggable = false;
> +    device_class_set_props(dc, vmfwupdate_properties);
> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);

How is this device instantiated?

> +}
> +
> +static const TypeInfo vmfwupdate_device_info = {
> +    .name          = TYPE_VMFWUPDATE,
> +    .parent        = TYPE_DEVICE,
> +    .instance_size = sizeof(VMFwUpdateState),
> +    .class_init    = vmfwupdate_device_class_init,
> +};
> +
> +static void vmfwupdate_register_types(void)
> +{
> +    type_register_static(&vmfwupdate_device_info);

New models should use DEFINE_TYPES().

> +}
> +
> +type_init(vmfwupdate_register_types);
> diff --git a/include/hw/misc/vmfwupdate.h b/include/hw/misc/vmfwupdate.h
> new file mode 100644
> index 0000000000..e9229d807b
> --- /dev/null
> +++ b/include/hw/misc/vmfwupdate.h
> @@ -0,0 +1,103 @@
> +/*
> + * Guest driven VM boot component update device
> + * For details and specification, please look at docs/specs/vmfwupdate.rst.
> + *
> + * Copyright (C) 2024 Red Hat, Inc.
> + *
> + * Authors: Ani Sinha <anisinha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +#ifndef VMFWUPDATE_H
> +#define VMFWUPDATE_H
> +
> +#include "hw/qdev-core.h"
> +#include "qom/object.h"
> +#include "qemu/units.h"
> +
> +#define TYPE_VMFWUPDATE "vmfwupdate"
> +
> +#define VMFWUPDCAPMSK  0xffff /* least significant 16 capability bits */
> +
> +#define VMFWUPDATE_CAP_EDKROM 0x08 /* bit 4 represents support for EDKROM */
> +#define VMFWUPDATE_CAP_BIOS_RESIZE 0x04 /* guests may resize bios region */
> +#define CAP_VMFWUPD_MASK 0x80
> +
> +#define VMFWUPDATE_OPAQUE_SIZE (1024 * MiB)
> +
> +/* fw_cfg file definitions */
> +#define FILE_VMFWUPDATE_OBLOB "etc/vmfwupdate/opaque-blob"
> +#define FILE_VMFWUPDATE_FWBLOB "etc/vmfwupdate/fw-blob"
> +#define FILE_VMFWUPDATE_CAP "etc/vmfwupdate/cap"
> +#define FILE_VMFWUPDATE_BIOS_SIZE "etc/vmfwupdate/bios-size"
> +#define FILE_VMFWUPDATE_CONTROL "etc/vmfwupdate/disable"
> +
> +/*
> + * Address and length of the guest provided firmware blob.
> + * The blob itself is passed using the guest shared memory to QEMU.
> + * This is then copied to the guest private memeory in the secure vm
> + * by the hypervisor.
> + */
> +typedef struct {
> +    uint32_t bios_size; /*
> +                         * this is used by the guest to update plat_bios_size
> +                         * when VMFWUPDATE_CAP_BIOS_RESIZE is set.
> +                         */
> +    uint64_t bios_paddr; /*
> +                          * starting gpa where the blob is in shared guest
> +                          * memory. Cleared upon system reset.
> +                          */
> +} VMFwUpdateFwBlob;
> +
> +typedef struct VMFwUpdateState {
> +    DeviceState parent_obj;
> +
> +    /*
> +     * capabilities - 64 bits.
> +     * Little endian format.
> +     */
> +    uint64_t capability;
> +
> +    /*
> +     * size of the bios region - architecture dependent.
> +     * Read-only by the guest unless VMFWUPDATE_CAP_BIOS_RESIZE
> +     * capability is set.
> +     */
> +    uint32_t plat_bios_size;
> +
> +    /*
> +     * disable - disables the interface when non-zero value is written to it.
> +     * Writing 0 to this file enables the interface.
> +     */
> +    uint8_t disable;
> +
> +    /*
> +     * The first stage boot uses this opaque blob to convey to the next stage
> +     * where the next stage components are loaded. The exact structure and
> +     * number of entries are unknown to the hypervisor and the hypervisor
> +     * does not touch this memory or do any validations.
> +     * The contents of this memory needs to be validated by the guest and
> +     * must be ABI compatible between the first and second stages.
> +     */
> +    unsigned char opaque_blobs[VMFWUPDATE_OPAQUE_SIZE];
> +
> +    /*
> +     * firmware blob addresses and sizes. These are moved to guest
> +     * private memory.
> +     */
> +    VMFwUpdateFwBlob fw_blob;
> +} VMFwUpdateState;
> +
> +OBJECT_DECLARE_SIMPLE_TYPE(VMFwUpdateState, VMFWUPDATE);
> +
> +/* returns NULL unless there is exactly one device */
> +static inline VMFwUpdateState *vmfwupdate_find(void)

What is the point of this helper? Why inline it in header?

> +{
> +    Object *o = object_resolve_path_type("", TYPE_VMFWUPDATE, NULL);
> +
> +    return o ? VMFWUPDATE(o) : NULL;
> +}
> +
> +#endif
Ani Sinha Dec. 17, 2024, 10:06 a.m. UTC | #2
> On 16 Dec 2024, at 8:35 PM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> 
> Hi Ani,
> 
> On 16/12/24 12:48, Ani Sinha wrote:
>> VM firmware update is a mechanism where the virtual machines can use their
>> preferred and trusted firmware image in their execution environment without
>> having to depend on a untrusted party to provide the firmware bundle. This is
>> particularly useful for confidential virtual machines that are deployed in the
>> cloud where the tenant and the cloud provider are two different entities. In
>> this scenario, virtual machines can bring their own trusted firmware image
>> bundled as a part of their filesystem (using UKIs for example[1]) and then use
>> this hypervisor interface to update to their trusted firmware image. This also
>> allows the guests to have a consistent measurements on the firmware image.
>> This change introduces basic support for the fw-cfg based hypervisor interface
>> and the corresponding device. The change also includes the
>> specification document for this interface. The interface is made generic
>> enough so that guests are free to use their own ABI to pass required
>> information between initial and trusted execution contexts (where they are
>> running their own trusted firmware image) without the hypervisor getting
>> involved in between. In subsequent patches, we will introduce other minimal
>> changes on the hypervisor that are required to make the mechanism work.
>> [1] See systemd pull requests https://github.com/systemd/systemd/pull/35091
>> and https://github.com/systemd/systemd/pull/35281 for some discussions on
>> how we can bundle firmware image within an UKI.
>> CC: Alex Graf <graf@amazon.com>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> CC: Gerd Hoffman <kraxel@redhat.com>
>> CC: Igor Mammedov <imammedo@redhat.com>
>> CC: Vitaly Kuznetsov <vkuznets@redhat.com>
>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>> ---
>>  MAINTAINERS                  |   9 ++
>>  docs/specs/index.rst         |   1 +
>>  docs/specs/vmfwupdate.rst    | 109 ++++++++++++++++++++++++
>>  hw/misc/meson.build          |   2 +
>>  hw/misc/vmfwupdate.c         | 157 +++++++++++++++++++++++++++++++++++
>>  include/hw/misc/vmfwupdate.h | 103 +++++++++++++++++++++++
>>  6 files changed, 381 insertions(+)
>>  create mode 100644 docs/specs/vmfwupdate.rst
>>  create mode 100644 hw/misc/vmfwupdate.c
>>  create mode 100644 include/hw/misc/vmfwupdate.h
> 
> Can we have a test?

This interface requires guest side support which is being worked on atm. So if you are thinking of a full integration test, it might take a while.
Ofcourse, I am open to ideas here.

> 
>> diff --git a/hw/misc/meson.build b/hw/misc/meson.build
>> index d02d96e403..4c5bdb0de2 100644
>> --- a/hw/misc/meson.build
>> +++ b/hw/misc/meson.build
>> @@ -148,6 +148,8 @@ specific_ss.add(when: 'CONFIG_MAC_VIA', if_true: files('mac_via.c'))
>>  specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true: files('mips_cmgcr.c', 'mips_cpc.c'))
>>  specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true: files('mips_itu.c'))
>>  +specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true: files('vmfwupdate.c'))
>> +
>>  system_ss.add(when: 'CONFIG_SBSA_REF', if_true: files('sbsa_ec.c'))
>>    # HPPA devices
>> diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
>> new file mode 100644
>> index 0000000000..1e29a610c0
>> --- /dev/null
>> +++ b/hw/misc/vmfwupdate.c
>> @@ -0,0 +1,157 @@
>> +/*
>> + * Guest driven VM boot component update device
>> + * For details and specification, please look at docs/specs/vmfwupdate.rst.
>> + *
>> + * Copyright (C) 2024 Red Hat, Inc.
>> + *
>> + * Authors: Ani Sinha <anisinha@redhat.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + *
>> + */
>> +
>> +#include "qemu/osdep.h"
>> +#include "qapi/error.h"
>> +#include "qemu/module.h"
>> +#include "sysemu/reset.h"
>> +#include "hw/nvram/fw_cfg.h"
>> +#include "hw/i386/pc.h"
>> +#include "hw/qdev-properties.h"
>> +#include "hw/misc/vmfwupdate.h"
>> +#include "qemu/error-report.h"
>> +
>> +static void fw_update_reset(void *dev)
>> +{
>> +    /* a NOOP at present */
>> +    return;
>> +}
>> +
>> +
>> +static uint64_t get_max_fw_size(void)
>> +{
>> +    Object *m_obj = qdev_get_machine();
>> +    PCMachineState *pcms = PC_MACHINE(m_obj);
>> +
>> +    if (pcms) {
>> +        return pcms->max_fw_size;
>> +    } else {
>> +        return 0;
>> +    }
>> +}
>> +
>> +static void fw_blob_write(void *dev, off_t offset, size_t len)
>> +{
>> +    VMFwUpdateState *s = VMFWUPDATE(dev);
>> +
>> +    /* for non-pc platform, we do not allow changing bios_size yet */
>> +    if (!s->plat_bios_size) {
>> +        return;
>> +    }
>> +
>> +    /*
>> +     * in order to change the bios size, appropriate capability
>> +     * must be enabled
>> +     */
>> +    if (s->fw_blob.bios_size &&
>> +        !(s->capability & VMFWUPDATE_CAP_BIOS_RESIZE)) {
>> +        warn_report("vmfwupdate: VMFWUPDATE_CAP_BIOS_RESIZE not enabled");
>> +        return;
>> +    }
>> +
>> +    s->plat_bios_size = s->fw_blob.bios_size;
>> +
>> +    return;
>> +}
>> +
>> +static void vmfwupdate_realize(DeviceState *dev, Error **errp)
>> +{
>> +    VMFwUpdateState *s = VMFWUPDATE(dev);
>> +    FWCfgState *fw_cfg = fw_cfg_find();
>> +
>> +    /* multiple devices are not supported */
>> +    if (!vmfwupdate_find()) {
>> +        error_setg(errp, "at most one %s device is permitted",
>> +                   TYPE_VMFWUPDATE);
>> +        return;
>> +    }
>> +
>> +    /* fw_cfg with DMA support is necessary to support this device */
>> +    if (!fw_cfg || !fw_cfg_dma_enabled(fw_cfg)) {
>> +        error_setg(errp, "%s device requires fw_cfg",
>> +                   TYPE_VMFWUPDATE);
>> +        return;
>> +    }
>> +
>> +    memset(&s->fw_blob, 0, sizeof(s->fw_blob));
>> +    memset(&s->opaque_blobs, 0, sizeof(s->opaque_blobs));
>> +
>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_OBLOB,
>> +                             NULL, NULL, s,
>> +                             &s->opaque_blobs,
>> +                             sizeof(s->opaque_blobs),
>> +                             false);
>> +
>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_FWBLOB,
>> +                             NULL, fw_blob_write, s,
>> +                             &s->fw_blob,
>> +                             sizeof(s->fw_blob),
>> +                             false);
>> +
>> +    /*
>> +     * Add global capability fw_cfg file. This will be used by the guest to
>> +     * check capability of the hypervisor.
>> +     */
>> +    s->capability = cpu_to_le16(CAP_VMFWUPD_MASK | VMFWUPDATE_CAP_EDKROM);
>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_CAP,
>> +                    &s->capability, sizeof(s->capability));
>> +
>> +    s->plat_bios_size = get_max_fw_size(); /* for non-pc, this is 0 */
>> +    /* size of bios region for the platform - read only by the guest */
>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_BIOS_SIZE,
>> +                    &s->plat_bios_size, sizeof(s->plat_bios_size));
>> +    /*
>> +     * add fw cfg control file to disable the hypervisor interface.
>> +     */
>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_CONTROL,
>> +                             NULL, NULL, s,
>> +                             &s->disable,
>> +                             sizeof(s->disable),
>> +                             false);
>> +    /*
>> +     * This device requires to register a global reset because it is
>> +     * not plugged to a bus (which, as its QOM parent, would reset it).
>> +     */
>> +    qemu_register_reset(fw_update_reset, dev);
>> +}
>> +
>> +static Property vmfwupdate_properties[] = {
>> +    DEFINE_PROP_UINT8("disable", VMFwUpdateState, disable, 0),
>> +    DEFINE_PROP_END_OF_LIST(),
>> +};
>> +
>> +static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +
>> +    /* we are not interested in migration - so no need to populate dc->vmsd */
>> +    dc->desc = "VM firmware blob update device";
>> +    dc->realize = vmfwupdate_realize;
>> +    dc->hotpluggable = false;
>> +    device_class_set_props(dc, vmfwupdate_properties);
>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> 
> How is this device instantiated?

Something like this:
$ ./qemu-system-x86_64 -device vmfwupdate
VNC server running on ::1:5900

And we can maybe add a basic test for this scenario:

$ ./qemu-system-x86_64 -device vmfwupdate -device vmfwupdate
qemu-system-x86_64: -device vmfwupdate: at most one vmfwupdate device is permitted

To exercise the fwcfg files, guest support is needed as I said above.

> 
>> +}
>> +
>> +static const TypeInfo vmfwupdate_device_info = {
>> +    .name          = TYPE_VMFWUPDATE,
>> +    .parent        = TYPE_DEVICE,
>> +    .instance_size = sizeof(VMFwUpdateState),
>> +    .class_init    = vmfwupdate_device_class_init,
>> +};
>> +
>> +static void vmfwupdate_register_types(void)
>> +{
>> +    type_register_static(&vmfwupdate_device_info);
> 
> New models should use DEFINE_TYPES().
> 

Will fix.

>> +}
>> +
>> +type_init(vmfwupdate_register_types);
>> diff --git a/include/hw/misc/vmfwupdate.h b/include/hw/misc/vmfwupdate.h
>> new file mode 100644
>> index 0000000000..e9229d807b
>> --- /dev/null
>> +++ b/include/hw/misc/vmfwupdate.h
>> @@ -0,0 +1,103 @@
>> +/*
>> + * Guest driven VM boot component update device
>> + * For details and specification, please look at docs/specs/vmfwupdate.rst.
>> + *
>> + * Copyright (C) 2024 Red Hat, Inc.
>> + *
>> + * Authors: Ani Sinha <anisinha@redhat.com>
>> + *
>> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
>> + * See the COPYING file in the top-level directory.
>> + *
>> + */
>> +#ifndef VMFWUPDATE_H
>> +#define VMFWUPDATE_H
>> +
>> +#include "hw/qdev-core.h"
>> +#include "qom/object.h"
>> +#include "qemu/units.h"
>> +
>> +#define TYPE_VMFWUPDATE "vmfwupdate"
>> +
>> +#define VMFWUPDCAPMSK  0xffff /* least significant 16 capability bits */
>> +
>> +#define VMFWUPDATE_CAP_EDKROM 0x08 /* bit 4 represents support for EDKROM */
>> +#define VMFWUPDATE_CAP_BIOS_RESIZE 0x04 /* guests may resize bios region */
>> +#define CAP_VMFWUPD_MASK 0x80
>> +
>> +#define VMFWUPDATE_OPAQUE_SIZE (1024 * MiB)
>> +
>> +/* fw_cfg file definitions */
>> +#define FILE_VMFWUPDATE_OBLOB "etc/vmfwupdate/opaque-blob"
>> +#define FILE_VMFWUPDATE_FWBLOB "etc/vmfwupdate/fw-blob"
>> +#define FILE_VMFWUPDATE_CAP "etc/vmfwupdate/cap"
>> +#define FILE_VMFWUPDATE_BIOS_SIZE "etc/vmfwupdate/bios-size"
>> +#define FILE_VMFWUPDATE_CONTROL "etc/vmfwupdate/disable"
>> +
>> +/*
>> + * Address and length of the guest provided firmware blob.
>> + * The blob itself is passed using the guest shared memory to QEMU.
>> + * This is then copied to the guest private memeory in the secure vm
>> + * by the hypervisor.
>> + */
>> +typedef struct {
>> +    uint32_t bios_size; /*
>> +                         * this is used by the guest to update plat_bios_size
>> +                         * when VMFWUPDATE_CAP_BIOS_RESIZE is set.
>> +                         */
>> +    uint64_t bios_paddr; /*
>> +                          * starting gpa where the blob is in shared guest
>> +                          * memory. Cleared upon system reset.
>> +                          */
>> +} VMFwUpdateFwBlob;
>> +
>> +typedef struct VMFwUpdateState {
>> +    DeviceState parent_obj;
>> +
>> +    /*
>> +     * capabilities - 64 bits.
>> +     * Little endian format.
>> +     */
>> +    uint64_t capability;
>> +
>> +    /*
>> +     * size of the bios region - architecture dependent.
>> +     * Read-only by the guest unless VMFWUPDATE_CAP_BIOS_RESIZE
>> +     * capability is set.
>> +     */
>> +    uint32_t plat_bios_size;
>> +
>> +    /*
>> +     * disable - disables the interface when non-zero value is written to it.
>> +     * Writing 0 to this file enables the interface.
>> +     */
>> +    uint8_t disable;
>> +
>> +    /*
>> +     * The first stage boot uses this opaque blob to convey to the next stage
>> +     * where the next stage components are loaded. The exact structure and
>> +     * number of entries are unknown to the hypervisor and the hypervisor
>> +     * does not touch this memory or do any validations.
>> +     * The contents of this memory needs to be validated by the guest and
>> +     * must be ABI compatible between the first and second stages.
>> +     */
>> +    unsigned char opaque_blobs[VMFWUPDATE_OPAQUE_SIZE];
>> +
>> +    /*
>> +     * firmware blob addresses and sizes. These are moved to guest
>> +     * private memory.
>> +     */
>> +    VMFwUpdateFwBlob fw_blob;
>> +} VMFwUpdateState;
>> +
>> +OBJECT_DECLARE_SIMPLE_TYPE(VMFwUpdateState, VMFWUPDATE);
>> +
>> +/* returns NULL unless there is exactly one device */
>> +static inline VMFwUpdateState *vmfwupdate_find(void)
> 
> What is the point of this helper? Why inline it in header?

Ok will move this into vmfwupdate.c


> 
>> +{
>> +    Object *o = object_resolve_path_type("", TYPE_VMFWUPDATE, NULL);
>> +
>> +    return o ? VMFWUPDATE(o) : NULL;
>> +}
>> +
>> +#endif
Philippe Mathieu-Daudé Dec. 17, 2024, 10:41 a.m. UTC | #3
On 17/12/24 11:06, Ani Sinha wrote:
> 
> 
>> On 16 Dec 2024, at 8:35 PM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>>
>> Hi Ani,
>>
>> On 16/12/24 12:48, Ani Sinha wrote:
>>> VM firmware update is a mechanism where the virtual machines can use their
>>> preferred and trusted firmware image in their execution environment without
>>> having to depend on a untrusted party to provide the firmware bundle. This is
>>> particularly useful for confidential virtual machines that are deployed in the
>>> cloud where the tenant and the cloud provider are two different entities. In
>>> this scenario, virtual machines can bring their own trusted firmware image
>>> bundled as a part of their filesystem (using UKIs for example[1]) and then use
>>> this hypervisor interface to update to their trusted firmware image. This also
>>> allows the guests to have a consistent measurements on the firmware image.
>>> This change introduces basic support for the fw-cfg based hypervisor interface
>>> and the corresponding device. The change also includes the
>>> specification document for this interface. The interface is made generic
>>> enough so that guests are free to use their own ABI to pass required
>>> information between initial and trusted execution contexts (where they are
>>> running their own trusted firmware image) without the hypervisor getting
>>> involved in between. In subsequent patches, we will introduce other minimal
>>> changes on the hypervisor that are required to make the mechanism work.
>>> [1] See systemd pull requests https://github.com/systemd/systemd/pull/35091
>>> and https://github.com/systemd/systemd/pull/35281 for some discussions on
>>> how we can bundle firmware image within an UKI.
>>> CC: Alex Graf <graf@amazon.com>
>>> CC: Paolo Bonzini <pbonzini@redhat.com>
>>> CC: Gerd Hoffman <kraxel@redhat.com>
>>> CC: Igor Mammedov <imammedo@redhat.com>
>>> CC: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>>> ---
>>>   MAINTAINERS                  |   9 ++
>>>   docs/specs/index.rst         |   1 +
>>>   docs/specs/vmfwupdate.rst    | 109 ++++++++++++++++++++++++
>>>   hw/misc/meson.build          |   2 +
>>>   hw/misc/vmfwupdate.c         | 157 +++++++++++++++++++++++++++++++++++
>>>   include/hw/misc/vmfwupdate.h | 103 +++++++++++++++++++++++
>>>   6 files changed, 381 insertions(+)
>>>   create mode 100644 docs/specs/vmfwupdate.rst
>>>   create mode 100644 hw/misc/vmfwupdate.c
>>>   create mode 100644 include/hw/misc/vmfwupdate.h


>>> +static void vmfwupdate_realize(DeviceState *dev, Error **errp)
>>> +{
>>> +    VMFwUpdateState *s = VMFWUPDATE(dev);
>>> +    FWCfgState *fw_cfg = fw_cfg_find();
>>> +
>>> +    /* multiple devices are not supported */
>>> +    if (!vmfwupdate_find()) {
>>> +        error_setg(errp, "at most one %s device is permitted",
>>> +                   TYPE_VMFWUPDATE);
>>> +        return;
>>> +    }
>>> +
>>> +    /* fw_cfg with DMA support is necessary to support this device */
>>> +    if (!fw_cfg || !fw_cfg_dma_enabled(fw_cfg)) {
>>> +        error_setg(errp, "%s device requires fw_cfg",
>>> +                   TYPE_VMFWUPDATE);
>>> +        return;
>>> +    }
>>> +
>>> +    memset(&s->fw_blob, 0, sizeof(s->fw_blob));
>>> +    memset(&s->opaque_blobs, 0, sizeof(s->opaque_blobs));
>>> +
>>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_OBLOB,
>>> +                             NULL, NULL, s,
>>> +                             &s->opaque_blobs,
>>> +                             sizeof(s->opaque_blobs),
>>> +                             false);
>>> +
>>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_FWBLOB,
>>> +                             NULL, fw_blob_write, s,
>>> +                             &s->fw_blob,
>>> +                             sizeof(s->fw_blob),
>>> +                             false);
>>> +
>>> +    /*
>>> +     * Add global capability fw_cfg file. This will be used by the guest to
>>> +     * check capability of the hypervisor.
>>> +     */
>>> +    s->capability = cpu_to_le16(CAP_VMFWUPD_MASK | VMFWUPDATE_CAP_EDKROM);
>>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_CAP,
>>> +                    &s->capability, sizeof(s->capability));
>>> +
>>> +    s->plat_bios_size = get_max_fw_size(); /* for non-pc, this is 0 */
>>> +    /* size of bios region for the platform - read only by the guest */
>>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_BIOS_SIZE,
>>> +                    &s->plat_bios_size, sizeof(s->plat_bios_size));
>>> +    /*
>>> +     * add fw cfg control file to disable the hypervisor interface.
>>> +     */
>>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_CONTROL,
>>> +                             NULL, NULL, s,
>>> +                             &s->disable,
>>> +                             sizeof(s->disable),
>>> +                             false);
>>> +    /*
>>> +     * This device requires to register a global reset because it is
>>> +     * not plugged to a bus (which, as its QOM parent, would reset it).
>>> +     */
>>> +    qemu_register_reset(fw_update_reset, dev);
>>> +}
>>> +
>>> +static Property vmfwupdate_properties[] = {
>>> +    DEFINE_PROP_UINT8("disable", VMFwUpdateState, disable, 0),
>>> +    DEFINE_PROP_END_OF_LIST(),
>>> +};
>>> +
>>> +static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
>>> +{
>>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>>> +
>>> +    /* we are not interested in migration - so no need to populate dc->vmsd */
>>> +    dc->desc = "VM firmware blob update device";
>>> +    dc->realize = vmfwupdate_realize;
>>> +    dc->hotpluggable = false;
>>> +    device_class_set_props(dc, vmfwupdate_properties);
>>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>>
>> How is this device instantiated?
> 
> Something like this:
> $ ./qemu-system-x86_64 -device vmfwupdate
> VNC server running on ::1:5900

But this device is not marked as allowed to be created on the
command line with:

     dc->user_creatable = true;

Am I missing something?

> 
> And we can maybe add a basic test for this scenario:
> 
> $ ./qemu-system-x86_64 -device vmfwupdate -device vmfwupdate
> qemu-system-x86_64: -device vmfwupdate: at most one vmfwupdate device is permitted
> 
> To exercise the fwcfg files, guest support is needed as I said above.
Ani Sinha Dec. 17, 2024, 11:09 a.m. UTC | #4
> On 17 Dec 2024, at 4:11 PM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> 
> On 17/12/24 11:06, Ani Sinha wrote:
>>> On 16 Dec 2024, at 8:35 PM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>>> 
>>> Hi Ani,
>>> 
>>> On 16/12/24 12:48, Ani Sinha wrote:
>>>> VM firmware update is a mechanism where the virtual machines can use their
>>>> preferred and trusted firmware image in their execution environment without
>>>> having to depend on a untrusted party to provide the firmware bundle. This is
>>>> particularly useful for confidential virtual machines that are deployed in the
>>>> cloud where the tenant and the cloud provider are two different entities. In
>>>> this scenario, virtual machines can bring their own trusted firmware image
>>>> bundled as a part of their filesystem (using UKIs for example[1]) and then use
>>>> this hypervisor interface to update to their trusted firmware image. This also
>>>> allows the guests to have a consistent measurements on the firmware image.
>>>> This change introduces basic support for the fw-cfg based hypervisor interface
>>>> and the corresponding device. The change also includes the
>>>> specification document for this interface. The interface is made generic
>>>> enough so that guests are free to use their own ABI to pass required
>>>> information between initial and trusted execution contexts (where they are
>>>> running their own trusted firmware image) without the hypervisor getting
>>>> involved in between. In subsequent patches, we will introduce other minimal
>>>> changes on the hypervisor that are required to make the mechanism work.
>>>> [1] See systemd pull requests https://github.com/systemd/systemd/pull/35091
>>>> and https://github.com/systemd/systemd/pull/35281 for some discussions on
>>>> how we can bundle firmware image within an UKI.
>>>> CC: Alex Graf <graf@amazon.com>
>>>> CC: Paolo Bonzini <pbonzini@redhat.com>
>>>> CC: Gerd Hoffman <kraxel@redhat.com>
>>>> CC: Igor Mammedov <imammedo@redhat.com>
>>>> CC: Vitaly Kuznetsov <vkuznets@redhat.com>
>>>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>>>> ---
>>>>  MAINTAINERS                  |   9 ++
>>>>  docs/specs/index.rst         |   1 +
>>>>  docs/specs/vmfwupdate.rst    | 109 ++++++++++++++++++++++++
>>>>  hw/misc/meson.build          |   2 +
>>>>  hw/misc/vmfwupdate.c         | 157 +++++++++++++++++++++++++++++++++++
>>>>  include/hw/misc/vmfwupdate.h | 103 +++++++++++++++++++++++
>>>>  6 files changed, 381 insertions(+)
>>>>  create mode 100644 docs/specs/vmfwupdate.rst
>>>>  create mode 100644 hw/misc/vmfwupdate.c
>>>>  create mode 100644 include/hw/misc/vmfwupdate.h
> 
> 
>>>> +static void vmfwupdate_realize(DeviceState *dev, Error **errp)
>>>> +{
>>>> +    VMFwUpdateState *s = VMFWUPDATE(dev);
>>>> +    FWCfgState *fw_cfg = fw_cfg_find();
>>>> +
>>>> +    /* multiple devices are not supported */
>>>> +    if (!vmfwupdate_find()) {
>>>> +        error_setg(errp, "at most one %s device is permitted",
>>>> +                   TYPE_VMFWUPDATE);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    /* fw_cfg with DMA support is necessary to support this device */
>>>> +    if (!fw_cfg || !fw_cfg_dma_enabled(fw_cfg)) {
>>>> +        error_setg(errp, "%s device requires fw_cfg",
>>>> +                   TYPE_VMFWUPDATE);
>>>> +        return;
>>>> +    }
>>>> +
>>>> +    memset(&s->fw_blob, 0, sizeof(s->fw_blob));
>>>> +    memset(&s->opaque_blobs, 0, sizeof(s->opaque_blobs));
>>>> +
>>>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_OBLOB,
>>>> +                             NULL, NULL, s,
>>>> +                             &s->opaque_blobs,
>>>> +                             sizeof(s->opaque_blobs),
>>>> +                             false);
>>>> +
>>>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_FWBLOB,
>>>> +                             NULL, fw_blob_write, s,
>>>> +                             &s->fw_blob,
>>>> +                             sizeof(s->fw_blob),
>>>> +                             false);
>>>> +
>>>> +    /*
>>>> +     * Add global capability fw_cfg file. This will be used by the guest to
>>>> +     * check capability of the hypervisor.
>>>> +     */
>>>> +    s->capability = cpu_to_le16(CAP_VMFWUPD_MASK | VMFWUPDATE_CAP_EDKROM);
>>>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_CAP,
>>>> +                    &s->capability, sizeof(s->capability));
>>>> +
>>>> +    s->plat_bios_size = get_max_fw_size(); /* for non-pc, this is 0 */
>>>> +    /* size of bios region for the platform - read only by the guest */
>>>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_BIOS_SIZE,
>>>> +                    &s->plat_bios_size, sizeof(s->plat_bios_size));
>>>> +    /*
>>>> +     * add fw cfg control file to disable the hypervisor interface.
>>>> +     */
>>>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_CONTROL,
>>>> +                             NULL, NULL, s,
>>>> +                             &s->disable,
>>>> +                             sizeof(s->disable),
>>>> +                             false);
>>>> +    /*
>>>> +     * This device requires to register a global reset because it is
>>>> +     * not plugged to a bus (which, as its QOM parent, would reset it).
>>>> +     */
>>>> +    qemu_register_reset(fw_update_reset, dev);
>>>> +}
>>>> +
>>>> +static Property vmfwupdate_properties[] = {
>>>> +    DEFINE_PROP_UINT8("disable", VMFwUpdateState, disable, 0),
>>>> +    DEFINE_PROP_END_OF_LIST(),
>>>> +};
>>>> +
>>>> +static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
>>>> +{
>>>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>>>> +
>>>> +    /* we are not interested in migration - so no need to populate dc->vmsd */
>>>> +    dc->desc = "VM firmware blob update device";
>>>> +    dc->realize = vmfwupdate_realize;
>>>> +    dc->hotpluggable = false;
>>>> +    device_class_set_props(dc, vmfwupdate_properties);
>>>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>>> 
>>> How is this device instantiated?
>> Something like this:
>> $ ./qemu-system-x86_64 -device vmfwupdate
>> VNC server running on ::1:5900
> 
> But this device is not marked as allowed to be created on the
> command line with:
> 
>    dc->user_creatable = true;
> 
> Am I missing something?

Isnt’ it true by default? See device_class_init(). Only when it’s a private device we need to set it explicitly to false.

Let me know if its me who is missing something :-)

> 
>> And we can maybe add a basic test for this scenario:
>> $ ./qemu-system-x86_64 -device vmfwupdate -device vmfwupdate
>> qemu-system-x86_64: -device vmfwupdate: at most one vmfwupdate device is permitted
>> To exercise the fwcfg files, guest support is needed as I said above.
Philippe Mathieu-Daudé Dec. 17, 2024, 1:45 p.m. UTC | #5
On 17/12/24 12:09, Ani Sinha wrote:
> 
> 
>> On 17 Dec 2024, at 4:11 PM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>>
>> On 17/12/24 11:06, Ani Sinha wrote:
>>>> On 16 Dec 2024, at 8:35 PM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>>>>
>>>> Hi Ani,


>>>>> +static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
>>>>> +{
>>>>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>>>>> +
>>>>> +    /* we are not interested in migration - so no need to populate dc->vmsd */
>>>>> +    dc->desc = "VM firmware blob update device";
>>>>> +    dc->realize = vmfwupdate_realize;
>>>>> +    dc->hotpluggable = false;
>>>>> +    device_class_set_props(dc, vmfwupdate_properties);
>>>>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>>>>
>>>> How is this device instantiated?
>>> Something like this:
>>> $ ./qemu-system-x86_64 -device vmfwupdate
>>> VNC server running on ::1:5900
>>
>> But this device is not marked as allowed to be created on the
>> command line with:
>>
>>     dc->user_creatable = true;
>>
>> Am I missing something?
> 
> Isnt’ it true by default? See device_class_init(). Only when it’s a private device we need to set it explicitly to false.
> 
> Let me know if its me who is missing something :-)

Indeed, you are correct! I forgot about that default.
Gerd Hoffmann Dec. 17, 2024, 3:21 p.m. UTC | #6
Hi,

> +Fw-cfg Files
> +************
> +
> +Guests drive vmfwupdate through special ``fw-cfg`` files that control its flow
> +followed by a standard system reset operation. When vmfwupdate is available,
> +it provides the following ``fw-cfg`` files:
> +
> +* ``vmfwupdate/cap`` (``u64``) - Read-only Little Endian encoded bitmap of additional
> +  capabilities the interface supports. List of available capabilities:
> +
> +     ``VMFWUPDATE_CAP_BIOS_RESIZE        0x0000000000000001``
> +
> +* ``vmfwupdate/bios-size`` (``u32``) - Little Endian encoded size of the BIOS region.
> +  Read-only by default. Optionally Read-write if ``vmfwupdate/cap`` contains
> +  ``VMFWUPDATE_CAP_BIOS_RESIZE``. On write, the BIOS region may resize. Guests are
> +  required to read the value after writing and compare it with the requested size
> +  to determine whether the resize was successful. Note, x86 BIOS regions always
> +  start at 4GiB - bios-size.
> +
> +* ``vmfwupdate/opaque`` (``1024 bytes``) - A 1KiB buffer that survives the BIOS replacement
> +  flow. Can be used by the guest to propagate guest physical addresses of payloads
> +  to its BIOS stage. It’s recommended to make the new BIOS clear this file on boot
> +  if it exists. Contents of this file are under control by the hypervisor. In an
> +  environment that considers the hypervisor outside of its trust boundary, guests
> +  are advised to validate its contents before consumption.
> +
> +* ``vmfwupdate/disable`` (``u8``) - Indicates whether the interface is disabled.
> +  Returns 0 for enabled, 1 for disabled. Writing any value disables it. Writing is
> +  only allowed if the value is 0. When the interface is disabled, the replace file
> +  is ignored on reset. This value resets to 0 on system reset.
> +
> +* ``vmfwupdate/bios-addr`` (``u64``) - A 64bit Little Endian encoded guest physical address
> +  at the beginning of the replacement BIOS region. The provided payload must reside
> +  in shared memory. 0 on system reset.

I'd suggest to make all integers u64 (little endian).

Any specific reason why vmfwupdate/opaque is 1024 bytes?
How about using 4096 (aka PAGE_SIZE) bytes?

> +On reset, the hypervisor evaluates whether ``vmfwupdate/disable`` is ``1``. If it is, it ignores
> +any other vmfwupdate values and performs a standard system reset.
> +
> +If ``vmfwupdate/disable`` is ``0``, the hypervisor checks if bios-addr is ``0``. If it is, it
> +performs a standard system reset.

What is 'standard system reset' in SEV-SNP mode?  I think qemu does not
support it right now and will simply poweroff the guest instead.  Will
that continue to be the case?  Or should qemu re-create the VM context
in that case, simliar to the firmware update case, except that qemu
would have to use the default firmware image then?

Will the vmfwupdate be supported without SEV-SNP too?  Might be useful
for development + testing.  If so the spec should clarify what happens
in that case, because the concept of "shared memory" does not exist
then.

take care,
  Gerd
Philippe Mathieu-Daudé Dec. 18, 2024, 5:43 p.m. UTC | #7
On 16/12/24 12:48, Ani Sinha wrote:
> VM firmware update is a mechanism where the virtual machines can use their
> preferred and trusted firmware image in their execution environment without
> having to depend on a untrusted party to provide the firmware bundle. This is
> particularly useful for confidential virtual machines that are deployed in the
> cloud where the tenant and the cloud provider are two different entities. In
> this scenario, virtual machines can bring their own trusted firmware image
> bundled as a part of their filesystem (using UKIs for example[1]) and then use
> this hypervisor interface to update to their trusted firmware image. This also
> allows the guests to have a consistent measurements on the firmware image.
> 
> This change introduces basic support for the fw-cfg based hypervisor interface
> and the corresponding device. The change also includes the
> specification document for this interface. The interface is made generic
> enough so that guests are free to use their own ABI to pass required
> information between initial and trusted execution contexts (where they are
> running their own trusted firmware image) without the hypervisor getting
> involved in between. In subsequent patches, we will introduce other minimal
> changes on the hypervisor that are required to make the mechanism work.
> 
> [1] See systemd pull requests https://github.com/systemd/systemd/pull/35091
> and https://github.com/systemd/systemd/pull/35281 for some discussions on
> how we can bundle firmware image within an UKI.
> 
> CC: Alex Graf <graf@amazon.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Gerd Hoffman <kraxel@redhat.com>
> CC: Igor Mammedov <imammedo@redhat.com>
> CC: Vitaly Kuznetsov <vkuznets@redhat.com>
> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> ---
>   MAINTAINERS                  |   9 ++
>   docs/specs/index.rst         |   1 +
>   docs/specs/vmfwupdate.rst    | 109 ++++++++++++++++++++++++
>   hw/misc/meson.build          |   2 +
>   hw/misc/vmfwupdate.c         | 157 +++++++++++++++++++++++++++++++++++
>   include/hw/misc/vmfwupdate.h | 103 +++++++++++++++++++++++
>   6 files changed, 381 insertions(+)
>   create mode 100644 docs/specs/vmfwupdate.rst
>   create mode 100644 hw/misc/vmfwupdate.c
>   create mode 100644 include/hw/misc/vmfwupdate.h


> +static void vmfwupdate_realize(DeviceState *dev, Error **errp)
> +{
> +    VMFwUpdateState *s = VMFWUPDATE(dev);
> +    FWCfgState *fw_cfg = fw_cfg_find();
> +
> +    /* multiple devices are not supported */
> +    if (!vmfwupdate_find()) {
> +        error_setg(errp, "at most one %s device is permitted",
> +                   TYPE_VMFWUPDATE);
> +        return;
> +    }
> +
> +    /* fw_cfg with DMA support is necessary to support this device */
> +    if (!fw_cfg || !fw_cfg_dma_enabled(fw_cfg)) {
> +        error_setg(errp, "%s device requires fw_cfg",
> +                   TYPE_VMFWUPDATE);
> +        return;
> +    }
> +
> +    memset(&s->fw_blob, 0, sizeof(s->fw_blob));
> +    memset(&s->opaque_blobs, 0, sizeof(s->opaque_blobs));
> +
> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_OBLOB,
> +                             NULL, NULL, s,
> +                             &s->opaque_blobs,
> +                             sizeof(s->opaque_blobs),
> +                             false);
> +
> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_FWBLOB,
> +                             NULL, fw_blob_write, s,
> +                             &s->fw_blob,
> +                             sizeof(s->fw_blob),
> +                             false);
> +
> +    /*
> +     * Add global capability fw_cfg file. This will be used by the guest to
> +     * check capability of the hypervisor.
> +     */
> +    s->capability = cpu_to_le16(CAP_VMFWUPD_MASK | VMFWUPDATE_CAP_EDKROM);
> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_CAP,
> +                    &s->capability, sizeof(s->capability));
> +
> +    s->plat_bios_size = get_max_fw_size(); /* for non-pc, this is 0 */
> +    /* size of bios region for the platform - read only by the guest */
> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_BIOS_SIZE,
> +                    &s->plat_bios_size, sizeof(s->plat_bios_size));
> +    /*
> +     * add fw cfg control file to disable the hypervisor interface.
> +     */
> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_CONTROL,
> +                             NULL, NULL, s,
> +                             &s->disable,
> +                             sizeof(s->disable),
> +                             false);
> +    /*
> +     * This device requires to register a global reset because it is
> +     * not plugged to a bus (which, as its QOM parent, would reset it).
> +     */
> +    qemu_register_reset(fw_update_reset, dev);

Shouldn't we use qemu_register_resettable() instead?

> +}
> +
> +static Property vmfwupdate_properties[] = {
> +    DEFINE_PROP_UINT8("disable", VMFwUpdateState, disable, 0),
> +    DEFINE_PROP_END_OF_LIST(),
> +};
> +
> +static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +
> +    /* we are not interested in migration - so no need to populate dc->vmsd */
> +    dc->desc = "VM firmware blob update device";
> +    dc->realize = vmfwupdate_realize;
> +    dc->hotpluggable = false;
> +    device_class_set_props(dc, vmfwupdate_properties);
> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> +}
> +
> +static const TypeInfo vmfwupdate_device_info = {
> +    .name          = TYPE_VMFWUPDATE,
> +    .parent        = TYPE_DEVICE,

What is the qdev API used here? Why not use a plain object?

> +    .instance_size = sizeof(VMFwUpdateState),
> +    .class_init    = vmfwupdate_device_class_init,
> +};
Ani Sinha Dec. 19, 2024, 9:35 a.m. UTC | #8
> On 18 Dec 2024, at 11:13 PM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
> 
> On 16/12/24 12:48, Ani Sinha wrote:
>> VM firmware update is a mechanism where the virtual machines can use their
>> preferred and trusted firmware image in their execution environment without
>> having to depend on a untrusted party to provide the firmware bundle. This is
>> particularly useful for confidential virtual machines that are deployed in the
>> cloud where the tenant and the cloud provider are two different entities. In
>> this scenario, virtual machines can bring their own trusted firmware image
>> bundled as a part of their filesystem (using UKIs for example[1]) and then use
>> this hypervisor interface to update to their trusted firmware image. This also
>> allows the guests to have a consistent measurements on the firmware image.
>> This change introduces basic support for the fw-cfg based hypervisor interface
>> and the corresponding device. The change also includes the
>> specification document for this interface. The interface is made generic
>> enough so that guests are free to use their own ABI to pass required
>> information between initial and trusted execution contexts (where they are
>> running their own trusted firmware image) without the hypervisor getting
>> involved in between. In subsequent patches, we will introduce other minimal
>> changes on the hypervisor that are required to make the mechanism work.
>> [1] See systemd pull requests https://github.com/systemd/systemd/pull/35091
>> and https://github.com/systemd/systemd/pull/35281 for some discussions on
>> how we can bundle firmware image within an UKI.
>> CC: Alex Graf <graf@amazon.com>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> CC: Gerd Hoffman <kraxel@redhat.com>
>> CC: Igor Mammedov <imammedo@redhat.com>
>> CC: Vitaly Kuznetsov <vkuznets@redhat.com>
>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>> ---
>>  MAINTAINERS                  |   9 ++
>>  docs/specs/index.rst         |   1 +
>>  docs/specs/vmfwupdate.rst    | 109 ++++++++++++++++++++++++
>>  hw/misc/meson.build          |   2 +
>>  hw/misc/vmfwupdate.c         | 157 +++++++++++++++++++++++++++++++++++
>>  include/hw/misc/vmfwupdate.h | 103 +++++++++++++++++++++++
>>  6 files changed, 381 insertions(+)
>>  create mode 100644 docs/specs/vmfwupdate.rst
>>  create mode 100644 hw/misc/vmfwupdate.c
>>  create mode 100644 include/hw/misc/vmfwupdate.h
> 
> 
>> +static void vmfwupdate_realize(DeviceState *dev, Error **errp)
>> +{
>> +    VMFwUpdateState *s = VMFWUPDATE(dev);
>> +    FWCfgState *fw_cfg = fw_cfg_find();
>> +
>> +    /* multiple devices are not supported */
>> +    if (!vmfwupdate_find()) {
>> +        error_setg(errp, "at most one %s device is permitted",
>> +                   TYPE_VMFWUPDATE);
>> +        return;
>> +    }
>> +
>> +    /* fw_cfg with DMA support is necessary to support this device */
>> +    if (!fw_cfg || !fw_cfg_dma_enabled(fw_cfg)) {
>> +        error_setg(errp, "%s device requires fw_cfg",
>> +                   TYPE_VMFWUPDATE);
>> +        return;
>> +    }
>> +
>> +    memset(&s->fw_blob, 0, sizeof(s->fw_blob));
>> +    memset(&s->opaque_blobs, 0, sizeof(s->opaque_blobs));
>> +
>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_OBLOB,
>> +                             NULL, NULL, s,
>> +                             &s->opaque_blobs,
>> +                             sizeof(s->opaque_blobs),
>> +                             false);
>> +
>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_FWBLOB,
>> +                             NULL, fw_blob_write, s,
>> +                             &s->fw_blob,
>> +                             sizeof(s->fw_blob),
>> +                             false);
>> +
>> +    /*
>> +     * Add global capability fw_cfg file. This will be used by the guest to
>> +     * check capability of the hypervisor.
>> +     */
>> +    s->capability = cpu_to_le16(CAP_VMFWUPD_MASK | VMFWUPDATE_CAP_EDKROM);
>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_CAP,
>> +                    &s->capability, sizeof(s->capability));
>> +
>> +    s->plat_bios_size = get_max_fw_size(); /* for non-pc, this is 0 */
>> +    /* size of bios region for the platform - read only by the guest */
>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_BIOS_SIZE,
>> +                    &s->plat_bios_size, sizeof(s->plat_bios_size));
>> +    /*
>> +     * add fw cfg control file to disable the hypervisor interface.
>> +     */
>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_CONTROL,
>> +                             NULL, NULL, s,
>> +                             &s->disable,
>> +                             sizeof(s->disable),
>> +                             false);
>> +    /*
>> +     * This device requires to register a global reset because it is
>> +     * not plugged to a bus (which, as its QOM parent, would reset it).
>> +     */
>> +    qemu_register_reset(fw_update_reset, dev);
> 
> Shouldn't we use qemu_register_resettable() instead?

OK will do in v3.

> 
>> +}
>> +
>> +static Property vmfwupdate_properties[] = {
>> +    DEFINE_PROP_UINT8("disable", VMFwUpdateState, disable, 0),
>> +    DEFINE_PROP_END_OF_LIST(),
>> +};
>> +
>> +static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
>> +{
>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>> +
>> +    /* we are not interested in migration - so no need to populate dc->vmsd */
>> +    dc->desc = "VM firmware blob update device";
>> +    dc->realize = vmfwupdate_realize;
>> +    dc->hotpluggable = false;
>> +    device_class_set_props(dc, vmfwupdate_properties);
>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>> +}
>> +
>> +static const TypeInfo vmfwupdate_device_info = {
>> +    .name          = TYPE_VMFWUPDATE,
>> +    .parent        = TYPE_DEVICE,
> 
> What is the qdev API used here? Why not use a plain object?

I wrote this taking vmcoreinfo device as starting point. I will leave this as is for now unless anyone has strong opinions.

> 
>> +    .instance_size = sizeof(VMFwUpdateState),
>> +    .class_init    = vmfwupdate_device_class_init,
>> +};
Ani Sinha Dec. 19, 2024, 9:37 a.m. UTC | #9
> On 17 Dec 2024, at 8:51 PM, Gerd Hoffman <kraxel@redhat.com> wrote:
> 
>  Hi,
> 
>> +Fw-cfg Files
>> +************
>> +
>> +Guests drive vmfwupdate through special ``fw-cfg`` files that control its flow
>> +followed by a standard system reset operation. When vmfwupdate is available,
>> +it provides the following ``fw-cfg`` files:
>> +
>> +* ``vmfwupdate/cap`` (``u64``) - Read-only Little Endian encoded bitmap of additional
>> +  capabilities the interface supports. List of available capabilities:
>> +
>> +     ``VMFWUPDATE_CAP_BIOS_RESIZE        0x0000000000000001``
>> +
>> +* ``vmfwupdate/bios-size`` (``u32``) - Little Endian encoded size of the BIOS region.
>> +  Read-only by default. Optionally Read-write if ``vmfwupdate/cap`` contains
>> +  ``VMFWUPDATE_CAP_BIOS_RESIZE``. On write, the BIOS region may resize. Guests are
>> +  required to read the value after writing and compare it with the requested size
>> +  to determine whether the resize was successful. Note, x86 BIOS regions always
>> +  start at 4GiB - bios-size.
>> +
>> +* ``vmfwupdate/opaque`` (``1024 bytes``) - A 1KiB buffer that survives the BIOS replacement
>> +  flow. Can be used by the guest to propagate guest physical addresses of payloads
>> +  to its BIOS stage. It’s recommended to make the new BIOS clear this file on boot
>> +  if it exists. Contents of this file are under control by the hypervisor. In an
>> +  environment that considers the hypervisor outside of its trust boundary, guests
>> +  are advised to validate its contents before consumption.
>> +
>> +* ``vmfwupdate/disable`` (``u8``) - Indicates whether the interface is disabled.
>> +  Returns 0 for enabled, 1 for disabled. Writing any value disables it. Writing is
>> +  only allowed if the value is 0. When the interface is disabled, the replace file
>> +  is ignored on reset. This value resets to 0 on system reset.
>> +
>> +* ``vmfwupdate/bios-addr`` (``u64``) - A 64bit Little Endian encoded guest physical address
>> +  at the beginning of the replacement BIOS region. The provided payload must reside
>> +  in shared memory. 0 on system reset.
> 
> I'd suggest to make all integers u64 (little endian).
> 
> Any specific reason why vmfwupdate/opaque is 1024 bytes?
> How about using 4096 (aka PAGE_SIZE) bytes?
> 
>> +On reset, the hypervisor evaluates whether ``vmfwupdate/disable`` is ``1``. If it is, it ignores
>> +any other vmfwupdate values and performs a standard system reset.
>> +
>> +If ``vmfwupdate/disable`` is ``0``, the hypervisor checks if bios-addr is ``0``. If it is, it
>> +performs a standard system reset.
> 
> What is 'standard system reset' in SEV-SNP mode?  I think qemu does not
> support it right now and will simply poweroff the guest instead.  Will
> that continue to be the case?  

No. Work is in progress to support VM resets even in SEV-SNP.

> Or should qemu re-create the VM context
> in that case, simliar to the firmware update case,
> except that qemu
> would have to use the default firmware image then?

Fw update works both in coco and non-coco cases. Will update the doc in v3.

> 
> Will the vmfwupdate be supported without SEV-SNP too?  Might be useful
> for development + testing.  If so the spec should clarify what happens
> in that case, because the concept of "shared memory" does not exist
> then.

Will update the doc in v3.
Philippe Mathieu-Daudé Dec. 19, 2024, 10:03 a.m. UTC | #10
On 19/12/24 10:35, Ani Sinha wrote:
>> On 18 Dec 2024, at 11:13 PM, Philippe Mathieu-Daudé <philmd@linaro.org> wrote:
>> On 16/12/24 12:48, Ani Sinha wrote:
>>> VM firmware update is a mechanism where the virtual machines can use their
>>> preferred and trusted firmware image in their execution environment without
>>> having to depend on a untrusted party to provide the firmware bundle. This is
>>> particularly useful for confidential virtual machines that are deployed in the
>>> cloud where the tenant and the cloud provider are two different entities. In
>>> this scenario, virtual machines can bring their own trusted firmware image
>>> bundled as a part of their filesystem (using UKIs for example[1]) and then use
>>> this hypervisor interface to update to their trusted firmware image. This also
>>> allows the guests to have a consistent measurements on the firmware image.
>>> This change introduces basic support for the fw-cfg based hypervisor interface
>>> and the corresponding device. The change also includes the
>>> specification document for this interface. The interface is made generic
>>> enough so that guests are free to use their own ABI to pass required
>>> information between initial and trusted execution contexts (where they are
>>> running their own trusted firmware image) without the hypervisor getting
>>> involved in between. In subsequent patches, we will introduce other minimal
>>> changes on the hypervisor that are required to make the mechanism work.
>>> [1] See systemd pull requests https://github.com/systemd/systemd/pull/35091
>>> and https://github.com/systemd/systemd/pull/35281 for some discussions on
>>> how we can bundle firmware image within an UKI.
>>> CC: Alex Graf <graf@amazon.com>
>>> CC: Paolo Bonzini <pbonzini@redhat.com>
>>> CC: Gerd Hoffman <kraxel@redhat.com>
>>> CC: Igor Mammedov <imammedo@redhat.com>
>>> CC: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>>> ---
>>>   MAINTAINERS                  |   9 ++
>>>   docs/specs/index.rst         |   1 +
>>>   docs/specs/vmfwupdate.rst    | 109 ++++++++++++++++++++++++
>>>   hw/misc/meson.build          |   2 +
>>>   hw/misc/vmfwupdate.c         | 157 +++++++++++++++++++++++++++++++++++
>>>   include/hw/misc/vmfwupdate.h | 103 +++++++++++++++++++++++
>>>   6 files changed, 381 insertions(+)
>>>   create mode 100644 docs/specs/vmfwupdate.rst
>>>   create mode 100644 hw/misc/vmfwupdate.c
>>>   create mode 100644 include/hw/misc/vmfwupdate.h
>>
>>
>>> +static void vmfwupdate_realize(DeviceState *dev, Error **errp)
>>> +{
>>> +    VMFwUpdateState *s = VMFWUPDATE(dev);
>>> +    FWCfgState *fw_cfg = fw_cfg_find();
>>> +
>>> +    /* multiple devices are not supported */
>>> +    if (!vmfwupdate_find()) {
>>> +        error_setg(errp, "at most one %s device is permitted",
>>> +                   TYPE_VMFWUPDATE);
>>> +        return;
>>> +    }
>>> +
>>> +    /* fw_cfg with DMA support is necessary to support this device */
>>> +    if (!fw_cfg || !fw_cfg_dma_enabled(fw_cfg)) {
>>> +        error_setg(errp, "%s device requires fw_cfg",
>>> +                   TYPE_VMFWUPDATE);
>>> +        return;
>>> +    }
>>> +
>>> +    memset(&s->fw_blob, 0, sizeof(s->fw_blob));
>>> +    memset(&s->opaque_blobs, 0, sizeof(s->opaque_blobs));
>>> +
>>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_OBLOB,
>>> +                             NULL, NULL, s,
>>> +                             &s->opaque_blobs,
>>> +                             sizeof(s->opaque_blobs),
>>> +                             false);
>>> +
>>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_FWBLOB,
>>> +                             NULL, fw_blob_write, s,
>>> +                             &s->fw_blob,
>>> +                             sizeof(s->fw_blob),
>>> +                             false);
>>> +
>>> +    /*
>>> +     * Add global capability fw_cfg file. This will be used by the guest to
>>> +     * check capability of the hypervisor.
>>> +     */
>>> +    s->capability = cpu_to_le16(CAP_VMFWUPD_MASK | VMFWUPDATE_CAP_EDKROM);
>>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_CAP,
>>> +                    &s->capability, sizeof(s->capability));
>>> +
>>> +    s->plat_bios_size = get_max_fw_size(); /* for non-pc, this is 0 */
>>> +    /* size of bios region for the platform - read only by the guest */
>>> +    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_BIOS_SIZE,
>>> +                    &s->plat_bios_size, sizeof(s->plat_bios_size));
>>> +    /*
>>> +     * add fw cfg control file to disable the hypervisor interface.
>>> +     */
>>> +    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_CONTROL,
>>> +                             NULL, NULL, s,
>>> +                             &s->disable,
>>> +                             sizeof(s->disable),
>>> +                             false);
>>> +    /*
>>> +     * This device requires to register a global reset because it is
>>> +     * not plugged to a bus (which, as its QOM parent, would reset it).
>>> +     */
>>> +    qemu_register_reset(fw_update_reset, dev);
>>
>> Shouldn't we use qemu_register_resettable() instead?
> 
> OK will do in v3.
> 
>>
>>> +}
>>> +
>>> +static Property vmfwupdate_properties[] = {
>>> +    DEFINE_PROP_UINT8("disable", VMFwUpdateState, disable, 0),
>>> +    DEFINE_PROP_END_OF_LIST(),
>>> +};
>>> +
>>> +static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
>>> +{
>>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>>> +
>>> +    /* we are not interested in migration - so no need to populate dc->vmsd */
>>> +    dc->desc = "VM firmware blob update device";
>>> +    dc->realize = vmfwupdate_realize;
>>> +    dc->hotpluggable = false;
>>> +    device_class_set_props(dc, vmfwupdate_properties);
>>> +    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>>> +}
>>> +
>>> +static const TypeInfo vmfwupdate_device_info = {
>>> +    .name          = TYPE_VMFWUPDATE,
>>> +    .parent        = TYPE_DEVICE,
>>
>> What is the qdev API used here? Why not use a plain object?
> 
> I wrote this taking vmcoreinfo device as starting point. I will leave this as is for now unless anyone has strong opinions.

We shouldn't blindly copy/paste & spread possible design mistakes.

Marc-André, any particular reason to implement vmcoreinfo using qdev
and not plain object?

> 
>>
>>> +    .instance_size = sizeof(VMFwUpdateState),
>>> +    .class_init    = vmfwupdate_device_class_init,
>>> +};
> 
>
Marc-André Lureau Dec. 19, 2024, 12:55 p.m. UTC | #11
Hi

On Thu, Dec 19, 2024 at 2:03 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
> >>> +static const TypeInfo vmfwupdate_device_info = {
> >>> +    .name          = TYPE_VMFWUPDATE,
> >>> +    .parent        = TYPE_DEVICE,
> >>
> >> What is the qdev API used here? Why not use a plain object?
> >
> > I wrote this taking vmcoreinfo device as starting point. I will leave this as is for now unless anyone has strong opinions.
>
> We shouldn't blindly copy/paste & spread possible design mistakes.
>
> Marc-André, any particular reason to implement vmcoreinfo using qdev
> and not plain object?
>

I don't remember (damn 8y ago..). It seems the design changed over
time during review, qdev might have been necessary and stayed this
way.
Ani Sinha Dec. 19, 2024, 2:07 p.m. UTC | #12
On Thu, Dec 19, 2024 at 6:25 PM Marc-André Lureau
<marcandre.lureau@redhat.com> wrote:
>
> Hi
>
> On Thu, Dec 19, 2024 at 2:03 PM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
> > >>> +static const TypeInfo vmfwupdate_device_info = {
> > >>> +    .name          = TYPE_VMFWUPDATE,
> > >>> +    .parent        = TYPE_DEVICE,
> > >>
> > >> What is the qdev API used here? Why not use a plain object?
> > >
> > > I wrote this taking vmcoreinfo device as starting point. I will leave this as is for now unless anyone has strong opinions.
> >
> > We shouldn't blindly copy/paste & spread possible design mistakes.
> >
> > Marc-André, any particular reason to implement vmcoreinfo using qdev
> > and not plain object?
> >
>
> I don't remember (damn 8y ago..). It seems the design changed over
> time during review, qdev might have been necessary and stayed this
> way.

I changed it to TYPE_OBJECT and we get a crash here:

#3  0x0000aaaaab207a48 [PAC] in object_class_dynamic_cast_assert
    (class=0xaaaaac608880, typename=typename@entry=0xaaaaab4b9630
"device", file=file@entry=0xaaaaab4300d0
"/workspace/qemu-ani/include/hw/qdev-core.h", line=line@entry=77,
func=func@entry=0xaaaaab595a90 <__func__.0> "DEVICE_CLASS") at
../qom/object.c:1021
#4  0x0000aaaaaaec2d74 in DEVICE_CLASS (klass=<optimized out>) at
/workspace/qemu-ani/include/hw/qdev-core.h:77
#5  vmcoreinfo_device_class_init (klass=<optimized out>,
data=<optimized out>) at ../hw/misc/vmcoreinfo.c:88

Basically doing this would be illegal for vmcoreinfo and we need to
adjust the code :

   DeviceClass *dc = DEVICE_CLASS(klass);

    dc->vmsd = &vmstate_vmcoreinfo;
    dc->realize = vmcoreinfo_realize;
    dc->hotpluggable = false;
    set_bit(DEVICE_CATEGORY_MISC, dc->categories);

Anyway, for vmfwupdate, it is actually like a device with device properties:

+    device_class_set_props(dc, vmfwupdate_properties);

So I prefer to make it qdev type for now.
Philippe Mathieu-Daudé Dec. 19, 2024, 3:46 p.m. UTC | #13
On 19/12/24 15:07, Ani Sinha wrote:
> On Thu, Dec 19, 2024 at 6:25 PM Marc-André Lureau
> <marcandre.lureau@redhat.com> wrote:
>>
>> Hi
>>
>> On Thu, Dec 19, 2024 at 2:03 PM Philippe Mathieu-Daudé
>> <philmd@linaro.org> wrote:
>>>>>> +static const TypeInfo vmfwupdate_device_info = {
>>>>>> +    .name          = TYPE_VMFWUPDATE,
>>>>>> +    .parent        = TYPE_DEVICE,
>>>>>
>>>>> What is the qdev API used here? Why not use a plain object?
>>>>
>>>> I wrote this taking vmcoreinfo device as starting point. I will leave this as is for now unless anyone has strong opinions.
>>>
>>> We shouldn't blindly copy/paste & spread possible design mistakes.
>>>
>>> Marc-André, any particular reason to implement vmcoreinfo using qdev
>>> and not plain object?
>>>
>>
>> I don't remember (damn 8y ago..). It seems the design changed over
>> time during review, qdev might have been necessary and stayed this
>> way.
> 
> I changed it to TYPE_OBJECT and we get a crash here:
> 
> #3  0x0000aaaaab207a48 [PAC] in object_class_dynamic_cast_assert
>      (class=0xaaaaac608880, typename=typename@entry=0xaaaaab4b9630
> "device", file=file@entry=0xaaaaab4300d0
> "/workspace/qemu-ani/include/hw/qdev-core.h", line=line@entry=77,
> func=func@entry=0xaaaaab595a90 <__func__.0> "DEVICE_CLASS") at
> ../qom/object.c:1021
> #4  0x0000aaaaaaec2d74 in DEVICE_CLASS (klass=<optimized out>) at
> /workspace/qemu-ani/include/hw/qdev-core.h:77
> #5  vmcoreinfo_device_class_init (klass=<optimized out>,
> data=<optimized out>) at ../hw/misc/vmcoreinfo.c:88

I believe you have enough knowledge to understand the concepts you
are mixing here. You can not change a type signature without
implementing its interface (which as you noticed, for QEMU is checked
at runtime).

> Basically doing this would be illegal for vmcoreinfo and we need to
> adjust the code :
> 
>     DeviceClass *dc = DEVICE_CLASS(klass);
> 
>      dc->vmsd = &vmstate_vmcoreinfo;
>      dc->realize = vmcoreinfo_realize;
>      dc->hotpluggable = false;
>      set_bit(DEVICE_CATEGORY_MISC, dc->categories);

See the conversion:
https://lore.kernel.org/qemu-devel/20241219153857.57450-1-philmd@linaro.org/

> Anyway, for vmfwupdate, it is actually like a device with device properties:
> 
> +    device_class_set_props(dc, vmfwupdate_properties);
> 
> So I prefer to make it qdev type for now.

We have the opportunity to start with the correct model.
Consider simplifying our future (see what is required in
the suggested vmcoreinfo conversion). Except if you insist
and commit to do the vmfwupdate later.
Philippe Mathieu-Daudé Dec. 19, 2024, 3:52 p.m. UTC | #14
On 16/12/24 12:48, Ani Sinha wrote:

> diff --git a/hw/misc/meson.build b/hw/misc/meson.build
> index d02d96e403..4c5bdb0de2 100644
> --- a/hw/misc/meson.build
> +++ b/hw/misc/meson.build
> @@ -148,6 +148,8 @@ specific_ss.add(when: 'CONFIG_MAC_VIA', if_true: files('mac_via.c'))
>   specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true: files('mips_cmgcr.c', 'mips_cpc.c'))
>   specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true: files('mips_itu.c'))
>   
> +specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true: files('vmfwupdate.c'))

FW_CFG_DMA is offered by multiple targets ...:

$ git grep -w FW_CFG_DMA
hw/arm/Kconfig:19:    select FW_CFG_DMA
hw/i386/Kconfig:82:    select FW_CFG_DMA
hw/i386/Kconfig:113:    select FW_CFG_DMA
hw/loongarch/Kconfig:22:    select FW_CFG_DMA
hw/riscv/Kconfig:59:    select FW_CFG_DMA

> diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
> new file mode 100644
> index 0000000000..1e29a610c0
> --- /dev/null
> +++ b/hw/misc/vmfwupdate.c
> @@ -0,0 +1,157 @@
> +/*
> + * Guest driven VM boot component update device
> + * For details and specification, please look at docs/specs/vmfwupdate.rst.
> + *
> + * Copyright (C) 2024 Red Hat, Inc.
> + *
> + * Authors: Ani Sinha <anisinha@redhat.com>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + *
> + */
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "qemu/module.h"
> +#include "sysemu/reset.h"
> +#include "hw/nvram/fw_cfg.h"
> +#include "hw/i386/pc.h"

... however ...

> +#include "hw/qdev-properties.h"
> +#include "hw/misc/vmfwupdate.h"
> +#include "qemu/error-report.h"
> +
> +static void fw_update_reset(void *dev)
> +{
> +    /* a NOOP at present */
> +    return;
> +}
> +
> +
> +static uint64_t get_max_fw_size(void)
> +{
> +    Object *m_obj = qdev_get_machine();
> +    PCMachineState *pcms = PC_MACHINE(m_obj);
> +
> +    if (pcms) {
> +        return pcms->max_fw_size;

... this code depends on x86/PC.

Could it be wiser to add a new VM_FWUPDATE Kconfig
symbol, having it depending on FW_CFG_DMA && I386?

> +    } else {
> +        return 0;
> +    }
> +}
Ani Sinha Dec. 19, 2024, 4:11 p.m. UTC | #15
On Thu, 19 Dec, 2024, 9:16 pm Philippe Mathieu-Daudé, <philmd@linaro.org>
wrote:

> On 19/12/24 15:07, Ani Sinha wrote:
> > On Thu, Dec 19, 2024 at 6:25 PM Marc-André Lureau
> > <marcandre.lureau@redhat.com> wrote:
> >>
> >> Hi
> >>
> >> On Thu, Dec 19, 2024 at 2:03 PM Philippe Mathieu-Daudé
> >> <philmd@linaro.org> wrote:
> >>>>>> +static const TypeInfo vmfwupdate_device_info = {
> >>>>>> +    .name          = TYPE_VMFWUPDATE,
> >>>>>> +    .parent        = TYPE_DEVICE,
> >>>>>
> >>>>> What is the qdev API used here? Why not use a plain object?
> >>>>
> >>>> I wrote this taking vmcoreinfo device as starting point. I will leave
> this as is for now unless anyone has strong opinions.
> >>>
> >>> We shouldn't blindly copy/paste & spread possible design mistakes.
> >>>
> >>> Marc-André, any particular reason to implement vmcoreinfo using qdev
> >>> and not plain object?
> >>>
> >>
> >> I don't remember (damn 8y ago..). It seems the design changed over
> >> time during review, qdev might have been necessary and stayed this
> >> way.
> >
> > I changed it to TYPE_OBJECT and we get a crash here:
> >
> > #3  0x0000aaaaab207a48 [PAC] in object_class_dynamic_cast_assert
> >      (class=0xaaaaac608880, typename=typename@entry=0xaaaaab4b9630
> > "device", file=file@entry=0xaaaaab4300d0
> > "/workspace/qemu-ani/include/hw/qdev-core.h", line=line@entry=77,
> > func=func@entry=0xaaaaab595a90 <__func__.0> "DEVICE_CLASS") at
> > ../qom/object.c:1021
> > #4  0x0000aaaaaaec2d74 in DEVICE_CLASS (klass=<optimized out>) at
> > /workspace/qemu-ani/include/hw/qdev-core.h:77
> > #5  vmcoreinfo_device_class_init (klass=<optimized out>,
> > data=<optimized out>) at ../hw/misc/vmcoreinfo.c:88
>
> I believe you have enough knowledge to understand the concepts you
> are mixing here. You can not change a type signature without
> implementing its interface (which as you noticed, for QEMU is checked
> at runtime).
>

Yes the point was to quickly try and see changing to DEVICE works. Turned
out that more changes would be required and therefore I left it for the
maintained of that device.


> > Basically doing this would be illegal for vmcoreinfo and we need to
> > adjust the code :
> >
> >     DeviceClass *dc = DEVICE_CLASS(klass);
> >
> >      dc->vmsd = &vmstate_vmcoreinfo;
> >      dc->realize = vmcoreinfo_realize;
> >      dc->hotpluggable = false;
> >      set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>
> See the conversion:
>
> https://lore.kernel.org/qemu-devel/20241219153857.57450-1-philmd@linaro.org/


Yes I see you sent a patch and Dan's response. That was exactly also my
opinion. Vmfwupdate, like vmcoreinfo is like a device not a generic object.
So device type is more appropriate.


> > Anyway, for vmfwupdate, it is actually like a device with device
> properties:
> >
> > +    device_class_set_props(dc, vmfwupdate_properties);
> >
> > So I prefer to make it qdev type for now.
>
> We have the opportunity to start with the correct model.
> Consider simplifying our future (see what is required in
> the suggested vmcoreinfo conversion). Except if you insist
> and commit to do the vmfwupdate later.
>
>
Ani Sinha Dec. 19, 2024, 4:16 p.m. UTC | #16
On Thu, 19 Dec, 2024, 9:22 pm Philippe Mathieu-Daudé, <philmd@linaro.org>
wrote:

> On 16/12/24 12:48, Ani Sinha wrote:
>
> > diff --git a/hw/misc/meson.build b/hw/misc/meson.build
> > index d02d96e403..4c5bdb0de2 100644
> > --- a/hw/misc/meson.build
> > +++ b/hw/misc/meson.build
> > @@ -148,6 +148,8 @@ specific_ss.add(when: 'CONFIG_MAC_VIA', if_true:
> files('mac_via.c'))
> >   specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true:
> files('mips_cmgcr.c', 'mips_cpc.c'))
> >   specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true: files('mips_itu.c'))
> >
> > +specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true:
> files('vmfwupdate.c'))
>
> FW_CFG_DMA is offered by multiple targets ...:
>
> $ git grep -w FW_CFG_DMA
> hw/arm/Kconfig:19:    select FW_CFG_DMA
> hw/i386/Kconfig:82:    select FW_CFG_DMA
> hw/i386/Kconfig:113:    select FW_CFG_DMA
> hw/loongarch/Kconfig:22:    select FW_CFG_DMA
> hw/riscv/Kconfig:59:    select FW_CFG_DMA
>
> > diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
> > new file mode 100644
> > index 0000000000..1e29a610c0
> > --- /dev/null
> > +++ b/hw/misc/vmfwupdate.c
> > @@ -0,0 +1,157 @@
> > +/*
> > + * Guest driven VM boot component update device
> > + * For details and specification, please look at
> docs/specs/vmfwupdate.rst.
> > + *
> > + * Copyright (C) 2024 Red Hat, Inc.
> > + *
> > + * Authors: Ani Sinha <anisinha@redhat.com>
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or
> later.
> > + * See the COPYING file in the top-level directory.
> > + *
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "qemu/module.h"
> > +#include "sysemu/reset.h"
> > +#include "hw/nvram/fw_cfg.h"
> > +#include "hw/i386/pc.h"
>
> ... however ...
>
> > +#include "hw/qdev-properties.h"
> > +#include "hw/misc/vmfwupdate.h"
> > +#include "qemu/error-report.h"
> > +
> > +static void fw_update_reset(void *dev)
> > +{
> > +    /* a NOOP at present */
> > +    return;
> > +}
> > +
> > +
> > +static uint64_t get_max_fw_size(void)
> > +{
> > +    Object *m_obj = qdev_get_machine();
> > +    PCMachineState *pcms = PC_MACHINE(m_obj);
> > +
> > +    if (pcms) {
> > +        return pcms->max_fw_size;
>
> ... this code depends on x86/PC.
>
> Could it be wiser to add a new VM_FWUPDATE Kconfig
> symbol, having it depending on FW_CFG_DMA && I386?
>

There is no reason why vmfwupdate would be limited to x86 only. There is
minimal support needed from hypervisor side for this mechanism. That
mechanism has little dependency on specific platform.


> +    } else {
> > +        return 0;
> > +    }
> > +}
>
>
Philippe Mathieu-Daudé Dec. 19, 2024, 4:51 p.m. UTC | #17
On 19/12/24 17:16, Ani Sinha wrote:
> 
> 
> On Thu, 19 Dec, 2024, 9:22 pm Philippe Mathieu-Daudé, <philmd@linaro.org 
> <mailto:philmd@linaro.org>> wrote:
> 
>     On 16/12/24 12:48, Ani Sinha wrote:
> 
>      > diff --git a/hw/misc/meson.build b/hw/misc/meson.build
>      > index d02d96e403..4c5bdb0de2 100644
>      > --- a/hw/misc/meson.build
>      > +++ b/hw/misc/meson.build
>      > @@ -148,6 +148,8 @@ specific_ss.add(when: 'CONFIG_MAC_VIA',
>     if_true: files('mac_via.c'))
>      >   specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true:
>     files('mips_cmgcr.c', 'mips_cpc.c'))
>      >   specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true:
>     files('mips_itu.c'))
>      >
>      > +specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true:
>     files('vmfwupdate.c'))
> 
>     FW_CFG_DMA is offered by multiple targets ...:
> 
>     $ git grep -w FW_CFG_DMA
>     hw/arm/Kconfig:19:    select FW_CFG_DMA
>     hw/i386/Kconfig:82:    select FW_CFG_DMA
>     hw/i386/Kconfig:113:    select FW_CFG_DMA
>     hw/loongarch/Kconfig:22:    select FW_CFG_DMA
>     hw/riscv/Kconfig:59:    select FW_CFG_DMA
> 
>      > diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
>      > new file mode 100644
>      > index 0000000000..1e29a610c0
>      > --- /dev/null
>      > +++ b/hw/misc/vmfwupdate.c
>      > @@ -0,0 +1,157 @@
>      > +/*
>      > + * Guest driven VM boot component update device
>      > + * For details and specification, please look at docs/specs/
>     vmfwupdate.rst.
>      > + *
>      > + * Copyright (C) 2024 Red Hat, Inc.
>      > + *
>      > + * Authors: Ani Sinha <anisinha@redhat.com
>     <mailto:anisinha@redhat.com>>
>      > + *
>      > + * This work is licensed under the terms of the GNU GPL, version
>     2 or later.
>      > + * See the COPYING file in the top-level directory.
>      > + *
>      > + */
>      > +
>      > +#include "qemu/osdep.h"
>      > +#include "qapi/error.h"
>      > +#include "qemu/module.h"
>      > +#include "sysemu/reset.h"
>      > +#include "hw/nvram/fw_cfg.h"
>      > +#include "hw/i386/pc.h"
> 
>     ... however ...
> 
>      > +#include "hw/qdev-properties.h"
>      > +#include "hw/misc/vmfwupdate.h"
>      > +#include "qemu/error-report.h"
>      > +
>      > +static void fw_update_reset(void *dev)
>      > +{
>      > +    /* a NOOP at present */
>      > +    return;
>      > +}
>      > +
>      > +
>      > +static uint64_t get_max_fw_size(void)
>      > +{
>      > +    Object *m_obj = qdev_get_machine();
>      > +    PCMachineState *pcms = PC_MACHINE(m_obj);
>      > +
>      > +    if (pcms) {
>      > +        return pcms->max_fw_size;
> 
>     ... this code depends on x86/PC.
> 
>     Could it be wiser to add a new VM_FWUPDATE Kconfig
>     symbol, having it depending on FW_CFG_DMA && I386?
> 
> 
> There is no reason why vmfwupdate would be limited to x86 only. There is 
> minimal support needed from hypervisor side for this mechanism. That 
> mechanism has little dependency on specific platform.

OK, then please remove that PCMachineState use.

What about the FW_CFG_DMA dependency?
Ani Sinha Dec. 19, 2024, 5:08 p.m. UTC | #18
On Thu, 19 Dec, 2024, 10:21 pm Philippe Mathieu-Daudé, <philmd@linaro.org>
wrote:

> On 19/12/24 17:16, Ani Sinha wrote:
> >
> >
> > On Thu, 19 Dec, 2024, 9:22 pm Philippe Mathieu-Daudé, <philmd@linaro.org
> > <mailto:philmd@linaro.org>> wrote:
> >
> >     On 16/12/24 12:48, Ani Sinha wrote:
> >
> >      > diff --git a/hw/misc/meson.build b/hw/misc/meson.build
> >      > index d02d96e403..4c5bdb0de2 100644
> >      > --- a/hw/misc/meson.build
> >      > +++ b/hw/misc/meson.build
> >      > @@ -148,6 +148,8 @@ specific_ss.add(when: 'CONFIG_MAC_VIA',
> >     if_true: files('mac_via.c'))
> >      >   specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true:
> >     files('mips_cmgcr.c', 'mips_cpc.c'))
> >      >   specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true:
> >     files('mips_itu.c'))
> >      >
> >      > +specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true:
> >     files('vmfwupdate.c'))
> >
> >     FW_CFG_DMA is offered by multiple targets ...:
> >
> >     $ git grep -w FW_CFG_DMA
> >     hw/arm/Kconfig:19:    select FW_CFG_DMA
> >     hw/i386/Kconfig:82:    select FW_CFG_DMA
> >     hw/i386/Kconfig:113:    select FW_CFG_DMA
> >     hw/loongarch/Kconfig:22:    select FW_CFG_DMA
> >     hw/riscv/Kconfig:59:    select FW_CFG_DMA
> >
> >      > diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
> >      > new file mode 100644
> >      > index 0000000000..1e29a610c0
> >      > --- /dev/null
> >      > +++ b/hw/misc/vmfwupdate.c
> >      > @@ -0,0 +1,157 @@
> >      > +/*
> >      > + * Guest driven VM boot component update device
> >      > + * For details and specification, please look at docs/specs/
> >     vmfwupdate.rst.
> >      > + *
> >      > + * Copyright (C) 2024 Red Hat, Inc.
> >      > + *
> >      > + * Authors: Ani Sinha <anisinha@redhat.com
> >     <mailto:anisinha@redhat.com>>
> >      > + *
> >      > + * This work is licensed under the terms of the GNU GPL, version
> >     2 or later.
> >      > + * See the COPYING file in the top-level directory.
> >      > + *
> >      > + */
> >      > +
> >      > +#include "qemu/osdep.h"
> >      > +#include "qapi/error.h"
> >      > +#include "qemu/module.h"
> >      > +#include "sysemu/reset.h"
> >      > +#include "hw/nvram/fw_cfg.h"
> >      > +#include "hw/i386/pc.h"
> >
> >     ... however ...
> >
> >      > +#include "hw/qdev-properties.h"
> >      > +#include "hw/misc/vmfwupdate.h"
> >      > +#include "qemu/error-report.h"
> >      > +
> >      > +static void fw_update_reset(void *dev)
> >      > +{
> >      > +    /* a NOOP at present */
> >      > +    return;
> >      > +}
> >      > +
> >      > +
> >      > +static uint64_t get_max_fw_size(void)
> >      > +{
> >      > +    Object *m_obj = qdev_get_machine();
> >      > +    PCMachineState *pcms = PC_MACHINE(m_obj);
> >      > +
> >      > +    if (pcms) {
> >      > +        return pcms->max_fw_size;
> >
> >     ... this code depends on x86/PC.
> >
> >     Could it be wiser to add a new VM_FWUPDATE Kconfig
> >     symbol, having it depending on FW_CFG_DMA && I386?
> >
> >
> > There is no reason why vmfwupdate would be limited to x86 only. There is
> > minimal support needed from hypervisor side for this mechanism. That
> > mechanism has little dependency on specific platform.
>
> OK, then please remove that PCMachineState use


That is used because the max_fw_size property only exists for pc machines.
Do you have a better idea to get this value in a platform agnostic way?
Like I said in the previous reply I do not know how to get this value
reliably for all machines. If anyone has better ideas, I'm all ears.

.
>
> What about the FW_CFG_DMA dependency?
>
>
Philippe Mathieu-Daudé Dec. 19, 2024, 5:16 p.m. UTC | #19
On 19/12/24 18:08, Ani Sinha wrote:
> 
> 
> On Thu, 19 Dec, 2024, 10:21 pm Philippe Mathieu-Daudé, 
> <philmd@linaro.org <mailto:philmd@linaro.org>> wrote:
> 
>     On 19/12/24 17:16, Ani Sinha wrote:
>      >
>      >
>      > On Thu, 19 Dec, 2024, 9:22 pm Philippe Mathieu-Daudé,
>     <philmd@linaro.org <mailto:philmd@linaro.org>
>      > <mailto:philmd@linaro.org <mailto:philmd@linaro.org>>> wrote:
>      >
>      >     On 16/12/24 12:48, Ani Sinha wrote:
>      >
>      >      > diff --git a/hw/misc/meson.build b/hw/misc/meson.build
>      >      > index d02d96e403..4c5bdb0de2 100644
>      >      > --- a/hw/misc/meson.build
>      >      > +++ b/hw/misc/meson.build
>      >      > @@ -148,6 +148,8 @@ specific_ss.add(when: 'CONFIG_MAC_VIA',
>      >     if_true: files('mac_via.c'))
>      >      >   specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true:
>      >     files('mips_cmgcr.c', 'mips_cpc.c'))
>      >      >   specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true:
>      >     files('mips_itu.c'))
>      >      >
>      >      > +specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true:
>      >     files('vmfwupdate.c'))
>      >
>      >     FW_CFG_DMA is offered by multiple targets ...:
>      >
>      >     $ git grep -w FW_CFG_DMA
>      >     hw/arm/Kconfig:19:    select FW_CFG_DMA
>      >     hw/i386/Kconfig:82:    select FW_CFG_DMA
>      >     hw/i386/Kconfig:113:    select FW_CFG_DMA
>      >     hw/loongarch/Kconfig:22:    select FW_CFG_DMA
>      >     hw/riscv/Kconfig:59:    select FW_CFG_DMA
>      >
>      >      > diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
>      >      > new file mode 100644
>      >      > index 0000000000..1e29a610c0
>      >      > --- /dev/null
>      >      > +++ b/hw/misc/vmfwupdate.c
>      >      > @@ -0,0 +1,157 @@
>      >      > +/*
>      >      > + * Guest driven VM boot component update device
>      >      > + * For details and specification, please look at docs/specs/
>      >     vmfwupdate.rst.
>      >      > + *
>      >      > + * Copyright (C) 2024 Red Hat, Inc.
>      >      > + *
>      >      > + * Authors: Ani Sinha <anisinha@redhat.com
>     <mailto:anisinha@redhat.com>
>      >     <mailto:anisinha@redhat.com <mailto:anisinha@redhat.com>>>
>      >      > + *
>      >      > + * This work is licensed under the terms of the GNU GPL,
>     version
>      >     2 or later.
>      >      > + * See the COPYING file in the top-level directory.
>      >      > + *
>      >      > + */
>      >      > +
>      >      > +#include "qemu/osdep.h"
>      >      > +#include "qapi/error.h"
>      >      > +#include "qemu/module.h"
>      >      > +#include "sysemu/reset.h"
>      >      > +#include "hw/nvram/fw_cfg.h"
>      >      > +#include "hw/i386/pc.h"
>      >
>      >     ... however ...
>      >
>      >      > +#include "hw/qdev-properties.h"
>      >      > +#include "hw/misc/vmfwupdate.h"
>      >      > +#include "qemu/error-report.h"
>      >      > +
>      >      > +static void fw_update_reset(void *dev)
>      >      > +{
>      >      > +    /* a NOOP at present */
>      >      > +    return;
>      >      > +}
>      >      > +
>      >      > +
>      >      > +static uint64_t get_max_fw_size(void)
>      >      > +{
>      >      > +    Object *m_obj = qdev_get_machine();
>      >      > +    PCMachineState *pcms = PC_MACHINE(m_obj);
>      >      > +
>      >      > +    if (pcms) {
>      >      > +        return pcms->max_fw_size;
>      >
>      >     ... this code depends on x86/PC.
>      >
>      >     Could it be wiser to add a new VM_FWUPDATE Kconfig
>      >     symbol, having it depending on FW_CFG_DMA && I386?
>      >
>      >
>      > There is no reason why vmfwupdate would be limited to x86 only.
>     There is
>      > minimal support needed from hypervisor side for this mechanism. That
>      > mechanism has little dependency on specific platform.
> 
>     OK, then please remove that PCMachineState use
> 
> 
> That is used because the max_fw_size property only exists for pc 
> machines. Do you have a better idea to get this value in a platform 
> agnostic way? Like I said in the previous reply I do not know how to get 
> this value reliably for all machines. If anyone has better ideas, I'm 
> all ears.

Either add the I386 dependency or don't use PC_MACHINE, because on
non-x86 targets PC_MACHINE(qdev_get_machine()) will crash.

>     What about the FW_CFG_DMA dependency?
> 

I wasted enough time on this so I'll stop reviewing this work.

Regards,

Phil.
Ani Sinha Dec. 20, 2024, 1:32 a.m. UTC | #20
On Thu, Dec 19, 2024 at 10:46 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> On 19/12/24 18:08, Ani Sinha wrote:
> >
> >
> > On Thu, 19 Dec, 2024, 10:21 pm Philippe Mathieu-Daudé,
> > <philmd@linaro.org <mailto:philmd@linaro.org>> wrote:
> >
> >     On 19/12/24 17:16, Ani Sinha wrote:
> >      >
> >      >
> >      > On Thu, 19 Dec, 2024, 9:22 pm Philippe Mathieu-Daudé,
> >     <philmd@linaro.org <mailto:philmd@linaro.org>
> >      > <mailto:philmd@linaro.org <mailto:philmd@linaro.org>>> wrote:
> >      >
> >      >     On 16/12/24 12:48, Ani Sinha wrote:
> >      >
> >      >      > diff --git a/hw/misc/meson.build b/hw/misc/meson.build
> >      >      > index d02d96e403..4c5bdb0de2 100644
> >      >      > --- a/hw/misc/meson.build
> >      >      > +++ b/hw/misc/meson.build
> >      >      > @@ -148,6 +148,8 @@ specific_ss.add(when: 'CONFIG_MAC_VIA',
> >      >     if_true: files('mac_via.c'))
> >      >      >   specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true:
> >      >     files('mips_cmgcr.c', 'mips_cpc.c'))
> >      >      >   specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true:
> >      >     files('mips_itu.c'))
> >      >      >
> >      >      > +specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true:
> >      >     files('vmfwupdate.c'))
> >      >
> >      >     FW_CFG_DMA is offered by multiple targets ...:
> >      >
> >      >     $ git grep -w FW_CFG_DMA
> >      >     hw/arm/Kconfig:19:    select FW_CFG_DMA
> >      >     hw/i386/Kconfig:82:    select FW_CFG_DMA
> >      >     hw/i386/Kconfig:113:    select FW_CFG_DMA
> >      >     hw/loongarch/Kconfig:22:    select FW_CFG_DMA
> >      >     hw/riscv/Kconfig:59:    select FW_CFG_DMA
> >      >
> >      >      > diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
> >      >      > new file mode 100644
> >      >      > index 0000000000..1e29a610c0
> >      >      > --- /dev/null
> >      >      > +++ b/hw/misc/vmfwupdate.c
> >      >      > @@ -0,0 +1,157 @@
> >      >      > +/*
> >      >      > + * Guest driven VM boot component update device
> >      >      > + * For details and specification, please look at docs/specs/
> >      >     vmfwupdate.rst.
> >      >      > + *
> >      >      > + * Copyright (C) 2024 Red Hat, Inc.
> >      >      > + *
> >      >      > + * Authors: Ani Sinha <anisinha@redhat.com
> >     <mailto:anisinha@redhat.com>
> >      >     <mailto:anisinha@redhat.com <mailto:anisinha@redhat.com>>>
> >      >      > + *
> >      >      > + * This work is licensed under the terms of the GNU GPL,
> >     version
> >      >     2 or later.
> >      >      > + * See the COPYING file in the top-level directory.
> >      >      > + *
> >      >      > + */
> >      >      > +
> >      >      > +#include "qemu/osdep.h"
> >      >      > +#include "qapi/error.h"
> >      >      > +#include "qemu/module.h"
> >      >      > +#include "sysemu/reset.h"
> >      >      > +#include "hw/nvram/fw_cfg.h"
> >      >      > +#include "hw/i386/pc.h"
> >      >
> >      >     ... however ...
> >      >
> >      >      > +#include "hw/qdev-properties.h"
> >      >      > +#include "hw/misc/vmfwupdate.h"
> >      >      > +#include "qemu/error-report.h"
> >      >      > +
> >      >      > +static void fw_update_reset(void *dev)
> >      >      > +{
> >      >      > +    /* a NOOP at present */
> >      >      > +    return;
> >      >      > +}
> >      >      > +
> >      >      > +
> >      >      > +static uint64_t get_max_fw_size(void)
> >      >      > +{
> >      >      > +    Object *m_obj = qdev_get_machine();
> >      >      > +    PCMachineState *pcms = PC_MACHINE(m_obj);
> >      >      > +
> >      >      > +    if (pcms) {
> >      >      > +        return pcms->max_fw_size;
> >      >
> >      >     ... this code depends on x86/PC.
> >      >
> >      >     Could it be wiser to add a new VM_FWUPDATE Kconfig
> >      >     symbol, having it depending on FW_CFG_DMA && I386?
> >      >
> >      >
> >      > There is no reason why vmfwupdate would be limited to x86 only.
> >     There is
> >      > minimal support needed from hypervisor side for this mechanism. That
> >      > mechanism has little dependency on specific platform.
> >
> >     OK, then please remove that PCMachineState use
> >
> >
> > That is used because the max_fw_size property only exists for pc
> > machines. Do you have a better idea to get this value in a platform
> > agnostic way? Like I said in the previous reply I do not know how to get
> > this value reliably for all machines. If anyone has better ideas, I'm
> > all ears.
>
> Either add the I386 dependency or don't use PC_MACHINE, because on
> non-x86 targets PC_MACHINE(qdev_get_machine()) will crash.

Ah this is where we have a disconnect. I assumed that
> pcms = PC_MACHINE(m_obj)

would return NULL on non-x86.

Seems a better way to do this (as is done in vga.c) is to use
object_dynamic_cast()
How about

diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
index 0e90bd10e1..19d042b929 100644
--- a/hw/misc/vmfwupdate.c
+++ b/hw/misc/vmfwupdate.c
@@ -32,9 +32,11 @@ static inline VMFwUpdateState *vmfwupdate_find(void)
 static uint64_t get_max_fw_size(void)
 {
     Object *m_obj = qdev_get_machine();
-    PCMachineState *pcms = PC_MACHINE(m_obj);
+    MachineState *ms = MACHINE(m_obj);
+    PCMachineState *pcms;
-    if (pcms) {
+    if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
+        pcms = PC_MACHINE(m_obj);
         return pcms->max_fw_size;
     } else {
         return 0;

As I said before, if this is not the best way, please suggest an
alternative to get max_fw_size for any platform.

>
> >     What about the FW_CFG_DMA dependency?

If you read the spec,

+ The ``fw-cfg`` interface must support the
+     DMA interface. It may only support the DMA interface for write operations.

So we need that.

> >
>
> I wasted enough time on this so I'll stop reviewing this work.

Well up to you but I have tried to incorporate most of your
suggestions here (as it has been with other patches that you review).
I am off for the year-end break so won't be sending any more versions
of this patch in the next few days. So if you change your mind, feel
free to comment.
Have a good year-end break.
Ani Sinha Dec. 20, 2024, 10 a.m. UTC | #21
> > Either add the I386 dependency or don't use PC_MACHINE, because on
> > non-x86 targets PC_MACHINE(qdev_get_machine()) will crash.
>
> Ah this is where we have a disconnect. I assumed that
> > pcms = PC_MACHINE(m_obj)
>
> would return NULL on non-x86.
>
> Seems a better way to do this (as is done in vga.c) is to use
> object_dynamic_cast()
> How about
>
> diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
> index 0e90bd10e1..19d042b929 100644
> --- a/hw/misc/vmfwupdate.c
> +++ b/hw/misc/vmfwupdate.c
> @@ -32,9 +32,11 @@ static inline VMFwUpdateState *vmfwupdate_find(void)
>  static uint64_t get_max_fw_size(void)
>  {
>      Object *m_obj = qdev_get_machine();
> -    PCMachineState *pcms = PC_MACHINE(m_obj);
> +    MachineState *ms = MACHINE(m_obj);
> +    PCMachineState *pcms;
> -    if (pcms) {
> +    if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
> +        pcms = PC_MACHINE(m_obj);
>          return pcms->max_fw_size;
>      } else {
>          return 0;
>

For the records, I tested this with arm and the following command line
does not crash QEMU;

./qemu-system-arm -machine virt-9.2 -device vmfwupdate

I have also added a separate functional test to exercise basic device
creation which will be part of v5 when I send it out.
Alexander Graf Dec. 20, 2024, 11:32 a.m. UTC | #22
On 20.12.24 11:00, Ani Sinha wrote:
>>> Either add the I386 dependency or don't use PC_MACHINE, because on
>>> non-x86 targets PC_MACHINE(qdev_get_machine()) will crash.
>> Ah this is where we have a disconnect. I assumed that
>>> pcms = PC_MACHINE(m_obj)
>> would return NULL on non-x86.
>>
>> Seems a better way to do this (as is done in vga.c) is to use
>> object_dynamic_cast()
>> How about
>>
>> diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
>> index 0e90bd10e1..19d042b929 100644
>> --- a/hw/misc/vmfwupdate.c
>> +++ b/hw/misc/vmfwupdate.c
>> @@ -32,9 +32,11 @@ static inline VMFwUpdateState *vmfwupdate_find(void)
>>   static uint64_t get_max_fw_size(void)
>>   {
>>       Object *m_obj = qdev_get_machine();
>> -    PCMachineState *pcms = PC_MACHINE(m_obj);
>> +    MachineState *ms = MACHINE(m_obj);
>> +    PCMachineState *pcms;
>> -    if (pcms) {
>> +    if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
>> +        pcms = PC_MACHINE(m_obj);
>>           return pcms->max_fw_size;
>>       } else {
>>           return 0;
>>
> For the records, I tested this with arm and the following command line
> does not crash QEMU;
>
> ./qemu-system-arm -machine virt-9.2 -device vmfwupdate
>
> I have also added a separate functional test to exercise basic device
> creation which will be part of v5 when I send it out.


You are currently not implementing the reset logic required to actually 
make vmfwupdate work. That means technically, the device should not be 
instantiable on *any* platform at the moment. For example with the 
command line above you would advertise the capability to update firmware 
in fw-cfg, but then not back it by functionality. If QEMU were to merge 
this patch, it would just create a broken device.

Please make sure that the vmfwupdate device can only be instantiated on 
machine types that it has full support for.


Alex




Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
Ani Sinha Dec. 20, 2024, 1:31 p.m. UTC | #23
On Fri, Dec 20, 2024 at 5:03 PM Alexander Graf <graf@amazon.com> wrote:
>
>
> On 20.12.24 11:00, Ani Sinha wrote:
> >>> Either add the I386 dependency or don't use PC_MACHINE, because on
> >>> non-x86 targets PC_MACHINE(qdev_get_machine()) will crash.
> >> Ah this is where we have a disconnect. I assumed that
> >>> pcms = PC_MACHINE(m_obj)
> >> would return NULL on non-x86.
> >>
> >> Seems a better way to do this (as is done in vga.c) is to use
> >> object_dynamic_cast()
> >> How about
> >>
> >> diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
> >> index 0e90bd10e1..19d042b929 100644
> >> --- a/hw/misc/vmfwupdate.c
> >> +++ b/hw/misc/vmfwupdate.c
> >> @@ -32,9 +32,11 @@ static inline VMFwUpdateState *vmfwupdate_find(void)
> >>   static uint64_t get_max_fw_size(void)
> >>   {
> >>       Object *m_obj = qdev_get_machine();
> >> -    PCMachineState *pcms = PC_MACHINE(m_obj);
> >> +    MachineState *ms = MACHINE(m_obj);
> >> +    PCMachineState *pcms;
> >> -    if (pcms) {
> >> +    if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
> >> +        pcms = PC_MACHINE(m_obj);
> >>           return pcms->max_fw_size;
> >>       } else {
> >>           return 0;
> >>
> > For the records, I tested this with arm and the following command line
> > does not crash QEMU;
> >
> > ./qemu-system-arm -machine virt-9.2 -device vmfwupdate
> >
> > I have also added a separate functional test to exercise basic device
> > creation which will be part of v5 when I send it out.
>
>
> You are currently not implementing the reset logic required to actually
> make vmfwupdate work.

Yes that is correct and that is by design. The reset logic on CoCo
depends on the larger piece of work on how to enable reset and
re-instantiation of the VM without simply terminating. I did not want
to wait until all those complicated bits were sorted out first. I
wanted to make sure that the hypervisor/guest interface is put in
place.

That means technically, the device should not be
> instantiable on *any* platform at the moment. For example with the
> command line above you would advertise the capability to update firmware
> in fw-cfg, but then not back it by functionality.

OK so if we wanted to put this work peace meal in smaller chunks, can
we have an additional capability bit that actually advertizes this
functionality on certain machine types/platforms?

If QEMU were to merge
> this patch, it would just create a broken device.

Are you talking about CoCo or non-CoCo?

>
> Please make sure that the vmfwupdate device can only be instantiated on
> machine types that it has full support for.
>
>
> Alex
>
>
>
>
> Amazon Web Services Development Center Germany GmbH
> Krausenstr. 38
> 10117 Berlin
> Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
> Sitz: Berlin
> Ust-ID: DE 365 538 597
Ani Sinha Dec. 20, 2024, 5:24 p.m. UTC | #24
On Fri, 20 Dec, 2024, 7:01 pm Ani Sinha, <anisinha@redhat.com> wrote:

> On Fri, Dec 20, 2024 at 5:03 PM Alexander Graf <graf@amazon.com> wrote:
> >
> >
> > On 20.12.24 11:00, Ani Sinha wrote:
> > >>> Either add the I386 dependency or don't use PC_MACHINE, because on
> > >>> non-x86 targets PC_MACHINE(qdev_get_machine()) will crash.
> > >> Ah this is where we have a disconnect. I assumed that
> > >>> pcms = PC_MACHINE(m_obj)
> > >> would return NULL on non-x86.
> > >>
> > >> Seems a better way to do this (as is done in vga.c) is to use
> > >> object_dynamic_cast()
> > >> How about
> > >>
> > >> diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
> > >> index 0e90bd10e1..19d042b929 100644
> > >> --- a/hw/misc/vmfwupdate.c
> > >> +++ b/hw/misc/vmfwupdate.c
> > >> @@ -32,9 +32,11 @@ static inline VMFwUpdateState
> *vmfwupdate_find(void)
> > >>   static uint64_t get_max_fw_size(void)
> > >>   {
> > >>       Object *m_obj = qdev_get_machine();
> > >> -    PCMachineState *pcms = PC_MACHINE(m_obj);
> > >> +    MachineState *ms = MACHINE(m_obj);
> > >> +    PCMachineState *pcms;
> > >> -    if (pcms) {
> > >> +    if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
> > >> +        pcms = PC_MACHINE(m_obj);
> > >>           return pcms->max_fw_size;
> > >>       } else {
> > >>           return 0;
> > >>
> > > For the records, I tested this with arm and the following command line
> > > does not crash QEMU;
> > >
> > > ./qemu-system-arm -machine virt-9.2 -device vmfwupdate
> > >
> > > I have also added a separate functional test to exercise basic device
> > > creation which will be part of v5 when I send it out.
> >
> >
> > You are currently not implementing the reset logic required to actually
> > make vmfwupdate work.
>
> Yes that is correct and that is by design. The reset logic on CoCo
> depends on the larger piece of work on how to enable reset and
> re-instantiation of the VM without simply terminating. I did not want
> to wait until all those complicated bits were sorted out first. I
> wanted to make sure that the hypervisor/guest interface is put in
> place.
>
> That means technically, the device should not be
> > instantiable on *any* platform at the moment. For example with the
> > command line above you would advertise the capability to update firmware
> > in fw-cfg, but then not back it by functionality.
>
> OK so if we wanted to put this work peace meal in smaller chunks, can
> we have an additional capability bit that actually advertizes this
> functionality on certain machine types/platforms?
>

Note that having a capability bit also helps us to test each machine /
platforms separately as we roll this out. The capability will be disabled
for all platforms and machines initially and incrementally enabled for
those machines and platforms for which we have tested the feature to work.


> If QEMU were to merge
> > this patch, it would just create a broken device.
>
> Are you talking about CoCo or non-CoCo?
>

Maybe for non coco case which is a lot simpler, we can add the reset bits
as a part of this patchset.


>
> > Please make sure that the vmfwupdate device can only be instantiated on
> > machine types that it has full support for.
> >
> >
> > Alex
> >
> >
> >
> >
> > Amazon Web Services Development Center Germany GmbH
> > Krausenstr. 38
> > 10117 Berlin
> > Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
> > Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
> > Sitz: Berlin
> > Ust-ID: DE 365 538 597
>
Alexander Graf Dec. 20, 2024, 6:19 p.m. UTC | #25
On 20.12.24 14:31, Ani Sinha wrote:
> On Fri, Dec 20, 2024 at 5:03 PM Alexander Graf <graf@amazon.com> wrote:
>>
>> On 20.12.24 11:00, Ani Sinha wrote:
>>>>> Either add the I386 dependency or don't use PC_MACHINE, because on
>>>>> non-x86 targets PC_MACHINE(qdev_get_machine()) will crash.
>>>> Ah this is where we have a disconnect. I assumed that
>>>>> pcms = PC_MACHINE(m_obj)
>>>> would return NULL on non-x86.
>>>>
>>>> Seems a better way to do this (as is done in vga.c) is to use
>>>> object_dynamic_cast()
>>>> How about
>>>>
>>>> diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
>>>> index 0e90bd10e1..19d042b929 100644
>>>> --- a/hw/misc/vmfwupdate.c
>>>> +++ b/hw/misc/vmfwupdate.c
>>>> @@ -32,9 +32,11 @@ static inline VMFwUpdateState *vmfwupdate_find(void)
>>>>    static uint64_t get_max_fw_size(void)
>>>>    {
>>>>        Object *m_obj = qdev_get_machine();
>>>> -    PCMachineState *pcms = PC_MACHINE(m_obj);
>>>> +    MachineState *ms = MACHINE(m_obj);
>>>> +    PCMachineState *pcms;
>>>> -    if (pcms) {
>>>> +    if (object_dynamic_cast(OBJECT(ms), TYPE_X86_MACHINE)) {
>>>> +        pcms = PC_MACHINE(m_obj);
>>>>            return pcms->max_fw_size;
>>>>        } else {
>>>>            return 0;
>>>>
>>> For the records, I tested this with arm and the following command line
>>> does not crash QEMU;
>>>
>>> ./qemu-system-arm -machine virt-9.2 -device vmfwupdate
>>>
>>> I have also added a separate functional test to exercise basic device
>>> creation which will be part of v5 when I send it out.
>>
>> You are currently not implementing the reset logic required to actually
>> make vmfwupdate work.
> Yes that is correct and that is by design. The reset logic on CoCo
> depends on the larger piece of work on how to enable reset and
> re-instantiation of the VM without simply terminating. I did not want
> to wait until all those complicated bits were sorted out first. I
> wanted to make sure that the hypervisor/guest interface is put in
> place.


This logic has no dependency on CoCo.


>
> That means technically, the device should not be
>> instantiable on *any* platform at the moment. For example with the
>> command line above you would advertise the capability to update firmware
>> in fw-cfg, but then not back it by functionality.
> OK so if we wanted to put this work peace meal in smaller chunks, can
> we have an additional capability bit that actually advertizes this
> functionality on certain machine types/platforms?


Maybe. The property we need is that spawning this device on a machine 
type that does not support reset enlightenment for it should fail 
launching. I don't really care how you build that. But today, adding 
-device vmfwupdate should fail on any target system.


>
> If QEMU were to merge
>> this patch, it would just create a broken device.
> Are you talking about CoCo or non-CoCo?


Always. The guest would see the interface, try to drive it and then we 
ignore its inputs. We have to prevent that from happening if we want to 
merge the base logic without board backend code.

I personally think the board backend code for pc is simple enough that 
it makes sense to fold into this patch though so that you have at least 
one working board.


Alex





Amazon Web Services Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 822f34344b..131a38659b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2539,6 +2539,15 @@  F: include/hw/acpi/vmgenid.h
 F: docs/specs/vmgenid.rst
 F: tests/qtest/vmgenid-test.c
 
+VM Firmware Update
+M: Ani Sinha <anisinha@redhat.com>
+M: Alex Graf <graf@amazon.com>
+M: Paolo Bonzini <pbonzini@redhat.com>
+S: Maintained
+F: hw/misc/vmfwupdate.c
+F: include/hw/misc/vmfwupdate.h
+F: docs/specs/vmfwupdate.rst
+
 LED
 M: Philippe Mathieu-Daudé <philmd@linaro.org>
 S: Maintained
diff --git a/docs/specs/index.rst b/docs/specs/index.rst
index ff5a1f03da..cbda7e0398 100644
--- a/docs/specs/index.rst
+++ b/docs/specs/index.rst
@@ -34,6 +34,7 @@  guest hardware that is specific to QEMU.
    virt-ctlr
    vmcoreinfo
    vmgenid
+   vmfwupdate
    rapl-msr
    rocker
    riscv-iommu
diff --git a/docs/specs/vmfwupdate.rst b/docs/specs/vmfwupdate.rst
new file mode 100644
index 0000000000..3a36ca14c7
--- /dev/null
+++ b/docs/specs/vmfwupdate.rst
@@ -0,0 +1,109 @@ 
+VMFWUPDATE INTERFACE SPECIFICATION
+##################################
+
+Introduction
+************
+
+``Vmfwupdate`` is an extension to ``fw-cfg`` that allows guests to replace early boot
+code in their virtual machine. Through a combination of vmfwupdate and
+hypervisor stack knowledge, guests can deterministically replace the launch
+payload for guests. This is useful for environments like SEV-SNP where the
+launch payload becomes the launch digest. Guests can use vmfwupdate to provide
+a measured, full guest payload (BIOS image, kernel, initramfs, kernel
+command line) to the virtual machine which enables them to easily reason about
+integrity of the resulting system.
+For more information, please see the `KVM Forum 2024 presentation <KVMFORUM_>`__
+about this work from the authors [1]_.
+
+
+.. _KVMFORUM: https://www.youtube.com/watch?v=VCMBxU6tAto
+
+Base Requirements
+*****************
+
+#. **fw-cfg**:
+     The target system must provide a ``fw-cfg`` interface. For x86 based
+     environments, this ``fw-cfg`` interface must be accessible through PIO ports
+     0x510 and 0x511. The ``fw-cfg`` interface does not need to be announced as part
+     of system device tables such as DSDT. The ``fw-cfg`` interface must support the
+     DMA interface. It may only support the DMA interface for write operations.
+
+#. **BIOS region**:
+     The hypervisor must provide a BIOS region which may be
+     statically sized. Through vmfwupdate, the guest is able to atomically replace
+     its contents. The BIOS region must be mapped as read-write memory. In a
+     SEV-SNP environment, the BIOS region must be mapped as private memory at
+     launch time.
+
+Fw-cfg Files
+************
+
+Guests drive vmfwupdate through special ``fw-cfg`` files that control its flow
+followed by a standard system reset operation. When vmfwupdate is available,
+it provides the following ``fw-cfg`` files:
+
+* ``vmfwupdate/cap`` (``u64``) - Read-only Little Endian encoded bitmap of additional
+  capabilities the interface supports. List of available capabilities:
+
+     ``VMFWUPDATE_CAP_BIOS_RESIZE        0x0000000000000001``
+
+* ``vmfwupdate/bios-size`` (``u32``) - Little Endian encoded size of the BIOS region.
+  Read-only by default. Optionally Read-write if ``vmfwupdate/cap`` contains
+  ``VMFWUPDATE_CAP_BIOS_RESIZE``. On write, the BIOS region may resize. Guests are
+  required to read the value after writing and compare it with the requested size
+  to determine whether the resize was successful. Note, x86 BIOS regions always
+  start at 4GiB - bios-size.
+
+* ``vmfwupdate/opaque`` (``1024 bytes``) - A 1KiB buffer that survives the BIOS replacement
+  flow. Can be used by the guest to propagate guest physical addresses of payloads
+  to its BIOS stage. It’s recommended to make the new BIOS clear this file on boot
+  if it exists. Contents of this file are under control by the hypervisor. In an
+  environment that considers the hypervisor outside of its trust boundary, guests
+  are advised to validate its contents before consumption.
+
+* ``vmfwupdate/disable`` (``u8``) - Indicates whether the interface is disabled.
+  Returns 0 for enabled, 1 for disabled. Writing any value disables it. Writing is
+  only allowed if the value is 0. When the interface is disabled, the replace file
+  is ignored on reset. This value resets to 0 on system reset.
+
+* ``vmfwupdate/bios-addr`` (``u64``) - A 64bit Little Endian encoded guest physical address
+  at the beginning of the replacement BIOS region. The provided payload must reside
+  in shared memory. 0 on system reset.
+
+
+Triggering the Firmware Update
+******************************
+
+To initiate the firmware update process, the guest issues a standard system reset
+operation through any of the means implemented by the machine model.
+
+On reset, the hypervisor evaluates whether ``vmfwupdate/disable`` is ``1``. If it is, it ignores
+any other vmfwupdate values and performs a standard system reset.
+
+If ``vmfwupdate/disable`` is ``0``, the hypervisor checks if bios-addr is ``0``. If it is, it
+performs a standard system reset.
+
+If ``vmfwupdate/bios-addr`` is ``non-0``, the hypervisor replaces the contents of the system’s
+BIOS region with the guest physically contiguous ``vmfwupdate/bios-size`` sized payload at the
+guest physical address address vmfwupdate/bios-addr.
+
+As part of the reset operation, all existing guest shared memory as well as the
+``vmfwupdate/opaque`` file are preserved. CPU and device state are reset to the default
+hypervisor specific reset states. In SEV-SNP environments, the reset causes recreation
+of the VM context which triggers a fresh measurement of the replaced BIOS region and
+reset CPU state. The guest always resumes operation in the highest privileged mode
+available to it (VMPL0 in SEV-SNP).
+
+Closing Remarks
+***************
+The handover protocol (format of the ``vmwupdate/opaque`` file etc.) will be implemented by
+the firmware loader and firmware image, both provided by the guest.  The hypervisor does
+not need to know these details, so it is not included in this specification.
+
+
+
+Footnotes:
+^^^^^^^^^^
+.. [1] Original author of the specification: *Alex Graf <graf@amazon.com>*,
+       converted to re-structured-text (rst format) and slightly edited
+       by *Ani Sinha <anisinha@redhat.com>*.
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index d02d96e403..4c5bdb0de2 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -148,6 +148,8 @@  specific_ss.add(when: 'CONFIG_MAC_VIA', if_true: files('mac_via.c'))
 specific_ss.add(when: 'CONFIG_MIPS_CPS', if_true: files('mips_cmgcr.c', 'mips_cpc.c'))
 specific_ss.add(when: 'CONFIG_MIPS_ITU', if_true: files('mips_itu.c'))
 
+specific_ss.add(when: 'CONFIG_FW_CFG_DMA', if_true: files('vmfwupdate.c'))
+
 system_ss.add(when: 'CONFIG_SBSA_REF', if_true: files('sbsa_ec.c'))
 
 # HPPA devices
diff --git a/hw/misc/vmfwupdate.c b/hw/misc/vmfwupdate.c
new file mode 100644
index 0000000000..1e29a610c0
--- /dev/null
+++ b/hw/misc/vmfwupdate.c
@@ -0,0 +1,157 @@ 
+/*
+ * Guest driven VM boot component update device
+ * For details and specification, please look at docs/specs/vmfwupdate.rst.
+ *
+ * Copyright (C) 2024 Red Hat, Inc.
+ *
+ * Authors: Ani Sinha <anisinha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/module.h"
+#include "sysemu/reset.h"
+#include "hw/nvram/fw_cfg.h"
+#include "hw/i386/pc.h"
+#include "hw/qdev-properties.h"
+#include "hw/misc/vmfwupdate.h"
+#include "qemu/error-report.h"
+
+static void fw_update_reset(void *dev)
+{
+    /* a NOOP at present */
+    return;
+}
+
+
+static uint64_t get_max_fw_size(void)
+{
+    Object *m_obj = qdev_get_machine();
+    PCMachineState *pcms = PC_MACHINE(m_obj);
+
+    if (pcms) {
+        return pcms->max_fw_size;
+    } else {
+        return 0;
+    }
+}
+
+static void fw_blob_write(void *dev, off_t offset, size_t len)
+{
+    VMFwUpdateState *s = VMFWUPDATE(dev);
+
+    /* for non-pc platform, we do not allow changing bios_size yet */
+    if (!s->plat_bios_size) {
+        return;
+    }
+
+    /*
+     * in order to change the bios size, appropriate capability
+     * must be enabled
+     */
+    if (s->fw_blob.bios_size &&
+        !(s->capability & VMFWUPDATE_CAP_BIOS_RESIZE)) {
+        warn_report("vmfwupdate: VMFWUPDATE_CAP_BIOS_RESIZE not enabled");
+        return;
+    }
+
+    s->plat_bios_size = s->fw_blob.bios_size;
+
+    return;
+}
+
+static void vmfwupdate_realize(DeviceState *dev, Error **errp)
+{
+    VMFwUpdateState *s = VMFWUPDATE(dev);
+    FWCfgState *fw_cfg = fw_cfg_find();
+
+    /* multiple devices are not supported */
+    if (!vmfwupdate_find()) {
+        error_setg(errp, "at most one %s device is permitted",
+                   TYPE_VMFWUPDATE);
+        return;
+    }
+
+    /* fw_cfg with DMA support is necessary to support this device */
+    if (!fw_cfg || !fw_cfg_dma_enabled(fw_cfg)) {
+        error_setg(errp, "%s device requires fw_cfg",
+                   TYPE_VMFWUPDATE);
+        return;
+    }
+
+    memset(&s->fw_blob, 0, sizeof(s->fw_blob));
+    memset(&s->opaque_blobs, 0, sizeof(s->opaque_blobs));
+
+    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_OBLOB,
+                             NULL, NULL, s,
+                             &s->opaque_blobs,
+                             sizeof(s->opaque_blobs),
+                             false);
+
+    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_FWBLOB,
+                             NULL, fw_blob_write, s,
+                             &s->fw_blob,
+                             sizeof(s->fw_blob),
+                             false);
+
+    /*
+     * Add global capability fw_cfg file. This will be used by the guest to
+     * check capability of the hypervisor.
+     */
+    s->capability = cpu_to_le16(CAP_VMFWUPD_MASK | VMFWUPDATE_CAP_EDKROM);
+    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_CAP,
+                    &s->capability, sizeof(s->capability));
+
+    s->plat_bios_size = get_max_fw_size(); /* for non-pc, this is 0 */
+    /* size of bios region for the platform - read only by the guest */
+    fw_cfg_add_file(fw_cfg, FILE_VMFWUPDATE_BIOS_SIZE,
+                    &s->plat_bios_size, sizeof(s->plat_bios_size));
+    /*
+     * add fw cfg control file to disable the hypervisor interface.
+     */
+    fw_cfg_add_file_callback(fw_cfg, FILE_VMFWUPDATE_CONTROL,
+                             NULL, NULL, s,
+                             &s->disable,
+                             sizeof(s->disable),
+                             false);
+    /*
+     * This device requires to register a global reset because it is
+     * not plugged to a bus (which, as its QOM parent, would reset it).
+     */
+    qemu_register_reset(fw_update_reset, dev);
+}
+
+static Property vmfwupdate_properties[] = {
+    DEFINE_PROP_UINT8("disable", VMFwUpdateState, disable, 0),
+    DEFINE_PROP_END_OF_LIST(),
+};
+
+static void vmfwupdate_device_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+
+    /* we are not interested in migration - so no need to populate dc->vmsd */
+    dc->desc = "VM firmware blob update device";
+    dc->realize = vmfwupdate_realize;
+    dc->hotpluggable = false;
+    device_class_set_props(dc, vmfwupdate_properties);
+    set_bit(DEVICE_CATEGORY_MISC, dc->categories);
+}
+
+static const TypeInfo vmfwupdate_device_info = {
+    .name          = TYPE_VMFWUPDATE,
+    .parent        = TYPE_DEVICE,
+    .instance_size = sizeof(VMFwUpdateState),
+    .class_init    = vmfwupdate_device_class_init,
+};
+
+static void vmfwupdate_register_types(void)
+{
+    type_register_static(&vmfwupdate_device_info);
+}
+
+type_init(vmfwupdate_register_types);
diff --git a/include/hw/misc/vmfwupdate.h b/include/hw/misc/vmfwupdate.h
new file mode 100644
index 0000000000..e9229d807b
--- /dev/null
+++ b/include/hw/misc/vmfwupdate.h
@@ -0,0 +1,103 @@ 
+/*
+ * Guest driven VM boot component update device
+ * For details and specification, please look at docs/specs/vmfwupdate.rst.
+ *
+ * Copyright (C) 2024 Red Hat, Inc.
+ *
+ * Authors: Ani Sinha <anisinha@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+#ifndef VMFWUPDATE_H
+#define VMFWUPDATE_H
+
+#include "hw/qdev-core.h"
+#include "qom/object.h"
+#include "qemu/units.h"
+
+#define TYPE_VMFWUPDATE "vmfwupdate"
+
+#define VMFWUPDCAPMSK  0xffff /* least significant 16 capability bits */
+
+#define VMFWUPDATE_CAP_EDKROM 0x08 /* bit 4 represents support for EDKROM */
+#define VMFWUPDATE_CAP_BIOS_RESIZE 0x04 /* guests may resize bios region */
+#define CAP_VMFWUPD_MASK 0x80
+
+#define VMFWUPDATE_OPAQUE_SIZE (1024 * MiB)
+
+/* fw_cfg file definitions */
+#define FILE_VMFWUPDATE_OBLOB "etc/vmfwupdate/opaque-blob"
+#define FILE_VMFWUPDATE_FWBLOB "etc/vmfwupdate/fw-blob"
+#define FILE_VMFWUPDATE_CAP "etc/vmfwupdate/cap"
+#define FILE_VMFWUPDATE_BIOS_SIZE "etc/vmfwupdate/bios-size"
+#define FILE_VMFWUPDATE_CONTROL "etc/vmfwupdate/disable"
+
+/*
+ * Address and length of the guest provided firmware blob.
+ * The blob itself is passed using the guest shared memory to QEMU.
+ * This is then copied to the guest private memeory in the secure vm
+ * by the hypervisor.
+ */
+typedef struct {
+    uint32_t bios_size; /*
+                         * this is used by the guest to update plat_bios_size
+                         * when VMFWUPDATE_CAP_BIOS_RESIZE is set.
+                         */
+    uint64_t bios_paddr; /*
+                          * starting gpa where the blob is in shared guest
+                          * memory. Cleared upon system reset.
+                          */
+} VMFwUpdateFwBlob;
+
+typedef struct VMFwUpdateState {
+    DeviceState parent_obj;
+
+    /*
+     * capabilities - 64 bits.
+     * Little endian format.
+     */
+    uint64_t capability;
+
+    /*
+     * size of the bios region - architecture dependent.
+     * Read-only by the guest unless VMFWUPDATE_CAP_BIOS_RESIZE
+     * capability is set.
+     */
+    uint32_t plat_bios_size;
+
+    /*
+     * disable - disables the interface when non-zero value is written to it.
+     * Writing 0 to this file enables the interface.
+     */
+    uint8_t disable;
+
+    /*
+     * The first stage boot uses this opaque blob to convey to the next stage
+     * where the next stage components are loaded. The exact structure and
+     * number of entries are unknown to the hypervisor and the hypervisor
+     * does not touch this memory or do any validations.
+     * The contents of this memory needs to be validated by the guest and
+     * must be ABI compatible between the first and second stages.
+     */
+    unsigned char opaque_blobs[VMFWUPDATE_OPAQUE_SIZE];
+
+    /*
+     * firmware blob addresses and sizes. These are moved to guest
+     * private memory.
+     */
+    VMFwUpdateFwBlob fw_blob;
+} VMFwUpdateState;
+
+OBJECT_DECLARE_SIMPLE_TYPE(VMFwUpdateState, VMFWUPDATE);
+
+/* returns NULL unless there is exactly one device */
+static inline VMFwUpdateState *vmfwupdate_find(void)
+{
+    Object *o = object_resolve_path_type("", TYPE_VMFWUPDATE, NULL);
+
+    return o ? VMFWUPDATE(o) : NULL;
+}
+
+#endif