diff mbox series

[v3,2/2] hw/arm/virt: Support for virtio-mem-pci

Message ID 20211203233404.37313-3-gshan@redhat.com (mailing list archive)
State New, archived
Headers show
Series hw/arm/virt: Support for virtio-mem-pci | expand

Commit Message

Gavin Shan Dec. 3, 2021, 11:34 p.m. UTC
This supports virtio-mem-pci device on "virt" platform, by simply
following the implementation on x86.

   * This implements the hotplug handlers to support virtio-mem-pci
     device hot-add, while the hot-remove isn't supported as we have
     on x86.

   * The block size is 512MB on ARM64 instead of 128MB on x86.

   * It has been passing the tests with various combinations like 64KB
     and 4KB page sizes on host and guest, different memory device
     backends like normal, transparent huge page and HugeTLB, plus
     migration.

Co-developed-by: David Hildenbrand <david@redhat.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
---
 hw/arm/Kconfig         |  1 +
 hw/arm/virt.c          | 68 +++++++++++++++++++++++++++++++++++++++++-
 hw/virtio/virtio-mem.c |  4 ++-
 3 files changed, 71 insertions(+), 2 deletions(-)

Comments

Peter Maydell Jan. 7, 2022, 4:40 p.m. UTC | #1
On Fri, 3 Dec 2021 at 23:34, Gavin Shan <gshan@redhat.com> wrote:
>
> This supports virtio-mem-pci device on "virt" platform, by simply
> following the implementation on x86.
>
>    * This implements the hotplug handlers to support virtio-mem-pci
>      device hot-add, while the hot-remove isn't supported as we have
>      on x86.
>
>    * The block size is 512MB on ARM64 instead of 128MB on x86.
>
>    * It has been passing the tests with various combinations like 64KB
>      and 4KB page sizes on host and guest, different memory device
>      backends like normal, transparent huge page and HugeTLB, plus
>      migration.
>
> Co-developed-by: David Hildenbrand <david@redhat.com>
> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> Signed-off-by: Gavin Shan <gshan@redhat.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Reviewed-by: David Hildenbrand <david@redhat.com>


> +static void virt_virtio_md_pci_pre_plug(HotplugHandler *hotplug_dev,
> +                                        DeviceState *dev, Error **errp)
> +{
> +    HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev);
> +    Error *local_err = NULL;
> +
> +    if (!hotplug_dev2 && dev->hotplugged) {
> +        /*
> +         * Without a bus hotplug handler, we cannot control the plug/unplug
> +         * order. We should never reach this point when hotplugging on x86,
> +         * however, better add a safety net.
> +         */

This comment looks like it was cut-n-pasted from x86 -- is whatever
it is that prevents us from reaching this point also true for arm ?
(What is the thing that prevents us reaching this point?)

> +        error_setg(errp, "hotplug of virtio based memory devices not supported"
> +                   " on this bus.");
> +        return;
> +    }
> +    /*
> +     * First, see if we can plug this memory device at all. If that
> +     * succeeds, branch of to the actual hotplug handler.
> +     */
> +    memory_device_pre_plug(MEMORY_DEVICE(dev), MACHINE(hotplug_dev), NULL,
> +                           &local_err);
> +    if (!local_err && hotplug_dev2) {
> +        hotplug_handler_pre_plug(hotplug_dev2, dev, &local_err);
> +    }
> +    error_propagate(errp, local_err);
> +}



> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> index b20595a496..21e4d572ab 100644
> --- a/hw/virtio/virtio-mem.c
> +++ b/hw/virtio/virtio-mem.c
> @@ -125,7 +125,7 @@ static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
>   * The memory block size corresponds mostly to the section size.
>   *
>   * This allows e.g., to add 20MB with a section size of 128MB on x86_64, and
> - * a section size of 1GB on arm64 (as long as the start address is properly
> + * a section size of 512MB on arm64 (as long as the start address is properly
>   * aligned, similar to ordinary DIMMs).
>   *
>   * We can change this at any time and maybe even make it configurable if
> @@ -134,6 +134,8 @@ static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
>   */
>  #if defined(TARGET_X86_64) || defined(TARGET_I386)
>  #define VIRTIO_MEM_USABLE_EXTENT (2 * (128 * MiB))
> +#elif defined(TARGET_ARM)
> +#define VIRTIO_MEM_USABLE_EXTENT (2 * (512 * MiB))
>  #else
>  #error VIRTIO_MEM_USABLE_EXTENT not defined
>  #endif

Could this comment explain where the 128MB and 512MB come from
and why the value is different for different architectures ?

thanks
-- PMM
Gavin Shan Jan. 8, 2022, 7:21 a.m. UTC | #2
Hi Peter,

On 1/8/22 12:40 AM, Peter Maydell wrote:
> On Fri, 3 Dec 2021 at 23:34, Gavin Shan <gshan@redhat.com> wrote:
>>
>> This supports virtio-mem-pci device on "virt" platform, by simply
>> following the implementation on x86.
>>
>>     * This implements the hotplug handlers to support virtio-mem-pci
>>       device hot-add, while the hot-remove isn't supported as we have
>>       on x86.
>>
>>     * The block size is 512MB on ARM64 instead of 128MB on x86.
>>
>>     * It has been passing the tests with various combinations like 64KB
>>       and 4KB page sizes on host and guest, different memory device
>>       backends like normal, transparent huge page and HugeTLB, plus
>>       migration.
>>
>> Co-developed-by: David Hildenbrand <david@redhat.com>
>> Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> Signed-off-by: Gavin Shan <gshan@redhat.com>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
>> Reviewed-by: David Hildenbrand <david@redhat.com>
> 
> 
>> +static void virt_virtio_md_pci_pre_plug(HotplugHandler *hotplug_dev,
>> +                                        DeviceState *dev, Error **errp)
>> +{
>> +    HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev);
>> +    Error *local_err = NULL;
>> +
>> +    if (!hotplug_dev2 && dev->hotplugged) {
>> +        /*
>> +         * Without a bus hotplug handler, we cannot control the plug/unplug
>> +         * order. We should never reach this point when hotplugging on x86,
>> +         * however, better add a safety net.
>> +         */
> 
> This comment looks like it was cut-n-pasted from x86 -- is whatever
> it is that prevents us from reaching this point also true for arm ?
> (What is the thing that prevents us reaching this point?)
> 

Yeah, the comment was copied from x86. It's also true for ARM as a hotplug
controller on the parent bus is required for virtio-mem-pci device hot-add,
according to the following commit log.

commit a0a49813f7f2fc23bfe8a4fc6760e2a60c9a3e59
Author: David Hildenbrand <david@redhat.com>
Date:   Wed Jun 19 15:19:07 2019 +0530

     pc: Support for virtio-pmem-pci
     
     Override the device hotplug handler to properly handle the memory device
     part via virtio-pmem-pci callbacks from the machine hotplug handler and
     forward to the actual PCI bus hotplug handler.
     
     As PCI hotplug has not been properly factored out into hotplug handlers,
     most magic is performed in the (un)realize functions. Also some PCI host
     buses don't have a PCI hotplug handler at all yet, just to be sure that
     we alway have a hotplug handler on x86, add a simple error check.
     
     Unlocking virtio-pmem will unlock virtio-pmem-pci.
     
     Signed-off-by: David Hildenbrand <david@redhat.com>

However, I don't think the comment we have for ARM is precise enough because
it's irrelevant to x86. I will change it something like below in v4:

	/*
	 * Without a bus hotplug handler, we cannot control the plug/unplug
	 * order. We should never reach this point when hotplugging on ARM.
	 * However, it's nice to add a safety net, similar to what we have
          * on x86.
	 */


>> +        error_setg(errp, "hotplug of virtio based memory devices not supported"
>> +                   " on this bus.");
>> +        return;
>> +    }
>> +    /*
>> +     * First, see if we can plug this memory device at all. If that
>> +     * succeeds, branch of to the actual hotplug handler.
>> +     */
>> +    memory_device_pre_plug(MEMORY_DEVICE(dev), MACHINE(hotplug_dev), NULL,
>> +                           &local_err);
>> +    if (!local_err && hotplug_dev2) {
>> +        hotplug_handler_pre_plug(hotplug_dev2, dev, &local_err);
>> +    }
>> +    error_propagate(errp, local_err);
>> +}
> 
> 
> 
>> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
>> index b20595a496..21e4d572ab 100644
>> --- a/hw/virtio/virtio-mem.c
>> +++ b/hw/virtio/virtio-mem.c
>> @@ -125,7 +125,7 @@ static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
>>    * The memory block size corresponds mostly to the section size.
>>    *
>>    * This allows e.g., to add 20MB with a section size of 128MB on x86_64, and
>> - * a section size of 1GB on arm64 (as long as the start address is properly
>> + * a section size of 512MB on arm64 (as long as the start address is properly
>>    * aligned, similar to ordinary DIMMs).
>>    *
>>    * We can change this at any time and maybe even make it configurable if
>> @@ -134,6 +134,8 @@ static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
>>    */
>>   #if defined(TARGET_X86_64) || defined(TARGET_I386)
>>   #define VIRTIO_MEM_USABLE_EXTENT (2 * (128 * MiB))
>> +#elif defined(TARGET_ARM)
>> +#define VIRTIO_MEM_USABLE_EXTENT (2 * (512 * MiB))
>>   #else
>>   #error VIRTIO_MEM_USABLE_EXTENT not defined
>>   #endif
> 
> Could this comment explain where the 128MB and 512MB come from
> and why the value is different for different architectures ?
> 

Yes, the comment already explained it by "section size", which is the
minimal hotpluggable unit. It's defined by the linux guest kernel as
below. On ARM64, we pick the larger section size without considering
the base page size. Besides, the virtio-mem is/will-be enabled on
x86_64 and ARM64 guest kernel only.

#define SECTION_SIZE_BITS  29      /* ARM:    64KB base page size        */
#define SECTION_SIZE_BITS  27      /* ARM:    16KB or 4KB base page size */
#define SECTION_SIZE_BITS  27      /* x86_64                             */

Thanks,
Gavin
Peter Maydell Jan. 10, 2022, 10:50 a.m. UTC | #3
On Sat, 8 Jan 2022 at 07:22, Gavin Shan <gshan@redhat.com> wrote:
>
> Hi Peter,
>
> On 1/8/22 12:40 AM, Peter Maydell wrote:
> > On Fri, 3 Dec 2021 at 23:34, Gavin Shan <gshan@redhat.com> wrote:
> >> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
> >> index b20595a496..21e4d572ab 100644
> >> --- a/hw/virtio/virtio-mem.c
> >> +++ b/hw/virtio/virtio-mem.c
> >> @@ -125,7 +125,7 @@ static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
> >>    * The memory block size corresponds mostly to the section size.
> >>    *
> >>    * This allows e.g., to add 20MB with a section size of 128MB on x86_64, and
> >> - * a section size of 1GB on arm64 (as long as the start address is properly
> >> + * a section size of 512MB on arm64 (as long as the start address is properly
> >>    * aligned, similar to ordinary DIMMs).
> >>    *
> >>    * We can change this at any time and maybe even make it configurable if
> >> @@ -134,6 +134,8 @@ static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
> >>    */
> >>   #if defined(TARGET_X86_64) || defined(TARGET_I386)
> >>   #define VIRTIO_MEM_USABLE_EXTENT (2 * (128 * MiB))
> >> +#elif defined(TARGET_ARM)
> >> +#define VIRTIO_MEM_USABLE_EXTENT (2 * (512 * MiB))
> >>   #else
> >>   #error VIRTIO_MEM_USABLE_EXTENT not defined
> >>   #endif
> >
> > Could this comment explain where the 128MB and 512MB come from
> > and why the value is different for different architectures ?
> >
>
> Yes, the comment already explained it by "section size", which is the
> minimal hotpluggable unit. It's defined by the linux guest kernel as
> below. On ARM64, we pick the larger section size without considering
> the base page size. Besides, the virtio-mem is/will-be enabled on
> x86_64 and ARM64 guest kernel only.

Oh, so "section" is a Linux kernel concept? It wasn't clear to me
that that was a fixed value rather than something we were arbitrarily
defining in QEMU.

-- PMM
David Hildenbrand Jan. 10, 2022, 10:59 a.m. UTC | #4
On 10.01.22 11:50, Peter Maydell wrote:
> On Sat, 8 Jan 2022 at 07:22, Gavin Shan <gshan@redhat.com> wrote:
>>
>> Hi Peter,
>>
>> On 1/8/22 12:40 AM, Peter Maydell wrote:
>>> On Fri, 3 Dec 2021 at 23:34, Gavin Shan <gshan@redhat.com> wrote:
>>>> diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
>>>> index b20595a496..21e4d572ab 100644
>>>> --- a/hw/virtio/virtio-mem.c
>>>> +++ b/hw/virtio/virtio-mem.c
>>>> @@ -125,7 +125,7 @@ static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
>>>>    * The memory block size corresponds mostly to the section size.
>>>>    *
>>>>    * This allows e.g., to add 20MB with a section size of 128MB on x86_64, and
>>>> - * a section size of 1GB on arm64 (as long as the start address is properly
>>>> + * a section size of 512MB on arm64 (as long as the start address is properly
>>>>    * aligned, similar to ordinary DIMMs).
>>>>    *
>>>>    * We can change this at any time and maybe even make it configurable if
>>>> @@ -134,6 +134,8 @@ static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
>>>>    */
>>>>   #if defined(TARGET_X86_64) || defined(TARGET_I386)
>>>>   #define VIRTIO_MEM_USABLE_EXTENT (2 * (128 * MiB))
>>>> +#elif defined(TARGET_ARM)
>>>> +#define VIRTIO_MEM_USABLE_EXTENT (2 * (512 * MiB))
>>>>   #else
>>>>   #error VIRTIO_MEM_USABLE_EXTENT not defined
>>>>   #endif
>>>
>>> Could this comment explain where the 128MB and 512MB come from
>>> and why the value is different for different architectures ?
>>>
>>
>> Yes, the comment already explained it by "section size", which is the
>> minimal hotpluggable unit. It's defined by the linux guest kernel as
>> below. On ARM64, we pick the larger section size without considering
>> the base page size. Besides, the virtio-mem is/will-be enabled on
>> x86_64 and ARM64 guest kernel only.
> 
> Oh, so "section" is a Linux kernel concept? It wasn't clear to me
> that that was a fixed value rather than something we were arbitrarily
> defining in QEMU.

It's somewhat an arbitrary value, as the section size can change in the
future, and there are other memory hotplug granularity restrictions on
some architectures (e.g., x86 with boot memory size of >64GiB can
usually only hotplug in 2 GiB granularity, not 128 MiB granularity). So
what we're doing here is really best-effort for Linux guests we expect.
As the comment states, we can always change that magic value in the
future if there is need to (e.g., increase it to granularity we expect).

If our guesstimate is wrong, the guest won't be able to hotplug all
requested device memory, until we actually increase the requested size
such that it gets again possible for the VM.

We'd be running into similar issues when trying hotplug of a 128MiB DIMM
to an arm64 64k guest: Linux guests can currently only hotplug 512 MiB
granularity in such a setup and smaller DIMMs can simply not be exposed
to the page alloator and remain essentially unusable. But in contrast to
DIMMs, with virtio-mem we can actually detect that the guest cannot make
use of that memory, figure out why, and optimize.
diff mbox series

Patch

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 2d37d29f02..15aff8efb8 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -27,6 +27,7 @@  config ARM_VIRT
     select DIMM
     select ACPI_HW_REDUCED
     select ACPI_APEI
+    select VIRTIO_MEM_SUPPORTED
 
 config CHEETAH
     bool
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 30da05dfe0..db1544760d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -72,9 +72,11 @@ 
 #include "hw/arm/smmuv3.h"
 #include "hw/acpi/acpi.h"
 #include "target/arm/internals.h"
+#include "hw/mem/memory-device.h"
 #include "hw/mem/pc-dimm.h"
 #include "hw/mem/nvdimm.h"
 #include "hw/acpi/generic_event_device.h"
+#include "hw/virtio/virtio-mem-pci.h"
 #include "hw/virtio/virtio-iommu.h"
 #include "hw/char/pl011.h"
 #include "qemu/guest-random.h"
@@ -2483,6 +2485,63 @@  static void virt_memory_plug(HotplugHandler *hotplug_dev,
                          dev, &error_abort);
 }
 
+static void virt_virtio_md_pci_pre_plug(HotplugHandler *hotplug_dev,
+                                        DeviceState *dev, Error **errp)
+{
+    HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev);
+    Error *local_err = NULL;
+
+    if (!hotplug_dev2 && dev->hotplugged) {
+        /*
+         * Without a bus hotplug handler, we cannot control the plug/unplug
+         * order. We should never reach this point when hotplugging on x86,
+         * however, better add a safety net.
+         */
+        error_setg(errp, "hotplug of virtio based memory devices not supported"
+                   " on this bus.");
+        return;
+    }
+    /*
+     * First, see if we can plug this memory device at all. If that
+     * succeeds, branch of to the actual hotplug handler.
+     */
+    memory_device_pre_plug(MEMORY_DEVICE(dev), MACHINE(hotplug_dev), NULL,
+                           &local_err);
+    if (!local_err && hotplug_dev2) {
+        hotplug_handler_pre_plug(hotplug_dev2, dev, &local_err);
+    }
+    error_propagate(errp, local_err);
+}
+
+static void virt_virtio_md_pci_plug(HotplugHandler *hotplug_dev,
+                                    DeviceState *dev, Error **errp)
+{
+    HotplugHandler *hotplug_dev2 = qdev_get_bus_hotplug_handler(dev);
+    Error *local_err = NULL;
+
+    /*
+     * Plug the memory device first and then branch off to the actual
+     * hotplug handler. If that one fails, we can easily undo the memory
+     * device bits.
+     */
+    memory_device_plug(MEMORY_DEVICE(dev), MACHINE(hotplug_dev));
+    if (hotplug_dev2) {
+        hotplug_handler_plug(hotplug_dev2, dev, &local_err);
+        if (local_err) {
+            memory_device_unplug(MEMORY_DEVICE(dev), MACHINE(hotplug_dev));
+        }
+    }
+    error_propagate(errp, local_err);
+}
+
+static void virt_virtio_md_pci_unplug_request(HotplugHandler *hotplug_dev,
+                                              DeviceState *dev, Error **errp)
+{
+    /* We don't support hot unplug of virtio based memory devices */
+    error_setg(errp, "virtio based memory devices cannot be unplugged.");
+}
+
+
 static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
                                             DeviceState *dev, Error **errp)
 {
@@ -2516,6 +2575,8 @@  static void virt_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
         qdev_prop_set_uint32(dev, "len-reserved-regions", 1);
         qdev_prop_set_string(dev, "reserved-regions[0]", resv_prop_str);
         g_free(resv_prop_str);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) {
+        virt_virtio_md_pci_pre_plug(hotplug_dev, dev, errp);
     }
 }
 
@@ -2541,6 +2602,8 @@  static void virt_machine_device_plug_cb(HotplugHandler *hotplug_dev,
         vms->iommu = VIRT_IOMMU_VIRTIO;
         vms->virtio_iommu_bdf = pci_get_bdf(pdev);
         create_virtio_iommu_dt_bindings(vms);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) {
+        virt_virtio_md_pci_plug(hotplug_dev, dev, errp);
     }
 }
 
@@ -2591,6 +2654,8 @@  static void virt_machine_device_unplug_request_cb(HotplugHandler *hotplug_dev,
 {
     if (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM)) {
         virt_dimm_unplug_request(hotplug_dev, dev, errp);
+    } else if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) {
+        virt_virtio_md_pci_unplug_request(hotplug_dev, dev, errp);
     } else {
         error_setg(errp, "device unplug request for unsupported device"
                    " type: %s", object_get_typename(OBJECT(dev)));
@@ -2614,7 +2679,8 @@  static HotplugHandler *virt_machine_get_hotplug_handler(MachineState *machine,
     MachineClass *mc = MACHINE_GET_CLASS(machine);
 
     if (device_is_dynamic_sysbus(mc, dev) ||
-       (object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM))) {
+        object_dynamic_cast(OBJECT(dev), TYPE_PC_DIMM) ||
+        object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_MEM_PCI)) {
         return HOTPLUG_HANDLER(machine);
     }
     if (object_dynamic_cast(OBJECT(dev), TYPE_VIRTIO_IOMMU_PCI)) {
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index b20595a496..21e4d572ab 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -125,7 +125,7 @@  static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
  * The memory block size corresponds mostly to the section size.
  *
  * This allows e.g., to add 20MB with a section size of 128MB on x86_64, and
- * a section size of 1GB on arm64 (as long as the start address is properly
+ * a section size of 512MB on arm64 (as long as the start address is properly
  * aligned, similar to ordinary DIMMs).
  *
  * We can change this at any time and maybe even make it configurable if
@@ -134,6 +134,8 @@  static uint64_t virtio_mem_default_block_size(RAMBlock *rb)
  */
 #if defined(TARGET_X86_64) || defined(TARGET_I386)
 #define VIRTIO_MEM_USABLE_EXTENT (2 * (128 * MiB))
+#elif defined(TARGET_ARM)
+#define VIRTIO_MEM_USABLE_EXTENT (2 * (512 * MiB))
 #else
 #error VIRTIO_MEM_USABLE_EXTENT not defined
 #endif