diff mbox series

[PROTOTYPE,1/6] memory: Introduce sparse RAM handler for memory regions

Message ID 20200924160423.106747-2-david@redhat.com (mailing list archive)
State New, archived
Headers show
Series virtio-mem: vfio support | expand

Commit Message

David Hildenbrand Sept. 24, 2020, 4:04 p.m. UTC
We have some special memory ram regions (managed by paravirtualized memory
devices - virtio-mem), whereby the guest agreed to only use selected memory
ranges. This results in "sparse" mmaps, "sparse" RAMBlocks and "sparse"
memory ram regions.

In most cases, we don't currently care about that - e.g., in KVM, we simply
have a single KVM memory slot (and as the number is fairly limited, we'll
have to keep it like that). However, in case of vfio, registering the
whole region with the kernel results in all pages getting pinned, and
therefore an unexpected high memory consumption. This is the main
reason why vfio is incompatible with memory ballooning.

Let's introduce a way to communicate the actual accessible/mapped (meaning,
not discarded) pieces for such a sparse memory region, and get notified on
changes (e.g., a virito-mem device plugging/unplugging memory).

We expect that the SparseRAMHandler is set for a memory region before it
is mapped into guest physical address space (so before any memory
listeners get notified about the addition), and the SparseRAMHandler isn't
unset before the memory region was unmapped from guest physical address
space (so after any memory listener got notified about the removal).

This is somewhat similar to the iommu memory region notifier mechanism.

TODO:
- Better documentation.
- Better Naming?
- Handle it on RAMBlocks?
- SPAPR spacial handling required (virtio-mem only supports x86-64 for now)?
- Catch mapping errors during hotplug in a nice way
- Fail early when a certain number of mappings would be exceeded
  (instead of eventually consuming too many, leaving none for others)
- Resizeable memory region handling (future).
- Callback to check the state of a block.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/exec/memory.h | 115 ++++++++++++++++++++++++++++++++++++++++++
 softmmu/memory.c      |   7 +++
 2 files changed, 122 insertions(+)

Comments

Peter Xu Oct. 20, 2020, 7:24 p.m. UTC | #1
On Thu, Sep 24, 2020 at 06:04:18PM +0200, David Hildenbrand wrote:
> +static inline void memory_region_set_sparse_ram_handler(MemoryRegion *mr,
> +                                                        SparseRAMHandler *srh)
> +{
> +    g_assert(memory_region_is_ram(mr));

Nit: Maybe assert mr->srh==NULL here?  If sparse ram handler is exclusive,
which afaiu, is a yes.

> +    mr->srh = srh;
> +}
> +
> +static inline void memory_region_register_sparse_ram_notifier(MemoryRegion *mr,
> +                                                           SparseRAMNotifier *n)
> +{
> +    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
> +    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
> +
> +    srhc->register_listener(srh, mr, n);

I feel like you need to check srhc!=NULL first or vfio may start crash without
virtio-mem...  Same question to the other ones (at least unregister).
David Hildenbrand Oct. 20, 2020, 8:13 p.m. UTC | #2
On 20.10.20 21:24, Peter Xu wrote:
> On Thu, Sep 24, 2020 at 06:04:18PM +0200, David Hildenbrand wrote:
>> +static inline void memory_region_set_sparse_ram_handler(MemoryRegion *mr,
>> +                                                        SparseRAMHandler *srh)
>> +{
>> +    g_assert(memory_region_is_ram(mr));
> 
> Nit: Maybe assert mr->srh==NULL here?  If sparse ram handler is exclusive,
> which afaiu, is a yes.

Indeed. The owner of the memory region.

> 
>> +    mr->srh = srh;
>> +}
>> +
>> +static inline void memory_region_register_sparse_ram_notifier(MemoryRegion *mr,
>> +                                                           SparseRAMNotifier *n)
>> +{
>> +    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
>> +    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
>> +
>> +    srhc->register_listener(srh, mr, n);
> 
> I feel like you need to check srhc!=NULL first or vfio may start crash without
> virtio-mem...  Same question to the other ones (at least unregister).

I think nobody should be calling this function unless
memory_region_is_sparse_ram() returns true.

I'm still considering moving it down to RAMBlock level. Feels more
natural, and would allow other RAMBlock users figure out what's going
on. If only my list of TODO items wouldn't be that long ... :)

Thanks Peter!
diff mbox series

Patch

diff --git a/include/exec/memory.h b/include/exec/memory.h
index f1bb2a7df5..2931ead730 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -42,6 +42,12 @@  typedef struct IOMMUMemoryRegionClass IOMMUMemoryRegionClass;
 DECLARE_OBJ_CHECKERS(IOMMUMemoryRegion, IOMMUMemoryRegionClass,
                      IOMMU_MEMORY_REGION, TYPE_IOMMU_MEMORY_REGION)
 
+#define TYPE_SPARSE_RAM_HANDLER "sparse-ram-handler"
+typedef struct SparseRAMHandlerClass SparseRAMHandlerClass;
+typedef struct SparseRAMHandler SparseRAMHandler;
+DECLARE_OBJ_CHECKERS(SparseRAMHandler, SparseRAMHandlerClass,
+                     SPARSE_RAM_HANDLER, TYPE_SPARSE_RAM_HANDLER)
+
 extern bool global_dirty_log;
 
 typedef struct MemoryRegionOps MemoryRegionOps;
@@ -136,6 +142,28 @@  static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
     n->iommu_idx = iommu_idx;
 }
 
+struct SparseRAMNotifier;
+typedef int (*SparseRAMNotifyMap)(struct SparseRAMNotifier *notifier,
+                                  const MemoryRegion *mr, uint64_t mr_offset,
+                                  uint64_t size);
+typedef void (*SparseRAMNotifyUnmap)(struct SparseRAMNotifier *notifier,
+                                     const MemoryRegion *mr, uint64_t mr_offset,
+                                     uint64_t size);
+
+typedef struct SparseRAMNotifier {
+    SparseRAMNotifyMap notify_map;
+    SparseRAMNotifyUnmap notify_unmap;
+    QLIST_ENTRY(SparseRAMNotifier) next;
+} SparseRAMNotifier;
+
+static inline void sparse_ram_notifier_init(SparseRAMNotifier *notifier,
+                                            SparseRAMNotifyMap map_fn,
+                                            SparseRAMNotifyUnmap unmap_fn)
+{
+    notifier->notify_map = map_fn;
+    notifier->notify_unmap = unmap_fn;
+}
+
 /*
  * Memory region callbacks
  */
@@ -352,6 +380,36 @@  struct IOMMUMemoryRegionClass {
     int (*num_indexes)(IOMMUMemoryRegion *iommu);
 };
 
+struct SparseRAMHandlerClass {
+    /* private */
+    InterfaceClass parent_class;
+
+    /*
+     * Returns the minimum granularity in which (granularity-aligned pieces
+     * within the memory region) can become either be mapped or unmapped.
+     */
+    uint64_t (*get_granularity)(const SparseRAMHandler *srh,
+                                const MemoryRegion *mr);
+
+    /*
+     * Register a listener for mapping changes.
+     */
+    void (*register_listener)(SparseRAMHandler *srh, const MemoryRegion *mr,
+                              SparseRAMNotifier *notifier);
+
+    /*
+     * Unregister a listener for mapping changes.
+     */
+    void (*unregister_listener)(SparseRAMHandler *srh, const MemoryRegion *mr,
+                                SparseRAMNotifier *notifier);
+
+    /*
+     * Replay notifications for mapped RAM.
+     */
+    int (*replay_mapped)(SparseRAMHandler *srh, const MemoryRegion *mr,
+                         SparseRAMNotifier *notifier);
+};
+
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
 typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
 
@@ -399,6 +457,7 @@  struct MemoryRegion {
     const char *name;
     unsigned ioeventfd_nb;
     MemoryRegionIoeventfd *ioeventfds;
+    SparseRAMHandler *srh; /* For RAM only */
 };
 
 struct IOMMUMemoryRegion {
@@ -1889,6 +1948,62 @@  bool memory_region_present(MemoryRegion *container, hwaddr addr);
  */
 bool memory_region_is_mapped(MemoryRegion *mr);
 
+
+static inline SparseRAMHandler* memory_region_get_sparse_ram_handler(
+                                                               MemoryRegion *mr)
+{
+    return mr->srh;
+}
+
+static inline bool memory_region_is_sparse_ram(MemoryRegion *mr)
+{
+    return memory_region_get_sparse_ram_handler(mr) != NULL;
+}
+
+static inline void memory_region_set_sparse_ram_handler(MemoryRegion *mr,
+                                                        SparseRAMHandler *srh)
+{
+    g_assert(memory_region_is_ram(mr));
+    mr->srh = srh;
+}
+
+static inline void memory_region_register_sparse_ram_notifier(MemoryRegion *mr,
+                                                           SparseRAMNotifier *n)
+{
+    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
+    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
+
+    srhc->register_listener(srh, mr, n);
+}
+
+static inline void memory_region_unregister_sparse_ram_notifier(
+                                                               MemoryRegion *mr,
+                                                           SparseRAMNotifier *n)
+{
+    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
+    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
+
+    srhc->unregister_listener(srh, mr, n);
+}
+
+static inline uint64_t memory_region_sparse_ram_get_granularity(
+                                                               MemoryRegion *mr)
+{
+    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
+    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
+
+    return srhc->get_granularity(srh, mr);
+}
+
+static inline int memory_region_sparse_ram_replay_mapped(MemoryRegion *mr,
+                                                         SparseRAMNotifier *n)
+{
+    SparseRAMHandler *srh = memory_region_get_sparse_ram_handler(mr);
+    SparseRAMHandlerClass *srhc = SPARSE_RAM_HANDLER_GET_CLASS(srh);
+
+    return srhc->replay_mapped(srh, mr, n);
+}
+
 /**
  * memory_region_find: translate an address/size relative to a
  * MemoryRegion into a #MemoryRegionSection.
diff --git a/softmmu/memory.c b/softmmu/memory.c
index d030eb6f7c..89649f52f7 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -3241,10 +3241,17 @@  static const TypeInfo iommu_memory_region_info = {
     .abstract           = true,
 };
 
+static const TypeInfo sparse_ram_handler_info = {
+    .parent             = TYPE_INTERFACE,
+    .name               = TYPE_SPARSE_RAM_HANDLER,
+    .class_size         = sizeof(SparseRAMHandlerClass),
+};
+
 static void memory_register_types(void)
 {
     type_register_static(&memory_region_info);
     type_register_static(&iommu_memory_region_info);
+    type_register_static(&sparse_ram_handler_info);
 }
 
 type_init(memory_register_types)