Message ID | 20210224094910.44986-1-david@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | virtio-mem: vfio support | expand |
On 24.02.21 10:48, David Hildenbrand wrote: > A virtio-mem device manages a memory region in guest physical address > space, represented as a single (currently large) memory region in QEMU, > mapped into system memory address space. Before the guest is allowed to use > memory blocks, it must coordinate with the hypervisor (plug blocks). After > a reboot, all memory is usually unplugged - when the guest comes up, it > detects the virtio-mem device and selects memory blocks to plug (based on > resize requests from the hypervisor). > > Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem > device (triggered by the guest). When unplugging blocks, we discard the > memory - similar to memory balloon inflation. In contrast to memory > ballooning, we always know which memory blocks a guest may actually use - > especially during a reboot, after a crash, or after kexec (and during > hibernation as well). Guests agreed to not access unplugged memory again, > especially not via DMA. > > The issue with vfio is, that it cannot deal with random discards - for this > reason, virtio-mem and vfio can currently only run mutually exclusive. > Especially, vfio would currently map the whole memory region (with possible > only little/no plugged blocks), resulting in all pages getting pinned and > therefore resulting in a higher memory consumption than expected (turning > virtio-mem basically useless in these environments). > > To make vfio work nicely with virtio-mem, we have to map only the plugged > blocks, and map/unmap properly when plugging/unplugging blocks (including > discarding of RAM when unplugging). We achieve that by using a new notifier > mechanism that communicates changes. > > It's important to map memory in the granularity in which we could see > unmaps again (-> virtio-mem block size) - so when e.g., plugging > consecutive 100 MB with a block size of 2 MB, we need 50 mappings. When > unmapping, we can use a single vfio_unmap call for the applicable range. > We expect that the block size of virtio-mem devices will be fairly large > in the future (to not run out of mappings and to improve hot(un)plug > performance), configured by the user, when used with vfio (e.g., 128MB, > 1G, ...), but it will depend on the setup. > > More info regarding virtio-mem can be found at: > https://virtio-mem.gitlab.io/ > > v7 is located at: > git@github.com:davidhildenbrand/qemu.git virtio-mem-vfio-v7 > Gentle ping.
On 02.03.21 17:46, David Hildenbrand wrote: > On 24.02.21 10:48, David Hildenbrand wrote: >> A virtio-mem device manages a memory region in guest physical address >> space, represented as a single (currently large) memory region in QEMU, >> mapped into system memory address space. Before the guest is allowed to use >> memory blocks, it must coordinate with the hypervisor (plug blocks). After >> a reboot, all memory is usually unplugged - when the guest comes up, it >> detects the virtio-mem device and selects memory blocks to plug (based on >> resize requests from the hypervisor). >> >> Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem >> device (triggered by the guest). When unplugging blocks, we discard the >> memory - similar to memory balloon inflation. In contrast to memory >> ballooning, we always know which memory blocks a guest may actually use - >> especially during a reboot, after a crash, or after kexec (and during >> hibernation as well). Guests agreed to not access unplugged memory again, >> especially not via DMA. >> >> The issue with vfio is, that it cannot deal with random discards - for this >> reason, virtio-mem and vfio can currently only run mutually exclusive. >> Especially, vfio would currently map the whole memory region (with possible >> only little/no plugged blocks), resulting in all pages getting pinned and >> therefore resulting in a higher memory consumption than expected (turning >> virtio-mem basically useless in these environments). >> >> To make vfio work nicely with virtio-mem, we have to map only the plugged >> blocks, and map/unmap properly when plugging/unplugging blocks (including >> discarding of RAM when unplugging). We achieve that by using a new notifier >> mechanism that communicates changes. >> >> It's important to map memory in the granularity in which we could see >> unmaps again (-> virtio-mem block size) - so when e.g., plugging >> consecutive 100 MB with a block size of 2 MB, we need 50 mappings. When >> unmapping, we can use a single vfio_unmap call for the applicable range. >> We expect that the block size of virtio-mem devices will be fairly large >> in the future (to not run out of mappings and to improve hot(un)plug >> performance), configured by the user, when used with vfio (e.g., 128MB, >> 1G, ...), but it will depend on the setup. >> >> More info regarding virtio-mem can be found at: >> https://virtio-mem.gitlab.io/ >> >> v7 is located at: >> git@github.com:davidhildenbrand/qemu.git virtio-mem-vfio-v7 >> > > Gentle ping. @Paolo, can you have another look? thanks.
On 24.02.21 10:48, David Hildenbrand wrote: > A virtio-mem device manages a memory region in guest physical address > space, represented as a single (currently large) memory region in QEMU, > mapped into system memory address space. Before the guest is allowed to use > memory blocks, it must coordinate with the hypervisor (plug blocks). After > a reboot, all memory is usually unplugged - when the guest comes up, it > detects the virtio-mem device and selects memory blocks to plug (based on > resize requests from the hypervisor). > > Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem > device (triggered by the guest). When unplugging blocks, we discard the > memory - similar to memory balloon inflation. In contrast to memory > ballooning, we always know which memory blocks a guest may actually use - > especially during a reboot, after a crash, or after kexec (and during > hibernation as well). Guests agreed to not access unplugged memory again, > especially not via DMA. > > The issue with vfio is, that it cannot deal with random discards - for this > reason, virtio-mem and vfio can currently only run mutually exclusive. > Especially, vfio would currently map the whole memory region (with possible > only little/no plugged blocks), resulting in all pages getting pinned and > therefore resulting in a higher memory consumption than expected (turning > virtio-mem basically useless in these environments). > > To make vfio work nicely with virtio-mem, we have to map only the plugged > blocks, and map/unmap properly when plugging/unplugging blocks (including > discarding of RAM when unplugging). We achieve that by using a new notifier > mechanism that communicates changes. > > It's important to map memory in the granularity in which we could see > unmaps again (-> virtio-mem block size) - so when e.g., plugging > consecutive 100 MB with a block size of 2 MB, we need 50 mappings. When > unmapping, we can use a single vfio_unmap call for the applicable range. > We expect that the block size of virtio-mem devices will be fairly large > in the future (to not run out of mappings and to improve hot(un)plug > performance), configured by the user, when used with vfio (e.g., 128MB, > 1G, ...), but it will depend on the setup. > > More info regarding virtio-mem can be found at: > https://virtio-mem.gitlab.io/ > > v7 is located at: > git@github.com:davidhildenbrand/qemu.git virtio-mem-vfio-v7 > > v6 -> v7: > - s/RamDiscardMgr/RamDiscardManager/ > - "memory: Introduce RamDiscardManager for RAM memory regions" > -- Make RamDiscardManager/RamDiscardListener eat MemoryRegionSections > -- Replace notify_discard_all callback by double_discard_supported > -- Reshuffle the individual hunks in memory.h > -- Provide function wrappers for RamDiscardManager calls > - "memory: Helpers to copy/free a MemoryRegionSection" > -- Added > - "virtio-mem: Implement RamDiscardManager interface" > -- Work on MemoryRegionSections instead of ranges > -- Minor optimizations > - "vfio: Support for RamDiscardManager in the !vIOMMU case" > -- Simplify based on new interfaces / MemoryRegionSections > -- Minor cleanups and optimizations > -- Add a comment regarding dirty bitmap sync. > -- Don't store "offset_within_region" in VFIORamDiscardListener > - "vfio: Support for RamDiscardManager in the vIOMMU case" > -- Adjust to new interface > - "softmmu/physmem: Don't use atomic operations in ..." > -- Rename variables > - "softmmu/physmem: Extend ram_block_discard_(require|disable) ..." > -- Rename variables > - Rebased and retested > > v5 -> v6: > - "memory: Introduce RamDiscardMgr for RAM memory regions" > -- Fix variable names in one prototype. > - "virtio-mem: Don't report errors when ram_block_discard_range() fails" > -- Added > - "virtio-mem: Implement RamDiscardMgr interface" > -- Don't report an error if discarding fails > - Rebased and retested > > v4 -> v5: > - "vfio: Support for RamDiscardMgr in the !vIOMMU case" > -- Added more assertions for granularity vs. iommu supported pagesize > - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr" > -- Fix accounting of mappings > - "vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus" > -- Fence off SPAPR and add some comments regarding future support. > -- Tweak patch description > - Rebase and retest > > v3 -> v4: > - "vfio: Query and store the maximum number of DMA mappings > -- Limit the patch to querying and storing only > -- Renamed to "vfio: Query and store the maximum number of possible DMA > mappings" > - "vfio: Support for RamDiscardMgr in the !vIOMMU case" > -- Remove sanity checks / warning the user > - "vfio: Sanity check maximum number of DMA mappings with RamDiscardMgr" > -- Perform sanity checks by looking at the number of memslots and all > registered RamDiscardMgr sections > - Rebase and retest > - Reshuffled the patches slightly > > v2 -> v3: > - Rebased + retested > - Fixed some typos > - Added RB's > > v1 -> v2: > - "memory: Introduce RamDiscardMgr for RAM memory regions" > -- Fix some errors in the documentation > -- Make register_listener() notify about populated parts and > unregister_listener() notify about discarding populated parts, to > simplify future locking inside virtio-mem, when handling requests via a > separate thread. > - "vfio: Query and store the maximum number of DMA mappings" > -- Query number of mappings and track mappings (except for vIOMMU) > - "vfio: Support for RamDiscardMgr in the !vIOMMU case" > -- Adapt to RamDiscardMgr changes and warn via generic DMA reservation > - "vfio: Support for RamDiscardMgr in the vIOMMU case" > -- Use vmstate priority to handle migration dependencies > > RFC - v1: > - VFIO migration code. Due to missing kernel support, I cannot really test > if that part works. > - Understand/test/document vIOMMU implications, also regarding migration > - Nicer ram_block_discard_disable/require handling. > - s/SparseRAMHandler/RamDiscardMgr/, refactorings, cleanups, documentation, > testing, ... > > David Hildenbrand (13): > memory: Introduce RamDiscardManager for RAM memory regions > memory: Helpers to copy/free a MemoryRegionSection > virtio-mem: Factor out traversing unplugged ranges > virtio-mem: Don't report errors when ram_block_discard_range() fails > virtio-mem: Implement RamDiscardManager interface > vfio: Support for RamDiscardManager in the !vIOMMU case > vfio: Query and store the maximum number of possible DMA mappings > vfio: Sanity check maximum number of DMA mappings with > RamDiscardManager > vfio: Support for RamDiscardManager in the vIOMMU case > softmmu/physmem: Don't use atomic operations in > ram_block_discard_(disable|require) > softmmu/physmem: Extend ram_block_discard_(require|disable) by two > discard types > virtio-mem: Require only coordinated discards > vfio: Disable only uncoordinated discards for VFIO_TYPE1 iommus > > hw/vfio/common.c | 315 +++++++++++++++++++++++++- > hw/virtio/virtio-mem.c | 391 ++++++++++++++++++++++++++++----- > include/exec/memory.h | 324 +++++++++++++++++++++++++-- > include/hw/vfio/vfio-common.h | 12 + > include/hw/virtio/virtio-mem.h | 3 + > include/migration/vmstate.h | 1 + > softmmu/memory.c | 98 +++++++++ > softmmu/physmem.c | 108 ++++++--- > 8 files changed, 1133 insertions(+), 119 deletions(-) > Another gentle ping; it's been almost a month with no feedback. I hope we can get this early into 6.1, because I have other stuff coming up that rely on the RamDiscardManager infrastructure. Feedback/acks are appreciated so I can finally make progress with this.