Message ID | 20201119153918.120976-1-david@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | virtio-mem: vfio support | expand |
On 19.11.20 16:39, David Hildenbrand wrote: > This is the follow-up of: > https://lkml.kernel.org/r/20200924160423.106747-1-david@redhat.com > to make vfio and virtio-mem play together. The basic idea was the result of > Alex brainstorming with me on how to tackle this. > > A virtio-mem device manages a memory region in guest physical address > space, represented as a single (currently large) memory region in QEMU, > mapped into system memory address space. Before the guest is allowed to use > memory blocks, it must coordinate with the hypervisor (plug blocks). After > a reboot, all memory is usually unplugged - when the guest comes up, it > detects the virtio-mem device and selects memory blocks to plug (based on > resize requests from the hypervisor). > > Memory hot(un)plug consists of (un)plugging memory blocks via a virtio-mem > device (triggered by the guest). When unplugging blocks, we discard the > memory - similar to memory balloon inflation. In contrast to memory > ballooning, we always know which memory blocks a guest may actually use - > especially during a reboot, after a crash, or after kexec (and during > hibernation as well). Guests agreed to no access unplugged memory again, > especially not via DMA. > > The issue with vfio is, that it cannot deal with random discards - for this > reason, virtio-mem and vfio can currently only run mutually exclusive. > Especially, vfio would currently map the whole memory region (with possible > only little/no plugged blocks), resulting in all pages getting pinned and > therefore resulting in a higher memory consumption than expected (turning > virtio-mem basically useless in these environments). > > To make vfio work nicely with virtio-mem, we have to map only the plugged > blocks, and map/unmap properly when plugging/unplugging blocks (including > discarding of RAM when unplugging). We achieve that by using a new notifier > mechanism that communicates changes. > > It's important to map memory in the granularity in which we could see > unmaps again (-> virtio-mem block size) - so when e.g., plugging > consecutive 100 MB with a block size of 2MB, we need 50 mappings. When > unmapping, we can use a single vfio_unmap call for the applicable range. > We expect that the block size of virtio-mem devices will be fairly large > in the future (to not run out of mappings and to improve hot(un)plug > performance), configured by the user, when used with vfio (e.g., 128MB, > 1G, ...). > > More info regarding virtio-mem can be found at: > https://virtio-mem.gitlab.io/ > I'll add a guide for virtio-mem+vfio soonish. There is now a guide/example at: https://virtio-mem.gitlab.io/user-guide/user-guide-qemu.html#vfio-vfio-pci