Message ID | 1457001868-15949-5-git-send-email-liang.z.li@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 3 Mar 2016 18:44:28 +0800 Liang Li <liang.z.li@intel.com> wrote: > Get the free pages information through virtio and filter out the free > pages in the ram bulk stage. This can significantly reduce the total > live migration time as well as network traffic. > > Signed-off-by: Liang Li <liang.z.li@intel.com> > --- > migration/ram.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------ > 1 file changed, 46 insertions(+), 6 deletions(-) > > @@ -1945,6 +1971,20 @@ static int ram_save_setup(QEMUFile *f, void *opaque) > DIRTY_MEMORY_MIGRATION); > } > memory_global_dirty_log_start(); > + > + if (balloon_free_pages_support() && > + balloon_get_free_pages(migration_bitmap_rcu->free_pages_bmap, > + &free_pages_count) == 0) { > + qemu_mutex_unlock_iothread(); > + while (balloon_get_free_pages(migration_bitmap_rcu->free_pages_bmap, > + &free_pages_count) == 0) { > + usleep(1000); > + } > + qemu_mutex_lock_iothread(); > + > + filter_out_guest_free_pages(migration_bitmap_rcu->free_pages_bmap); A general comment: Using the ballooner to get information about pages that can be filtered out is too limited (there may be other ways to do this; we might be able to use cmma on s390, for example), and I don't like hardcoding to a specific method. What about the reverse approach: Code may register a handler that populates the free_pages_bitmap which is called during this stage? <I like the idea of filtering in general, but I haven't looked at the code yet> > + } > + > migration_bitmap_sync(); > qemu_mutex_unlock_ramlist(); > qemu_mutex_unlock_iothread(); -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 03, 2016 at 06:44:28PM +0800, Liang Li wrote: > Get the free pages information through virtio and filter out the free > pages in the ram bulk stage. This can significantly reduce the total > live migration time as well as network traffic. > > Signed-off-by: Liang Li <liang.z.li@intel.com> > --- > migration/ram.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------ > 1 file changed, 46 insertions(+), 6 deletions(-) > @@ -1945,6 +1971,20 @@ static int ram_save_setup(QEMUFile *f, void *opaque) > DIRTY_MEMORY_MIGRATION); > } > memory_global_dirty_log_start(); > + > + if (balloon_free_pages_support() && > + balloon_get_free_pages(migration_bitmap_rcu->free_pages_bmap, > + &free_pages_count) == 0) { > + qemu_mutex_unlock_iothread(); > + while (balloon_get_free_pages(migration_bitmap_rcu->free_pages_bmap, > + &free_pages_count) == 0) { > + usleep(1000); > + } > + qemu_mutex_lock_iothread(); > + > + filter_out_guest_free_pages(migration_bitmap_rcu->free_pages_bmap); > + } IIUC, this code is synchronous wrt to the guest OS balloon drive. ie it is asking the geust for free pages and waiting for a response. If the guest OS has crashed this is going to mean QEMU waits forever and thus migration won't complete. Similarly you need to consider that the guest OS may be malicious and simply never respond. So if the migration code is going to use the guest balloon driver to get info about free pages it has to be done in an asynchronous manner so that migration can never be stalled by a slow/crashed/malicious guest driver. Regards, Daniel
> On Thu, 3 Mar 2016 18:44:28 +0800 > Liang Li <liang.z.li@intel.com> wrote: > > > Get the free pages information through virtio and filter out the free > > pages in the ram bulk stage. This can significantly reduce the total > > live migration time as well as network traffic. > > > > Signed-off-by: Liang Li <liang.z.li@intel.com> > > --- > > migration/ram.c | 52 > > ++++++++++++++++++++++++++++++++++++++++++++++------ > > 1 file changed, 46 insertions(+), 6 deletions(-) > > > > > @@ -1945,6 +1971,20 @@ static int ram_save_setup(QEMUFile *f, void > *opaque) > > DIRTY_MEMORY_MIGRATION); > > } > > memory_global_dirty_log_start(); > > + > > + if (balloon_free_pages_support() && > > + balloon_get_free_pages(migration_bitmap_rcu->free_pages_bmap, > > + &free_pages_count) == 0) { > > + qemu_mutex_unlock_iothread(); > > + while (balloon_get_free_pages(migration_bitmap_rcu- > >free_pages_bmap, > > + &free_pages_count) == 0) { > > + usleep(1000); > > + } > > + qemu_mutex_lock_iothread(); > > + > > + > > + filter_out_guest_free_pages(migration_bitmap_rcu- > >free_pages_bmap); > > A general comment: Using the ballooner to get information about pages that > can be filtered out is too limited (there may be other ways to do this; we > might be able to use cmma on s390, for example), and I don't like hardcoding > to a specific method. > > What about the reverse approach: Code may register a handler that > populates the free_pages_bitmap which is called during this stage? Good suggestion, thanks! Liang > <I like the idea of filtering in general, but I haven't looked at the code yet> > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Thu, Mar 03, 2016 at 06:44:28PM +0800, Liang Li wrote: > > Get the free pages information through virtio and filter out the free > > pages in the ram bulk stage. This can significantly reduce the total > > live migration time as well as network traffic. > > > > Signed-off-by: Liang Li <liang.z.li@intel.com> > > --- > > migration/ram.c | 52 > > ++++++++++++++++++++++++++++++++++++++++++++++------ > > 1 file changed, 46 insertions(+), 6 deletions(-) > > > @@ -1945,6 +1971,20 @@ static int ram_save_setup(QEMUFile *f, void > *opaque) > > DIRTY_MEMORY_MIGRATION); > > } > > memory_global_dirty_log_start(); > > + > > + if (balloon_free_pages_support() && > > + balloon_get_free_pages(migration_bitmap_rcu->free_pages_bmap, > > + &free_pages_count) == 0) { > > + qemu_mutex_unlock_iothread(); > > + while (balloon_get_free_pages(migration_bitmap_rcu- > >free_pages_bmap, > > + &free_pages_count) == 0) { > > + usleep(1000); > > + } > > + qemu_mutex_lock_iothread(); > > + > > + filter_out_guest_free_pages(migration_bitmap_rcu- > >free_pages_bmap); > > + } > > IIUC, this code is synchronous wrt to the guest OS balloon drive. ie it is asking > the geust for free pages and waiting for a response. If the guest OS has > crashed this is going to mean QEMU waits forever and thus migration won't > complete. Similarly you need to consider that the guest OS may be malicious > and simply never respond. > > So if the migration code is going to use the guest balloon driver to get info > about free pages it has to be done in an asynchronous manner so that > migration can never be stalled by a slow/crashed/malicious guest driver. > > Regards, > Daniel Really, thanks a lot! Liang
diff --git a/migration/ram.c b/migration/ram.c index ee2547d..819553b 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -40,6 +40,7 @@ #include "trace.h" #include "exec/ram_addr.h" #include "qemu/rcu_queue.h" +#include "sysemu/balloon.h" #ifdef DEBUG_MIGRATION_RAM #define DPRINTF(fmt, ...) \ @@ -241,6 +242,7 @@ static struct BitmapRcu { struct rcu_head rcu; /* Main migration bitmap */ unsigned long *bmap; + unsigned long *free_pages_bmap; /* bitmap of pages that haven't been sent even once * only maintained and used in postcopy at the moment * where it's used to send the dirtymap at the start @@ -561,12 +563,7 @@ ram_addr_t migration_bitmap_find_dirty(RAMBlock *rb, unsigned long next; bitmap = atomic_rcu_read(&migration_bitmap_rcu)->bmap; - if (ram_bulk_stage && nr > base) { - next = nr + 1; - } else { - next = find_next_bit(bitmap, size, nr); - } - + next = find_next_bit(bitmap, size, nr); *ram_addr_abs = next << TARGET_PAGE_BITS; return (next - base) << TARGET_PAGE_BITS; } @@ -1415,6 +1412,9 @@ void free_xbzrle_decoded_buf(void) static void migration_bitmap_free(struct BitmapRcu *bmap) { g_free(bmap->bmap); + if (balloon_free_pages_support()) { + g_free(bmap->free_pages_bmap); + } g_free(bmap->unsentmap); g_free(bmap); } @@ -1873,6 +1873,28 @@ err: return ret; } +static void filter_out_guest_free_pages(unsigned long *free_pages_bmap) +{ + RAMBlock *block; + DirtyMemoryBlocks *blocks; + unsigned long end, page; + + blocks = atomic_rcu_read(&ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION]); + block = QLIST_FIRST_RCU(&ram_list.blocks); + end = TARGET_PAGE_ALIGN(block->offset + + block->used_length) >> TARGET_PAGE_BITS; + page = block->offset >> TARGET_PAGE_BITS; + + while (page < end) { + unsigned long idx = page / DIRTY_MEMORY_BLOCK_SIZE; + unsigned long offset = page % DIRTY_MEMORY_BLOCK_SIZE; + unsigned long num = MIN(end - page, DIRTY_MEMORY_BLOCK_SIZE - offset); + unsigned long *p = free_pages_bmap + BIT_WORD(page); + + slow_bitmap_complement(blocks->blocks[idx], p, num); + page += num; + } +} /* Each of ram_save_setup, ram_save_iterate and ram_save_complete has * long-running RCU critical section. When rcu-reclaims in the code @@ -1884,6 +1906,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque) { RAMBlock *block; int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */ + uint64_t free_pages_count = 0; dirty_rate_high_cnt = 0; bitmap_sync_count = 0; @@ -1931,6 +1954,9 @@ static int ram_save_setup(QEMUFile *f, void *opaque) ram_bitmap_pages = last_ram_offset() >> TARGET_PAGE_BITS; migration_bitmap_rcu = g_new0(struct BitmapRcu, 1); migration_bitmap_rcu->bmap = bitmap_new(ram_bitmap_pages); + if (balloon_free_pages_support()) { + migration_bitmap_rcu->free_pages_bmap = bitmap_new(ram_bitmap_pages); + } if (migrate_postcopy_ram()) { migration_bitmap_rcu->unsentmap = bitmap_new(ram_bitmap_pages); @@ -1945,6 +1971,20 @@ static int ram_save_setup(QEMUFile *f, void *opaque) DIRTY_MEMORY_MIGRATION); } memory_global_dirty_log_start(); + + if (balloon_free_pages_support() && + balloon_get_free_pages(migration_bitmap_rcu->free_pages_bmap, + &free_pages_count) == 0) { + qemu_mutex_unlock_iothread(); + while (balloon_get_free_pages(migration_bitmap_rcu->free_pages_bmap, + &free_pages_count) == 0) { + usleep(1000); + } + qemu_mutex_lock_iothread(); + + filter_out_guest_free_pages(migration_bitmap_rcu->free_pages_bmap); + } + migration_bitmap_sync(); qemu_mutex_unlock_ramlist(); qemu_mutex_unlock_iothread();
Get the free pages information through virtio and filter out the free pages in the ram bulk stage. This can significantly reduce the total live migration time as well as network traffic. Signed-off-by: Liang Li <liang.z.li@intel.com> --- migration/ram.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 6 deletions(-)