mbox series

[RFC,v3,0/3] migration: reduce time of loading non-iterable vmstate

Message ID 20221213133510.1279488-1-xuchuangxclwt@bytedance.com (mailing list archive)
Headers show
Series migration: reduce time of loading non-iterable vmstate | expand

Message

Chuang Xu Dec. 13, 2022, 1:35 p.m. UTC
Hi!

In this version:

- move virtio_load_check_delay() from virtio_memory_listener_commit() to 
  virtio_vmstate_change().
- add delay_check flag to VirtIODevice to make sure virtio_load_check_delay() 
  will be called when delay_check is true.

Please review, Chuang.

[v2]

- rebase to latest upstream.
- add sanity check to address_space_to_flatview().
- postpone the init of the vring cache until migration's loading completes. 

[v1]

The duration of loading non-iterable vmstate accounts for a significant
portion of downtime (starting with the timestamp of source qemu stop and
ending with the timestamp of target qemu start). Most of the time is spent
committing memory region changes repeatedly.

This patch packs all the changes to memory region during the period of
loading non-iterable vmstate in a single memory transaction. With the
increase of devices, this patch will greatly improve the performance.

Here are the test results:
test vm info:
- 32 CPUs 128GB RAM
- 8 16-queue vhost-net device
- 16 4-queue vhost-user-blk device.

	time of loading non-iterable vmstate
before		about 210 ms
after		about 40 ms

Comments

Peter Xu Dec. 16, 2022, 5:11 p.m. UTC | #1
Chuang,

On Tue, Dec 13, 2022 at 09:35:07PM +0800, Chuang Xu wrote:
> Here are the test results:
> test vm info:
> - 32 CPUs 128GB RAM
> - 8 16-queue vhost-net device
> - 16 4-queue vhost-user-blk device.
> 
> 	time of loading non-iterable vmstate
> before		about 210 ms
> after		about 40 ms

It'll be also great if you can attach more information in the cover letter
in the next post.  For example:

  - Per your investigation, what's the major influential factor for the
    memory updates?  Besides the number of devices, does the number of
    queues also matter?  Is there anything else?

  - Total downtime comparison (only if above is only part of the downtime)

  - The nic used in the test.

Thanks,