mbox series

[V2,00/22] Live Update

Message ID 1609861330-129855-1-git-send-email-steven.sistare@oracle.com (mailing list archive)
Headers show
Series Live Update | expand

Message

Steven Sistare Jan. 5, 2021, 3:41 p.m. UTC
Provide the cprsave and cprload commands for live update.  These save and
restore VM state, with minimal guest pause time, so that qemu may be updated
to a new version in between.

cprsave stops the VM and saves vmstate to an ordinary file.  It supports two
modes: restart and reboot.  For restart, cprsave exec's the qemu binary (or
/usr/bin/qemu-exec if it exists) with the same argv.  qemu restarts in a
paused state and waits for the cprload command.

To use the restart mode, qemu must be started with the memfd-alloc option,
which allocates guest ram using memfd_create.  The memfd's are saved to
the environment and kept open across exec, after which they are found from
the environment and re-mmap'd.  Hence guest ram is preserved in place,
albeit with new virtual addresses in the qemu process.  The caller resumes
the guest by calling cprload, which loads state from the file.  If the VM
was running at cprsave time, then VM execution resumes.  cprsave supports
any type of guest image and block device, but the caller must not modify
guest block devices between cprsave and cprload.

The restart mode supports vfio devices by preserving the vfio container,
group, device, and event descriptors across the qemu re-exec, and by
updating DMA mapping virtual addresses using VFIO_DMA_UNMAP_FLAG_SUSPEND
and VFIO_DMA_MAP_FLAG_RESUME as proposed in 
https://lore.kernel.org/kvm/1609861013-129801-1-git-send-email-steven.sistare@oracle.com

For the reboot mode, cprsave saves state and exits qemu, and the caller is
allowed to update the host kernel and system software and reboot.  The
caller resumes the guest by running qemu with the same arguments as the
original process and calling cprload.  To use this mode, guest ram must be
mapped to a persistent shared memory file such as /dev/dax0.0, or /dev/shm
PKRAM as proposed in https://lore.kernel.org/lkml/1588812129-8596-1-git-send-email-anthony.yznaga@oracle.com/

The reboot mode supports vfio devices if the caller suspends the guest
instead of stopping the VM, such as by issuing guest-suspend-ram to the
qemu guest agent.  The guest drivers' suspend methods flush outstanding
requests and re-initialize the devices, and thus there is no device state
to save and restore.

The first patches add helper functions:

  - as_flat_walk
  - qemu_ram_volatile
  - oslib: qemu_clr_cloexec
  - util: env var helpers
  - vl: memfd-alloc option
  - vl: add helper to request re-exec

The next patches implement cprsave and cprload:

  - cpr
  - cpr: QMP interfaces
  - cpr: HMP interfaces

The next patches add vfio support for the restart mode:

  - pci: export functions for cpr
  - vfio-pci: refactor for cpr
  - vfio-pci: cpr

The next patches preserve various descriptor-based backend devices across
a cprsave restart:

  - vhost: reset vhost devices upon cprsave
  - chardev: cpr framework
  - chardev: cpr for simple devices
  - chardev: cpr for pty
  - chardev: socket accept subroutine
  - chardev: cpr for sockets
  - monitor: cpr support
  - cpr: only-cpr-capable option
  - cpr: maintainers
  - simplify savevm

Here is an example of updating qemu from v4.2.0 to v4.2.1 using 
"cprload restart".  The software update is performed while the guest is
running to minimize downtime.

window 1				| window 2
					|
# qemu-system-x86_64 ... 		|
QEMU 4.2.0 monitor - type 'help' ...	|
(qemu) info status			|
VM status: running			|
					| # yum update qemu
(qemu) cprsave /tmp/qemu.sav restart	|
QEMU 4.2.1 monitor - type 'help' ...	|
(qemu) info status			|
VM status: paused (prelaunch)		|
(qemu) cprload /tmp/qemu.sav		|
(qemu) info status			|
VM status: running			|


Here is an example of updating the host kernel using "cprload reboot"

window 1					| window 2
						|
# qemu-system-x86_64 ...mem-path=/dev/dax0.0 ...|
QEMU 4.2.1 monitor - type 'help' ...		|
(qemu) info status				|
VM status: running				|
						| # yum update kernel-uek
(qemu) cprsave /tmp/qemu.sav restart		|
						|
# systemctl kexec				|
kexec_core: Starting new kernel			|
...						|
						|
# qemu-system-x86_64 ...mem-path=/dev/dax0.0 ...|
QEMU 4.2.1 monitor - type 'help' ...		|
(qemu) info status				|
VM status: paused (prelaunch)			|
(qemu) cprload /tmp/qemu.sav			|
(qemu) info status				|
VM status: running				|

Changes from V1 to V2:
  - revert vmstate infrastructure changes
  - refactor cpr functions into new files
  - delete MADV_DOEXEC and use memfd + VFIO_DMA_UNMAP_FLAG_SUSPEND to 
    preserve memory.
  - add framework to filter chardev's that support cpr
  - save and restore vfio eventfd's
  - modify cprinfo QMP interface
  - incorporate misc review feedback
  - remove unrelated and unneeded patches
  - refactor all patches into a shorter and easier to review series

Steve Sistare (17):
  as_flat_walk
  qemu_ram_volatile
  oslib: qemu_clr_cloexec
  util: env var helpers
  vl: memfd-alloc option
  vl: add helper to request re-exec
  cpr
  pci: export functions for cpr
  vfio-pci: refactor for cpr
  vfio-pci: cpr
  chardev: cpr framework
  chardev: cpr for simple devices
  chardev: cpr for pty
  chardev: socket accept subroutine
  cpr: only-cpr-capable option
  cpr: maintainers
  simplify savevm

Mark Kanda (5):
  cpr: QMP interfaces
  cpr: HMP interfaces
  vhost: reset vhost devices upon cprsave
  chardev: cpr for sockets
  monitor: cpr support

 MAINTAINERS                   |  11 +++
 chardev/char-mux.c            |   1 +
 chardev/char-null.c           |   1 +
 chardev/char-pty.c            |  16 +++-
 chardev/char-serial.c         |   1 +
 chardev/char-socket.c         |  31 +++++++
 chardev/char-stdio.c          |   8 ++
 chardev/char.c                |  41 ++++++++-
 exec.c                        |  75 +++++++++++++--
 gdbstub.c                     |   1 +
 hmp-commands.hx               |  44 +++++++++
 hw/pci/msix.c                 |  20 ++--
 hw/pci/pci.c                  |   7 +-
 hw/vfio/Makefile.objs         |   2 +-
 hw/vfio/common.c              |  63 ++++++++++++-
 hw/vfio/cpr.c                 | 117 +++++++++++++++++++++++
 hw/vfio/pci.c                 | 209 ++++++++++++++++++++++++++++++++++++++----
 hw/vfio/trace-events          |   1 +
 hw/virtio/vhost.c             |  11 +++
 include/chardev/char.h        |   6 ++
 include/exec/memory.h         |  11 +++
 include/hw/pci/msix.h         |   5 +
 include/hw/pci/pci.h          |   2 +
 include/hw/vfio/vfio-common.h |   7 ++
 include/hw/virtio/vhost.h     |   1 +
 include/io/channel-socket.h   |  12 +++
 include/migration/cpr.h       |  17 ++++
 include/monitor/hmp.h         |   3 +
 include/monitor/monitor.h     |   2 +
 include/qemu/env.h            |  27 ++++++
 include/qemu/osdep.h          |   1 +
 include/sysemu/sysemu.h       |   4 +
 io/channel-socket.c           |  52 +++++++----
 linux-headers/linux/vfio.h    |   5 +
 migration/Makefile.objs       |   2 +-
 migration/cpr.c               | 198 +++++++++++++++++++++++++++++++++++++++
 migration/migration.c         |   6 ++
 migration/savevm.c            |  19 ++--
 migration/savevm.h            |   2 +
 monitor/hmp-cmds.c            |  48 ++++++++++
 monitor/monitor.c             |   5 +
 monitor/qmp-cmds.c            |  31 +++++++
 monitor/qmp.c                 |  43 +++++++++
 qapi/Makefile.objs            |   3 +-
 qapi/char.json                |   5 +-
 qapi/cpr.json                 |  68 ++++++++++++++
 qapi/qapi-schema.json         |   1 +
 qemu-options.hx               |  45 ++++++++-
 slirp                         |   2 +-
 softmmu/memory.c              |  17 ++++
 softmmu/vl.c                  |  68 +++++++++++++-
 stubs/Makefile.objs           |   1 +
 stubs/cpr.c                   |   3 +
 trace-events                  |   1 +
 util/Makefile.objs            |   2 +-
 util/env.c                    | 119 ++++++++++++++++++++++++
 util/oslib-posix.c            |   9 ++
 util/oslib-win32.c            |   4 +
 58 files changed, 1433 insertions(+), 84 deletions(-)
 create mode 100644 hw/vfio/cpr.c
 create mode 100644 include/migration/cpr.h
 create mode 100644 include/qemu/env.h
 create mode 100644 migration/cpr.c
 create mode 100644 qapi/cpr.json
 create mode 100644 stubs/cpr.c
 create mode 100644 util/env.c