mbox series

[v6,00/25] Fixing record/replay and adding reverse debugging

Message ID 20180912081747.3228.21861.stgit@pasha-VirtualBox (mailing list archive)
Headers show
Series Fixing record/replay and adding reverse debugging | expand

Message

Pavel Dovgalyuk Sept. 12, 2018, 8:17 a.m. UTC
GDB remote protocol supports reverse debugging of the targets.
It includes 'reverse step' and 'reverse continue' operations.
The first one finds the previous step of the execution,
and the second one is intended to stop at the last breakpoint that
would happen when the program is executed normally.

Reverse debugging is possible in the replay mode, when at least
one snapshot was created at the record or replay phase.
QEMU can use these snapshots for travelling back in time with GDB.

Running the execution in replay mode allows using GDB reverse debugging
commands:
 - reverse-stepi (or rsi): Steps one instruction to the past.
   QEMU loads on of the prior snapshots and proceeds to the desired
   instruction forward. When that step is reaches, execution stops.
 - reverse-continue (or rc): Runs execution "backwards".
   QEMU tries to find breakpoint or watchpoint by loaded prior snapshot
   and replaying the execution. Then QEMU loads snapshots again and
   replays to the latest breakpoint. When there are no breakpoints in
   the examined section of the execution, QEMU finds one more snapshot
   and tries again. After the first snapshot is processed, execution
   stops at this snapshot.

The set of patches include the following modifications:
 - fixes of record/replay caused by the QEMU core changes
 - gdbstub update for reverse debugging support
 - functions that automatically perform reverse step and reverse
   continue operations
 - hmp/qmp commands for manipulating the replay process
 - improvement of the snapshotting for saving the execution step
   in the snapshot parameters
 - adding new clock for correct timer events from vnc and slirp
 - other record/replay fixes

The patches are available in the repository:
https://github.com/ispras/qemu/tree/rr-180911

v6 changes:
 - rebased to the new version of master
 - fixed build of linux-user configurations
 - added new clock for slirp and vnc timers

v5 changes:
 - multiple fixes of record/replay bugs appeared after QEMU core update
 - changed reverse debugging to 'since 3.1'

v4 changes:
 - changed 'since 2.13' to 'since 3.0' in json (as suggested by Eric Blake)

v3 changes:
 - Fixed PS/2 bug with save/load vm, which caused failures of the replay.
 - Rebased to the new code base.
 - Minor fixes.

v2 changes:
 - documented reverse debugging
 - fixed start vmstate loading in record mode
 - documented qcow2 changes (as suggested by Eric Blake)
 - made icount SnapshotInfo field optional (as suggested by Eric Blake)
 - renamed qmp commands (as suggested by Eric Blake)
 - minor changes

---

Pavel Dovgalyuk (25):
      block: implement bdrv_snapshot_goto for blkreplay
      replay: disable default snapshot for record/replay
      replay: update docs for record/replay with block devices
      replay: don't drain/flush bdrv queue while RR is working
      replay: finish record/replay before closing the disks
      qcow2: introduce icount field for snapshots
      migration: introduce icount field for snapshots
      replay: provide and accessor for rr filename
      replay: introduce info hmp/qmp command
      replay: introduce breakpoint at the specified step
      replay: implement replay-seek command to proceed to the desired step
      replay: flush events when exiting
      replay: refine replay-time module
      translator: fix breakpoint processing
      replay: flush rr queue before loading the vmstate
      gdbstub: add reverse step support in replay mode
      gdbstub: add reverse continue support in replay mode
      replay: describe reverse debugging in docs/replay.txt
      replay: allow loading any snapshots before recording
      replay: wake up vCPU when replaying
      replay: replay BH for IDE trim operation
      replay: add BH oneshot event for block layer
      timer: introduce new virtual clock
      slirp: fix ipv6 timers
      ui: fix virtual timers


 accel/tcg/translator.c    |    9 +
 block/blkreplay.c         |    8 +
 block/block-backend.c     |    3 
 block/io.c                |   22 +++
 block/qapi.c              |   17 ++-
 block/qcow2-snapshot.c    |    9 +
 block/qcow2.h             |    2 
 blockdev.c                |   10 ++
 cpus.c                    |   50 +++++---
 docs/interop/qcow2.txt    |    4 +
 docs/replay.txt           |   45 +++++++
 exec.c                    |    6 +
 gdbstub.c                 |   50 +++++++-
 hmp-commands-info.hx      |   14 ++
 hmp-commands.hx           |   30 +++++
 hmp.h                     |    3 
 hw/ide/core.c             |    3 
 include/block/snapshot.h  |    1 
 include/qemu/timer.h      |    9 +
 include/sysemu/replay.h   |   26 ++++
 migration/savevm.c        |   15 +-
 qapi/block-core.json      |    5 +
 qapi/block.json           |    3 
 qapi/misc.json            |   68 +++++++++++
 replay/Makefile.objs      |    3 
 replay/replay-debugging.c |  287 +++++++++++++++++++++++++++++++++++++++++++++
 replay/replay-events.c    |   30 +++--
 replay/replay-internal.h  |    9 +
 replay/replay-snapshot.c  |   17 ++-
 replay/replay-time.c      |   32 ++---
 replay/replay.c           |   38 ++++++
 slirp/ip6_icmp.c          |    7 +
 stubs/Makefile.objs       |    1 
 stubs/replay-user.c       |    9 +
 stubs/replay.c            |   10 ++
 ui/input.c                |    8 +
 util/qemu-timer.c         |    2 
 vl.c                      |   18 ++-
 38 files changed, 791 insertions(+), 92 deletions(-)
 create mode 100644 replay/replay-debugging.c
 create mode 100644 stubs/replay-user.c

Comments

Paolo Bonzini Sept. 13, 2018, 10:27 a.m. UTC | #1
On 12/09/2018 10:17, Pavel Dovgalyuk wrote:
> GDB remote protocol supports reverse debugging of the targets.
> It includes 'reverse step' and 'reverse continue' operations.
> The first one finds the previous step of the execution,
> and the second one is intended to stop at the last breakpoint that
> would happen when the program is executed normally.
> 
> Reverse debugging is possible in the replay mode, when at least
> one snapshot was created at the record or replay phase.
> QEMU can use these snapshots for travelling back in time with GDB.
> 
> Running the execution in replay mode allows using GDB reverse debugging
> commands:
>  - reverse-stepi (or rsi): Steps one instruction to the past.
>    QEMU loads on of the prior snapshots and proceeds to the desired
>    instruction forward. When that step is reaches, execution stops.
>  - reverse-continue (or rc): Runs execution "backwards".
>    QEMU tries to find breakpoint or watchpoint by loaded prior snapshot
>    and replaying the execution. Then QEMU loads snapshots again and
>    replays to the latest breakpoint. When there are no breakpoints in
>    the examined section of the execution, QEMU finds one more snapshot
>    and tries again. After the first snapshot is processed, execution
>    stops at this snapshot.
> 
> The set of patches include the following modifications:
>  - fixes of record/replay caused by the QEMU core changes
>  - gdbstub update for reverse debugging support
>  - functions that automatically perform reverse step and reverse
>    continue operations
>  - hmp/qmp commands for manipulating the replay process
>  - improvement of the snapshotting for saving the execution step
>    in the snapshot parameters
>  - adding new clock for correct timer events from vnc and slirp
>  - other record/replay fixes
> 
> The patches are available in the repository:
> https://github.com/ispras/qemu/tree/rr-180911
> 
> v6 changes:
>  - rebased to the new version of master
>  - fixed build of linux-user configurations
>  - added new clock for slirp and vnc timers
> 
> v5 changes:
>  - multiple fixes of record/replay bugs appeared after QEMU core update
>  - changed reverse debugging to 'since 3.1'
> 
> v4 changes:
>  - changed 'since 2.13' to 'since 3.0' in json (as suggested by Eric Blake)
> 
> v3 changes:
>  - Fixed PS/2 bug with save/load vm, which caused failures of the replay.
>  - Rebased to the new code base.
>  - Minor fixes.
> 
> v2 changes:
>  - documented reverse debugging
>  - fixed start vmstate loading in record mode
>  - documented qcow2 changes (as suggested by Eric Blake)
>  - made icount SnapshotInfo field optional (as suggested by Eric Blake)
>  - renamed qmp commands (as suggested by Eric Blake)
>  - minor changes
> 
> ---
> 
> Pavel Dovgalyuk (25):
>       block: implement bdrv_snapshot_goto for blkreplay
>       replay: disable default snapshot for record/replay
>       replay: update docs for record/replay with block devices
>       replay: don't drain/flush bdrv queue while RR is working
>       replay: finish record/replay before closing the disks
>       qcow2: introduce icount field for snapshots
>       migration: introduce icount field for snapshots
>       replay: provide and accessor for rr filename
>       replay: introduce info hmp/qmp command
>       replay: introduce breakpoint at the specified step
>       replay: implement replay-seek command to proceed to the desired step
>       replay: flush events when exiting
>       replay: refine replay-time module
>       translator: fix breakpoint processing
>       replay: flush rr queue before loading the vmstate
>       gdbstub: add reverse step support in replay mode
>       gdbstub: add reverse continue support in replay mode
>       replay: describe reverse debugging in docs/replay.txt
>       replay: allow loading any snapshots before recording
>       replay: wake up vCPU when replaying
>       replay: replay BH for IDE trim operation
>       replay: add BH oneshot event for block layer
>       timer: introduce new virtual clock
>       slirp: fix ipv6 timers
>       ui: fix virtual timers
> 
> 
>  accel/tcg/translator.c    |    9 +
>  block/blkreplay.c         |    8 +
>  block/block-backend.c     |    3 
>  block/io.c                |   22 +++
>  block/qapi.c              |   17 ++-
>  block/qcow2-snapshot.c    |    9 +
>  block/qcow2.h             |    2 
>  blockdev.c                |   10 ++
>  cpus.c                    |   50 +++++---
>  docs/interop/qcow2.txt    |    4 +
>  docs/replay.txt           |   45 +++++++
>  exec.c                    |    6 +
>  gdbstub.c                 |   50 +++++++-
>  hmp-commands-info.hx      |   14 ++
>  hmp-commands.hx           |   30 +++++
>  hmp.h                     |    3 
>  hw/ide/core.c             |    3 
>  include/block/snapshot.h  |    1 
>  include/qemu/timer.h      |    9 +
>  include/sysemu/replay.h   |   26 ++++
>  migration/savevm.c        |   15 +-
>  qapi/block-core.json      |    5 +
>  qapi/block.json           |    3 
>  qapi/misc.json            |   68 +++++++++++
>  replay/Makefile.objs      |    3 
>  replay/replay-debugging.c |  287 +++++++++++++++++++++++++++++++++++++++++++++
>  replay/replay-events.c    |   30 +++--
>  replay/replay-internal.h  |    9 +
>  replay/replay-snapshot.c  |   17 ++-
>  replay/replay-time.c      |   32 ++---
>  replay/replay.c           |   38 ++++++
>  slirp/ip6_icmp.c          |    7 +
>  stubs/Makefile.objs       |    1 
>  stubs/replay-user.c       |    9 +
>  stubs/replay.c            |   10 ++
>  ui/input.c                |    8 +
>  util/qemu-timer.c         |    2 
>  vl.c                      |   18 ++-
>  38 files changed, 791 insertions(+), 92 deletions(-)
>  create mode 100644 replay/replay-debugging.c
>  create mode 100644 stubs/replay-user.c
> 

For now I'm queuing 12, 14, 19, 20 (pending question to you) and 23-25.

Kevin, can you take a look at patches 1-5?  I cannot quite evaluate if 4
has any scary ramifications.

Paolo
Pavel Dovgalyuk Sept. 13, 2018, 1:40 p.m. UTC | #2
> From: Paolo Bonzini [mailto:pbonzini@redhat.com]
> On 12/09/2018 10:17, Pavel Dovgalyuk wrote:
> > GDB remote protocol supports reverse debugging of the targets.
> > It includes 'reverse step' and 'reverse continue' operations.
> > The first one finds the previous step of the execution,
> > and the second one is intended to stop at the last breakpoint that
> > would happen when the program is executed normally.
> >
> > Reverse debugging is possible in the replay mode, when at least
> > one snapshot was created at the record or replay phase.
> > QEMU can use these snapshots for travelling back in time with GDB.
> >
> > Running the execution in replay mode allows using GDB reverse debugging
> > commands:
> >  - reverse-stepi (or rsi): Steps one instruction to the past.
> >    QEMU loads on of the prior snapshots and proceeds to the desired
> >    instruction forward. When that step is reaches, execution stops.
> >  - reverse-continue (or rc): Runs execution "backwards".
> >    QEMU tries to find breakpoint or watchpoint by loaded prior snapshot
> >    and replaying the execution. Then QEMU loads snapshots again and
> >    replays to the latest breakpoint. When there are no breakpoints in
> >    the examined section of the execution, QEMU finds one more snapshot
> >    and tries again. After the first snapshot is processed, execution
> >    stops at this snapshot.
> >
> > The set of patches include the following modifications:
> >  - fixes of record/replay caused by the QEMU core changes
> >  - gdbstub update for reverse debugging support
> >  - functions that automatically perform reverse step and reverse
> >    continue operations
> >  - hmp/qmp commands for manipulating the replay process
> >  - improvement of the snapshotting for saving the execution step
> >    in the snapshot parameters
> >  - adding new clock for correct timer events from vnc and slirp
> >  - other record/replay fixes
> >
> > The patches are available in the repository:
> > https://github.com/ispras/qemu/tree/rr-180911
> >
> > v6 changes:
> >  - rebased to the new version of master
> >  - fixed build of linux-user configurations
> >  - added new clock for slirp and vnc timers
> >
> > v5 changes:
> >  - multiple fixes of record/replay bugs appeared after QEMU core update
> >  - changed reverse debugging to 'since 3.1'
> >
> > v4 changes:
> >  - changed 'since 2.13' to 'since 3.0' in json (as suggested by Eric Blake)
> >
> > v3 changes:
> >  - Fixed PS/2 bug with save/load vm, which caused failures of the replay.
> >  - Rebased to the new code base.
> >  - Minor fixes.
> >
> > v2 changes:
> >  - documented reverse debugging
> >  - fixed start vmstate loading in record mode
> >  - documented qcow2 changes (as suggested by Eric Blake)
> >  - made icount SnapshotInfo field optional (as suggested by Eric Blake)
> >  - renamed qmp commands (as suggested by Eric Blake)
> >  - minor changes
> >
> > ---
> >
> > Pavel Dovgalyuk (25):
> >       block: implement bdrv_snapshot_goto for blkreplay
> >       replay: disable default snapshot for record/replay
> >       replay: update docs for record/replay with block devices
> >       replay: don't drain/flush bdrv queue while RR is working
> >       replay: finish record/replay before closing the disks
> >       qcow2: introduce icount field for snapshots
> >       migration: introduce icount field for snapshots
> >       replay: provide and accessor for rr filename
> >       replay: introduce info hmp/qmp command
> >       replay: introduce breakpoint at the specified step
> >       replay: implement replay-seek command to proceed to the desired step
> >       replay: flush events when exiting
> >       replay: refine replay-time module
> >       translator: fix breakpoint processing
> >       replay: flush rr queue before loading the vmstate
> >       gdbstub: add reverse step support in replay mode
> >       gdbstub: add reverse continue support in replay mode
> >       replay: describe reverse debugging in docs/replay.txt
> >       replay: allow loading any snapshots before recording
> >       replay: wake up vCPU when replaying
> >       replay: replay BH for IDE trim operation
> >       replay: add BH oneshot event for block layer
> >       timer: introduce new virtual clock
> >       slirp: fix ipv6 timers
> >       ui: fix virtual timers
> >
> >
> >  accel/tcg/translator.c    |    9 +
> >  block/blkreplay.c         |    8 +
> >  block/block-backend.c     |    3
> >  block/io.c                |   22 +++
> >  block/qapi.c              |   17 ++-
> >  block/qcow2-snapshot.c    |    9 +
> >  block/qcow2.h             |    2
> >  blockdev.c                |   10 ++
> >  cpus.c                    |   50 +++++---
> >  docs/interop/qcow2.txt    |    4 +
> >  docs/replay.txt           |   45 +++++++
> >  exec.c                    |    6 +
> >  gdbstub.c                 |   50 +++++++-
> >  hmp-commands-info.hx      |   14 ++
> >  hmp-commands.hx           |   30 +++++
> >  hmp.h                     |    3
> >  hw/ide/core.c             |    3
> >  include/block/snapshot.h  |    1
> >  include/qemu/timer.h      |    9 +
> >  include/sysemu/replay.h   |   26 ++++
> >  migration/savevm.c        |   15 +-
> >  qapi/block-core.json      |    5 +
> >  qapi/block.json           |    3
> >  qapi/misc.json            |   68 +++++++++++
> >  replay/Makefile.objs      |    3
> >  replay/replay-debugging.c |  287 +++++++++++++++++++++++++++++++++++++++++++++
> >  replay/replay-events.c    |   30 +++--
> >  replay/replay-internal.h  |    9 +
> >  replay/replay-snapshot.c  |   17 ++-
> >  replay/replay-time.c      |   32 ++---
> >  replay/replay.c           |   38 ++++++
> >  slirp/ip6_icmp.c          |    7 +
> >  stubs/Makefile.objs       |    1
> >  stubs/replay-user.c       |    9 +
> >  stubs/replay.c            |   10 ++
> >  ui/input.c                |    8 +
> >  util/qemu-timer.c         |    2
> >  vl.c                      |   18 ++-
> >  38 files changed, 791 insertions(+), 92 deletions(-)
> >  create mode 100644 replay/replay-debugging.c
> >  create mode 100644 stubs/replay-user.c
> >
> 
> For now I'm queuing 12, 14, 19, 20 (pending question to you) and 23-25.

What about patch 21?

Pavel Dovgalyuk
Paolo Bonzini Sept. 13, 2018, 1:46 p.m. UTC | #3
On 13/09/2018 15:40, Pavel Dovgalyuk wrote:
>> For now I'm queuing 12, 14, 19, 20 (pending question to you) and 23-25.
> What about patch 21?

I'd want an ACK from the IDE maintainer.  Let's add him to Cc.

Paolo