mbox series

[v4,00/15] Reverse debugging

Message ID 160006358590.31457.16757371597343007847.stgit@pasha-ThinkPad-X280 (mailing list archive)
Headers show
Series Reverse debugging | expand

Message

Pavel Dovgalyuk Sept. 14, 2020, 6:06 a.m. UTC
GDB remote protocol supports reverse debugging of the targets.
It includes 'reverse step' and 'reverse continue' operations.
The first one finds the previous step of the execution,
and the second one is intended to stop at the last breakpoint that
would happen when the program is executed normally.

Reverse debugging is possible in the replay mode, when at least
one snapshot was created at the record or replay phase.
QEMU can use these snapshots for travelling back in time with GDB.

Running the execution in replay mode allows using GDB reverse debugging
commands:
 - reverse-stepi (or rsi): Steps one instruction to the past.
   QEMU loads on of the prior snapshots and proceeds to the desired
   instruction forward. When that step is reaches, execution stops.
 - reverse-continue (or rc): Runs execution "backwards".
   QEMU tries to find breakpoint or watchpoint by loaded prior snapshot
   and replaying the execution. Then QEMU loads snapshots again and
   replays to the latest breakpoint. When there are no breakpoints in
   the examined section of the execution, QEMU finds one more snapshot
   and tries again. After the first snapshot is processed, execution
   stops at this snapshot.

The set of patches include the following modifications:
 - gdbstub update for reverse debugging support
 - functions that automatically perform reverse step and reverse
   continue operations
 - hmp/qmp commands for manipulating the replay process
 - improvement of the snapshotting for saving the execution step
   in the snapshot parameters
 - avocado-based acceptance tests for reverse debugging

The patches are available in the repository:
https://github.com/ispras/qemu/tree/rr-200901

v4 changes:
 - added VM snapshot creation on gdb connect (suggested by Alex Bennée)
 - removed useless calls to error_free
 - updated poll interrupt processing
 - minor changes
v3 changes:
 - rebased to support the new build system
 - bumped avocado framework version for using fixed remote gdb client
v2 changes:
 - rebased to the latest upstream version
 - fixed replaying of the POLL interrupts after the latest debug changes

---

Pavel Dovgaluk (11):
      replay: provide an accessor for rr filename
      qcow2: introduce icount field for snapshots
      qapi: introduce replay.json for record/replay-related stuff
      replay: introduce info hmp/qmp command
      replay: introduce breakpoint at the specified step
      replay: implement replay-seek command
      replay: flush rr queue before loading the vmstate
      gdbstub: add reverse step support in replay mode
      gdbstub: add reverse continue support in replay mode
      replay: describe reverse debugging in docs/replay.txt
      tests/acceptance: add reverse debugging test

Pavel Dovgalyuk (4):
      replay: don't record interrupt poll
      migration: introduce icount field for snapshots
      docs: convert replay.txt to rst
      replay: create temporary snapshot at debugger connection


 MAINTAINERS                           |    2 
 accel/tcg/cpu-exec.c                  |   21 +-
 accel/tcg/translator.c                |    1 
 block/qapi.c                          |   18 +
 block/qcow2-snapshot.c                |    9 +
 block/qcow2.h                         |    3 
 blockdev.c                            |   10 +
 docs/interop/qcow2.txt                |    5 
 docs/replay.txt                       |  364 -----------------------------
 docs/system/index.rst                 |    1 
 docs/system/replay.rst                |  410 +++++++++++++++++++++++++++++++++
 exec.c                                |    8 +
 gdbstub.c                             |   64 +++++
 hmp-commands-info.hx                  |   11 +
 hmp-commands.hx                       |   50 ++++
 include/block/snapshot.h              |    1 
 include/monitor/hmp.h                 |    4 
 include/sysemu/replay.h               |   26 ++
 migration/savevm.c                    |   17 +
 qapi/block-core.json                  |   11 +
 qapi/meson.build                      |    1 
 qapi/misc.json                        |   18 -
 qapi/qapi-schema.json                 |    1 
 qapi/replay.json                      |  121 ++++++++++
 replay/meson.build                    |    1 
 replay/replay-debugging.c             |  334 +++++++++++++++++++++++++++
 replay/replay-events.c                |    4 
 replay/replay-internal.h              |    6 
 replay/replay.c                       |   22 ++
 softmmu/cpus.c                        |   19 +-
 stubs/replay.c                        |   15 +
 tests/acceptance/reverse_debugging.py |  203 ++++++++++++++++
 tests/qemu-iotests/267.out            |   48 ++--
 33 files changed, 1401 insertions(+), 428 deletions(-)
 delete mode 100644 docs/replay.txt
 create mode 100644 docs/system/replay.rst
 create mode 100644 qapi/replay.json
 create mode 100644 replay/replay-debugging.c
 create mode 100644 tests/acceptance/reverse_debugging.py

--
Pavel Dovgalyuk

Comments

Paolo Bonzini Sept. 20, 2020, 7:58 a.m. UTC | #1
On 14/09/20 08:06, Pavel Dovgalyuk wrote:
> GDB remote protocol supports reverse debugging of the targets.
> It includes 'reverse step' and 'reverse continue' operations.
> The first one finds the previous step of the execution,
> and the second one is intended to stop at the last breakpoint that
> would happen when the program is executed normally.
> 
> Reverse debugging is possible in the replay mode, when at least
> one snapshot was created at the record or replay phase.
> QEMU can use these snapshots for travelling back in time with GDB.

I had queued this, it is a very nice patch series.  Unfortunately, the
tests failed on gitlab:

https://gitlab.com/bonzini/qemu/-/jobs/745795080

Paolo
Pavel Dovgalyuk Sept. 21, 2020, 6:03 a.m. UTC | #2
On 20.09.2020 10:58, Paolo Bonzini wrote:
> On 14/09/20 08:06, Pavel Dovgalyuk wrote:
>> GDB remote protocol supports reverse debugging of the targets.
>> It includes 'reverse step' and 'reverse continue' operations.
>> The first one finds the previous step of the execution,
>> and the second one is intended to stop at the last breakpoint that
>> would happen when the program is executed normally.
>>
>> Reverse debugging is possible in the replay mode, when at least
>> one snapshot was created at the record or replay phase.
>> QEMU can use these snapshots for travelling back in time with GDB.
> 
> I had queued this, it is a very nice patch series.  Unfortunately, the
> tests failed on gitlab:
> 
> https://gitlab.com/bonzini/qemu/-/jobs/745795080

There is a strange thing in your environment:

15:49:41 INFO | Downloading/preparing boot image
15:49:42 INFO | Running '/builds/bonzini/qemu/build/qemu-img create -f 
qcow2 -b 
/builds/bonzini/qemu/avocado-cache/by_location/d2a8d6b607afec50de14560c064f34ffd99836b2/Fedora-Cloud-Base-31-1.9.x86_64.qcow2 
/var/tmp/avocado_tj2janfx/avocado_job_ys7ueohj/04-tests_acceptance_boot_linux.py_BootLinuxX8664.test_pc_q35_kvm/Fedora-Cloud-Base-31-1.9.x86_64-d1ac1224.qcow2'


It downloads boot image, but there is no such requirement in the test.
And all this stuff consumes most of the time for the test.

Pavel Dovgalyuk
Pavel Dovgalyuk Sept. 21, 2020, 6:24 a.m. UTC | #3
On 21.09.2020 09:03, Pavel Dovgalyuk wrote:
> On 20.09.2020 10:58, Paolo Bonzini wrote:
>> On 14/09/20 08:06, Pavel Dovgalyuk wrote:
>>> GDB remote protocol supports reverse debugging of the targets.
>>> It includes 'reverse step' and 'reverse continue' operations.
>>> The first one finds the previous step of the execution,
>>> and the second one is intended to stop at the last breakpoint that
>>> would happen when the program is executed normally.
>>>
>>> Reverse debugging is possible in the replay mode, when at least
>>> one snapshot was created at the record or replay phase.
>>> QEMU can use these snapshots for travelling back in time with GDB.
>>
>> I had queued this, it is a very nice patch series.  Unfortunately, the
>> tests failed on gitlab:
>>
>> https://gitlab.com/bonzini/qemu/-/jobs/745795080
> 
> There is a strange thing in your environment:
> 
> 15:49:41 INFO | Downloading/preparing boot image
> 15:49:42 INFO | Running '/builds/bonzini/qemu/build/qemu-img create -f 
> qcow2 -b 
> /builds/bonzini/qemu/avocado-cache/by_location/d2a8d6b607afec50de14560c064f34ffd99836b2/Fedora-Cloud-Base-31-1.9.x86_64.qcow2 
> /var/tmp/avocado_tj2janfx/avocado_job_ys7ueohj/04-tests_acceptance_boot_linux.py_BootLinuxX8664.test_pc_q35_kvm/Fedora-Cloud-Base-31-1.9.x86_64-d1ac1224.qcow2' 
> 
> 
> 
> It downloads boot image, but there is no such requirement in the test.
> And all this stuff consumes most of the time for the test.

Sorry, that was ok. It was the output from the previous test.

For reverse debugging there was a timeout on reading from the socket: 
result = self._socket.recv(REMOTE_MAX_PACKET_SIZE)

Do you have any hint how to debug such a failure in this environment?

Pavel Dovgalyuk
Pavel Dovgalyuk Sept. 21, 2020, 6:48 a.m. UTC | #4
On 20.09.2020 10:58, Paolo Bonzini wrote:
> On 14/09/20 08:06, Pavel Dovgalyuk wrote:
>> GDB remote protocol supports reverse debugging of the targets.
>> It includes 'reverse step' and 'reverse continue' operations.
>> The first one finds the previous step of the execution,
>> and the second one is intended to stop at the last breakpoint that
>> would happen when the program is executed normally.
>>
>> Reverse debugging is possible in the replay mode, when at least
>> one snapshot was created at the record or replay phase.
>> QEMU can use these snapshots for travelling back in time with GDB.
> 
> I had queued this, it is a very nice patch series.  Unfortunately, the
> tests failed on gitlab:
> 
> https://gitlab.com/bonzini/qemu/-/jobs/745795080

There are other tests that were disabled on gitlab for the unknown reason.

https://patchwork.kernel.org/patch/11636515/
https://patchwork.kernel.org/patch/11701681/

The latter is related to machine_rx_gdbsim.py
Could it be the same avocado/python/etc issue with socket interaction?


Pavel Dovgalyuk
Philippe Mathieu-Daudé Sept. 21, 2020, 7:20 a.m. UTC | #5
On 9/21/20 8:48 AM, Pavel Dovgalyuk wrote:
> On 20.09.2020 10:58, Paolo Bonzini wrote:
>> On 14/09/20 08:06, Pavel Dovgalyuk wrote:
>>> GDB remote protocol supports reverse debugging of the targets.
>>> It includes 'reverse step' and 'reverse continue' operations.
>>> The first one finds the previous step of the execution,
>>> and the second one is intended to stop at the last breakpoint that
>>> would happen when the program is executed normally.
>>>
>>> Reverse debugging is possible in the replay mode, when at least
>>> one snapshot was created at the record or replay phase.
>>> QEMU can use these snapshots for travelling back in time with GDB.
>>
>> I had queued this, it is a very nice patch series.  Unfortunately, the
>> tests failed on gitlab:
>>
>> https://gitlab.com/bonzini/qemu/-/jobs/745795080
> 
> There are other tests that were disabled on gitlab for the unknown reason.
> 
> https://patchwork.kernel.org/patch/11636515/

Unrelated.

> https://patchwork.kernel.org/patch/11701681/
> 
> The latter is related to machine_rx_gdbsim.py

Unrelated, 'gdbsim' is the name of the machine. It is not
using the gdbstub/gdb protocol.

> Could it be the same avocado/python/etc issue with socket interaction?

Yes, likely...

The kludge is to simply add (with an explanation if possible):

  @skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')

> 
> 
> Pavel Dovgalyuk
> 
>