@@ -9,6 +9,7 @@ testing infrastructure.
main
qtest
+ migration
functional
avocado
acpi-bits
@@ -96,6 +96,18 @@ QTest cases can be executed with
make check-qtest
+Migration
+~~~~~~~~~
+
+Migration tests are part of QTest, but are run independently. Refer
+to :doc:`migration` for more details.
+
+Migration test cases can be executed with
+
+.. code::
+
+ make check-qtest-migration
+
Writing portable test cases
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Both unit tests and qtests can run on POSIX hosts as well as Windows hosts.
new file mode 100644
@@ -0,0 +1,261 @@
+.. _migration:
+
+Migration tests
+===============
+
+Migration tests are part of QTest, but have some particularities of
+their own, such as:
+
+- Extended test time due to the need to exercise the iterative phase
+ of migration;
+- Extra requirements on the QEMU binary being used due to
+ :ref:`cross-version migration <cross-version-tests>`;
+- The use of a custom binary for the guest code to test memory
+ integrity (see :ref:`guest-code`).
+
+Invocation
+----------
+
+Migration tests can be ran with:
+
+.. code::
+
+ make check-qtest
+ make check-qtest-migration
+
+or directly:
+
+.. code::
+
+ # all tests
+ QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test -m thorough
+
+ # single test
+ QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test -m thorough -p /x86_64/migration/bad_dest
+
+ # all tests under /multifd (note no trailing slash)
+ QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test -m thorough -r /x86_64/migration/multifd
+
+for cross-version tests (see :ref:`cross-version-tests`):
+
+.. code::
+
+ # old QEMU -> new QEMU
+ QTEST_QEMU_BINARY=./qemu-system-x86_64 QTEST_QEMU_BINARY_SRC=./old/qemu-system-x86_64 -m thorough ./tests/qtest/migration-test
+ QTEST_QEMU_BINARY_DST=./qemu-system-x86_64 QTEST_QEMU_BINARY=./old/qemu-system-x86_64 -m thorough ./tests/qtest/migration-test
+
+ # new QEMU -> old QEMU (backwards migration)
+ QTEST_QEMU_BINARY_SRC=./qemu-system-x86_64 QTEST_QEMU_BINARY=./old/qemu-system-x86_64 -m thorough ./tests/qtest/migration-test
+ QTEST_QEMU_BINARY=./qemu-system-x86_64 QTEST_QEMU_BINARY_DST=./old/qemu-system-x86_64 -m thorough ./tests/qtest/migration-test
+
+ # both _SRC and _DST variants are supported for convenience
+
+.. _cross-version-tests:
+
+Cross-version tests
+~~~~~~~~~~~~~~~~~~~
+
+To detect compatibility issues between different QEMU versions, all
+migration tests can be executed with two different QEMU versions. The
+common machine type between the two versions is used.
+
+To setup cross-version tests, a previous build of QEMU must be kept,
+e.g.:
+
+.. code::
+
+ # build current code
+ mkdir build
+ cd build
+ ../configure; make
+
+ # build previous version
+ cd ../
+ mkdir build-9.1
+ git checkout v9.1.0
+ cd build
+ ../configure; make
+
+To avoid issues with newly added features and new tests, it is highly
+recommended to run the tests from the source directory of *older*
+version being tested.
+
+.. code::
+
+ ./build/qemu-system-x86_64 --version
+ QEMU emulator version 9.1.50
+
+ ./build-9.1/qemu-system-x86_64 --version
+ QEMU emulator version 9.1.0
+
+ cd build-9.1
+ QTEST_QEMU_BINARY=./qemu-system-x86_64 QTEST_QEMU_BINARY_DST=../build/qemu-system-x86_64 ./tests/qtest/migration-test -m thorough
+
+
+How to write migration tests
+----------------------------
+
+Add a test function (prefixed with ``test_``) that gets registered
+with QTest using the ``migration_test_add*()`` helpers.
+
+.. code::
+
+ migration_test_add("/migration/multifd/tcp/plain/cancel", test_multifd_tcp_cancel);
+
+There is no formal grammar for the definition of the test paths, but
+an informal rule is followed for consistency. Usually:
+
+``/migration/<multifd|precopy|postcopy>/<url type>/<test-specific>/``
+
+Bear in mind that the path string affects test execution order and
+filtering when using the ``-r`` flag.
+
+For simpler tests, the test function can setup the test arguments in
+the ``MigrateCommon`` structure and call into a common test
+routine. Currently there are two common test routines:
+
+ - test_precopy_common - for generic precopy migration
+ - test_file_common - for migration using the file: URL
+
+The general structure of a test routine is:
+
+- call ``migrate_start()`` to initialize the two QEMU
+ instances. Usually named "from", for the source machine and "to" for
+ the destination machine;
+
+- define the migration duration, (roughly speaking either quick or
+ slow) by altering the convergence parameters with
+ ``migrate_ensure[_non]_converge()``;
+
+- wait for the machines to be in the desired state with the ``wait_for_*``
+ helpers;
+
+- migrate with ``migrate_qmp()/migrate_incoming_qmp()/migrate_qmp_fail()``;
+
+- check that guest memory was not corrupted and clean up the QEMU
+ instances with ``migrate_end()``.
+
+If using the common test routines, the ``.start_hook`` and
+``.end_hook`` callbacks can be used to perform test-specific tasks.
+
+.. _guest-code:
+
+About guest code
+----------------
+
+The tests all use a custom, architecture-specific binary as the guest
+code. This code, known as a-b-kernel or a-b-bootblock, constantly
+iterates over the guest memory, writing a number to the start of each
+guest page, incrementing it as it loops around (i.e. a generation
+count). This allows the tests to catch memory corruption errors that
+occur during migration as every page's first byte must have the same
+value, except at the point where the transition happens.
+
+Whenever guest memory is migrated incorrectly, the test will output
+the address and amount of pages that present a value inconsistent with
+the generation count, e.g.:
+
+.. code::
+
+ Memory content inconsistency at d53000 first_byte = 27 last_byte = 26 current = 27 hit_edge = 1
+ Memory content inconsistency at d54000 first_byte = 27 last_byte = 26 current = 27 hit_edge = 1
+ Memory content inconsistency at d55000 first_byte = 27 last_byte = 26 current = 27 hit_edge = 1
+ and in another 4929 pages
+
+In the scenario above,
+
+``first_byte`` shows that the current generation number is 27, therefore
+all pages should have 27 as their first byte. Since ``hit_edge=1``, that
+means the transition point was found, i.e. the guest was stopped for
+migration while not all pages had yet been updated to the new
+generation count. So 26 is also a valid byte to find in some pages.
+
+The inconsistency here is that ``last_byte``, i.e. the previous
+generation count is smaller than the ``current`` byte, which should not
+be possible. This would indicate a memory layout such as:
+
+.. code::
+
+ 0xb00000 | 27 00 00 ...
+ ...
+ 0xc00000 | 27 00 00 ...
+ ...
+ 0xd00000 | 27 00 00 ...
+ 0x?????? | 26 00 00 ... <-- pages around this addr weren't migrated correctly
+ ...
+ 0xd53000 | 27 00 00 ...
+ 0xd54000 | 27 00 00 ...
+ 0xd55000 | 27 00 00 ...
+ ...
+
+The a-b code is located at ``tests/qtest/migration/<arch>``.
+
+Troubleshooting
+---------------
+
+Migration tests usually run as part of make check, which is most
+likely to not have been using the verbose flag, so the first thing to
+check is the test log from meson (``meson-logs/testlog.txt``).
+
+There, look for the last "Running" entry, which will be the current
+test. Notice whether the failing program is one of the QEMU instances
+or the migration-test-* themselves.
+
+E.g.:
+
+.. code::
+
+ # Running /s390x/migration/precopy/unix/plain
+ # Using machine type: s390-ccw-virtio-9.2
+ # starting QEMU: exec ./qemu-system-s390x -qtest ...
+ # starting QEMU: exec ./qemu-system-s390x -qtest ...
+ ----------------------------------- stderr -----------------------------------
+ migration-test: ../tests/qtest/migration-test.c:1712: test_precopy_common: Assertion `0' failed.
+
+ (test program exited with status code -6)
+
+.. code::
+
+ # Running /x86_64/migration/bad_dest
+ # Using machine type: pc-q35-9.2
+ # starting QEMU: exec ./qemu-system-x86_64 -qtest ...
+ # starting QEMU: exec ./qemu-system-x86_64 -qtest ...
+ ----------------------------------- stderr -----------------------------------
+ Broken pipe
+ ../tests/qtest/libqtest.c:205: kill_qemu() detected QEMU death from signal 6 (Aborted) (core dumped)
+
+ (test program exited with status code -6)
+
+The above is usually not enough to determine what happened, so
+re-running the test directly is helpful:
+
+.. code::
+
+ QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test -m thorough -p /x86_64/migration/bad_dest
+
+There are also the QTEST_LOG and QTEST_TRACE variables for increased
+logging and tracing.
+
+The QTEST_QEMU_BINARY environment variable can be abused to hook GDB
+or valgrind into the invocation:
+
+.. code::
+
+ QTEST_QEMU_BINARY='gdb -q --ex "set pagination off" --ex "set print thread-events off" \
+ --ex "handle SIGUSR1 noprint" --ex "break <breakpoint>" --ex "run" --ex "quit \$_exitcode" \
+ --args ./qemu-system-x86_64' ./tests/qtest/migration-test -m thorough -p /x86_64/migration/multifd/file/mapped-ram/fdset/dio
+
+.. code::
+
+ QTEST_QEMU_BINARY='valgrind -q --leak-check=full --show-leak-kinds=definite,indirect \
+ ./qemu-system-x86_64' ./tests/qtest/migration-test -m thorough -r /x86_64/migration
+
+Whenever a test fails, it will leave behind a temporary
+directory. This is useful for file migrations to inspect the generated
+migration file:
+
+.. code::
+
+ $ file /tmp/migration-test-X496U2/migfile
+ /tmp/migration-test-X496U2/migfile: QEMU suspend to disk image
+ $ hexdump -C /tmp/migration-test-X496U2/migfile | less
@@ -5,6 +5,7 @@ QTest Device Emulation Testing Framework
.. toctree::
qgraph
+ migration
QTest is a device emulation testing framework. It can be very useful to test
device models; it could also control certain aspects of QEMU (such as virtual
Add documentation about how to write, run and debug migration tests. Signed-off-by: Fabiano Rosas <farosas@suse.de> --- docs/devel/testing/index.rst | 1 + docs/devel/testing/main.rst | 12 ++ docs/devel/testing/migration.rst | 261 +++++++++++++++++++++++++++++++ docs/devel/testing/qtest.rst | 1 + 4 files changed, 275 insertions(+) create mode 100644 docs/devel/testing/migration.rst