mbox series

[v2,0/7] Live migration acceleration with UADK

Message ID 20240607135310.46320-1-shameerali.kolothum.thodi@huawei.com (mailing list archive)
Headers show
Series Live migration acceleration with UADK | expand

Message

Shameerali Kolothum Thodi June 7, 2024, 1:53 p.m. UTC
Hi,

v1 --> v2
(v1: https://lore.kernel.org/qemu-devel/20240529094435.11140-1-shameerali.kolothum.thodi@huawei.com/)

-Rebased on top of Intel IAA v7 series[0].
-Addressed comments from Fabiano. Thanks.
-Gathered tags received.

Please take a look and let me know your feedback.

Thanks,
Shameer
[0] https://lore.kernel.org/qemu-devel/20240603154106.764378-1-yuan1.liu@intel.com/

---

This series adds support for UADK library based hardware acceleration
for live migration. UADK[0] is a general-purpose user space accelerator
framework that uses shared virtual addressing (SVA) to provide a unified
programming interface for hardware acceleration of cryptographic and
compression algorithms.

UADK makes use of the UACCE(Unified/User-space-access-intended Accelerator
Framework) Linux kernel module which enables hardware accelerators from
different vendors that support SVA to adapt to UADK. Linux kernel from
v5.9 has support for UACCE and SVA on ARM64 platforms.

Currently, HiSilicon Kunpeng hardware accelerators have been registered with
UACCE and the Zip accelerator on these platforms can be used for compression
which can  free up CPU computing power and improve computing performance.

This series is on top of Intel IAA accelerator live migration support
series[1] from Yuan Liu. Many thanks for doing this.

Initial tests were carried out on HiSilicon D06 platforms and the results
are as below:

Test setup: HiSilicon D06 boards connected over a 1Gbps n/w.
Host Kernel: Host Kernel: 6.7.0 Mainline Kernel.
Guest VM: 64 cpus, 16GB mem, hugepages, prealloc=on (80% Memory filled
          with random data)

 +--------+-------------+--------+--------+----------+------+
 |        | The number  |total   |downtime|pages per | CPU  |
 | None   | of channels |time(ms)|(ms)    |second    | Util |
 | Comp   |             |        |        |          |      |
 |        +-------------+-----------------+----------+------+
 |Network |            2|  114536|      79|   32849  |   18%|
 |BW:  1G +-------------+--------+--------+----------+------+
 |        |            4|  114327|      78|   34217  |   22%|
 |        +-------------+--------+--------+----------+------+
 |        |            8|  114231|     107|   211840 |   24%|
 +--------+-------------+--------+--------+----------+------+
 
 +--------+-------------+--------+--------+----------+------+
 |        | The number  |total   |downtime|pages per | CPU  |
 | UADK   | of channels |time(ms)|(ms)    |second    | Util |
 | Comp   |             |        |        |          |      |
 |        +-------------+-----------------+----------+------+
 |Network |            2|  77192 |      75|   182679 |   24%|
 |BW:  1G +-------------+--------+--------+----------+------+
 |        |            4|  77000 |      86|   185600 |   25%|
 |        +-------------+--------+--------+----------+------+
 |        |            8|  76835 |      97|   330966 |   27%|
 +--------+-------------+--------+--------+----------+------+
 
 +--------+-------------+--------+--------+----------+------+
 |        | The number  |total   |downtime|pages per | CPU  |
 | ZLIB   | of channels |time(ms)|(ms)    |second    | Util |
 | Comp   |             |        |        |          |      |
 |        +-------------+-----------------+----------+------+
 |Network |            2|  134664|      73|   42666  |  200%|
 |BW:  1G +-------------+--------+--------+----------+------+
 |        |            4|  71550 |      72|   181227 |  390%|
 |        +-------------+--------+--------+----------+------+
 |        |            8|  67781 |     108|   200960 |  460%|
 +--------+-------------+--------+--------+----------+------+
 
 +--------+-------------+--------+--------+----------+------+
 |        | The number  |total   |downtime|pages per | CPU  |
 | ZSTD   | of channels |time(ms)|(ms)    |second    | Util |
 | Comp   |             |        |        |          |      |
 |        +-------------+-----------------+----------+------+
 |Network |            2|  67822 |      73|   202772 |  160%|
 |BW:  1G +-------------+--------+--------+----------+------+
 |        |            4|  67460 |     107|   198400 |  180%|
 |        +-------------+--------+--------+----------+------+
 |        |            8|  67422 |     83 |   349808 |  215%|
 +--------+-------------+--------+--------+----------+------+

From the above results, UADK has considerable CPU cycle savings
compared to both Zlib/Zstd.  Also compared with Qemu 
"multifd-compression none" mode UADK has an edge on migration
"total time".

Please take a look and let me know your feedback.

Thanks,
Shameer

[0] https://github.com/Linaro/uadk/tree/master/docs
[1] https://lore.kernel.org/qemu-devel/20240505165751.2392198-1-yuan1.liu@intel.com/

Shameer Kolothum via (7):
  docs/migration: add uadk compression feature
  configure: Add uadk option
  migration/multifd: add uadk compression framework
  migration/multifd: Add UADK initialization
  migration/multifd: Add UADK based compression and decompression
  migration/multifd: Switch to no compression when no hardware support
  tests/migration-test: add uadk compression test

 docs/devel/migration/features.rst         |   1 +
 docs/devel/migration/uadk-compression.rst | 144 +++++++++
 hw/core/qdev-properties-system.c          |   2 +-
 meson.build                               |  14 +
 meson_options.txt                         |   2 +
 migration/meson.build                     |   1 +
 migration/multifd-uadk.c                  | 369 ++++++++++++++++++++++
 migration/multifd.h                       |   5 +-
 qapi/migration.json                       |   5 +-
 scripts/meson-buildoptions.sh             |   3 +
 tests/qtest/migration-test.c              |  23 ++
 11 files changed, 565 insertions(+), 4 deletions(-)
 create mode 100644 docs/devel/migration/uadk-compression.rst
 create mode 100644 migration/multifd-uadk.c

Comments

Fabiano Rosas June 14, 2024, 1:01 p.m. UTC | #1
On Fri, 07 Jun 2024 14:53:03 +0100, Shameer Kolothum via wrote:
> v1 --> v2
> (v1: https://lore.kernel.org/qemu-devel/20240529094435.11140-1-shameerali.kolothum.thodi@huawei.com/)
> 
> -Rebased on top of Intel IAA v7 series[0].
> -Addressed comments from Fabiano. Thanks.
> -Gathered tags received.
> 
> [...]

Queued, thanks!