From patchwork Fri Jan 20 13:47:46 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Daniil Tatianin <d-tatianin@yandex-team.ru>
X-Patchwork-Id: 13109895
Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id 32F20C05027
	for <qemu-devel@archiver.kernel.org>; Fri, 20 Jan 2023 13:57:25 +0000 (UTC)
Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces@nongnu.org>)
	id 1pIrs9-0007nM-6J; Fri, 20 Jan 2023 08:56:01 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <d-tatianin@yandex-team.ru>)
 id 1pIrs5-0007mJ-1n
 for qemu-devel@nongnu.org; Fri, 20 Jan 2023 08:55:57 -0500
Received: from forwardcorp1a.mail.yandex.net ([178.154.239.72])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <d-tatianin@yandex-team.ru>)
 id 1pIrs2-0001n6-Ow
 for qemu-devel@nongnu.org; Fri, 20 Jan 2023 08:55:56 -0500
Received: from vla5-b2806cb321eb.qloud-c.yandex.net
 (vla5-b2806cb321eb.qloud-c.yandex.net
 [IPv6:2a02:6b8:c18:3e0d:0:640:b280:6cb3])
 by forwardcorp1a.mail.yandex.net (Yandex) with ESMTP id F10AD5FD79;
 Fri, 20 Jan 2023 16:48:04 +0300 (MSK)
Received: from d-tatianin-nix.yandex.net (unknown
 [2a02:6b8:0:419:8f3f:2197:162b:4096])
 by vla5-b2806cb321eb.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id
 wlngUQ0WWiE1-uHMgKX7j; Fri, 20 Jan 2023 16:48:04 +0300
Precedence: bulk
X-Yandex-Fwd: 1
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru;
 s=default;
 t=1674222484; bh=o94S0ghzsViae90R7vBqSqgRMFVyIdapQaUB50thZWs=;
 h=Message-Id:Date:In-Reply-To:Cc:Subject:References:To:From;
 b=DTQ7LCKX+li9YBbXe/JPabqkBBInDy8blSMswuSHY9G46Caii/XhZZZ9CG3X2wPqT
 JaJsetCprMqXgyyQ2PjdhAVcB5d6b2SD1udUxdBnlqVlVeogbT4BQwhcy9bNGeaqCR
 2F3hVLA+/kN2Oa1MEG6Poy5HSFq5caTFUBqYluhw=
Authentication-Results: vla5-b2806cb321eb.qloud-c.yandex.net;
 dkim=pass header.i=@yandex-team.ru
From: Daniil Tatianin <d-tatianin@yandex-team.ru>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Daniil Tatianin <d-tatianin@yandex-team.ru>, qemu-devel@nongnu.org,
 Stefan Weil <sw@weilnetz.de>, David Hildenbrand <david@redhat.com>,
 Igor Mammedov <imammedo@redhat.com>, yc-core@yandex-team.ru
Subject: [PATCH 1/4] oslib: introduce new qemu_prealloc_mem_with_timeout() api
Date: Fri, 20 Jan 2023 16:47:46 +0300
Message-Id: <20230120134749.550639-2-d-tatianin@yandex-team.ru>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20230120134749.550639-1-d-tatianin@yandex-team.ru>
References: <20230120134749.550639-1-d-tatianin@yandex-team.ru>
MIME-Version: 1.0
Received-SPF: pass client-ip=178.154.239.72;
 envelope-from=d-tatianin@yandex-team.ru; helo=forwardcorp1a.mail.yandex.net
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org

This helper allows limiting the maximum amount time to be spent
preallocating a block of memory, which is important on systems that
might have unpredictable page allocation delays because of possible
fragmentation or other reasons specific to the backend.

It also exposes a way to register a callback that is invoked in case the
specified timeout is exceeded. The callback is provided with a
PreallocStats structure that includes a bunch of statistics about the
progress including total & allocated number of pages, as well as page
size and number of allocation threads.

The win32 implementation is currently a stub that just calls into the
old qemu_prealloc_mem api.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
---
 include/qemu/osdep.h |  19 ++++++++
 util/oslib-posix.c   | 114 +++++++++++++++++++++++++++++++++++++++----
 util/oslib-win32.c   |   9 ++++
 3 files changed, 133 insertions(+), 9 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index bd23a08595..21757e5144 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -595,6 +595,25 @@ typedef struct ThreadContext ThreadContext;
 void qemu_prealloc_mem(int fd, char *area, size_t sz, int max_threads,
                        ThreadContext *tc, Error **errp);
 
+typedef struct PreallocStats {
+    size_t page_size;
+    size_t total_pages;
+    size_t allocated_pages;
+    int threads;
+    time_t seconds_elapsed;
+} PreallocStats;
+
+typedef struct PreallocTimeout {
+    time_t seconds;
+    void *user;
+    void (*on_timeout)(void *user, const PreallocStats *stats);
+} PreallocTimeout;
+
+void qemu_prealloc_mem_with_timeout(int fd, char *area, size_t sz,
+                                    int max_threads, ThreadContext *tc,
+                                    const PreallocTimeout *timeout,
+                                    Error **errp);
+
 /**
  * qemu_get_pid_name:
  * @pid: pid of a process
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 59a891b6a8..570fca601f 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -74,6 +74,7 @@ typedef struct MemsetContext {
     bool any_thread_failed;
     struct MemsetThread *threads;
     int num_threads;
+    PreallocStats stats;
 } MemsetContext;
 
 struct MemsetThread {
@@ -83,6 +84,7 @@ struct MemsetThread {
     QemuThread pgthread;
     sigjmp_buf env;
     MemsetContext *context;
+    size_t touched_pages;
 };
 typedef struct MemsetThread MemsetThread;
 
@@ -373,6 +375,7 @@ static void *do_touch_pages(void *arg)
              */
             *(volatile char *)addr = *addr;
             addr += hpagesize;
+            qatomic_inc(&memset_args->touched_pages);
         }
     }
     pthread_sigmask(SIG_SETMASK, &oldset, NULL);
@@ -396,6 +399,11 @@ static void *do_madv_populate_write_pages(void *arg)
     if (size && qemu_madvise(addr, size, QEMU_MADV_POPULATE_WRITE)) {
         ret = -errno;
     }
+
+    if (!ret) {
+        qatomic_set(&memset_args->touched_pages, memset_args->numpages);
+    }
+
     return (void *)(uintptr_t)ret;
 }
 
@@ -418,8 +426,68 @@ static inline int get_memset_num_threads(size_t hpagesize, size_t numpages,
     return ret;
 }
 
+static int do_join_memset_threads_with_timeout(MemsetContext *context,
+                                               time_t seconds)
+{
+    struct timespec ts;
+    int i = 0;
+
+    if (clock_gettime(CLOCK_REALTIME, &ts) < 0) {
+        return i;
+    }
+    ts.tv_sec += seconds;
+
+    for (; i < context->num_threads; ++i) {
+        if (pthread_timedjoin_np(context->threads[i].pgthread.thread,
+                                 NULL, &ts)) {
+            break;
+        }
+    }
+
+    return i;
+}
+
+static void memset_stats_count_pages(MemsetContext *context)
+{
+    int i;
+
+    for (i = 0; i < context->num_threads; ++i) {
+        size_t pages = qatomic_load_acquire(
+                            &context->threads[i].touched_pages);
+        context->stats.allocated_pages += pages;
+    }
+}
+
+static int timed_join_memset_threads(MemsetContext *context,
+                                     const PreallocTimeout *timeout)
+{
+    int i, off;
+    PreallocStats *stats = &context->stats;
+    off = do_join_memset_threads_with_timeout(context, timeout->seconds);
+
+    if (off != context->num_threads && timeout->on_timeout) {
+        memset_stats_count_pages(context);
+
+        /*
+         * Guard against possible races if preallocation finishes right
+         * after the timeout is exceeded.
+         */
+        if (stats->allocated_pages < stats->total_pages) {
+            stats->seconds_elapsed = timeout->seconds;
+            timeout->on_timeout(timeout->user, stats);
+        }
+    }
+
+    for (i = off; i < context->num_threads; ++i) {
+        pthread_cancel(context->threads[i].pgthread.thread);
+    }
+
+    return off;
+}
+
 static int touch_all_pages(char *area, size_t hpagesize, size_t numpages,
                            int max_threads, ThreadContext *tc,
+                           const PreallocTimeout *timeout,
                            bool use_madv_populate_write)
 {
     static gsize initialized = 0;
@@ -452,6 +520,9 @@ static int touch_all_pages(char *area, size_t hpagesize, size_t numpages,
     }
 
     context.threads = g_new0(MemsetThread, context.num_threads);
+    context.stats.page_size = hpagesize;
+    context.stats.total_pages = numpages;
+    context.stats.threads = context.num_threads;
     numpages_per_thread = numpages / context.num_threads;
     leftover = numpages % context.num_threads;
     for (i = 0; i < context.num_threads; i++) {
@@ -481,11 +552,20 @@ static int touch_all_pages(char *area, size_t hpagesize, size_t numpages,
     qemu_cond_broadcast(&page_cond);
     qemu_mutex_unlock(&page_mutex);
 
-    for (i = 0; i < context.num_threads; i++) {
-        int tmp = (uintptr_t)qemu_thread_join(&context.threads[i].pgthread);
+    if (timeout) {
+        i = timed_join_memset_threads(&context, timeout);
+
+        if (i != context.num_threads &&
+            context.stats.allocated_pages != context.stats.total_pages) {
+            ret = -ETIMEDOUT;
+        }
+    }
+
+    for (; i < context.num_threads; i++) {
+        void *thread_ret = qemu_thread_join(&context.threads[i].pgthread);
 
-        if (tmp) {
-            ret = tmp;
+        if (thread_ret && thread_ret != PTHREAD_CANCELED) {
+            ret = (uintptr_t)thread_ret;
         }
     }
 
@@ -503,8 +583,10 @@ static bool madv_populate_write_possible(char *area, size_t pagesize)
            errno != EINVAL;
 }
 
-void qemu_prealloc_mem(int fd, char *area, size_t sz, int max_threads,
-                       ThreadContext *tc, Error **errp)
+void qemu_prealloc_mem_with_timeout(int fd, char *area, size_t sz,
+                                    int max_threads, ThreadContext *tc,
+                                    const PreallocTimeout *timeout,
+                                    Error **errp)
 {
     static gsize initialized;
     int ret;
@@ -546,10 +628,18 @@ void qemu_prealloc_mem(int fd, char *area, size_t sz, int max_threads,
 
     /* touch pages simultaneously */
     ret = touch_all_pages(area, hpagesize, numpages, max_threads, tc,
-                          use_madv_populate_write);
+                          timeout, use_madv_populate_write);
+
     if (ret) {
-        error_setg_errno(errp, -ret,
-                         "qemu_prealloc_mem: preallocating memory failed");
+        const char *msg;
+
+        if (timeout && ret == -ETIMEDOUT) {
+            msg = "preallocation timed out";
+        } else {
+            msg = "preallocating memory failed";
+        }
+
+        error_setg_errno(errp, -ret, "qemu_prealloc_mem: %s", msg);
     }
 
     if (!use_madv_populate_write) {
@@ -563,6 +653,12 @@ void qemu_prealloc_mem(int fd, char *area, size_t sz, int max_threads,
     }
 }
 
+void qemu_prealloc_mem(int fd, char *area, size_t sz, int max_threads,
+                       ThreadContext *tc, Error **errp)
+{
+    qemu_prealloc_mem_with_timeout(fd, area, sz, max_threads, tc, NULL, errp);
+}
+
 char *qemu_get_pid_name(pid_t pid)
 {
     char *name = NULL;
diff --git a/util/oslib-win32.c b/util/oslib-win32.c
index 07ade41800..27f39ef66a 100644
--- a/util/oslib-win32.c
+++ b/util/oslib-win32.c
@@ -276,6 +276,15 @@ void qemu_prealloc_mem(int fd, char *area, size_t sz, int max_threads,
     }
 }
 
+void qemu_prealloc_mem_with_timeout(int fd, char *area, size_t sz,
+                                    int max_threads, ThreadContext *tc,
+                                    const PreallocTimeout *timeout,
+                                    Error **errp)
+{
+    /* FIXME: actually implement timing out here */
+    qemu_prealloc_mem(fd, area, sz, max_threads, tc, errp);
+}
+
 char *qemu_get_pid_name(pid_t pid)
 {
     /* XXX Implement me */

From patchwork Fri Jan 20 13:47:47 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Daniil Tatianin <d-tatianin@yandex-team.ru>
X-Patchwork-Id: 13109894
Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id E126DC38159
	for <qemu-devel@archiver.kernel.org>; Fri, 20 Jan 2023 13:57:14 +0000 (UTC)
Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces@nongnu.org>)
	id 1pIrs3-0007li-If; Fri, 20 Jan 2023 08:55:55 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <d-tatianin@yandex-team.ru>)
 id 1pIrs1-0007lC-JX
 for qemu-devel@nongnu.org; Fri, 20 Jan 2023 08:55:53 -0500
Received: from forwardcorp1a.mail.yandex.net
 ([2a02:6b8:c0e:500:1:45:d181:df01])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <d-tatianin@yandex-team.ru>)
 id 1pIrrz-0001mq-7b
 for qemu-devel@nongnu.org; Fri, 20 Jan 2023 08:55:53 -0500
Received: from vla5-b2806cb321eb.qloud-c.yandex.net
 (vla5-b2806cb321eb.qloud-c.yandex.net
 [IPv6:2a02:6b8:c18:3e0d:0:640:b280:6cb3])
 by forwardcorp1a.mail.yandex.net (Yandex) with ESMTP id 7F3C45FD7A;
 Fri, 20 Jan 2023 16:48:06 +0300 (MSK)
Received: from d-tatianin-nix.yandex.net (unknown
 [2a02:6b8:0:419:8f3f:2197:162b:4096])
 by vla5-b2806cb321eb.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id
 wlngUQ0WWiE1-JpDStUHs; Fri, 20 Jan 2023 16:48:05 +0300
Precedence: bulk
X-Yandex-Fwd: 1
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru;
 s=default;
 t=1674222485; bh=0SRBKZnysXyosxKSj1Sfn7uJHGrM42Cqy1GYWxRXmZ4=;
 h=Message-Id:Date:In-Reply-To:Cc:Subject:References:To:From;
 b=N1W66G29Vv1IVDcILYVKa2vHQp5I0QLHT92PQg8M+mMEVyGUN8j9MEYrRoQHBiCtC
 OVnLozDzG28W73BzwWTRKs/o4s9/frHeRPTOWXQOldecpThzkA3BEJOgkONWD6T2JE
 JKXxKISrXJK+WgTzySnttBJaTSGwZy+5gpC6hxHM=
Authentication-Results: vla5-b2806cb321eb.qloud-c.yandex.net;
 dkim=pass header.i=@yandex-team.ru
From: Daniil Tatianin <d-tatianin@yandex-team.ru>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Daniil Tatianin <d-tatianin@yandex-team.ru>, qemu-devel@nongnu.org,
 Stefan Weil <sw@weilnetz.de>, David Hildenbrand <david@redhat.com>,
 Igor Mammedov <imammedo@redhat.com>, yc-core@yandex-team.ru
Subject: [PATCH 2/4] backends/hostmem: move memory region preallocation logic
 into a helper
Date: Fri, 20 Jan 2023 16:47:47 +0300
Message-Id: <20230120134749.550639-3-d-tatianin@yandex-team.ru>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20230120134749.550639-1-d-tatianin@yandex-team.ru>
References: <20230120134749.550639-1-d-tatianin@yandex-team.ru>
MIME-Version: 1.0
Received-SPF: pass client-ip=2a02:6b8:c0e:500:1:45:d181:df01;
 envelope-from=d-tatianin@yandex-team.ru; helo=forwardcorp1a.mail.yandex.net
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org

...so that we don't have to duplicate it in multiple places throughout
the file.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
---
 backends/hostmem.c | 38 ++++++++++++++++++++------------------
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 747e7838c0..842bfa9eb7 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -216,10 +216,26 @@ static bool host_memory_backend_get_prealloc(Object *obj, Error **errp)
     return backend->prealloc;
 }
 
+static bool do_prealloc_mr(HostMemoryBackend *backend, Error **errp)
+{
+    Error *local_err = NULL;
+    int fd = memory_region_get_fd(&backend->mr);
+    void *ptr = memory_region_get_ram_ptr(&backend->mr);
+    uint64_t sz = memory_region_size(&backend->mr);
+
+    qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads,
+                      backend->prealloc_context, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return false;
+    }
+
+    return true;
+}
+
 static void host_memory_backend_set_prealloc(Object *obj, bool value,
                                              Error **errp)
 {
-    Error *local_err = NULL;
     HostMemoryBackend *backend = MEMORY_BACKEND(obj);
 
     if (!backend->reserve && value) {
@@ -233,17 +249,7 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value,
     }
 
     if (value && !backend->prealloc) {
-        int fd = memory_region_get_fd(&backend->mr);
-        void *ptr = memory_region_get_ram_ptr(&backend->mr);
-        uint64_t sz = memory_region_size(&backend->mr);
-
-        qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads,
-                          backend->prealloc_context, &local_err);
-        if (local_err) {
-            error_propagate(errp, local_err);
-            return;
-        }
-        backend->prealloc = true;
+        backend->prealloc = do_prealloc_mr(backend, errp);
     }
 }
 
@@ -399,12 +405,8 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
          * specified NUMA policy in place.
          */
         if (backend->prealloc) {
-            qemu_prealloc_mem(memory_region_get_fd(&backend->mr), ptr, sz,
-                              backend->prealloc_threads,
-                              backend->prealloc_context, &local_err);
-            if (local_err) {
-                goto out;
-            }
+            do_prealloc_mr(backend, errp);
+            return;
         }
     }
 out:

From patchwork Fri Jan 20 13:47:48 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Daniil Tatianin <d-tatianin@yandex-team.ru>
X-Patchwork-Id: 13109892
Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C25CC05027
	for <qemu-devel@archiver.kernel.org>; Fri, 20 Jan 2023 13:57:13 +0000 (UTC)
Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces@nongnu.org>)
	id 1pIrs7-0007mx-Sl; Fri, 20 Jan 2023 08:55:59 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <d-tatianin@yandex-team.ru>)
 id 1pIrs5-0007mI-1o
 for qemu-devel@nongnu.org; Fri, 20 Jan 2023 08:55:57 -0500
Received: from forwardcorp1a.mail.yandex.net ([178.154.239.72])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <d-tatianin@yandex-team.ru>)
 id 1pIrs2-0001n8-Sp
 for qemu-devel@nongnu.org; Fri, 20 Jan 2023 08:55:56 -0500
Received: from vla5-b2806cb321eb.qloud-c.yandex.net
 (vla5-b2806cb321eb.qloud-c.yandex.net
 [IPv6:2a02:6b8:c18:3e0d:0:640:b280:6cb3])
 by forwardcorp1a.mail.yandex.net (Yandex) with ESMTP id D25545FD7B;
 Fri, 20 Jan 2023 16:48:07 +0300 (MSK)
Received: from d-tatianin-nix.yandex.net (unknown
 [2a02:6b8:0:419:8f3f:2197:162b:4096])
 by vla5-b2806cb321eb.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id
 wlngUQ0WWiE1-kiz6Q4ar; Fri, 20 Jan 2023 16:48:07 +0300
Precedence: bulk
X-Yandex-Fwd: 1
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru;
 s=default;
 t=1674222487; bh=lYyT3PU2ejZSt9ohAhPKNh+ge3LC9xuJBtLV6pXra7w=;
 h=Message-Id:Date:In-Reply-To:Cc:Subject:References:To:From;
 b=qAKuDeXbi7Q3aI5NqioBcR7DJ5d36hEBM6PF7zwmk7BKFlqe1fNtj5ZAEO8XwehDA
 bqJhJbOIqL5mGks4csPVl9flg9qPmCYTcwN2A2U31UUDz8X3NFvlkwSJwSGmm6xq2c
 n+lch5yoBRZ6XsKceJlb6cqjfKjmOek/LuSma5xg=
Authentication-Results: vla5-b2806cb321eb.qloud-c.yandex.net;
 dkim=pass header.i=@yandex-team.ru
From: Daniil Tatianin <d-tatianin@yandex-team.ru>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Daniil Tatianin <d-tatianin@yandex-team.ru>, qemu-devel@nongnu.org,
 Stefan Weil <sw@weilnetz.de>, David Hildenbrand <david@redhat.com>,
 Igor Mammedov <imammedo@redhat.com>, yc-core@yandex-team.ru
Subject: [PATCH 3/4] backends/hostmem: add an ability to specify prealloc
 timeout
Date: Fri, 20 Jan 2023 16:47:48 +0300
Message-Id: <20230120134749.550639-4-d-tatianin@yandex-team.ru>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20230120134749.550639-1-d-tatianin@yandex-team.ru>
References: <20230120134749.550639-1-d-tatianin@yandex-team.ru>
MIME-Version: 1.0
Received-SPF: pass client-ip=178.154.239.72;
 envelope-from=d-tatianin@yandex-team.ru; helo=forwardcorp1a.mail.yandex.net
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org

Use the new qemu_prealloc_mem_with_timeout api so that we can limit the
maximum amount of time to be spent preallocating guest RAM. We also emit
a warning from the timeout handler detailing the current prealloc
progress and letting the user know that it was exceeded.

The timeout is set to zero (no timeout) by default, and can be
configured via the new 'prealloc-timeout' property.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
---
 backends/hostmem.c       | 48 ++++++++++++++++++++++++++++++++++++++--
 include/sysemu/hostmem.h |  2 ++
 qapi/qom.json            |  4 ++++
 3 files changed, 52 insertions(+), 2 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index 842bfa9eb7..be9af7515e 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -34,6 +34,19 @@ QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_BIND != MPOL_BIND);
 QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != MPOL_INTERLEAVE);
 #endif
 
+static void
+host_memory_on_prealloc_timeout(void *opaque,
+                                const PreallocStats *stats)
+{
+    HostMemoryBackend *backend = opaque;
+
+    backend->prealloc_did_timeout = true;
+    warn_report("HostMemory preallocation timeout %"PRIu64"s exceeded, "
+                "allocated %zu/%zu (%zu byte) pages (%d threads)",
+                (uint64_t)stats->seconds_elapsed, stats->allocated_pages,
+                stats->total_pages, stats->page_size, stats->threads);
+}
+
 char *
 host_memory_backend_get_name(HostMemoryBackend *backend)
 {
@@ -223,8 +236,26 @@ static bool do_prealloc_mr(HostMemoryBackend *backend, Error **errp)
     void *ptr = memory_region_get_ram_ptr(&backend->mr);
     uint64_t sz = memory_region_size(&backend->mr);
 
-    qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads,
-                      backend->prealloc_context, &local_err);
+    if (backend->prealloc_timeout) {
+        PreallocTimeout timeout = {
+            .seconds = (time_t)backend->prealloc_timeout,
+            .user = backend,
+            .on_timeout = host_memory_on_prealloc_timeout,
+        };
+
+        qemu_prealloc_mem_with_timeout(fd, ptr, sz, backend->prealloc_threads,
+                                       backend->prealloc_context, &timeout,
+                                       &local_err);
+        if (local_err && backend->prealloc_did_timeout) {
+            error_free(local_err);
+            local_err = NULL;
+        }
+    } else {
+        qemu_prealloc_mem(fd, ptr, sz, backend->prealloc_threads,
+                          backend->prealloc_context, &local_err);
+    }
+
+
     if (local_err) {
         error_propagate(errp, local_err);
         return false;
@@ -277,6 +308,13 @@ static void host_memory_backend_set_prealloc_threads(Object *obj, Visitor *v,
     backend->prealloc_threads = value;
 }
 
+static void host_memory_backend_get_set_prealloc_timeout(Object *obj,
+    Visitor *v, const char *name, void *opaque, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+    visit_type_uint32(v, name, &backend->prealloc_timeout, errp);
+}
+
 static void host_memory_backend_init(Object *obj)
 {
     HostMemoryBackend *backend = MEMORY_BACKEND(obj);
@@ -516,6 +554,12 @@ host_memory_backend_class_init(ObjectClass *oc, void *data)
         object_property_allow_set_link, OBJ_PROP_LINK_STRONG);
     object_class_property_set_description(oc, "prealloc-context",
         "Context to use for creating CPU threads for preallocation");
+    object_class_property_add(oc, "prealloc-timeout", "int",
+        host_memory_backend_get_set_prealloc_timeout,
+        host_memory_backend_get_set_prealloc_timeout,
+        NULL, NULL);
+    object_class_property_set_description(oc, "prealloc-timeout",
+        "Maximum memory preallocation timeout in seconds");
     object_class_property_add(oc, "size", "int",
         host_memory_backend_get_size,
         host_memory_backend_set_size,
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index 39326f1d4f..21910f3b45 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -66,7 +66,9 @@ struct HostMemoryBackend {
     uint64_t size;
     bool merge, dump, use_canonical_path;
     bool prealloc, is_mapped, share, reserve;
+    bool prealloc_did_timeout;
     uint32_t prealloc_threads;
+    uint32_t prealloc_timeout;
     ThreadContext *prealloc_context;
     DECLARE_BITMAP(host_nodes, MAX_NODES + 1);
     HostMemPolicy policy;
diff --git a/qapi/qom.json b/qapi/qom.json
index 30e76653ad..9149c064b8 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -581,6 +581,9 @@
 # @prealloc-context: thread context to use for creation of preallocation threads
 #                    (default: none) (since 7.2)
 #
+# @prealloc-timeout: Maximum memory preallocation timeout in seconds
+#                    (default: 0) (since 7.3)
+#
 # @share: if false, the memory is private to QEMU; if true, it is shared
 #         (default: false)
 #
@@ -612,6 +615,7 @@
             '*prealloc': 'bool',
             '*prealloc-threads': 'uint32',
             '*prealloc-context': 'str',
+            '*prealloc-timeout': 'uint32',
             '*share': 'bool',
             '*reserve': 'bool',
             'size': 'size',

From patchwork Fri Jan 20 13:47:49 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Daniil Tatianin <d-tatianin@yandex-team.ru>
X-Patchwork-Id: 13109893
Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.lore.kernel.org (Postfix) with ESMTPS id B6BB3C27C7C
	for <qemu-devel@archiver.kernel.org>; Fri, 20 Jan 2023 13:57:14 +0000 (UTC)
Received: from localhost ([::1] helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <qemu-devel-bounces@nongnu.org>)
	id 1pIrs5-0007mH-IK; Fri, 20 Jan 2023 08:55:57 -0500
Received: from eggs.gnu.org ([2001:470:142:3::10])
 by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <d-tatianin@yandex-team.ru>)
 id 1pIrs2-0007la-Vm
 for qemu-devel@nongnu.org; Fri, 20 Jan 2023 08:55:55 -0500
Received: from forwardcorp1a.mail.yandex.net ([178.154.239.72])
 by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256)
 (Exim 4.90_1) (envelope-from <d-tatianin@yandex-team.ru>)
 id 1pIrrz-0001n0-RB
 for qemu-devel@nongnu.org; Fri, 20 Jan 2023 08:55:54 -0500
Received: from vla5-b2806cb321eb.qloud-c.yandex.net
 (vla5-b2806cb321eb.qloud-c.yandex.net
 [IPv6:2a02:6b8:c18:3e0d:0:640:b280:6cb3])
 by forwardcorp1a.mail.yandex.net (Yandex) with ESMTP id 074F05FE28;
 Fri, 20 Jan 2023 16:48:09 +0300 (MSK)
Received: from d-tatianin-nix.yandex.net (unknown
 [2a02:6b8:0:419:8f3f:2197:162b:4096])
 by vla5-b2806cb321eb.qloud-c.yandex.net (smtpcorp/Yandex) with ESMTPSA id
 wlngUQ0WWiE1-7YN7AeUl; Fri, 20 Jan 2023 16:48:08 +0300
Precedence: bulk
X-Yandex-Fwd: 1
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru;
 s=default;
 t=1674222488; bh=ZtTbi4GXRgOO7uoBYaSNdemABJCezVJcSdF4lxOAUcw=;
 h=Message-Id:Date:In-Reply-To:Cc:Subject:References:To:From;
 b=s8C38UC+VC6JVmCa1Fho2ZH0KMiiBSfqMvqFGmQP47+FvffuBx1AXYj+cs9/0huCp
 zwpi82m6lQNoU8u5yjrXzBoDJciH+nPQQ7QEZoJ32qerzPljWw5Z/2Y9lPYQlbOsg6
 2//e7dX4PN7HnQuskrGg08yvpSbmgIfxUoTkC0xw=
Authentication-Results: vla5-b2806cb321eb.qloud-c.yandex.net;
 dkim=pass header.i=@yandex-team.ru
From: Daniil Tatianin <d-tatianin@yandex-team.ru>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Daniil Tatianin <d-tatianin@yandex-team.ru>, qemu-devel@nongnu.org,
 Stefan Weil <sw@weilnetz.de>, David Hildenbrand <david@redhat.com>,
 Igor Mammedov <imammedo@redhat.com>, yc-core@yandex-team.ru
Subject: [PATCH 4/4] backends/hostmem: add an ability to make prealloc timeout
 fatal
Date: Fri, 20 Jan 2023 16:47:49 +0300
Message-Id: <20230120134749.550639-5-d-tatianin@yandex-team.ru>
X-Mailer: git-send-email 2.25.1
In-Reply-To: <20230120134749.550639-1-d-tatianin@yandex-team.ru>
References: <20230120134749.550639-1-d-tatianin@yandex-team.ru>
MIME-Version: 1.0
Received-SPF: pass client-ip=178.154.239.72;
 envelope-from=d-tatianin@yandex-team.ru; helo=forwardcorp1a.mail.yandex.net
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org

This is controlled via the new 'prealloc-timeout-fatal' property and can
be useful for cases when we cannot afford to not preallocate all guest
pages while being time constrained.

Signed-off-by: Daniil Tatianin <d-tatianin@yandex-team.ru>
---
 backends/hostmem.c       | 38 ++++++++++++++++++++++++++++++++++----
 include/sysemu/hostmem.h |  1 +
 qapi/qom.json            |  4 ++++
 3 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/backends/hostmem.c b/backends/hostmem.c
index be9af7515e..0808dc6951 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -39,12 +39,21 @@ host_memory_on_prealloc_timeout(void *opaque,
                                 const PreallocStats *stats)
 {
     HostMemoryBackend *backend = opaque;
+    const char *msg = "HostMemory preallocation timeout %"PRIu64"s exceeded, "
+                      "allocated %zu/%zu (%zu byte) pages (%d threads)";
+
+    if (backend->prealloc_timeout_fatal) {
+        error_report(msg, (uint64_t)stats->seconds_elapsed,
+                     stats->allocated_pages, stats->total_pages,
+                     stats->page_size, stats->threads);
+        exit(1);
+
+    }
 
     backend->prealloc_did_timeout = true;
-    warn_report("HostMemory preallocation timeout %"PRIu64"s exceeded, "
-                "allocated %zu/%zu (%zu byte) pages (%d threads)",
-                (uint64_t)stats->seconds_elapsed, stats->allocated_pages,
-                stats->total_pages, stats->page_size, stats->threads);
+    warn_report(msg, (uint64_t)stats->seconds_elapsed,
+                stats->allocated_pages, stats->total_pages,
+                stats->page_size, stats->threads);
 }
 
 char *
@@ -315,6 +324,22 @@ static void host_memory_backend_get_set_prealloc_timeout(Object *obj,
     visit_type_uint32(v, name, &backend->prealloc_timeout, errp);
 }
 
+static bool host_memory_backend_get_prealloc_timeout_fatal(
+        Object *obj, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    return backend->prealloc_timeout_fatal;
+}
+
+static void host_memory_backend_set_prealloc_timeout_fatal(
+        Object *obj, bool value, Error **errp)
+{
+    HostMemoryBackend *backend = MEMORY_BACKEND(obj);
+
+    backend->prealloc_timeout_fatal = value;
+}
+
 static void host_memory_backend_init(Object *obj)
 {
     HostMemoryBackend *backend = MEMORY_BACKEND(obj);
@@ -560,6 +585,11 @@ host_memory_backend_class_init(ObjectClass *oc, void *data)
         NULL, NULL);
     object_class_property_set_description(oc, "prealloc-timeout",
         "Maximum memory preallocation timeout in seconds");
+    object_class_property_add_bool(oc, "prealloc-timeout-fatal",
+        host_memory_backend_get_prealloc_timeout_fatal,
+        host_memory_backend_set_prealloc_timeout_fatal);
+    object_class_property_set_description(oc, "prealloc-timeout-fatal",
+        "Consider preallocation timeout a fatal error");
     object_class_property_add(oc, "size", "int",
         host_memory_backend_get_size,
         host_memory_backend_set_size,
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index 21910f3b45..b501b5eff2 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -67,6 +67,7 @@ struct HostMemoryBackend {
     bool merge, dump, use_canonical_path;
     bool prealloc, is_mapped, share, reserve;
     bool prealloc_did_timeout;
+    bool prealloc_timeout_fatal;
     uint32_t prealloc_threads;
     uint32_t prealloc_timeout;
     ThreadContext *prealloc_context;
diff --git a/qapi/qom.json b/qapi/qom.json
index 9149c064b8..70644d714b 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -584,6 +584,9 @@
 # @prealloc-timeout: Maximum memory preallocation timeout in seconds
 #                    (default: 0) (since 7.3)
 #
+# @prealloc-timeout-fatal: Consider preallocation timeout a fatal error
+#                          (default: false) (since 7.3)
+#
 # @share: if false, the memory is private to QEMU; if true, it is shared
 #         (default: false)
 #
@@ -616,6 +619,7 @@
             '*prealloc-threads': 'uint32',
             '*prealloc-context': 'str',
             '*prealloc-timeout': 'uint32',
+            '*prealloc-timeout-fatal': 'bool',
             '*share': 'bool',
             '*reserve': 'bool',
             'size': 'size',