From patchwork Tue Nov 15 17:32:56 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Colton Lewis <coltonlewis@google.com>
X-Patchwork-Id: 13044022
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id E86CFC433FE
	for <kvm@archiver.kernel.org>; Tue, 15 Nov 2022 17:33:57 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229874AbiKORd5 (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Tue, 15 Nov 2022 12:33:57 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35934 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229681AbiKORd4 (ORCPT <rfc822;kvm@vger.kernel.org>);
        Tue, 15 Nov 2022 12:33:56 -0500
Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com
 [IPv6:2607:f8b0:4864:20::1149])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E25152AC5
        for <kvm@vger.kernel.org>; Tue, 15 Nov 2022 09:33:54 -0800 (PST)
Received: by mail-yw1-x1149.google.com with SMTP id
 00721157ae682-373582569edso139551127b3.2
        for <kvm@vger.kernel.org>; Tue, 15 Nov 2022 09:33:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=qN910hWbp0G+0j1u4FtQw9avjTAO8HEceklYfG84xcU=;
        b=pMxNlPLcK0k/NwfW5wny+v/2L3HNUuuknK6Cou7WrBgtxIG8dNhFPVEXi7A1M9tgtB
         MtxuT2RBujIseHio1Oi/V7tN0/KDyIxSKUXXtsRX0F/8DtfSl6m5yl184ucKgCrOmBd5
         e51CBAOdLXcwl3ofwVIINRmkV3R5c9cEI2fPcd5k6zb7ah6RBRNtW5WQQ/qqRwAm+lXm
         Jo864etKkISYNMUTO2szVS+fwZ6kL8w+CdmGRk+CAGjw3nXBhJkVUkD2IUGIZug7VTtM
         v4NZ33hIowG6ntyTYBuOPS+QzSbXtBvVK7TXlU+DCJOPNiI029VG5/CtI8WkkiiSbWl1
         4scQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=qN910hWbp0G+0j1u4FtQw9avjTAO8HEceklYfG84xcU=;
        b=nVSgHjgBx7vGCR2P1CyW7XPd7BnrN1EY5FpUuitG3YqG2ZCGca06X5aSZwx488EdDl
         jLfi/j25dBGbibtECWiJSjGD2lwqy4uoZLuF2aYNaul++hMzC1FENNhdApO1zvSXXRST
         CDfD6WRNVojECrvC/Ne2SdBeXKd60OU61M0dptn+D9iC+tT9UuYzrTx/juQ84bAhC7XR
         Rk1sGlx0glfCfzoz/w6BTuxcSkofZkdmq+NuZ86sigxAqGGtq57oPCGYxPX5+2UdNKE1
         4guDG+sppNe7H2C0w6yBpAw3qgafyhmOXLVAd0LejRz6bxjDcTI07UyqSu4Rpm02JZMB
         ndZA==
X-Gm-Message-State: ACrzQf1BHXwGWFj+r2+YB+Pw7Hx+Jd3GWIqustKhAg39PYjU4vdu7sL9
        zAnNGHCfcEzT4o5Imh1g7yEnjDxazbIE0V85u3o89GCXgQ6TrwD+QSoOjxbvZcuF6067SwixMJL
        bo533kOxaYnPSh35L8WwXDR7SIPN85OZsXArCu2LNyDgh0vJVQ3GTBgyx3UZHhJME9h/SMnE=
X-Google-Smtp-Source: 
 AMsMyM4SGJ3fRCMs9hI1adFvA/DAzsP1MqlrBwcR31lxCdARpoZTkBSMqMNgrYFQrMDSd58nTKcghV94yXn+MQtQPg==
X-Received: from coltonlewis-kvm.c.googlers.com
 ([fda3:e722:ac3:cc00:2b:ff92:c0a8:14ce])
 (user=coltonlewis job=sendgmr) by 2002:a5b:3ca:0:b0:6cf:dda2:8e64 with SMTP
 id t10-20020a5b03ca000000b006cfdda28e64mr49337908ybp.552.1668533633390; Tue,
 15 Nov 2022 09:33:53 -0800 (PST)
Date: Tue, 15 Nov 2022 17:32:56 +0000
In-Reply-To: <20221115173258.2530923-1-coltonlewis@google.com>
Mime-Version: 1.0
References: <20221115173258.2530923-1-coltonlewis@google.com>
X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog
Message-ID: <20221115173258.2530923-2-coltonlewis@google.com>
Subject: [PATCH 1/3] KVM: selftests: Allocate additional space for latency
 samples
From: Colton Lewis <coltonlewis@google.com>
To: kvm@vger.kernel.org
Cc: pbonzini@redhat.com, maz@kernel.org, dmatlack@google.com,
        seanjc@google.com, bgardon@google.com, oupton@google.com,
        ricarkol@google.com, Colton Lewis <coltonlewis@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Allocate additional space for latency samples. This has been separated
out to call attention to the additional VM memory allocation. The test
runs out of physical pages without the additional allocation. The 100
multiple for pages was determined by trial and error. A more
well-reasoned calculation would be preferable.

Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
 tools/testing/selftests/kvm/lib/perf_test_util.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
index 137be359b09e..a48904b64e19 100644
--- a/tools/testing/selftests/kvm/lib/perf_test_util.c
+++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
@@ -38,6 +38,12 @@ static bool all_vcpu_threads_running;
 
 static struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
 
+#define SAMPLES_PER_VCPU 1000
+#define SAMPLE_CAPACITY (SAMPLES_PER_VCPU * KVM_MAX_VCPUS)
+
+/* Store all samples in a flat array so they can be easily sorted later. */
+uint64_t latency_samples[SAMPLE_CAPACITY];
+
 /*
  * Continuously write to the first 8 bytes of each page in the
  * specified region.
@@ -122,7 +128,7 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
 {
 	struct perf_test_args *pta = &perf_test_args;
 	struct kvm_vm *vm;
-	uint64_t guest_num_pages, slot0_pages = 0;
+	uint64_t guest_num_pages, sample_pages, slot0_pages = 0;
 	uint64_t backing_src_pagesz = get_backing_src_pagesz(backing_src);
 	uint64_t region_end_gfn;
 	int i;
@@ -161,7 +167,9 @@ struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
 	 * The memory is also added to memslot 0, but that's a benign side
 	 * effect as KVM allows aliasing HVAs in meslots.
 	 */
-	vm = __vm_create_with_vcpus(mode, nr_vcpus, slot0_pages + guest_num_pages,
+	sample_pages = 100 * sizeof(latency_samples) / pta->guest_page_size;
+	vm = __vm_create_with_vcpus(mode, nr_vcpus,
+				    slot0_pages + guest_num_pages + sample_pages,
 				    perf_test_guest_code, vcpus);
 
 	pta->vm = vm;

From patchwork Tue Nov 15 17:32:57 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Colton Lewis <coltonlewis@google.com>
X-Patchwork-Id: 13044024
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id CB0ACC4332F
	for <kvm@archiver.kernel.org>; Tue, 15 Nov 2022 17:34:01 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229681AbiKOReA (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Tue, 15 Nov 2022 12:34:00 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35956 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229843AbiKORd5 (ORCPT <rfc822;kvm@vger.kernel.org>);
        Tue, 15 Nov 2022 12:33:57 -0500
Received: from mail-oa1-x4a.google.com (mail-oa1-x4a.google.com
 [IPv6:2001:4860:4864:20::4a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 993452AC5
        for <kvm@vger.kernel.org>; Tue, 15 Nov 2022 09:33:56 -0800 (PST)
Received: by mail-oa1-x4a.google.com with SMTP id
 586e51a60fabf-13cc24bcecbso7003898fac.14
        for <kvm@vger.kernel.org>; Tue, 15 Nov 2022 09:33:56 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=VgjAyCYxfYSznAKHI0Sghitn9sjteqOiyCbyWAmSn+g=;
        b=UsLSGwQxfkxxl8w5uCBrXq2sq5cn6Cn5PavDDq/3xwTIOU5dPbTlfzNqkcZjrQYZtI
         EeMbk/Cu4GZk8tditbs/RuOf2BX1sSDN/skRB44oZtMH1Xerz5QPFy8kJKxG/tGZVm5k
         jarUs82hpIRyj3WzAdbCbZapV5uStHDSj/8LuzN+x5N6wGpjEQsemH64qjwWZI9kYj8z
         JUlvOUl2pBqpZRRxc43nCQ1dmeT5EtmbRJAJ8IwtWKptKUDSHP2vGOahcicb31StE04N
         kLzEhl12T61EbG4FViRiixbtbQrXKX4wPOcB9J1qytv2guqV32xwHEDafYpXD1Vt2rXd
         dflg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=VgjAyCYxfYSznAKHI0Sghitn9sjteqOiyCbyWAmSn+g=;
        b=8GIYE1Ao0vjCrPqO5Z459oawe3msfcFxPtMCS+hsfCjCBkKNUZL7VANAaVkBeu/ypf
         y7HHUDh+T81GP4ES+pL/QTv0YpG7tBgNZwYAqv2K0UMV21nDSMBdUzVf0S/67spikym9
         n0CkOg+FdyE3p3afn7Mdld0M0FlZhkAo5juw8/XgeRLlGfHZaHWIAzI1y5yR8ZgedAvZ
         A9x1chc9rCEI0zFn5LjzIwQnIjcdefmeWsuPelLgzxFw3VmG1P8dLiABhun+yOUIwfs1
         CfGLtLbN8QRlwXNw7eq8uZDZOhI2FwSyCnUs913mH1kJ8jiH3B40JCYRxu6P07kBHsSd
         9PQA==
X-Gm-Message-State: ANoB5pnin/Eebk3Q4tXC0+v29RQQ9YchGMM8+IhpWpINobHyW8zs0vK+
        TH+3h2uvYtreSm5ma82cSoos1KscoO2gEkjez6Q6zYXITbqSIzRsrvPryLMfq7lFfmvfi13o6CG
        4pRaqtP2jOZUf+u5WL3gDDj1EziDonW8K9FHdNJJp0kAbIIqt9AN6LqmgjDvkOVQGVS/GBtA=
X-Google-Smtp-Source: 
 AA0mqf7qMBcywdk6VjSUeeeGCovbU3tisROib/ECCHQoIFVpdXv+L9Dv4jLNcYXdeLfthnRQqXzpPfbKyWiWKj4YhQ==
X-Received: from coltonlewis-kvm.c.googlers.com
 ([fda3:e722:ac3:cc00:2b:ff92:c0a8:14ce])
 (user=coltonlewis job=sendgmr) by 2002:a05:6870:f209:b0:13b:a70a:9302 with
 SMTP id t9-20020a056870f20900b0013ba70a9302mr1759461oao.221.1668533634982;
 Tue, 15 Nov 2022 09:33:54 -0800 (PST)
Date: Tue, 15 Nov 2022 17:32:57 +0000
In-Reply-To: <20221115173258.2530923-1-coltonlewis@google.com>
Mime-Version: 1.0
References: <20221115173258.2530923-1-coltonlewis@google.com>
X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog
Message-ID: <20221115173258.2530923-3-coltonlewis@google.com>
Subject: [PATCH 2/3] KVM: selftests: Collect memory access latency samples
From: Colton Lewis <coltonlewis@google.com>
To: kvm@vger.kernel.org
Cc: pbonzini@redhat.com, maz@kernel.org, dmatlack@google.com,
        seanjc@google.com, bgardon@google.com, oupton@google.com,
        ricarkol@google.com, Colton Lewis <coltonlewis@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Collect memory access latency measured in clock cycles.

This introduces a dependency on the timers for ARM and x86. No other
architectures are implemented and their samples will all be 0.

Because keeping all samples is impractical due to the space required
in some cases (pooled memory w/ 64 vcpus would be 64 GB/vcpu * 64
vcpus * 250,000 samples/GB * 8 bytes/sample ~ 8 Gb extra memory just
for samples), resevior sampling is used to only keep a small number of
samples per vcpu (1000 samples in this patch).

Resevoir sampling means despite keeping only a small number of
samples, each sample has an equal chance of making it to the
resevoir. Simple proofs of this can be found online. This makes the
resevoir a good representation of the distribution of samples and
enables calculation of reasonably accurate percentiles.

All samples are stored in a statically allocated flat array for ease
of combining them later. Samples are stored at an offset in this array
calculated by the vcpu index (so vcpu 5 sample 10 would be stored at
address sample_times + 5 * vcpu_idx + 10).

Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
 .../selftests/kvm/lib/perf_test_util.c        | 34 +++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
index a48904b64e19..0311da76bae0 100644
--- a/tools/testing/selftests/kvm/lib/perf_test_util.c
+++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
@@ -4,6 +4,9 @@
  */
 #include <inttypes.h>
 
+#if defined(__aarch64__)
+#include "aarch64/arch_timer.h"
+#endif
 #include "kvm_util.h"
 #include "perf_test_util.h"
 #include "processor.h"
@@ -44,6 +47,18 @@ static struct kvm_vcpu *vcpus[KVM_MAX_VCPUS];
 /* Store all samples in a flat array so they can be easily sorted later. */
 uint64_t latency_samples[SAMPLE_CAPACITY];
 
+static uint64_t perf_test_timer_read(void)
+{
+#if defined(__aarch64__)
+	return timer_get_cntct(VIRTUAL);
+#elif defined(__x86_64__)
+	return rdtsc();
+#else
+#warn __func__ " is not implemented for this architecture, will return 0"
+	return 0;
+#endif
+}
+
 /*
  * Continuously write to the first 8 bytes of each page in the
  * specified region.
@@ -59,6 +74,10 @@ void perf_test_guest_code(uint32_t vcpu_idx)
 	int i;
 	struct guest_random_state rand_state =
 		new_guest_random_state(pta->random_seed + vcpu_idx);
+	uint64_t *latency_samples_offset = latency_samples + SAMPLES_PER_VCPU * vcpu_idx;
+	uint64_t count_before;
+	uint64_t count_after;
+	uint32_t maybe_sample;
 
 	gva = vcpu_args->gva;
 	pages = vcpu_args->pages;
@@ -75,10 +94,21 @@ void perf_test_guest_code(uint32_t vcpu_idx)
 
 			addr = gva + (page * pta->guest_page_size);
 
-			if (guest_random_u32(&rand_state) % 100 < pta->write_percent)
+			if (guest_random_u32(&rand_state) % 100 < pta->write_percent) {
+				count_before = perf_test_timer_read();
 				*(uint64_t *)addr = 0x0123456789ABCDEF;
-			else
+				count_after = perf_test_timer_read();
+			} else {
+				count_before = perf_test_timer_read();
 				READ_ONCE(*(uint64_t *)addr);
+				count_after = perf_test_timer_read();
+			}
+
+			maybe_sample = guest_random_u32(&rand_state) % (i + 1);
+			if (i < SAMPLES_PER_VCPU)
+				latency_samples_offset[i] = count_after - count_before;
+			else if (maybe_sample < SAMPLES_PER_VCPU)
+				latency_samples_offset[maybe_sample] = count_after - count_before;
 		}
 
 		GUEST_SYNC(1);

From patchwork Tue Nov 15 17:32:58 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Colton Lewis <coltonlewis@google.com>
X-Patchwork-Id: 13044023
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 9B84DC4332F
	for <kvm@archiver.kernel.org>; Tue, 15 Nov 2022 17:33:59 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S229897AbiKORd6 (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Tue, 15 Nov 2022 12:33:58 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35958 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S229681AbiKORd5 (ORCPT <rfc822;kvm@vger.kernel.org>);
        Tue, 15 Nov 2022 12:33:57 -0500
Received: from mail-il1-x14a.google.com (mail-il1-x14a.google.com
 [IPv6:2607:f8b0:4864:20::14a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9CCE631D
        for <kvm@vger.kernel.org>; Tue, 15 Nov 2022 09:33:56 -0800 (PST)
Received: by mail-il1-x14a.google.com with SMTP id
 j7-20020a056e02154700b003025b3c0ea3so4750974ilu.10
        for <kvm@vger.kernel.org>; Tue, 15 Nov 2022 09:33:56 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:from:to:cc:subject:date:message-id:reply-to;
        bh=smrecohOmbo6ykZMskRiuETFcF02IbrRITp2mzypa4E=;
        b=U4pO1VbXzfdLK/PwQWM9KfGv0ExhhSaEIyHK29/z+fgXAoQKanQj/Z1+2topoa7p+j
         UYtpW0yMA4ObmIhgK72HMM1H/Cpwhakw7e96ufRiMKrCRxxHqYhNp0Nk/yeRAAbnUhF6
         Rf0S3oxn+RaQZ8GVrCBiNYltLUNsFvbWpDlo42h1C4l7f9YFfNdDIpEz5NY/dl3IYqwv
         G2DboPrlTRg21nAwczSuE4O886CT+fytbxVeinDJSZ0MPtTQeiysCzubi6IerZLpAZMl
         CbEnuPfYid1Ijfxvgzxv7FF18FReK0do+GS7QvIxe9wZ9AI/sN1u6zp6w72g6VbPuJLC
         4EkA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=cc:to:from:subject:message-id:references:mime-version:in-reply-to
         :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
        bh=smrecohOmbo6ykZMskRiuETFcF02IbrRITp2mzypa4E=;
        b=2hDAEKmDZaZmT9Kk4x7Bht6RMcofoVAKMuw0b5GpFn1hqgfvxYT0YSWtrRcwSIZdEY
         OeztGPUNMPahgpLG4Za/6juGDuJCmxws5WUpL10EuxCZMQ3AXPwgLrxCqr9uEaJRguJ1
         5v+gUKy4zx0i+YGgl6g258UBJK9tMYkVmMueiBj5JuVMDtrKaw2CYsKa0WkDkXCyjJPR
         PMRg7QnJbG5LulzU+AH3juxBqJ3iX6C06/ZebW66P4jeUtO0v23K6/ZUTHwlymoe1bW+
         ER4YCKilToGLvu2SwXpcv7CTCZ30lX1qWFRL2FRX++eQ8aVOP1FAY4Z+AnYQll8JuoX7
         0dqw==
X-Gm-Message-State: ANoB5pmbaNQnr999BLYHW38qMLgBz6AgWt9RFcJsN2IHLlwPNs3GVTMM
        7JHGhoKKaccGYlt9YBkuIVNi3yeLfeYi/dG9Sn/pF0+p9ToRV0L9JY+cWyVXaoPIGi9yfvAAUyu
        uxJy2Ux6Z01U0CjXsLdYAU3tPwcaXuTYmuuvCix44OE/+ZZspXkwBLKSMbZIwP2+dm/YT3fk=
X-Google-Smtp-Source: 
 AA0mqf6mJuXACvtGH9CZFxpZ1YzE6molLSzGdHognbVEtsOpnMPGngp5UNdj0wNimyGt8IgDjEVfGSNMOu0wpcOGkg==
X-Received: from coltonlewis-kvm.c.googlers.com
 ([fda3:e722:ac3:cc00:2b:ff92:c0a8:14ce])
 (user=coltonlewis job=sendgmr) by 2002:a02:b691:0:b0:375:577d:688b with SMTP
 id i17-20020a02b691000000b00375577d688bmr8666992jam.255.1668533636070; Tue,
 15 Nov 2022 09:33:56 -0800 (PST)
Date: Tue, 15 Nov 2022 17:32:58 +0000
In-Reply-To: <20221115173258.2530923-1-coltonlewis@google.com>
Mime-Version: 1.0
References: <20221115173258.2530923-1-coltonlewis@google.com>
X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog
Message-ID: <20221115173258.2530923-4-coltonlewis@google.com>
Subject: [PATCH 3/3] KVM: selftests: Print summary stats of memory latency
 distribution
From: Colton Lewis <coltonlewis@google.com>
To: kvm@vger.kernel.org
Cc: pbonzini@redhat.com, maz@kernel.org, dmatlack@google.com,
        seanjc@google.com, bgardon@google.com, oupton@google.com,
        ricarkol@google.com, Colton Lewis <coltonlewis@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Print summary stats of the memory latency distribution in
nanoseconds. For every iteration, this prints the minimum, the
maximum, and the 50th, 90th, and 99th percentiles.

Stats are calculated by sorting the samples taken from all vcpus and
picking from the index corresponding with each percentile.

The conversion to nanoseconds needs the frequency of the Intel
timestamp counter, which is estimated by reading the counter before
and after sleeping for 1 second. This is not a pretty trick, but it
also exists in vmx_nested_tsc_scaling_test.c

Signed-off-by: Colton Lewis <coltonlewis@google.com>
---
 .../selftests/kvm/dirty_log_perf_test.c       |  2 +
 .../selftests/kvm/include/perf_test_util.h    |  2 +
 .../selftests/kvm/lib/perf_test_util.c        | 62 +++++++++++++++++++
 3 files changed, 66 insertions(+)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 202f38a72851..2bc066bba460 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -274,6 +274,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 	ts_diff = timespec_elapsed(start);
 	pr_info("Populate memory time: %ld.%.9lds\n",
 		ts_diff.tv_sec, ts_diff.tv_nsec);
+	perf_test_print_percentiles(vm, nr_vcpus);
 
 	/* Enable dirty logging */
 	clock_gettime(CLOCK_MONOTONIC, &start);
@@ -304,6 +305,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
 		vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff);
 		pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
 			iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
+		perf_test_print_percentiles(vm, nr_vcpus);
 
 		clock_gettime(CLOCK_MONOTONIC, &start);
 		get_dirty_log(vm, bitmaps, p->slots);
diff --git a/tools/testing/selftests/kvm/include/perf_test_util.h b/tools/testing/selftests/kvm/include/perf_test_util.h
index 3d0b75ea866a..ca378c262f12 100644
--- a/tools/testing/selftests/kvm/include/perf_test_util.h
+++ b/tools/testing/selftests/kvm/include/perf_test_util.h
@@ -47,6 +47,8 @@ struct perf_test_args {
 
 extern struct perf_test_args perf_test_args;
 
+void perf_test_print_percentiles(struct kvm_vm *vm, int nr_vcpus);
+
 struct kvm_vm *perf_test_create_vm(enum vm_guest_mode mode, int nr_vcpus,
 				   uint64_t vcpu_memory_bytes, int slots,
 				   enum vm_mem_backing_src_type backing_src,
diff --git a/tools/testing/selftests/kvm/lib/perf_test_util.c b/tools/testing/selftests/kvm/lib/perf_test_util.c
index 0311da76bae0..927d22421f7c 100644
--- a/tools/testing/selftests/kvm/lib/perf_test_util.c
+++ b/tools/testing/selftests/kvm/lib/perf_test_util.c
@@ -115,6 +115,68 @@ void perf_test_guest_code(uint32_t vcpu_idx)
 	}
 }
 
+#if defined(__x86_64__)
+/* This could be determined with the right sequence of cpuid
+ * instructions, but that's oddly complicated.
+ */
+static uint64_t perf_test_intel_timer_frequency(void)
+{
+	uint64_t count_before;
+	uint64_t count_after;
+	uint64_t measured_freq;
+	uint64_t adjusted_freq;
+
+	count_before = perf_test_timer_read();
+	sleep(1);
+	count_after = perf_test_timer_read();
+
+	/* Using 1 second implies our units are in Hz already. */
+	measured_freq = count_after - count_before;
+	/* Truncate to the nearest MHz. Clock frequencies are round numbers. */
+	adjusted_freq = measured_freq / 1000000 * 1000000;
+
+	return adjusted_freq;
+}
+#endif
+
+static double perf_test_cycles_to_ns(double cycles)
+{
+#if defined(__aarch64__)
+	return cycles * (1e9 / timer_get_cntfrq());
+#elif defined(__x86_64__)
+	static uint64_t timer_frequency;
+
+	if (timer_frequency == 0)
+		timer_frequency = perf_test_intel_timer_frequency();
+
+	return cycles * (1e9 / timer_frequency);
+#else
+#warn __func__ " is not implemented for this architecture, will return 0"
+	return 0.0;
+#endif
+}
+
+/* compare function for qsort */
+static int perf_test_qcmp(const void *a, const void *b)
+{
+	return *(int *)a - *(int *)b;
+}
+
+void perf_test_print_percentiles(struct kvm_vm *vm, int nr_vcpus)
+{
+	uint64_t n_samples = nr_vcpus * SAMPLES_PER_VCPU;
+
+	sync_global_from_guest(vm, latency_samples);
+	qsort(latency_samples, n_samples, sizeof(uint64_t), &perf_test_qcmp);
+
+	pr_info("Latency distribution (ns) = min:%6.0lf, 50th:%6.0lf, 90th:%6.0lf, 99th:%6.0lf, max:%6.0lf\n",
+		perf_test_cycles_to_ns((double)latency_samples[0]),
+		perf_test_cycles_to_ns((double)latency_samples[n_samples / 2]),
+		perf_test_cycles_to_ns((double)latency_samples[n_samples * 9 / 10]),
+		perf_test_cycles_to_ns((double)latency_samples[n_samples * 99 / 100]),
+		perf_test_cycles_to_ns((double)latency_samples[n_samples - 1]));
+}
+
 void perf_test_setup_vcpus(struct kvm_vm *vm, int nr_vcpus,
 			   struct kvm_vcpu *vcpus[],
 			   uint64_t vcpu_memory_bytes,