From patchwork Fri Jul 1 17:04:46 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 9210285 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 24FF360752 for ; Fri, 1 Jul 2016 17:29:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 13ACC286C6 for ; Fri, 1 Jul 2016 17:29:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 07647286CA; Fri, 1 Jul 2016 17:29:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 26FF8286C6 for ; Fri, 1 Jul 2016 17:29:00 +0000 (UTC) Received: from localhost ([::1]:34748 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bJ2FP-0006nT-5s for patchwork-qemu-devel@patchwork.kernel.org; Fri, 01 Jul 2016 13:28:59 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:53745) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bJ1sS-0002Rq-0O for qemu-devel@nongnu.org; Fri, 01 Jul 2016 13:05:20 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bJ1sP-0005Uz-Bb for qemu-devel@nongnu.org; Fri, 01 Jul 2016 13:05:14 -0400 Received: from mail-pf0-x244.google.com ([2607:f8b0:400e:c00::244]:32972) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bJ1sP-0005Uq-18 for qemu-devel@nongnu.org; Fri, 01 Jul 2016 13:05:13 -0400 Received: by mail-pf0-x244.google.com with SMTP id c74so10477388pfb.0 for ; Fri, 01 Jul 2016 10:05:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=cKaHOCl6VpJIWns6f1QtWHFvnvKtHOD8Bp/Fw41TjxA=; b=e0LT6JExpsNKPiEvrUQjggo+hW3WsgQjdS8J6HwqdTmoeHYjcJ4jhrYC2UdQ6Jlc7v G76+TrAvzI+KmPPlk2vUdl592ruUfp0NRb837neWM+1e0V2auMK//EePq/G/dEYibPN9 xMrQBgy8+s7uNg9Mp2DEV4b7FXKb5r/xWrYspNQP8mKOYBTnicI8CtrYt2o7KQ1wCvkk e4qcu9e/hylFONddc60QuU1Ntiwu2uyDCUHFFzfunBYqJZeqUFwSc1bCd6W2ant9pulq xvZCvGOvEuwQPHBpu7DLOfFvXCrEmikbZPIFkySbKxVhqcPiZBuspN6EsxXTBJdRehHx CoTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=cKaHOCl6VpJIWns6f1QtWHFvnvKtHOD8Bp/Fw41TjxA=; b=eBksanw0B4Api4SYAFj4I2mG55FeF3qIIMLxJsGMBDN8UPcu7c5mMozb5RpjzhVSRF 1pXVcuESx2OQD1ABd7HkDFztVMqsYuFEnPlBErbULN+QY93iuFZI/AyQwugzx6iUEIsW Yzn/P30sRgEDj+M6VARTpcyp1uPWm8RjT19pDj52j7be4+hSfuMw6h3IzYipff0UtUMV SwvzfbV+sWIkenOYlLsHk4tUc1PnVi8TZqnyZ4Y6KP7Sjd/KmJhuQjsGelE7yEnsTr+m EYjTSfcqnoAjmyDY21VAilLIIO4V8qIEerE1lLPgqbpsw3GGoGNfZxJn+r4+QqQXAG9N +/OA== X-Gm-Message-State: ALyK8tL0SXtJTJLZuv5N5+Ak8H56Xab80y8mk1S+3KDyIzoa4m2i9uR4yQi9c5uFXpwV2w== X-Received: by 10.98.82.68 with SMTP id g65mr33119013pfb.157.1467392712255; Fri, 01 Jul 2016 10:05:12 -0700 (PDT) Received: from bigtime.twiddle.net (71-37-54-227.tukw.qwest.net. [71.37.54.227]) by smtp.gmail.com with ESMTPSA id ff9sm2652229pac.5.2016.07.01.10.05.11 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 01 Jul 2016 10:05:11 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Fri, 1 Jul 2016 10:04:46 -0700 Message-Id: <1467392693-22715-21-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1467392693-22715-1-git-send-email-rth@twiddle.net> References: <1467392693-22715-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400e:c00::244 Subject: [Qemu-devel] [PATCH v2 20/27] tests: add atomic_add-bench X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, serge.fdrv@gmail.com, cota@braap.org, alex.bennee@linaro.org, peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP From: "Emilio G. Cota" With this microbenchmark we can measure the overhead of emulating atomic instructions with a configurable degree of contention. The benchmark spawns $n threads, each performing $o atomic ops (additions) in a loop. Each atomic operation is performed on a different cache line (assuming lines are 64b long) that is randomly selected from a range [0, $r). [ Note: each $foo corresponds to a -foo flag ] Signed-off-by: Emilio G. Cota Message-Id: <1467054136-10430-20-git-send-email-cota@braap.org> --- tests/.gitignore | 1 + tests/Makefile.include | 4 +- tests/atomic_add-bench.c | 180 +++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 184 insertions(+), 1 deletion(-) create mode 100644 tests/atomic_add-bench.c diff --git a/tests/.gitignore b/tests/.gitignore index 840ea39..52488a0 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -1,3 +1,4 @@ +atomic_add-bench check-qdict check-qfloat check-qint diff --git a/tests/Makefile.include b/tests/Makefile.include index 6c09962..7421778 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -408,7 +408,8 @@ test-obj-y = tests/check-qint.o tests/check-qstring.o tests/check-qdict.o \ tests/test-opts-visitor.o tests/test-qmp-event.o \ tests/rcutorture.o tests/test-rcu-list.o \ tests/test-qdist.o \ - tests/test-qht.o tests/qht-bench.o tests/test-qht-par.o + tests/test-qht.o tests/qht-bench.o tests/test-qht-par.o \ + tests/atomic_add-bench.o $(test-obj-y): QEMU_INCLUDES += -Itests QEMU_CFLAGS += -I$(SRC_PATH)/tests @@ -451,6 +452,7 @@ tests/test-qdist$(EXESUF): tests/test-qdist.o $(test-util-obj-y) tests/test-qht$(EXESUF): tests/test-qht.o $(test-util-obj-y) tests/test-qht-par$(EXESUF): tests/test-qht-par.o tests/qht-bench$(EXESUF) $(test-util-obj-y) tests/qht-bench$(EXESUF): tests/qht-bench.o $(test-util-obj-y) +tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o $(test-util-obj-y) tests/test-qdev-global-props$(EXESUF): tests/test-qdev-global-props.o \ hw/core/qdev.o hw/core/qdev-properties.o hw/core/hotplug.o\ diff --git a/tests/atomic_add-bench.c b/tests/atomic_add-bench.c new file mode 100644 index 0000000..5bbecf6 --- /dev/null +++ b/tests/atomic_add-bench.c @@ -0,0 +1,180 @@ +#include "qemu/osdep.h" +#include "qemu/thread.h" +#include "qemu/host-utils.h" +#include "qemu/processor.h" + +struct thread_info { + uint64_t r; +} QEMU_ALIGNED(64); + +struct count { + unsigned long val; +} QEMU_ALIGNED(64); + +static QemuThread *threads; +static struct thread_info *th_info; +static unsigned int n_threads = 1; +static unsigned int n_ready_threads; +static struct count *counts; +static unsigned long n_ops = 10000; +static double duration; +static unsigned int range = 1; +static bool test_start; + +static const char commands_string[] = + " -n = number of threads\n" + " -o = number of ops per thread\n" + " -r = range (will be rounded up to pow2)"; + +static void usage_complete(char *argv[]) +{ + fprintf(stderr, "Usage: %s [options]\n", argv[0]); + fprintf(stderr, "options:\n%s\n", commands_string); +} + +/* + * From: https://en.wikipedia.org/wiki/Xorshift + * This is faster than rand_r(), and gives us a wider range (RAND_MAX is only + * guaranteed to be >= INT_MAX). + */ +static uint64_t xorshift64star(uint64_t x) +{ + x ^= x >> 12; /* a */ + x ^= x << 25; /* b */ + x ^= x >> 27; /* c */ + return x * UINT64_C(2685821657736338717); +} + +static void *thread_func(void *arg) +{ + struct thread_info *info = arg; + unsigned long i; + + atomic_inc(&n_ready_threads); + while (!atomic_mb_read(&test_start)) { + cpu_relax(); + } + + for (i = 0; i < n_ops; i++) { + unsigned int index; + + info->r = xorshift64star(info->r); + index = info->r & (range - 1); + atomic_inc(&counts[index].val); + } + return NULL; +} + +static inline +uint64_t ts_subtract(const struct timespec *a, const struct timespec *b) +{ + uint64_t ns; + + ns = (b->tv_sec - a->tv_sec) * 1000000000ULL; + ns += (b->tv_nsec - a->tv_nsec); + return ns; +} + +static void run_test(void) +{ + unsigned int i; + struct timespec ts_start, ts_end; + + while (atomic_read(&n_ready_threads) != n_threads) { + cpu_relax(); + } + atomic_mb_set(&test_start, true); + + clock_gettime(CLOCK_MONOTONIC, &ts_start); + for (i = 0; i < n_threads; i++) { + qemu_thread_join(&threads[i]); + } + clock_gettime(CLOCK_MONOTONIC, &ts_end); + duration = ts_subtract(&ts_start, &ts_end) / 1e9; +} + +static void create_threads(void) +{ + unsigned int i; + + threads = g_new(QemuThread, n_threads); + th_info = g_new(struct thread_info, n_threads); + counts = qemu_memalign(64, sizeof(*counts) * range); + + for (i = 0; i < n_threads; i++) { + struct thread_info *info = &th_info[i]; + + info->r = (i + 1) ^ time(NULL); + qemu_thread_create(&threads[i], NULL, thread_func, info, + QEMU_THREAD_JOINABLE); + } +} + +static void pr_params(void) +{ + printf("Parameters:\n"); + printf(" # of threads: %u\n", n_threads); + printf(" n_ops: %lu\n", n_ops); + printf(" ops' range: %u\n", range); +} + +static void pr_stats(void) +{ + unsigned long long val = 0; + unsigned int i; + double tx; + + for (i = 0; i < range; i++) { + val += counts[i].val; + } + assert(val == n_threads * n_ops); + tx = val / duration / 1e6; + + printf("Results:\n"); + printf("Duration: %.2f s\n", duration); + printf(" Throughput: %.2f Mops/s\n", tx); + printf(" Throughput/thread: %.2f Mops/s/thread\n", tx / n_threads); +} + +static void parse_args(int argc, char *argv[]) +{ + unsigned long long n_ops_ull; + int c; + + for (;;) { + c = getopt(argc, argv, "hn:o:r:"); + if (c < 0) { + break; + } + switch (c) { + case 'h': + usage_complete(argv); + exit(0); + case 'n': + n_threads = atoi(optarg); + break; + case 'o': + n_ops_ull = atoll(optarg); + if (n_ops_ull > ULONG_MAX) { + fprintf(stderr, + "fatal: -o cannot be greater than %lu\n", ULONG_MAX); + exit(1); + } + n_ops = n_ops_ull; + break; + case 'r': + range = pow2ceil(atoi(optarg)); + break; + } + } +} + +int main(int argc, char *argv[]) +{ + parse_args(argc, argv); + pr_params(); + create_threads(); + run_test(); + pr_stats(); + return 0; +}