From patchwork Fri Jun  3 11:41:53 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Emilio Cota <cota@braap.org>
X-Patchwork-Id: 9152947
Return-Path: 
 <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	625E660751 for <patchwork-qemu-devel@patchwork.kernel.org>;
	Fri,  3 Jun 2016 11:42:30 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F8FD25E13
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Fri,  3 Jun 2016 11:42:30 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id 40EF828309; Fri,  3 Jun 2016 11:42:30 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 8B10C25E13
	for <patchwork-qemu-devel@patchwork.kernel.org>;
	Fri,  3 Jun 2016 11:42:29 +0000 (UTC)
Received: from localhost ([::1]:54230 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>)
	id 1b8nUh-0006PK-Uz for patchwork-qemu-devel@patchwork.kernel.org;
	Fri, 03 Jun 2016 07:42:27 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:37188)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1b8nUQ-0006PB-W1
	for qemu-devel@nongnu.org; Fri, 03 Jun 2016 07:42:12 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1b8nUK-0002C9-SV
	for qemu-devel@nongnu.org; Fri, 03 Jun 2016 07:42:09 -0400
Received: from out2-smtp.messagingengine.com ([66.111.4.26]:55295)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <cota@braap.org>) id 1b8nUI-0002BP-Ia
	for qemu-devel@nongnu.org; Fri, 03 Jun 2016 07:42:04 -0400
Received: from compute1.internal (compute1.nyi.internal [10.202.2.41])
	by mailout.nyi.internal (Postfix) with ESMTP id A51F522048;
	Fri,  3 Jun 2016 07:41:53 -0400 (EDT)
Received: from frontend1 ([10.202.2.160])
	by compute1.internal (MEProxy); Fri, 03 Jun 2016 07:41:53 -0400
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=braap.org; h=cc
	:content-type:date:from:in-reply-to:message-id:mime-version
	:references:subject:to:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=JejLZ
	DGtIgQvMFlZAC6HfGE7uX8=; b=en/aHB+ziDAXN55x7svVF0KKUAndeG/4EsLD6
	6HA8b5QpxWT7gK0mYNcR14EFd+yjm1/dzzp+z0Qy3zeEgvpki5kjFpT3GIQlHoRQ
	v3G8fyG+7S8iS2NaMgBgxp2KtH00LI2AdZCAUuvtQsm9ACkiviGUZE/MFB3SuCD7
	5FNgDc=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=
	messagingengine.com; h=cc:content-type:date:from:in-reply-to
	:message-id:mime-version:references:subject:to:x-sasl-enc
	:x-sasl-enc; s=smtpout; bh=JejLZDGtIgQvMFlZAC6HfGE7uX8=; b=LzyEJ
	aZETERr/9r8FKv4iC1I8z4J53sM0EGGIezvEpOWXG5u03gybRJB5qnuWGMK1Him9
	tXPOT3MfGFCH8LlLGX/ED4q1B6XtWPwvi/nap4xL+nZsSl5FXMSdqNswqr/+L8fT
	APxDUaGHrjzd9943Rlsnp68keRu7KBR4mDKM9E=
X-Sasl-enc: RobB5Ovd4LVqNMF3LkDsE+MxJymSLgBWww1913Om2Jtj 1464954113
Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216])
	by mail.messagingengine.com (Postfix) with ESMTPA id 52B8EF29F3;
	Fri,  3 Jun 2016 07:41:53 -0400 (EDT)
Date: Fri, 3 Jun 2016 07:41:53 -0400
From: "Emilio G. Cota" <cota@braap.org>
To: Sergey Fedorov <serge.fdrv@gmail.com>
Message-ID: <20160603114153.GD5251@flamenco>
References: <1464138802-23503-1-git-send-email-cota@braap.org>
	<1464138802-23503-13-git-send-email-cota@braap.org>
	<574B54E3.3010908@gmail.com>
MIME-Version: 1.0
Content-Disposition: inline
In-Reply-To: <574B54E3.3010908@gmail.com>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
X-Received-From: 66.111.4.26
Subject: Re: [Qemu-devel] [PATCH v6 12/15] qht: add qht-bench,
	a performance benchmark
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: MTTCG Devel <mttcg@listserver.greensocs.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Alex =?iso-8859-1?Q?Benn=E9e?= <alex.bennee@linaro.org>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Richard Henderson <rth@twiddle.net>
Errors-To: 
 qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
X-Virus-Scanned: ClamAV using ClamSMTP

On Sun, May 29, 2016 at 23:45:23 +0300, Sergey Fedorov wrote:
> On 25/05/16 04:13, Emilio G. Cota wrote:
> > diff --git a/tests/qht-bench.c b/tests/qht-bench.c
> > new file mode 100644
> > index 0000000..30d27c8
> > --- /dev/null
> > +++ b/tests/qht-bench.c
> > @@ -0,0 +1,474 @@
> (snip)
> > +static void do_rw(struct thread_info *info)
> > +{
> > +    struct thread_stats *stats = &info->stats;
> > +    uint32_t hash;
> > +    long *p;
> > +
> > +    if (info->r >= update_threshold) {
> > +        bool read;
> > +
> > +        p = &keys[info->r & (lookup_range - 1)];
> > +        hash = h(*p);
> > +        read = qht_lookup(&ht, is_equal, p, hash);
> > +        if (read) {
> > +            stats->rd++;
> > +        } else {
> > +            stats->not_rd++;
> > +        }
> > +    } else {
> > +        p = &keys[info->r & (update_range - 1)];
> > +        hash = h(*p);
> 
> The previous two lines are common for the both "if" branches. Lets move
> it above the "if".

Not quite. The mask uses lookup_range above, and update_range below.

> > +        if (info->write_op) {
> > +            bool written = false;
> > +
> > +            if (qht_lookup(&ht, is_equal, p, hash) == NULL) {
> > +                written = qht_insert(&ht, p, hash);
> > +            }
> > +            if (written) {
> > +                stats->in++;
> > +            } else {
> > +                stats->not_in++;
> > +            }
> > +        } else {
> > +            bool removed = false;
> > +
> > +            if (qht_lookup(&ht, is_equal, p, hash)) {
> > +                removed = qht_remove(&ht, p, hash);
> > +            }
> > +            if (removed) {
> > +                stats->rm++;
> > +            } else {
> > +                stats->not_rm++;
> > +            }
> > +        }
> > +        info->write_op = !info->write_op;
> > +    }
> > +}
> > +
> > +static void *thread_func(void *p)
> > +{
> > +    struct thread_info *info = p;
> > +
> > +    while (!atomic_mb_read(&test_start)) {
> > +        cpu_relax();
> > +    }
> > +
> > +    rcu_register_thread();
> 
> Shouldn't we do this before checking for 'test_start'?

From a correctness point of view it doesn't matter. But yes, it
is better to do it earlier. Changed.

> > +
> > +    rcu_read_lock();
> 
> Why don't we do rcu_read_lock()/rcu_read_unlock() inside the loop?

Because that will slow down the benchmark unnecessarily (Throughput
for single-threaded and default opts goes down from 38M/s to 35M/s).
For this benchmark we want to benchmark QHT's performance, not RCU's.
And really we're not allocating/deallocating elements dynamically,
so from a memory usage viewpoint calling this inside or outside
of the loop doesn't matter.

> > +    while (!atomic_read(&test_stop)) {
> > +        info->r = xorshift64star(info->r);
> > +        info->func(info);
> > +    }
> > +    rcu_read_unlock();
> > +
> > +    rcu_unregister_thread();
> > +    return NULL;
> > +}
> > +
> > +/* sets everything except info->func */
> > +static void prepare_thread_info(struct thread_info *info, int i)
> > +{
> > +    /* seed for the RNG; each thread should have a different one */
> > +    info->r = (i + 1) ^ time(NULL);
> > +    /* the first update will be a write */
> > +    info->write_op = true;
> > +    /* the first resize will be down */
> > +    info->resize_down = true;
> > +
> > +    memset(&info->stats, 0, sizeof(info->stats));
> > +}
> > +
> > +static void
> > +th_create_n(QemuThread **threads, struct thread_info **infos, const char *name,
> > +            void (*func)(struct thread_info *), int offset, int n)
> 
> 'offset' is not used in this function.

Good catch! Changed now:
+    prepare_thread_info(&info[i], offset + i);
The offset is passed so that each created thread has a unique
RNG seed.

> > +{
> > +    struct thread_info *info;
> > +    QemuThread *th;
> > +    int i;
> > +
> > +    th = g_malloc(sizeof(*th) * n);
> > +    *threads = th;
> > +
> > +    info = qemu_memalign(64, sizeof(*info) * n);
> > +    *infos = info;
> > +
> > +    for (i = 0; i < n; i++) {
> > +        prepare_thread_info(&info[i], i);
> > +        info[i].func = func;
> > +        qemu_thread_create(&th[i], name, thread_func, &info[i],
> > +                           QEMU_THREAD_JOINABLE);
> > +    }
> > +}
> > +
> (snip)
> > +
> > +static void run_test(void)
> > +{
> > +    unsigned int remaining;
> > +    int i;
> > +
> 
> Are we sure all the threads are ready at this point? Otherwise why
> bother with 'test_start' flag?

Good point. Added the following:


Thanks,

		Emilio

diff --git a/tests/qht-bench.c b/tests/qht-bench.c
index 885da9c..c1ed9b9 100644
--- a/tests/qht-bench.c
+++ b/tests/qht-bench.c
@@ -43,6 +43,7 @@ static unsigned long lookup_range = DEFAULT_RANGE;
 static unsigned long update_range = DEFAULT_RANGE;
 static size_t init_range = DEFAULT_RANGE;
 static size_t init_size = DEFAULT_RANGE;
+static size_t n_ready_threads;
 static long populate_offset;
 static long *keys;
 
@@ -190,6 +191,7 @@ static void *thread_func(void *p)
 
     rcu_register_thread();
 
+    atomic_inc(&n_ready_threads);
     while (!atomic_mb_read(&test_start)) {
         cpu_relax();
     }
@@ -387,6 +389,9 @@ static void run_test(void)
     unsigned int remaining;
     int i;
 
+    while (atomic_read(&n_ready_threads) != n_rw_threads + n_rz_threads) {
+        cpu_relax();
+    }
     atomic_mb_set(&test_start, true);
     do {
         remaining = sleep(duration);