From patchwork Fri Apr 22 19:59:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 8915441 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 0EDFFBF29F for ; Fri, 22 Apr 2016 20:00:23 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id B8CD520221 for ; Fri, 22 Apr 2016 20:00:21 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3C2152013A for ; Fri, 22 Apr 2016 20:00:16 +0000 (UTC) Received: from localhost ([::1]:38891 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1athFO-0005dJ-El for patchwork-qemu-devel@patchwork.kernel.org; Fri, 22 Apr 2016 16:00:14 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40716) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1athFA-0005Ty-Ji for qemu-devel@nongnu.org; Fri, 22 Apr 2016 16:00:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1athF7-0001xx-AM for qemu-devel@nongnu.org; Fri, 22 Apr 2016 16:00:00 -0400 Received: from mail-qk0-x243.google.com ([2607:f8b0:400d:c09::243]:33532) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1athF7-0001xs-3K for qemu-devel@nongnu.org; Fri, 22 Apr 2016 15:59:57 -0400 Received: by mail-qk0-x243.google.com with SMTP id q184so6458194qkf.0 for ; Fri, 22 Apr 2016 12:59:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:subject:reply-to:references:to:cc:from:message-id:date :user-agent:mime-version:in-reply-to; bh=RIX1ksg8qIMgtH3/2r3e94uHiA6cIYHpkCdoXo9Aqxg=; b=w07Hi2TPep/xoB4RflgcIs+iVYuuyy5C5LKgVvJ0fpkrdXF3LxcH++gVtbhGVV8bN1 0N/lenaky75ar0q1ubMpEtgHx6bi7A4TQvw4iw/kqthovQzE3F8AejnZw+dSb6FVcN52 FHyY/yghCOchpQrwaqmx2AEFVZSVJqXsJ9OF51OoGFIB6Y3wfD1p/SPqNqPmeB8h2e7C Da0RW8t4SGrTTikepfWCC9MMKfsl0pTIunTRDCHmH6Tgo1W4FdclFO1u5GEJ6hYH0vXS 8ysv726tpZ7hnTK27WfNttho7DzTLA/MN+Gt1xU3Jw9jFLeC5TCKM0KiUn1EIqL6trU7 bfvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:subject:reply-to:references:to:cc:from :message-id:date:user-agent:mime-version:in-reply-to; bh=RIX1ksg8qIMgtH3/2r3e94uHiA6cIYHpkCdoXo9Aqxg=; b=AYzyphCR1ycIbs8N+Qmdh0nCxAczA978uef+aKpecKvSHpOvIMDx+iPkurgnsGZERD LApImx4OZBFl2+v0mBQoGd/Ukm5w0ciVadZ/N2/FLntQzWSgEvBO8uj6swdDa6FfDflx pl0C9EYghinkQzlj7cAUYKRfBlNLUL11p2eb/gMpZSMRblWPkV0mAtiFxkp1Yq0O1USJ 1LvYqIxxHxkcWp0WBaETAJIsu+UvsYAvxLSFxSG/eVDmQcbUusS2IVDxbhA22oLPm4Iz Mg4Iy/0jgsDcpM/M+DYT/LlWpBR49Sgc9+IfUvKOdq/G17pctrTh9AIve9n/fiEQqz85 e7Dw== X-Gm-Message-State: AOPr4FWB7ydJgKTbtjbmAvPWoy7D6lq0pjYKHcuKuhmH2w6ZIyOsEwtU0HUrCXFW4Kz1Zg== X-Received: by 10.55.21.34 with SMTP id f34mr11854398qkh.101.1461355196519; Fri, 22 Apr 2016 12:59:56 -0700 (PDT) Received: from anchor.twiddle.net (50-194-63-110-static.hfc.comcastbusiness.net. [50.194.63.110]) by smtp.googlemail.com with ESMTPSA id n1sm2881189qkn.3.2016.04.22.12.59.54 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 22 Apr 2016 12:59:55 -0700 (PDT) References: <1461107270-19234-1-git-send-email-cota@braap.org> <1461107270-19234-12-git-send-email-cota@braap.org> <571A6245.4070209@twiddle.net> To: "Emilio G. Cota" , QEMU Developers , MTTCG Devel From: Richard Henderson X-Enigmail-Draft-Status: N1110 Message-ID: <571A82B8.5080908@twiddle.net> Date: Fri, 22 Apr 2016 12:59:52 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.1 MIME-Version: 1.0 In-Reply-To: <571A6245.4070209@twiddle.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2607:f8b0:400d:c09::243 Subject: Re: [Qemu-devel] [PATCH v3 11/11] translate-all: add tb hash bucket info to 'info jit' dump X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Richard Henderson Cc: Peter Maydell , Paolo Bonzini , Sergey Fedorov , =?UTF-8?Q?Alex_Benn=c3=a9e?= , Peter Crosthwaite Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 04/22/2016 10:41 AM, Richard Henderson wrote: > On 04/19/2016 04:07 PM, Emilio G. Cota wrote: >> + ht_avg_len = qht_avg_bucket_chain_length(&tcg_ctx.tb_ctx.htable, &ht_heads); >> + cpu_fprintf(f, "TB hash avg chain %0.5f buckets\n", ht_avg_len); >> + cpu_fprintf(f, "TB hash size %zu head buckets\n", ht_heads); > > > I think the accounting is questionable here. > > Consider the following data: > > TB count 230467/671088 > TB invalidate count 25915 > TB hash avg chain 1.03073 buckets > TB hash size 131072 head buckets > > This means that we've got 230467 - 25915 = 204552 active TB's, installed into a > hash table with 131072 heads. For a perfectly uniform distribution of TBs, > that would be an average chain length of 204552 / 131072 = 1.56. > > In order to get the average down to 1.03, there need to be a substantial number > of heads with zero entries. > > I think perhaps it might be more enlightening to separately account for empty > and non-empty heads. E.g. > > TB hash buckets used xxxx/131072 > TB hash avg chain yyyy > > where xxxx is the number of non-empty heads, and yyyy = |TBs| / xxxx. > > I also wonder if it wouldn't be better to size the hash table as appropriate > for the maximum number of allowable TBs. FWIW, so that I could get an idea of how the stats change as we improve the hashing, I inserted the attachment 1 patch between patches 5 and 6, and with attachment 2 attempting to fix the accounting for patches 9 and 10. For booting an alpha kernel to login prompt: Before hashing changes (@5/11) TB count 175363/671088 TB invalidate count 3996 TB hash buckets 31731/32768 TB hash avg chain 5.289 max=59 After xxhash patch (@7/11) TB hash buckets 32582/32768 TB hash avg chain 5.260 max=18 So far so good! After qht patches (@11/11) TB hash buckets 94360/131072 TB hash avg chain 1.774 max=8 Do note that those last numbers are off: 1.774 avg * 94360 used buckets = 167394 total entries, which is far from 171367, the correct number of total entries. I'm tempted to pull over gcc's non-chaining hash table implementation (libiberty/hashtab.c, still gplv2+) and compare... r~ >From cdc7b3631fd78bd2e31d2823f7543e2a56681149 Mon Sep 17 00:00:00 2001 From: Richard Henderson Date: Fri, 22 Apr 2016 11:28:52 -0700 Subject: translate-all: Add hashtable accounting to info jit Dump hash table occupancy numbers with "info jit". Signed-off-by: Richard Henderson >From 7f1d677f3d085b5891e1adbd5f602185d68ba81a Mon Sep 17 00:00:00 2001 From: Richard Henderson Date: Fri, 22 Apr 2016 12:50:00 -0700 Subject: fixup to {09,10,11}/11 diff --git a/include/qemu/qht.h b/include/qemu/qht.h index a0a1aa8..2d0b58f 100644 --- a/include/qemu/qht.h +++ b/include/qemu/qht.h @@ -49,6 +49,14 @@ void qht_grow(struct qht *ht); void *qht_lookup(struct qht *ht, qht_lookup_func_t func, const void *userp, uint32_t hash); -double qht_avg_bucket_chain_length(struct qht *ht, size_t *n_head_buckets); + +struct qht_stats { + size_t used_buckets; + size_t max_buckets; + size_t used_entries; + size_t max_chain; +}; + +struct qht_stats qht_statistics(struct qht *ht); #endif /* QHT_H */ diff --git a/translate-all.c b/translate-all.c index 3b73b46..a9ceb0a 100644 --- a/translate-all.c +++ b/translate-all.c @@ -1664,8 +1664,7 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) { size_t target_code_size, max_target_code_size; unsigned direct_jmp_count, direct_jmp2_count, cross_page; - unsigned used_buckets, max_chain, hash_tbs; - TranslationBlock *tb; + struct qht_stats hinfo; int i; target_code_size = 0; @@ -1673,35 +1672,23 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) cross_page = 0; direct_jmp_count = 0; direct_jmp2_count = 0; - used_buckets = 0; - hash_tbs = 0; - max_chain = 0; - - for (i = 0; i < CODE_GEN_PHYS_HASH_SIZE; i++) { - if (tcg_ctx.tb_ctx.tb_phys_hash[i]) { - unsigned this_chain = 0; - for (tb = tcg_ctx.tb_ctx.tb_phys_hash[i]; tb != NULL; - tb = tb->phys_hash_next) { - this_chain++; - hash_tbs++; - target_code_size += tb->size; - if (tb->page_addr[1] != -1) { - cross_page++; - } - if (tb->tb_next_offset[0] != 0xffff) { - direct_jmp_count++; - if (tb->tb_next_offset[1] != 0xffff) { - direct_jmp2_count++; - } - } - } - if (this_chain > max_chain) { - max_chain = this_chain; + + for (i = 0; i < tcg_ctx.tb_ctx.nb_tbs; i++) { + const TranslationBlock *tb = &tcg_ctx.tb_ctx.tbs[i]; + target_code_size += tb->size; + if (tb->page_addr[1] != -1) { + cross_page++; + } + if (tb->tb_next_offset[0] != 0xffff) { + direct_jmp_count++; + if (tb->tb_next_offset[1] != 0xffff) { + direct_jmp2_count++; } - used_buckets++; } } - assert(hash_tbs == + + hinfo = qht_statistics(&tcg_ctx.tb_ctx.htable); + assert(hinfo.used_entries == tcg_ctx.tb_ctx.nb_tbs - tcg_ctx.tb_ctx.tb_phys_invalidate_count); cpu_fprintf(f, "Translation buffer state:\n"); @@ -1731,11 +1718,12 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) direct_jmp2_count, tcg_ctx.tb_ctx.nb_tbs ? (direct_jmp2_count * 100) / tcg_ctx.tb_ctx.nb_tbs : 0); - cpu_fprintf(f, "TB hash buckets %u/%d\n", - used_buckets, CODE_GEN_PHYS_HASH_SIZE); - cpu_fprintf(f, "TB hash avg chain %0.3f max=%u\n", - used_buckets ? (double)hash_tbs / used_buckets : 0.0, - max_chain); + cpu_fprintf(f, "TB hash buckets %zu/%zu\n", + hinfo.used_buckets, hinfo.max_buckets); + cpu_fprintf(f, "TB hash avg chain %0.3f max=%zu\n", + hinfo.used_buckets + ? (double)hinfo.used_entries / hinfo.used_buckets : 0.0, + hinfo.max_chain); cpu_fprintf(f, "\nStatistics:\n"); cpu_fprintf(f, "TB flush count %d\n", tcg_ctx.tb_ctx.tb_flush_count); cpu_fprintf(f, "TB invalidate count %d\n", diff --git a/util/qht.c b/util/qht.c index 05ea5e8..535057b 100644 --- a/util/qht.c +++ b/util/qht.c @@ -556,35 +556,45 @@ void qht_grow(struct qht *ht) * value should be close to 1. * Note that each bucket tracks up to QHT_BUCKET_ENTRIES items. */ -double qht_avg_bucket_chain_length(struct qht *ht, size_t *n_head_buckets) +struct qht_stats qht_statistics(struct qht *ht) { struct qht_map *map; - size_t count = 0; - size_t i; + struct qht_stats s = {}; + size_t i, n; map = atomic_read(&ht->map); /* paired with smp_wmb() before setting ht->map */ smp_rmb(); + s.max_buckets = n = map->n; - for (i = 0; i < map->n; i++) { + for (i = 0; i < n; i++) { struct qht_bucket *head = &map->buckets[i]; struct qht_bucket *b; - size_t bucket_count; + size_t this_chain; uint32_t version; do { version = seqlock_read_begin(&head->sequence); - bucket_count = 0; + this_chain = 0; b = head; do { - bucket_count++; + int j; + for (j = 0; j < QHT_BUCKET_ENTRIES; j++) { + if (b->hashes[j]) { + this_chain++; + } + } b = b->next; } while (b); } while (seqlock_read_retry(&head->sequence, version)); - count += bucket_count; - } - if (n_head_buckets) { - *n_head_buckets = map->n; + if (this_chain != 0) { + s.used_entries += this_chain; + if (s.max_chain < this_chain) { + s.max_chain = this_chain; + } + s.used_buckets++; + } } - return (double)count / map->n; + + return s; } diff --git a/translate-all.c b/translate-all.c index 1a8f68b..ed296d5 100644 --- a/translate-all.c +++ b/translate-all.c @@ -1671,39 +1671,55 @@ void tb_flush_jmp_cache(CPUState *cpu, target_ulong addr) void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) { - int i, target_code_size, max_target_code_size; - int direct_jmp_count, direct_jmp2_count, cross_page; + size_t target_code_size, max_target_code_size; + unsigned direct_jmp_count, direct_jmp2_count, cross_page; + unsigned used_buckets, max_chain, hash_tbs; TranslationBlock *tb; + int i; target_code_size = 0; max_target_code_size = 0; cross_page = 0; direct_jmp_count = 0; direct_jmp2_count = 0; - for (i = 0; i < tcg_ctx.tb_ctx.nb_tbs; i++) { - tb = &tcg_ctx.tb_ctx.tbs[i]; - target_code_size += tb->size; - if (tb->size > max_target_code_size) { - max_target_code_size = tb->size; - } - if (tb->page_addr[1] != -1) { - cross_page++; - } - if (tb->tb_next_offset[0] != 0xffff) { - direct_jmp_count++; - if (tb->tb_next_offset[1] != 0xffff) { - direct_jmp2_count++; + used_buckets = 0; + hash_tbs = 0; + max_chain = 0; + + for (i = 0; i < CODE_GEN_PHYS_HASH_SIZE; i++) { + if (tcg_ctx.tb_ctx.tb_phys_hash[i]) { + unsigned this_chain = 0; + for (tb = tcg_ctx.tb_ctx.tb_phys_hash[i]; tb != NULL; + tb = tb->phys_hash_next) { + this_chain++; + hash_tbs++; + target_code_size += tb->size; + if (tb->page_addr[1] != -1) { + cross_page++; + } + if (tb->tb_next_offset[0] != 0xffff) { + direct_jmp_count++; + if (tb->tb_next_offset[1] != 0xffff) { + direct_jmp2_count++; + } + } } + if (this_chain > max_chain) { + max_chain = this_chain; + } + used_buckets++; } } - /* XXX: avoid using doubles ? */ + assert(hash_tbs == + tcg_ctx.tb_ctx.nb_tbs - tcg_ctx.tb_ctx.tb_phys_invalidate_count); + cpu_fprintf(f, "Translation buffer state:\n"); cpu_fprintf(f, "gen code size %td/%zd\n", tcg_ctx.code_gen_ptr - tcg_ctx.code_gen_buffer, tcg_ctx.code_gen_highwater - tcg_ctx.code_gen_buffer); cpu_fprintf(f, "TB count %d/%d\n", tcg_ctx.tb_ctx.nb_tbs, tcg_ctx.code_gen_max_blocks); - cpu_fprintf(f, "TB avg target size %d max=%d bytes\n", + cpu_fprintf(f, "TB avg target size %zd max=%zd bytes\n", tcg_ctx.tb_ctx.nb_tbs ? target_code_size / tcg_ctx.tb_ctx.nb_tbs : 0, max_target_code_size); @@ -1717,13 +1733,18 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf) cpu_fprintf(f, "cross page TB count %d (%d%%)\n", cross_page, tcg_ctx.tb_ctx.nb_tbs ? (cross_page * 100) / tcg_ctx.tb_ctx.nb_tbs : 0); - cpu_fprintf(f, "direct jump count %d (%d%%) (2 jumps=%d %d%%)\n", + cpu_fprintf(f, "direct jump count %u (%u%%) (2 jumps=%u %u%%)\n", direct_jmp_count, tcg_ctx.tb_ctx.nb_tbs ? (direct_jmp_count * 100) / tcg_ctx.tb_ctx.nb_tbs : 0, direct_jmp2_count, tcg_ctx.tb_ctx.nb_tbs ? (direct_jmp2_count * 100) / tcg_ctx.tb_ctx.nb_tbs : 0); + cpu_fprintf(f, "TB hash buckets %u/%d\n", + used_buckets, CODE_GEN_PHYS_HASH_SIZE); + cpu_fprintf(f, "TB hash avg chain %0.3f max=%u\n", + used_buckets ? (double)hash_tbs / used_buckets : 0.0, + max_chain); cpu_fprintf(f, "\nStatistics:\n"); cpu_fprintf(f, "TB flush count %d\n", tcg_ctx.tb_ctx.tb_flush_count); cpu_fprintf(f, "TB invalidate count %d\n",