From patchwork Wed Apr 6 00:52:39 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 8756881 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 91C5EC0553 for ; Wed, 6 Apr 2016 00:52:54 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id BF93D2034E for ; Wed, 6 Apr 2016 00:52:53 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D1EB720108 for ; Wed, 6 Apr 2016 00:52:52 +0000 (UTC) Received: from localhost ([::1]:40070 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anbiG-0002CG-58 for patchwork-qemu-devel@patchwork.kernel.org; Tue, 05 Apr 2016 20:52:52 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58513) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anbi9-0002CB-01 for qemu-devel@nongnu.org; Tue, 05 Apr 2016 20:52:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1anbi5-0003FG-Pg for qemu-devel@nongnu.org; Tue, 05 Apr 2016 20:52:44 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:59654) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1anbi5-0003F4-LO for qemu-devel@nongnu.org; Tue, 05 Apr 2016 20:52:41 -0400 Received: from compute6.internal (compute6.nyi.internal [10.202.2.46]) by mailout.nyi.internal (Postfix) with ESMTP id 931D820C81 for ; Tue, 5 Apr 2016 20:52:39 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute6.internal (MEProxy); Tue, 05 Apr 2016 20:52:39 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=braap.org; h=cc :content-type:date:from:in-reply-to:message-id:mime-version :references:subject:to:x-sasl-enc:x-sasl-enc; s=mesmtp; bh=aQaJk eK8XoBd8ew8IbH6Z5DDSiw=; b=LoKghJkU+xMPbwEuT9THExtGvsfseXwQhcs/D OY0Xoq35v9CN6fc0bDb851O3Q5bKH1lRTLY7iq38T9Dl2iD7HOHI7mxxSjLcjzUs /oaXKttDmv+Z3eITIqOTRfiOwy+yraCHTyTCjgy61WE4NNuIeksxE6Nz5BpntG2H jT6Hn4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-sasl-enc :x-sasl-enc; s=smtpout; bh=aQaJkeK8XoBd8ew8IbH6Z5DDSiw=; b=CyPgZ j94mHGqUL9qR65lXd6qDKf54wIuTRDCmUqWm9RjoqdoZdZMS0tf86v2jyOOmm7lQ UsofYaBSgIXhPEGFtalItThVH39N7DXfb5nKq6myZ8LgfY1YBhpyn6FZG7DZdyBF 9W7+ez75YTqGxz5InGBqlFWh+cqhdnFhplm09o= X-Sasl-enc: 71koC/HNXGftBmS2cFh3fZtSO33kHJwYy+ltmrFow0gl 1459903959 Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 4E8CEC00014; Tue, 5 Apr 2016 20:52:39 -0400 (EDT) Date: Tue, 5 Apr 2016 20:52:39 -0400 From: "Emilio G. Cota" To: Richard Henderson Message-ID: <20160406005239.GA25081@flamenco> References: <1459834253-8291-1-git-send-email-cota@braap.org> <1459834253-8291-8-git-send-email-cota@braap.org> <5703DCB7.50302@twiddle.net> <5703DE37.3080306@redhat.com> <5703E2DD.3020103@twiddle.net> <20160405194028.GA6671@flamenco> <5704293D.1070105@twiddle.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <5704293D.1070105@twiddle.net> User-Agent: Mutt/1.5.23 (2014-03-12) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 66.111.4.29 Cc: MTTCG Devel , Peter Maydell , Peter Crosthwaite , QEMU Developers , Sergey Fedorov , Paolo Bonzini , Alex =?iso-8859-1?Q?Benn=E9e?= Subject: Re: [Qemu-devel] [PATCH 07/10] tb hash: hash phys_pc, pc, and flags with xxhash X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org X-Spam-Status: No, score=-6.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Tue, Apr 05, 2016 at 14:08:13 -0700, Richard Henderson wrote: > But the point is that we can do better than dropping data into memory. > Particularly for those hosts that do not support unaligned data, such as you > created with the packed structure. If we made sure the fields in the struct were in the right order (larger fields first), this shouldn't be an issue. Anyway I took your proposal and implemented the patch below. FWIW I cannot measure a perf. difference between this and the packed struct for arm-softmmu (i.e. 16 bytes) on an x86_64 host. How does the appended look? Thanks, E. commit af92a0690f49172621cd8b80759e3ca567d43567 Author: Emilio G. Cota Date: Tue Apr 5 18:06:21 2016 -0400 rth Signed-off-by: Emilio G. Cota diff --git a/include/exec/tb-hash.h b/include/exec/tb-hash.h index 6b97a7c..349a856 100644 --- a/include/exec/tb-hash.h +++ b/include/exec/tb-hash.h @@ -45,19 +45,124 @@ static inline unsigned int tb_jmp_cache_hash_func(target_ulong pc) | (tmp & TB_JMP_ADDR_MASK)); } -static inline -uint32_t tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, int flags) +static inline uint32_t h32_finish(uint32_t h32) { - struct { - tb_page_addr_t phys_pc; - target_ulong pc; - int flags; - } QEMU_PACKED k; - - k.phys_pc = phys_pc; - k.pc = pc; - k.flags = flags; - return qemu_xxh32((uint32_t *)&k, sizeof(k) / sizeof(uint32_t), 1); + h32 ^= h32 >> 15; + h32 *= PRIME32_2; + h32 ^= h32 >> 13; + h32 *= PRIME32_3; + h32 ^= h32 >> 16; + + return h32; +} + +static inline uint32_t tb_hash_func3(uint32_t a, uint32_t b, uint32_t c, int seed) +{ + uint32_t h32 = seed + PRIME32_5; + + h32 += 12; + + h32 += a * PRIME32_3; + h32 = XXH_rotl32(h32, 17) * PRIME32_4; + + h32 += b * PRIME32_3; + h32 = XXH_rotl32(h32, 17) * PRIME32_4; + + h32 += c * PRIME32_3; + h32 = XXH_rotl32(h32, 17) * PRIME32_4; + + return h32_finish(h32); +} + +static inline uint32_t tb_hash_func4(uint64_t a0, uint32_t c, uint32_t d, int seed) +{ + uint32_t v1 = seed + PRIME32_1 + PRIME32_2; + uint32_t v2 = seed + PRIME32_2; + uint32_t v3 = seed + 0; + uint32_t v4 = seed - PRIME32_1; + uint32_t a = a0 >> 31 >> 1; + uint32_t b = a0; + uint32_t h32; + + v1 += a * PRIME32_2; + v1 = XXH_rotl32(v1, 13); + v1 *= PRIME32_1; + + v2 += b * PRIME32_2; + v2 = XXH_rotl32(v2, 13); + v2 *= PRIME32_1; + + v3 += c * PRIME32_2; + v3 = XXH_rotl32(v3, 13); + v3 *= PRIME32_1; + + v4 += d * PRIME32_2; + v4 = XXH_rotl32(v4, 13); + v4 *= PRIME32_1; + + h32 = XXH_rotl32(v1, 1) + XXH_rotl32(v2, 7) + XXH_rotl32(v3, 12) + + XXH_rotl32(v4, 18); + h32 += 16; + + return h32_finish(h32); +} + +static inline uint32_t tb_hash_func5(uint64_t a0, uint64_t b0, uint32_t e, int seed) +{ + uint32_t v1 = seed + PRIME32_1 + PRIME32_2; + uint32_t v2 = seed + PRIME32_2; + uint32_t v3 = seed + 0; + uint32_t v4 = seed - PRIME32_1; + uint32_t a = a0 >> 31 >> 1; + uint32_t b = a0; + uint32_t c = b0 >> 31 >> 1; + uint32_t d = b0; + uint32_t h32; + + v1 += a * PRIME32_2; + v1 = XXH_rotl32(v1, 13); + v1 *= PRIME32_1; + + v2 += b * PRIME32_2; + v2 = XXH_rotl32(v2, 13); + v2 *= PRIME32_1; + + v3 += c * PRIME32_2; + v3 = XXH_rotl32(v3, 13); + v3 *= PRIME32_1; + + v4 += d * PRIME32_2; + v4 = XXH_rotl32(v4, 13); + v4 *= PRIME32_1; + + h32 = XXH_rotl32(v1, 1) + XXH_rotl32(v2, 7) + XXH_rotl32(v3, 12) + + XXH_rotl32(v4, 18); + h32 += 20; + + h32 += e * PRIME32_3; + h32 = XXH_rotl32(h32, 17) * PRIME32_4; + + return h32_finish(h32); +} + +static __attribute__((noinline)) +unsigned tb_hash_func(tb_page_addr_t phys_pc, target_ulong pc, int flags) +{ +#if TARGET_LONG_BITS == 64 + + if (sizeof(phys_pc) == sizeof(pc)) { + return tb_hash_func5(phys_pc, pc, flags, 1); + } + return tb_hash_func4(pc, phys_pc, flags, 1); + +#else /* 32-bit target */ + + if (sizeof(phys_pc) > sizeof(pc)) { + return tb_hash_func4(phys_pc, pc, flags, 1); + } + return tb_hash_func3(pc, phys_pc, flags, 1); + +#endif /* TARGET_LONG_BITS */ } #endif