From patchwork Fri May 31 09:51:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 10969873 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E6A681398 for ; Fri, 31 May 2019 09:53:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CE960289EB for ; Fri, 31 May 2019 09:53:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BDC11289F3; Fri, 31 May 2019 09:53:20 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 3E289289EB for ; Fri, 31 May 2019 09:53:20 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hWeCe-0005Sb-G6; Fri, 31 May 2019 09:52:00 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1hWeCd-0005SW-NI for xen-devel@lists.xenproject.org; Fri, 31 May 2019 09:51:59 +0000 X-Inumbo-ID: ba30a4b4-8389-11e9-bcd3-d7a7469ff86f Received: from prv1-mh.provo.novell.com (unknown [137.65.248.33]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id ba30a4b4-8389-11e9-bcd3-d7a7469ff86f; Fri, 31 May 2019 09:51:57 +0000 (UTC) Received: from INET-PRV1-MTA by prv1-mh.provo.novell.com with Novell_GroupWise; Fri, 31 May 2019 03:51:56 -0600 Message-Id: <5CF0F9360200007800233E01@prv1-mh.provo.novell.com> X-Mailer: Novell GroupWise Internet Agent 18.1.1 Date: Fri, 31 May 2019 03:51:50 -0600 From: "Jan Beulich" To: "xen-devel" References: <5CF0F8530200007800233DE0@prv1-mh.provo.novell.com> In-Reply-To: <5CF0F8530200007800233DE0@prv1-mh.provo.novell.com> Mime-Version: 1.0 Content-Disposition: inline Subject: [Xen-devel] [PATCH 1/4] bitops: speed up hweight() X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Stefano Stabellini , WeiLiu , Konrad Rzeszutek Wilk , George Dunlap , Andrew Cooper , Ian Jackson , Tim Deegan , Julien Grall Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Algorithmically this gets us in line with current Linux, where the same change did happen about 13 years ago. See in particular Linux commits f9b4192923 ("bitops: hweight() speedup") and 0136611c62 ("optimize hweight64 for x86_64"). Kconfig changes for actually setting HAVE_FAST_MULTIPLY will follow. Take the opportunity and change generic_hweight64()'s return type to unsigned int. Suggested-by: Andrew Cooper Signed-off-by: Jan Beulich Reviewed-by: Andrew Cooper --- a/xen/common/Kconfig +++ b/xen/common/Kconfig @@ -31,6 +31,9 @@ config HAS_DEVICE_TREE config HAS_EX_TABLE bool +config HAS_FAST_MULTIPLY + bool + config MEM_ACCESS_ALWAYS_ON bool --- a/xen/include/xen/bitops.h +++ b/xen/include/xen/bitops.h @@ -153,41 +153,54 @@ static __inline__ int get_count_order(un static inline unsigned int generic_hweight32(unsigned int w) { - unsigned int res = (w & 0x55555555) + ((w >> 1) & 0x55555555); - res = (res & 0x33333333) + ((res >> 2) & 0x33333333); - res = (res & 0x0F0F0F0F) + ((res >> 4) & 0x0F0F0F0F); - res = (res & 0x00FF00FF) + ((res >> 8) & 0x00FF00FF); - return (res & 0x0000FFFF) + ((res >> 16) & 0x0000FFFF); + w -= (w >> 1) & 0x55555555; + w = (w & 0x33333333) + ((w >> 2) & 0x33333333); + w = (w + (w >> 4)) & 0x0f0f0f0f; + +#ifdef CONFIG_HAS_FAST_MULTIPLY + return (w * 0x01010101) >> 24; +#else + w += w >> 8; + + return (w + (w >> 16)) & 0xff; +#endif } static inline unsigned int generic_hweight16(unsigned int w) { - unsigned int res = (w & 0x5555) + ((w >> 1) & 0x5555); - res = (res & 0x3333) + ((res >> 2) & 0x3333); - res = (res & 0x0F0F) + ((res >> 4) & 0x0F0F); - return (res & 0x00FF) + ((res >> 8) & 0x00FF); + w -= ((w >> 1) & 0x5555); + w = (w & 0x3333) + ((w >> 2) & 0x3333); + w = (w + (w >> 4)) & 0x0f0f; + + return (w + (w >> 8)) & 0xff; } static inline unsigned int generic_hweight8(unsigned int w) { - unsigned int res = (w & 0x55) + ((w >> 1) & 0x55); - res = (res & 0x33) + ((res >> 2) & 0x33); - return (res & 0x0F) + ((res >> 4) & 0x0F); + w -= ((w >> 1) & 0x55); + w = (w & 0x33) + ((w >> 2) & 0x33); + + return (w + (w >> 4)) & 0x0f; } -static inline unsigned long generic_hweight64(__u64 w) +static inline unsigned int generic_hweight64(uint64_t w) { #if BITS_PER_LONG < 64 return generic_hweight32((unsigned int)(w >> 32)) + generic_hweight32((unsigned int)w); #else - u64 res; - res = (w & 0x5555555555555555ul) + ((w >> 1) & 0x5555555555555555ul); - res = (res & 0x3333333333333333ul) + ((res >> 2) & 0x3333333333333333ul); - res = (res & 0x0F0F0F0F0F0F0F0Ful) + ((res >> 4) & 0x0F0F0F0F0F0F0F0Ful); - res = (res & 0x00FF00FF00FF00FFul) + ((res >> 8) & 0x00FF00FF00FF00FFul); - res = (res & 0x0000FFFF0000FFFFul) + ((res >> 16) & 0x0000FFFF0000FFFFul); - return (res & 0x00000000FFFFFFFFul) + ((res >> 32) & 0x00000000FFFFFFFFul); + w -= (w >> 1) & 0x5555555555555555ul; + w = (w & 0x3333333333333333ul) + ((w >> 2) & 0x3333333333333333ul); + w = (w + (w >> 4)) & 0x0f0f0f0f0f0f0f0ful; + +# ifdef CONFIG_HAS_FAST_MULTIPLY + return (w * 0x0101010101010101ul) >> 56; +# else + w += w >> 8; + w += w >> 16; + + return (w + (w >> 32)) & 0xFF; +# endif #endif }