From patchwork Fri May 6 23:00:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 9037011 Return-Path: X-Original-To: patchwork-linux-sh@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id E53429F1D3 for ; Fri, 6 May 2016 23:00:57 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 08CB82010E for ; Fri, 6 May 2016 23:00:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 104EF20173 for ; Fri, 6 May 2016 23:00:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758958AbcEFXAt (ORCPT ); Fri, 6 May 2016 19:00:49 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:55334 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758874AbcEFXAq (ORCPT ); Fri, 6 May 2016 19:00:46 -0400 Received: from akpm3.mtv.corp.google.com (unknown [104.132.1.65]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 3A071307; Fri, 6 May 2016 23:00:42 +0000 (UTC) Date: Fri, 6 May 2016 16:00:41 -0700 From: Andrew Morton To: zengzhaoxiu@163.com Cc: linux@horizon.com, peterz@infradead.org, jjuran@gmail.com, James.Bottomley@HansenPartnership.com, geert@linux-m68k.org, dalias@libc.org, sam@ravnborg.org, davem@davemloft.net, Zhaoxiu Zeng , Richard Henderson , Ivan Kokshaysky , Matt Turner , Vineet Gupta , Russell King , Yoshinori Sato , James Hogan , Michal Simek , Ralf Baechle , Ley Foon Tan , Jonas Bonn , "James E.J. Bottomley" , Helge Deller , Martin Schwidefsky , Heiko Carstens , Chen Liqin , Lennox Wu , linux-kernel@vger.kernel.org, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, uclinux-h8-devel@lists.sourceforge.jp, linux-m68k@vger.kernel.org, linux-metag@vger.kernel.org, linux-mips@linux-mips.org, nios2-dev@lists.rocketboards.org, linux@lists.openrisc.net, linux-parisc@vger.kernel.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org Subject: Re: [patch V4] lib: GCD: Use binary GCD algorithm instead of Euclidean Message-Id: <20160506160041.2b5e47757329c288efaed4fb@linux-foundation.org> In-Reply-To: <1462527763-15301-1-git-send-email-zengzhaoxiu@163.com> References: <1462527763-15301-1-git-send-email-zengzhaoxiu@163.com> X-Mailer: Sylpheed 3.4.1 (GTK+ 2.24.23; x86_64-pc-linux-gnu) Mime-Version: 1.0 Sender: linux-sh-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sh@vger.kernel.org X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 6 May 2016 17:42:42 +0800 zengzhaoxiu@163.com wrote: > From: Zhaoxiu Zeng > > The binary GCD algorithm is based on the following facts: > 1. If a and b are all evens, then gcd(a,b) = 2 * gcd(a/2, b/2) > 2. If a is even and b is odd, then gcd(a,b) = gcd(a/2, b) > 3. If a and b are all odds, then gcd(a,b) = gcd((a-b)/2, b) = gcd((a+b)/2, b) > > Even on x86 machines with reasonable division hardware, the binary > algorithm runs about 25% faster (80% the execution time) than the > division-based Euclidian algorithm. > > On platforms like Alpha and ARMv6 where division is a function call to > emulation code, it's even more significant. > > There are two variants of the code here, depending on whether a > fast __ffs (find least significant set bit) instruction is available. > This allows the unpredictable branches in the bit-at-a-time shifting > loop to be eliminated. > > ... > > --- a/arch/alpha/Kconfig > +++ b/arch/alpha/Kconfig > @@ -27,6 +27,7 @@ config ALPHA > select MODULES_USE_ELF_RELA > select ODD_RT_SIGACTION > select OLD_SIGSUSPEND > + select CPU_NO_EFFICIENT_FFS if !ALPHA_EV67 > help argh. Please don't always put new items at end-of-list. That's the perfect way of maximizing the number of patch collisions. Insert it at a random position. avoiding the end (if the list isn't alpha-sorted, which it should be). > > ... > > --- a/arch/mips/include/asm/cpu-features.h > +++ b/arch/mips/include/asm/cpu-features.h > @@ -180,6 +180,16 @@ > #endif > #endif > > +/* __builtin_constant_p(cpu_has_mips_r) && cpu_has_mips_r */ > +#if !((defined(cpu_has_mips32r1) && cpu_has_mips32r1) || \ > + (defined(cpu_has_mips32r2) && cpu_has_mips32r2) || \ > + (defined(cpu_has_mips32r6) && cpu_has_mips32r6) || \ > + (defined(cpu_has_mips64r1) && cpu_has_mips64r1) || \ > + (defined(cpu_has_mips64r2) && cpu_has_mips64r2) || \ > + (defined(cpu_has_mips64r6) && cpu_has_mips64r6)) > +#define CONFIG_CPU_NO_EFFICIENT_FFS 1 > +#endif #defining a CONFIG_ variable is pretty rude - defining these is the role of the Kconfig system, not of header files macros. This was easy: --- a/arch/mips/include/asm/cpu-features.h~lib-gcd-use-binary-gcd-algorithm-instead-of-euclidean-fix +++ a/arch/mips/include/asm/cpu-features.h @@ -187,7 +187,7 @@ (defined(cpu_has_mips64r1) && cpu_has_mips64r1) || \ (defined(cpu_has_mips64r2) && cpu_has_mips64r2) || \ (defined(cpu_has_mips64r6) && cpu_has_mips64r6)) -#define CONFIG_CPU_NO_EFFICIENT_FFS 1 +#define CPU_NO_EFFICIENT_FFS 1 #endif #ifndef cpu_has_mips_1 --- a/lib/gcd.c~lib-gcd-use-binary-gcd-algorithm-instead-of-euclidean-fix +++ a/lib/gcd.c @@ -10,7 +10,7 @@ * has decent hardware division. */ -#if !defined(CONFIG_CPU_NO_EFFICIENT_FFS) +#if !defined(CONFIG_CPU_NO_EFFICIENT_FFS) && !defined(CPU_NO_EFFICIENT_FFS) /* If __ffs is available, the even/odd algorithm benchmarks slower. */ unsigned long gcd(unsigned long a, unsigned long b)