From patchwork Wed Aug 15 12:02:43 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Markus F.X.J. Oberhumer" X-Patchwork-Id: 1325481 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork2.kernel.org (Postfix) with ESMTP id D3F56DFFED for ; Wed, 15 Aug 2012 12:35:41 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1T1cml-0001pb-OL; Wed, 15 Aug 2012 12:33:19 +0000 Received: from mail.servus.at ([193.170.194.20]) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1T1cmh-0001nm-Uu for linux-arm-kernel@lists.infradead.org; Wed, 15 Aug 2012 12:33:17 +0000 Received: from localhost (mail.servus.at [127.0.0.1]) by mail.servus.at (Postfix) with ESMTP id DD5C52156BC; Wed, 15 Aug 2012 14:02:51 +0200 (CEST) X-Virus-Scanned: amavisd-new at servus.at Received: from mail.servus.at ([127.0.0.1]) by localhost (mail.servus.at [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id eSxDzJML-zIk; Wed, 15 Aug 2012 14:02:51 +0200 (CEST) Received: from hp6715b.oberhumer.com (unknown [37.117.240.23]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: oh_markus) by mail.servus.at (Postfix) with ESMTP id 2BD352156BE; Wed, 15 Aug 2012 14:02:43 +0200 (CEST) Message-ID: <502B8FE3.7080501@oberhumer.com> Date: Wed, 15 Aug 2012 14:02:43 +0200 From: "Markus F.X.J. Oberhumer" Organization: oberhumer.com User-Agent: Mozilla/5.0 (X11; Linux i686 on x86_64; rv:10.0.6esrpre) Gecko/20120713 Thunderbird/10.0.6 MIME-Version: 1.0 To: Johannes Stezenbach Subject: Re: [GIT PULL] Update LZO compression References: <50299142.2030504@oberhumer.com> <20120814123937.GA14756@sig21.net> In-Reply-To: <20120814123937.GA14756@sig21.net> X-Spam-Note: CRM114 invocation failed X-Spam-Score: -2.6 (--) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-2.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [193.170.194.20 listed in list.dnswl.org] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: linux-arm-kernel@lists.infradead.org, chris.mason@fusionio.com, richard -rw- weinberger , linux-kernel@vger.kernel.org, Richard Purdie , Andi Kleen , linux-btrfs@vger.kernel.org, Nitin Gupta X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org Hi Johannes, On 2012-08-14 14:39, Johannes Stezenbach wrote: > On Tue, Aug 14, 2012 at 01:44:02AM +0200, Markus F.X.J. Oberhumer wrote: >> On 2012-07-16 20:30, Markus F.X.J. Oberhumer wrote: >>> >>> As stated in the README this version is significantly faster (typically more >>> than 2 times faster!) than the current version, has been thoroughly tested on >>> x86_64/i386/powerpc platforms and is intended to get included into the >>> official Linux 3.6 or 3.7 release. >>> >>> I encourage all compression users to test and benchmark this new version, >>> and I also would ask some official LZO maintainer to convert the updated >>> source files into a GIT commit and possibly push it to Linus or linux-next. > > Sorry for not reporting earlier, but I didn't have time to do real > benchmarks, just a quick test on ARM926EJ-S using barebox, > and found in the new version decompression is slower: > http://lists.infradead.org/pipermail/barebox/2012-July/008268.html I can only guess, but maybe your ARM cpu does not have an efficient implementation of {get,put}_unaligned(). Could you please try the following patch and test if you can see any significant speed difference? Thanks, Markus > > BTW, do you have userspace code matching the old and new > lzo versions? It would be easier to benchmark. > > Unfortunately I cannot claim high confidence in my benchmark results > due to missing time to do it properly, it would be useful if > someone else could do some benchmarks on ARM before merging this. > > > Johannes diff --git a/lib/lzo/lzodefs.h b/lib/lzo/lzodefs.h index ddc8db5..efc5714 100644 --- a/lib/lzo/lzodefs.h +++ b/lib/lzo/lzodefs.h @@ -12,8 +12,15 @@ */ +#if defined(__arm__) +#define COPY4(dst, src) \ + (dst)[0] = (src)[0]; (dst)[1] = (src)[1]; \ + (dst)[2] = (src)[2]; (dst)[3] = (src)[3] +#endif +#ifndef COPY4 #define COPY4(dst, src) \ put_unaligned(get_unaligned((const u32 *)(src)), (u32 *)(dst)) +#endif #if defined(__x86_64__) #define COPY8(dst, src) \ put_unaligned(get_unaligned((const u64 *)(src)), (u64 *)(dst))