From patchwork Fri May 9 21:28:10 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jason Gunthorpe X-Patchwork-Id: 4145331 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 6790D9F23C for ; Fri, 9 May 2014 21:31:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 623AD20357 for ; Fri, 9 May 2014 21:31:17 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 807292034F for ; Fri, 9 May 2014 21:31:16 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1WisLV-0006rH-DK; Fri, 09 May 2014 21:28:45 +0000 Received: from quartz.orcorp.ca ([184.70.90.242]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1WisLQ-0006e0-8Q; Fri, 09 May 2014 21:28:40 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=obsidianresearch.com; s=rsa1; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=r1Vs3r0nX4/iqZ0T/hAMycy8klcCkYg/4TdWts8GjQY=; b=g4H8IEFlsLR57yjojE89vhzQia2SCczv6g2TPXE/WuWJeZVnjO5MxgCpM0/L+9soN/X5GayRgdkhJsfUVWy14H6FyJOvFl6gPvAQEsm6arBSdkdoV1HHcOba9HnXH7IM0CvrZnz3ZaufQ1K0kzYvmS/IToSa0UuD3DzBJDhcr2o=; Received: from [10.0.0.161] (helo=jggl.edm.orcorp.ca) by quartz.orcorp.ca with esmtps (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.72) (envelope-from ) id 1WisKx-000216-Bi; Fri, 09 May 2014 15:28:11 -0600 Received: from jgg by jggl.edm.orcorp.ca with local (Exim 4.82) (envelope-from ) id 1WisKw-0007b4-Vg; Fri, 09 May 2014 15:28:11 -0600 Date: Fri, 9 May 2014 15:28:10 -0600 From: Jason Gunthorpe To: Ezequiel Garcia Subject: Re: [PATCH 2/2] mtd: orion-nand: fix build error with ARMv4 Message-ID: <20140509212810.GF18257@obsidianresearch.com> References: <1399560433-1402630-1-git-send-email-arnd@arndb.de> <1399560990-1402858-1-git-send-email-arnd@arndb.de> <1399560990-1402858-4-git-send-email-arnd@arndb.de> <20140509184505.GA30330@arch.cereza> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20140509184505.GA30330@arch.cereza> User-Agent: Mutt/1.5.21 (2010-09-15) X-Broken-Reverse-DNS: no host name found for IP address 10.0.0.161 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20140509_142840_487831_9356B67F X-CRM114-Status: GOOD ( 15.65 ) X-Spam-Score: -0.1 (/) Cc: Arnd Bergmann , Jingoo Han , linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org, Brian Norris , David Woodhouse , linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, RP_MATCHES_RCVD, T_DKIM_INVALID, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, May 09, 2014 at 03:45:05PM -0300, Ezequiel Garcia wrote: > I gave this a try in order to answer Arnd's performance > question. First of all, the patch seems wrong. I guess it's because > readsl reads 4-bytes pieces, instead of 8-bytes. > > This patch below is tested (but not completely, see below) and works: Compilers are better now, I think you can just ditch the weirdness: uint64_t *from; uint64_t *to; void foo() { for (unsigned int I = 0; I != 1000; I++) *to++ = *from; } Using even gcc 4.6.3 gives good code: (v6) .L2: ldrd r2, [ip] strd r2, [r1], #8 cmp r1, r0 (v4) .L2: ldmia ip, {r0-r1} stmia r3!, {r0-r1} cmp r3, r2 For correctness this v4 version does require that the cpu executes the ldmia reads in increasing address order, and never in any other order. AFAIK the periphal is just a simple fifo that basically ignores the address. memcpy_fromio is not as good since it will never align if the buffer is unaligned, while this version does. The below gives: c8: ea000002 b d8 cc: e5dc0000 ldrb r0, [ip] d0: e7c30001 strb r0, [r3, r1] d4: e2811001 add r1, r1, #1 d8: e1510002 cmp r1, r2 Which looks the same as the asm version to me. diff --git a/drivers/mtd/nand/orion_nand.c b/drivers/mtd/nand/orion_nand.c index dc9d07f34b8a..fea1597f623e 100644 --- a/drivers/mtd/nand/orion_nand.c +++ b/drivers/mtd/nand/orion_nand.c @@ -95,16 +95,13 @@ static void orion_nand_read_buf(struct mtd_info *mtd, uint8_t *buf, int len) } buf64 = (uint64_t *)buf; while (i < len/8) { - /* - * Since GCC has no proper constraint (PR 43518) - * force x variable to r2/r3 registers as ldrd instruction - * requires first register to be even. - */ - register uint64_t x asm ("r2"); - - asm volatile ("ldrd\t%0, [%1]" : "=&r" (x) : "r" (io_base)); - buf64[i++] = x; +#ifdef CONFIG_64BIT + buf64[i++] = readq_relaxed(io_base); +#else + buf64[i++] = *(const volatile u64 __force *)io_base; +#endif } + i *= 8; while (i < len) buf[i++] = readb(io_base);