From patchwork Mon Oct 23 16:25:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Salyzyn X-Patchwork-Id: 10022823 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C0503603D7 for ; Mon, 23 Oct 2017 16:26:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AED67287A6 for ; Mon, 23 Oct 2017 16:26:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A2EFA28829; Mon, 23 Oct 2017 16:26:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_MED autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 41F22287A6 for ; Mon, 23 Oct 2017 16:26:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=9ysqhPmCDk0a+jB38buurjO8QA+VgKKnOnoGT1/GTuM=; b=Ili MvWgUBLnY6g1+Unqt2fs2Np61g0AWz5rWHGE7M9Ogea9WkRND4VV2wP7TvAEKQI/ZCFMqXC3cWs8c Pbt3/T7m+WV3eqrdB6j3XioaqYZv0X3Dvo4u/p8cfNPKLA4IOWDsLJV7KDo3Q/F7sKHUX8hnzyyPz CUUoJupeSF3HCKLASINonDY3dR7Chht80SEvpchVViaCLgghtrQpdvpI5AXuPpuA0HJQ/IaxuaiDa G9L+C8LKqRvwKTQeZqWCWvTqKzuVUojXCm9KyOxRQ4k63rNqmm3g/6OcuYsC8Q+dlm3Er037ux/QT wP+S+7gi/WhSZWAGC3G+EvefvLhcI1g==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1e6fYu-0004fm-Ja; Mon, 23 Oct 2017 16:26:48 +0000 Received: from mail-io0-x241.google.com ([2607:f8b0:4001:c06::241]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1e6fYr-0004c7-C7 for linux-arm-kernel@lists.infradead.org; Mon, 23 Oct 2017 16:26:47 +0000 Received: by mail-io0-x241.google.com with SMTP id f20so20721640ioj.9 for ; Mon, 23 Oct 2017 09:26:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=android.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=TGpDyVivaeiKBVPF8zGYqpBGAvhGh69/UjlNA9nFLHE=; b=jbh3yzn15otNbpjfbpdsE74YK732d0xk+PR+6ylpxqvcfzMDarDTKxnNog/AaCjzbJ iKxaIwzv20A6J2Fn0yOBt5GwOil4SzO2exUUtP6Ba+80NfoiI2uc9sRINRZuzzJR3PKo ynygODuiwpgWFaL+f4J6d9nJtpFTPgQwYfTpC49BhlCw2e8NCDboUV/sryq2k8oZJ4/H 4BQvW85o+ZKFj86wQ1QxyyXiEwNtESWJKz/9AlKKheI/26wrjUCyBW+jxmNZ6dvKas5u SKOnsKe72awhz0IC87qbLDL648MDtDCryHoRlUbDvX18F6N0t3GWL6GvZyH7nODqYnt6 8Rgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=TGpDyVivaeiKBVPF8zGYqpBGAvhGh69/UjlNA9nFLHE=; b=MkoPoiyss36jxjBirgOrN9YdWrIrJg94I5fMlsyP5LMaCKRFI2DQMu6Nizda9H84Y6 W2vMXNKeohYMBuXHazLfUJWriAWi1J1+zjIz5EwL1I7nfFkL+QfO1wzGixdNBMsjj0cj BfoMGUzAfMhaKOC0z0A8zkxJ04QVSYrimD4dlMEXvPGYn6q2xK3kmGlVS7TusnXWBolb y9Pba5mEUXCz28RQw9L9Xjnb4NUr0/Od7CC61hmF+ECJBtGp9wyobJhSFM3oDIqR04mw LefES/O5zHzyw2cb5j7GQhiCfGoyWBhBf9LHTFMDxV/HrJbHE1ZLjnoyqu2USmKVc/WP iDSg== X-Gm-Message-State: AMCzsaUE7geBsVwtYrbqwQG/qEd2at0YRBmV3Wzc/Uw55IZnmmhpHRGY uq+hRo8AZzQYiRZ/O3Gn8rR4mQ== X-Google-Smtp-Source: ABhQp+TJ7EW8+sPSwygA3s7llkiW1Fm+wRVeyU403P2fHkBhoNhTyV4TR1+GTF+h50vyr5CvGW7Xig== X-Received: by 10.107.161.144 with SMTP id k138mr11324939ioe.218.1508775983564; Mon, 23 Oct 2017 09:26:23 -0700 (PDT) Received: from nebulus.mtv.corp.google.com ([100.98.120.17]) by smtp.gmail.com with ESMTPSA id n135sm2567389itg.29.2017.10.23.09.26.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Oct 2017 09:26:23 -0700 (PDT) From: Mark Salyzyn To: linux-kernel@vger.kernel.org Subject: [PATCH v2] arm64: optimize __memcpy_fromio and __memcpy_toio Date: Mon, 23 Oct 2017 09:25:35 -0700 Message-Id: <20171023162611.37098-1-salyzyn@android.com> X-Mailer: git-send-email 2.15.0.rc0.271.g36b669edcc-goog X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20171023_092645_457224_2DA9852C X-CRM114-Status: GOOD ( 13.18 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Tony Luck , Kees Cook , Catalin Marinas , Anton Vorontsov , Will Deacon , Mark Salyzyn , Mark Salyzyn , Colin Cross , linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP __memcpy_fromio and __memcpy_toio functions do not deal well with harmonically unaligned addresses unless they can ultimately be copied as quads (u64) to and from the destination. Without a harmonically aligned relationship, they perform byte operations over the entire buffer. Dropped the fragment that tried to align on the normal memory, placing a priority on using quad alignment on the io-side. Removed the volatile on the source for __memcpy_toio as it is unnecessary. This change was motivated by performance issues in the pstore driver. On a test platform, measuring probe time for pstore, console buffer size of 1/4MB and pmsg of 1/2MB, was in the 90-107ms region. Change managed to reduce it to 10-25ms, an improvement in boot time. Signed-off-by: Mark Salyzyn Cc: Kees Cook Cc: Anton Vorontsov Cc: Tony Luck Cc: Catalin Marinas Cc: Will Deacon Cc: Anton Vorontsov Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org v2: - simplify, do not try so hard, or through steps, to align on the normal memory side, as it was a diminishing return. Dealing with any pathological short cases was unnecessary since there does not appear to be any. - drop similar __memset_io changes completely. --- arch/arm64/kernel/io.c | 36 +++++++++++++++++------------------- 1 file changed, 17 insertions(+), 19 deletions(-) diff --git a/arch/arm64/kernel/io.c b/arch/arm64/kernel/io.c index 354be2a872ae..fc039093fa9a 100644 --- a/arch/arm64/kernel/io.c +++ b/arch/arm64/kernel/io.c @@ -25,19 +25,18 @@ */ void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t count) { - while (count && (!IS_ALIGNED((unsigned long)from, 8) || - !IS_ALIGNED((unsigned long)to, 8))) { + while (count && !IS_ALIGNED((unsigned long)from, sizeof(u64))) { *(u8 *)to = __raw_readb(from); from++; to++; count--; } - while (count >= 8) { + while (count >= sizeof(u64)) { *(u64 *)to = __raw_readq(from); - from += 8; - to += 8; - count -= 8; + from += sizeof(u64); + to += sizeof(u64); + count -= sizeof(u64); } while (count) { @@ -54,23 +53,22 @@ EXPORT_SYMBOL(__memcpy_fromio); */ void __memcpy_toio(volatile void __iomem *to, const void *from, size_t count) { - while (count && (!IS_ALIGNED((unsigned long)to, 8) || - !IS_ALIGNED((unsigned long)from, 8))) { - __raw_writeb(*(volatile u8 *)from, to); + while (count && !IS_ALIGNED((unsigned long)to, sizeof(u64))) { + __raw_writeb(*(u8 *)from, to); from++; to++; count--; } - while (count >= 8) { - __raw_writeq(*(volatile u64 *)from, to); - from += 8; - to += 8; - count -= 8; + while (count >= sizeof(u64)) { + __raw_writeq(*(u64 *)from, to); + from += sizeof(u64); + to += sizeof(u64); + count -= sizeof(u64); } while (count) { - __raw_writeb(*(volatile u8 *)from, to); + __raw_writeb(*(u8 *)from, to); from++; to++; count--; @@ -89,16 +87,16 @@ void __memset_io(volatile void __iomem *dst, int c, size_t count) qc |= qc << 16; qc |= qc << 32; - while (count && !IS_ALIGNED((unsigned long)dst, 8)) { + while (count && !IS_ALIGNED((unsigned long)dst, sizeof(u64))) { __raw_writeb(c, dst); dst++; count--; } - while (count >= 8) { + while (count >= sizeof(u64)) { __raw_writeq(qc, dst); - dst += 8; - count -= 8; + dst += sizeof(u64); + count -= sizeof(u64); } while (count) {