From patchwork Fri Apr 18 21:11:47 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lauri Kasanen X-Patchwork-Id: 4018621 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 0F0899F369 for ; Fri, 18 Apr 2014 21:15:36 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 34A3D203AC for ; Fri, 18 Apr 2014 21:15:35 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 3921A202E6 for ; Fri, 18 Apr 2014 21:15:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E585F89C55; Fri, 18 Apr 2014 14:15:32 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org X-Greylist: delayed 302 seconds by postgrey-1.34 at gabe; Fri, 18 Apr 2014 14:15:32 PDT Received: from mout.gmx.net (mout.gmx.net [74.208.4.200]) by gabe.freedesktop.org (Postfix) with ESMTP id 1B0C589C55 for ; Fri, 18 Apr 2014 14:15:32 -0700 (PDT) Received: from Valinor ([84.248.177.169]) by mail.gmx.com (mrgmxus002) with ESMTPA (Nemesis) id 0Maquy-1WMUKF2rAQ-00KTwm for ; Fri, 18 Apr 2014 23:10:30 +0200 Date: Sat, 19 Apr 2014 00:11:47 +0300 From: Lauri Kasanen To: dri-devel@lists.freedesktop.org Subject: [PATCH] drm/radeon: Inline r100_mm_rreg, v2 Message-Id: <20140419001147.89fe2275.cand@gmx.com> X-Mailer: Sylpheed 3.1.4 (GTK+ 2.18.6; x86_64-unknown-linux-gnu) Mime-Version: 1.0 X-Provags-ID: V03:K0:waYA2JpjjsK0Kj+ejEKPzk1ift80OW+N1VOrX5xeE/dj+LxLGuT tSBd6ssMqEahZidu+n2LNneMLAqlQVYxzm57+jBRA8n9mA9JfRwCKLpR+/PGHwBzRGVMYgk NyBxcPI1dJALqTDnK73G4T48hnRC6R8REBXWUj7D4JE/MawL3xRvqT4l/9EtV5+/cONsvhb WTnGPc3VzbHD1k+JYMkcQ== X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This was originally un-inlined by Andi Kleen in 2011 citing size concerns. Indeed, a first attempt at inlining it grew radeon.ko by 7%. However, 2% of cpu is spent in this function. Simply inlining it gave 1% more fps in Urban Terror. v2: We know the minimum MMIO size. Adding it to the if allows the compiler to optimize the branch out, improving both performance and size. The v2 patch decreases radeon.ko size by 2%. I didn't re-benchmark, but common sense says perf is now more than 1% better. Signed-off-by: Lauri Kasanen --- drivers/gpu/drm/radeon/r100.c | 18 ------------------ drivers/gpu/drm/radeon/radeon.h | 21 +++++++++++++++++++-- 2 files changed, 19 insertions(+), 20 deletions(-) The _wreg function could be given similar treatment, but as it's nowhere as hot (0.009% vs 2%), didn't bother. diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index b6c3264..8169e82 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -4086,24 +4086,6 @@ int r100_init(struct radeon_device *rdev) return 0; } -uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg, - bool always_indirect) -{ - if (reg < rdev->rmmio_size && !always_indirect) - return readl(((void __iomem *)rdev->rmmio) + reg); - else { - unsigned long flags; - uint32_t ret; - - spin_lock_irqsave(&rdev->mmio_idx_lock, flags); - writel(reg, ((void __iomem *)rdev->rmmio) + RADEON_MM_INDEX); - ret = readl(((void __iomem *)rdev->rmmio) + RADEON_MM_DATA); - spin_unlock_irqrestore(&rdev->mmio_idx_lock, flags); - - return ret; - } -} - void r100_mm_wreg(struct radeon_device *rdev, uint32_t reg, uint32_t v, bool always_indirect) { diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index f21db7a..883276a 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -2328,8 +2328,25 @@ int radeon_device_init(struct radeon_device *rdev, void radeon_device_fini(struct radeon_device *rdev); int radeon_gpu_wait_for_idle(struct radeon_device *rdev); -uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg, - bool always_indirect); +static inline uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg, + bool always_indirect) +{ + /* The mmio size is 64kb at minimum. Allows the if to be optimized out. */ + if ((reg < rdev->rmmio_size || reg < 0x10000) && !always_indirect) + return readl(((void __iomem *)rdev->rmmio) + reg); + else { + unsigned long flags; + uint32_t ret; + + spin_lock_irqsave(&rdev->mmio_idx_lock, flags); + writel(reg, ((void __iomem *)rdev->rmmio) + RADEON_MM_INDEX); + ret = readl(((void __iomem *)rdev->rmmio) + RADEON_MM_DATA); + spin_unlock_irqrestore(&rdev->mmio_idx_lock, flags); + + return ret; + } +} + void r100_mm_wreg(struct radeon_device *rdev, uint32_t reg, uint32_t v, bool always_indirect); u32 r100_io_rreg(struct radeon_device *rdev, u32 reg);