From patchwork Sun Apr 20 17:29:33 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Lauri Kasanen X-Patchwork-Id: 4021751 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 0EB4EBFF02 for ; Sun, 20 Apr 2014 17:28:28 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 2B1A52022D for ; Sun, 20 Apr 2014 17:28:27 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 59CD32010F for ; Sun, 20 Apr 2014 17:28:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 11FD76E1E5; Sun, 20 Apr 2014 10:28:23 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mout.gmx.net (mout.gmx.net [74.208.4.200]) by gabe.freedesktop.org (Postfix) with ESMTP id A00086E1E5 for ; Sun, 20 Apr 2014 10:28:21 -0700 (PDT) Received: from Valinor ([84.248.177.169]) by mail.gmx.com (mrgmxus001) with ESMTPA (Nemesis) id 0M7YVF-1Wp4je1hoN-00xIS3; Sun, 20 Apr 2014 19:28:17 +0200 Date: Sun, 20 Apr 2014 20:29:33 +0300 From: Lauri Kasanen To: dri-devel@lists.freedesktop.org Subject: [PATCH] drm/radeon: Inline r100_mm_rreg, -wreg, v3 Message-Id: <20140420202933.78b3e355.cand@gmx.com> X-Mailer: Sylpheed 3.1.4 (GTK+ 2.18.6; x86_64-unknown-linux-gnu) Mime-Version: 1.0 X-Provags-ID: V03:K0:eX8G7Yyp9UKWK6e8YWoRt0Y82RuIv6lpQ64wjs38rOXfJ3zLo2v HGMnO2p33VHIekP0tMZWMviElxsJFwPjyABNs09/QFqPHFpDOoxXPSCNLTbvahL6D8CBupH cSMYXf+e0KQyohqtQeZ5LziKRSczlCeMG4Qw9Vwyb1/fSnPoOt9sOEiRLzFG5A1tNFJySTX 2YXc5YtVPOVa//2QSRn+A== X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00,FREEMAIL_FROM, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This was originally un-inlined by Andi Kleen in 2011 citing size concerns. Indeed, a first attempt at inlining it grew radeon.ko by 7%. However, 2% of cpu is spent in this function. Simply inlining it gave 1% more fps in Urban Terror. v2: We know the minimum MMIO size. Adding it to the if allows the compiler to optimize the branch out, improving both performance and size. The v2 patch decreases radeon.ko size by 2%. I didn't re-benchmark, but common sense says perf is now more than 1% better. v3: Also change _wreg, make the threshold a define. Inlining _wreg increased the size a bit compared to v2, so now radeon.ko is only 1% smaller. Signed-off-by: Lauri Kasanen Reviewed-by: Christian König --- drivers/gpu/drm/radeon/r100.c | 33 --------------------------------- drivers/gpu/drm/radeon/radeon.h | 40 ++++++++++++++++++++++++++++++++++++---- 2 files changed, 36 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/radeon/r100.c b/drivers/gpu/drm/radeon/r100.c index b6c3264..a4e7871 100644 --- a/drivers/gpu/drm/radeon/r100.c +++ b/drivers/gpu/drm/radeon/r100.c @@ -4086,39 +4086,6 @@ int r100_init(struct radeon_device *rdev) return 0; } -uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg, - bool always_indirect) -{ - if (reg < rdev->rmmio_size && !always_indirect) - return readl(((void __iomem *)rdev->rmmio) + reg); - else { - unsigned long flags; - uint32_t ret; - - spin_lock_irqsave(&rdev->mmio_idx_lock, flags); - writel(reg, ((void __iomem *)rdev->rmmio) + RADEON_MM_INDEX); - ret = readl(((void __iomem *)rdev->rmmio) + RADEON_MM_DATA); - spin_unlock_irqrestore(&rdev->mmio_idx_lock, flags); - - return ret; - } -} - -void r100_mm_wreg(struct radeon_device *rdev, uint32_t reg, uint32_t v, - bool always_indirect) -{ - if (reg < rdev->rmmio_size && !always_indirect) - writel(v, ((void __iomem *)rdev->rmmio) + reg); - else { - unsigned long flags; - - spin_lock_irqsave(&rdev->mmio_idx_lock, flags); - writel(reg, ((void __iomem *)rdev->rmmio) + RADEON_MM_INDEX); - writel(v, ((void __iomem *)rdev->rmmio) + RADEON_MM_DATA); - spin_unlock_irqrestore(&rdev->mmio_idx_lock, flags); - } -} - u32 r100_io_rreg(struct radeon_device *rdev, u32 reg) { if (reg < rdev->rio_mem_size) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index f21db7a..a749b6c 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -2328,10 +2328,42 @@ int radeon_device_init(struct radeon_device *rdev, void radeon_device_fini(struct radeon_device *rdev); int radeon_gpu_wait_for_idle(struct radeon_device *rdev); -uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg, - bool always_indirect); -void r100_mm_wreg(struct radeon_device *rdev, uint32_t reg, uint32_t v, - bool always_indirect); +#define RADEON_MIN_MMIO_SIZE 0x10000 + +static inline uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg, + bool always_indirect) +{ + /* The mmio size is 64kb at minimum. Allows the if to be optimized out. */ + if ((reg < rdev->rmmio_size || reg < RADEON_MIN_MMIO_SIZE) && !always_indirect) + return readl(((void __iomem *)rdev->rmmio) + reg); + else { + unsigned long flags; + uint32_t ret; + + spin_lock_irqsave(&rdev->mmio_idx_lock, flags); + writel(reg, ((void __iomem *)rdev->rmmio) + RADEON_MM_INDEX); + ret = readl(((void __iomem *)rdev->rmmio) + RADEON_MM_DATA); + spin_unlock_irqrestore(&rdev->mmio_idx_lock, flags); + + return ret; + } +} + +static inline void r100_mm_wreg(struct radeon_device *rdev, uint32_t reg, uint32_t v, + bool always_indirect) +{ + if ((reg < rdev->rmmio_size || reg < RADEON_MIN_MMIO_SIZE) && !always_indirect) + writel(v, ((void __iomem *)rdev->rmmio) + reg); + else { + unsigned long flags; + + spin_lock_irqsave(&rdev->mmio_idx_lock, flags); + writel(reg, ((void __iomem *)rdev->rmmio) + RADEON_MM_INDEX); + writel(v, ((void __iomem *)rdev->rmmio) + RADEON_MM_DATA); + spin_unlock_irqrestore(&rdev->mmio_idx_lock, flags); + } +} + u32 r100_io_rreg(struct radeon_device *rdev, u32 reg); void r100_io_wreg(struct radeon_device *rdev, u32 reg, u32 v);