From patchwork Mon Nov 28 19:28:28 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jordan Crouse <jcrouse@codeaurora.org>
X-Patchwork-Id: 9450059
X-Patchwork-Delegate: agross@codeaurora.org
Return-Path: <linux-arm-msm-owner@kernel.org>
Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org
	[172.30.200.125])
	by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id
	CC7AA6074E for <patchwork-linux-arm-msm@patchwork.kernel.org>;
	Mon, 28 Nov 2016 19:28:52 +0000 (UTC)
Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BBD2427FB3
	for <patchwork-linux-arm-msm@patchwork.kernel.org>;
	Mon, 28 Nov 2016 19:28:52 +0000 (UTC)
Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486)
	id B06D827FBE; Mon, 28 Nov 2016 19:28:52 +0000 (UTC)
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	pdx-wl-mail.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED,
	RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=ham version=3.3.1
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 550F428047
	for <patchwork-linux-arm-msm@patchwork.kernel.org>;
	Mon, 28 Nov 2016 19:28:52 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753425AbcK1T2v (ORCPT
	<rfc822;patchwork-linux-arm-msm@patchwork.kernel.org>);
	Mon, 28 Nov 2016 14:28:51 -0500
Received: from smtp.codeaurora.org ([198.145.29.96]:45404 "EHLO
	smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753632AbcK1T2u (ORCPT
	<rfc822;linux-arm-msm@vger.kernel.org>);
	Mon, 28 Nov 2016 14:28:50 -0500
Received: by smtp.codeaurora.org (Postfix, from userid 1000)
	id 76D6B612CA; Mon, 28 Nov 2016 19:28:49 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org;
	s=default; t=1480361329;
	bh=p8g3vQ6C3DGMEomK1/YbMyMmuF5WZGf7MXaFiUf+eQo=;
	h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
	b=TcCC1k0do4d+MXCbDEkd/Z7GLn+aMrh0RSzRvIP5lrmRQ5oO3Xa87s0RXCHxz7eJW
	1uAgTash7+Gx83w38fiWKS2iuvlaRxG5FBINjf5KzuCPmQ6YAUeZrAdWpFvYo6km5U
	2GUL+KmwiWP2DvPcHUDQ7Gtl+jRsbV8jy2pLtpgs=
Received: from jcrouse-lnx.qualcomm.com (i-global254.qualcomm.com
	[199.106.103.254])
	(using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits))
	(No client certificate requested)
	(Authenticated sender: jcrouse@smtp.codeaurora.org)
	by smtp.codeaurora.org (Postfix) with ESMTPSA id D2FCB6023E;
	Mon, 28 Nov 2016 19:28:48 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org;
	s=default; t=1480361329;
	bh=p8g3vQ6C3DGMEomK1/YbMyMmuF5WZGf7MXaFiUf+eQo=;
	h=From:To:Cc:Subject:Date:In-Reply-To:References:From;
	b=TcCC1k0do4d+MXCbDEkd/Z7GLn+aMrh0RSzRvIP5lrmRQ5oO3Xa87s0RXCHxz7eJW
	1uAgTash7+Gx83w38fiWKS2iuvlaRxG5FBINjf5KzuCPmQ6YAUeZrAdWpFvYo6km5U
	2GUL+KmwiWP2DvPcHUDQ7Gtl+jRsbV8jy2pLtpgs=
DMARC-Filter: OpenDMARC Filter v1.3.1 smtp.codeaurora.org D2FCB6023E
Authentication-Results: pdx-caf-mail.web.codeaurora.org;
	dmarc=none header.from=codeaurora.org
Authentication-Results: pdx-caf-mail.web.codeaurora.org;
	spf=pass smtp.mailfrom=jcrouse@codeaurora.org
From: Jordan Crouse <jcrouse@codeaurora.org>
To: freedreno@lists.freedesktop.org
Cc: linux-arm-msm@vger.kernel.org, dri-devel@lists.freedesktop.org
Subject: [PATCH 03/12] drm/msm: gpu Add new gpu register read/write functions
Date: Mon, 28 Nov 2016 12:28:28 -0700
Message-Id: <1480361317-9937-4-git-send-email-jcrouse@codeaurora.org>
X-Mailer: git-send-email 1.9.1
In-Reply-To: <1480361317-9937-1-git-send-email-jcrouse@codeaurora.org>
References: <1480361317-9937-1-git-send-email-jcrouse@codeaurora.org>
Sender: linux-arm-msm-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-arm-msm.vger.kernel.org>
X-Mailing-List: linux-arm-msm@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Add some new functions to manipulate GPU registers.  gpu_read64 and
gpu_write64 can read/write a 64 bit value to two 32 bit registers.
For 4XX and older these are normally perfcounter registers, but
future targets will use 64 bit addressing so there will be many
more spots where a 64 bit read and write are needed.

gpu_rmw() does a read/modify/write on a 32 bit register given a mask
and bits to OR in.

Signed-off-by: Jordan Crouse <jcrouse@codeaurora.org>
---
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 12 ++---------
 drivers/gpu/drm/msm/msm_gpu.h         | 39 +++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 9e7f5b7..4f68b63 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -513,16 +513,8 @@ static int a4xx_pm_suspend(struct msm_gpu *gpu) {
 
 static int a4xx_get_timestamp(struct msm_gpu *gpu, uint64_t *value)
 {
-	uint32_t hi, lo, tmp;
-
-	tmp = gpu_read(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_HI);
-	do {
-		hi = tmp;
-		lo = gpu_read(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO);
-		tmp = gpu_read(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_HI);
-	} while (tmp != hi);
-
-	*value = (((uint64_t)hi) << 32) | lo;
+	*value = gpu_read64(gpu, REG_A4XX_RBBM_PERFCTR_CP_0_LO,
+		REG_A4XX_RBBM_PERFCTR_CP_0_HI);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index 19a7254..baca428 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -154,6 +154,45 @@ static inline u32 gpu_read(struct msm_gpu *gpu, u32 reg)
 	return msm_readl(gpu->mmio + (reg << 2));
 }
 
+static inline void gpu_rmw(struct msm_gpu *gpu, u32 reg, u32 mask, u32 or)
+{
+	uint32_t val = gpu_read(gpu, reg);
+
+	val &= ~mask;
+	gpu_write(gpu, reg, val | or);
+}
+
+static inline u64 gpu_read64(struct msm_gpu *gpu, u32 lo, u32 hi)
+{
+	u64 val;
+
+	/*
+	 * Why not a readq here? Two reasons: 1) many of the LO registers are
+	 * not quad word aligned and 2) the GPU hardware designers have a bit
+	 * of a history of putting registers where they fit, especially in
+	 * spins. The longer a GPU family goes the higher the chance that
+	 * we'll get burned.  We could do a series of validity checks if we
+	 * wanted to, but really is a readq() that much better? Nah.
+	 */
+
+	/*
+	 * For some lo/hi registers (like perfcounters), the hi value is latched
+	 * when the lo is read, so make sure to read the lo first to trigger
+	 * that
+	 */
+	val = (u64) msm_readl(gpu->mmio + (lo << 2));
+	val |= ((u64) msm_readl(gpu->mmio + (hi << 2)) << 32);
+
+	return val;
+}
+
+static inline void gpu_write64(struct msm_gpu *gpu, u32 lo, u32 hi, u64 val)
+{
+	/* Why not a writeq here? Read the screed above */
+	msm_writel(lower_32_bits(val), gpu->mmio + (lo << 2));
+	msm_writel(upper_32_bits(val), gpu->mmio + (hi << 2));
+}
+
 int msm_gpu_pm_suspend(struct msm_gpu *gpu);
 int msm_gpu_pm_resume(struct msm_gpu *gpu);