From patchwork Thu Sep 26 10:06:34 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: =?utf-8?q?Micha=C5=82_Winiarski?=
 <michal.winiarski@intel.com>
X-Patchwork-Id: 11162235
Return-Path: <SRS0=kowy=XV=lists.freedesktop.org=intel-gfx-bounces@kernel.org>
Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org
 [172.30.200.123])
	by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D2CA514ED
	for <patchwork-intel-gfx@patchwork.kernel.org>;
 Thu, 26 Sep 2019 10:07:06 +0000 (UTC)
Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by mail.kernel.org (Postfix) with ESMTPS id BB308217F4
	for <patchwork-intel-gfx@patchwork.kernel.org>;
 Thu, 26 Sep 2019 10:07:06 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BB308217F4
Authentication-Results: mail.kernel.org;
 dmarc=fail (p=none dis=none) header.from=intel.com
Authentication-Results: mail.kernel.org;
 spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org
Received: from gabe.freedesktop.org (localhost [127.0.0.1])
	by gabe.freedesktop.org (Postfix) with ESMTP id 38B3E6EDAA;
	Thu, 26 Sep 2019 10:07:06 +0000 (UTC)
X-Original-To: intel-gfx@lists.freedesktop.org
Delivered-To: intel-gfx@lists.freedesktop.org
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by gabe.freedesktop.org (Postfix) with ESMTPS id 59D356EDAA
 for <intel-gfx@lists.freedesktop.org>; Thu, 26 Sep 2019 10:07:04 +0000 (UTC)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
 by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 26 Sep 2019 03:07:04 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.64,551,1559545200"; d="scan'208";a="203825642"
Received: from irsmsx102.ger.corp.intel.com ([163.33.3.155])
 by fmsmga001.fm.intel.com with ESMTP; 26 Sep 2019 03:07:03 -0700
Received: from localhost (172.28.172.64) by IRSMSX102.ger.corp.intel.com
 (163.33.3.155) with Microsoft SMTP Server (TLS) id 14.3.439.0; Thu, 26 Sep
 2019 11:07:02 +0100
From: =?utf-8?q?Micha=C5=82_Winiarski?= <michal.winiarski@intel.com>
To: <intel-gfx@lists.freedesktop.org>
Date: Thu, 26 Sep 2019 12:06:34 +0200
Message-ID: <20190926100635.9416-5-michal.winiarski@intel.com>
X-Mailer: git-send-email 2.21.0
In-Reply-To: <20190926100635.9416-1-michal.winiarski@intel.com>
References: <20190926100635.9416-1-michal.winiarski@intel.com>
MIME-Version: 1.0
X-Originating-IP: [172.28.172.64]
Subject: [Intel-gfx] [PATCH 5/6] drm/i915: Don't use scratch in WA batch.
X-BeenThere: intel-gfx@lists.freedesktop.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Intel graphics driver community testing & development
 <intel-gfx.lists.freedesktop.org>
List-Unsubscribe: <https://lists.freedesktop.org/mailman/options/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=unsubscribe>
List-Archive: <https://lists.freedesktop.org/archives/intel-gfx>
List-Post: <mailto:intel-gfx@lists.freedesktop.org>
List-Help: <mailto:intel-gfx-request@lists.freedesktop.org?subject=help>
List-Subscribe: <https://lists.freedesktop.org/mailman/listinfo/intel-gfx>,
 <mailto:intel-gfx-request@lists.freedesktop.org?subject=subscribe>
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

We're currently doing one workaround where we're using scratch as a
temporary storage place, while we're overwriting the value of one
register with some known constant value in order to perform a
workaround.
While we could just do similar thing with CS_GPR register
and MI_LOAD_REGISTER_REG instead of scratch, since we would use CS_GPR
anyways, let's just drop the constant values and do the bitops using
MI_MATH.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_engine.h   | 53 ++++++++++++++++++++++++
 drivers/gpu/drm/i915/gt/intel_gt_types.h |  3 --
 drivers/gpu/drm/i915/gt/intel_lrc.c      | 27 +++---------
 3 files changed, 58 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_engine.h b/drivers/gpu/drm/i915/gt/intel_engine.h
index d3c6993f4f46..2dfe0b23af1d 100644
--- a/drivers/gpu/drm/i915/gt/intel_engine.h
+++ b/drivers/gpu/drm/i915/gt/intel_engine.h
@@ -400,6 +400,59 @@ gen8_emit_ggtt_write(u32 *cs, u32 value, u32 gtt_offset, u32 flags)
 	return cs;
 }
 
+/*
+ * We're using CS_GPR registers for the MI_MATH ops.
+ * Note that user contexts are free to use those registers, which means that we
+ * should only use this this function during context initialization, before
+ * context restore (WA batch) or inside i915-owned contexts.
+ */
+static inline u32 *
+gen8_emit_bitop_reg_mask(struct intel_engine_cs *engine,
+			 u32 *cs, u32 op, i915_reg_t reg, u32 mask)
+{
+	const u32 base = engine->mmio_base;
+
+	GEM_BUG_ON(op != MI_MATH_OR && op != MI_MATH_AND);
+
+	*cs++ = MI_LOAD_REGISTER_REG;
+	*cs++ = i915_mmio_reg_offset(reg);
+	*cs++ = i915_mmio_reg_offset(GEN8_RING_CS_GPR(base, 0));
+	*cs++ = MI_NOOP;
+
+	*cs++ = MI_LOAD_REGISTER_IMM(1);
+	*cs++ = i915_mmio_reg_offset(GEN8_RING_CS_GPR(base, 1));
+	*cs++ = mask;
+	*cs++ = MI_NOOP;
+
+	*cs++ = MI_MATH(4);
+	*cs++ = MI_MATH_LOAD(MI_MATH_REG_SRCA, MI_MATH_REG(0));
+	*cs++ = MI_MATH_LOAD(MI_MATH_REG_SRCB, MI_MATH_REG(1));
+	*cs++ = op;
+	*cs++ = MI_MATH_STORE(MI_MATH_REG(0), MI_MATH_REG_ACCU);
+	*cs++ = MI_NOOP;
+
+	*cs++ = MI_LOAD_REGISTER_REG;
+	*cs++ = i915_mmio_reg_offset(GEN8_RING_CS_GPR(base, 0));
+	*cs++ = i915_mmio_reg_offset(reg);
+	*cs++ = MI_NOOP;
+
+	return cs;
+}
+
+static inline u32 *
+gen8_emit_set_register(struct intel_engine_cs *engine,
+		       u32 *cs, i915_reg_t reg, u32 mask)
+{
+	return gen8_emit_bitop_reg_mask(engine, cs, MI_MATH_OR, reg, mask);
+}
+
+static inline u32 *
+gen8_emit_clear_register(struct intel_engine_cs *engine,
+			 u32 *cs, i915_reg_t reg, u32 mask)
+{
+	return gen8_emit_bitop_reg_mask(engine, cs, MI_MATH_AND, reg, ~mask);
+}
+
 static inline void __intel_engine_reset(struct intel_engine_cs *engine,
 					bool stalled)
 {
diff --git a/drivers/gpu/drm/i915/gt/intel_gt_types.h b/drivers/gpu/drm/i915/gt/intel_gt_types.h
index e64db4c13df6..d9f25ac520a8 100644
--- a/drivers/gpu/drm/i915/gt/intel_gt_types.h
+++ b/drivers/gpu/drm/i915/gt/intel_gt_types.h
@@ -92,9 +92,6 @@ enum intel_gt_scratch_field {
 	/* 8 bytes */
 	INTEL_GT_SCRATCH_FIELD_RENDER_FLUSH = 128,
 
-	/* 8 bytes */
-	INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA = 256,
-
 };
 
 #endif /* __INTEL_GT_TYPES_H__ */
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index fa385218ce92..089c17599ca4 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2269,13 +2269,7 @@ static int execlists_request_alloc(struct i915_request *request)
  * PIPE_CONTROL instruction. This is required for the flush to happen correctly
  * but there is a slight complication as this is applied in WA batch where the
  * values are only initialized once so we cannot take register value at the
- * beginning and reuse it further; hence we save its value to memory, upload a
- * constant value with bit21 set and then we restore it back with the saved value.
- * To simplify the WA, a constant value is formed by using the default value
- * of this register. This shouldn't be a problem because we are only modifying
- * it for a short period and this batch in non-premptible. We can ofcourse
- * use additional instructions that read the actual value of the register
- * at that time and set our bit of interest but it makes the WA complicated.
+ * beginning and reuse it further;
  *
  * This WA is also required for Gen9 so extracting as a function avoids
  * code duplication.
@@ -2283,27 +2277,16 @@ static int execlists_request_alloc(struct i915_request *request)
 static u32 *
 gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
 {
-	/* NB no one else is allowed to scribble over scratch + 256! */
-	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = intel_gt_scratch_offset(engine->gt,
-					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
-	*batch++ = 0;
-
-	*batch++ = MI_LOAD_REGISTER_IMM(1);
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = 0x40400000 | GEN8_LQSC_FLUSH_COHERENT_LINES;
+	batch = gen8_emit_set_register(engine, batch, GEN8_L3SQCREG4,
+				       GEN8_LQSC_FLUSH_COHERENT_LINES);
 
 	batch = gen8_emit_pipe_control(batch,
 				       PIPE_CONTROL_CS_STALL |
 				       PIPE_CONTROL_DC_FLUSH_ENABLE,
 				       0);
 
-	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
-	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = intel_gt_scratch_offset(engine->gt,
-					   INTEL_GT_SCRATCH_FIELD_COHERENTL3_WA);
-	*batch++ = 0;
+	batch = gen8_emit_clear_register(engine, batch, GEN8_L3SQCREG4,
+					 GEN8_LQSC_FLUSH_COHERENT_LINES);
 
 	return batch;
 }