diff mbox

[v3,1/2] drm/i915: add render state initialization

Message ID 1399386627-25287-1-git-send-email-mika.kuoppala@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mika Kuoppala May 6, 2014, 2:30 p.m. UTC
HW guys say that it is not a cool idea to let device
go into rc6 without proper 3d pipeline state.

For each new uninitialized context, generate a
valid null render state to be run on context
creation.

This patch introduces a skeleton with empty states.

v2: - No need to vmap (Chris Wilson)
    - use .c files for state (Daniel Vetter)
    - no need to flush as i915_add_request does it
    - remove parameter for batch alloc size
    - don't wait for the init (Ben Widawsky)

v3: - move to cpu/gpu (Chris Wilson)

Tested-by: Kristen Carlson Accardi <kristen@linux.intel.com> (v1)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/Makefile                 |    6 +
 drivers/gpu/drm/i915/i915_drv.h               |    2 +
 drivers/gpu/drm/i915/i915_gem_context.c       |    6 +
 drivers/gpu/drm/i915/i915_gem_render_state.c  |  194 +++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_renderstate.h      |   48 ++++++
 drivers/gpu/drm/i915/intel_renderstate_gen6.c |   10 ++
 drivers/gpu/drm/i915/intel_renderstate_gen7.c |   10 ++
 drivers/gpu/drm/i915/intel_renderstate_gen8.c |   10 ++
 8 files changed, 286 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_render_state.c
 create mode 100644 drivers/gpu/drm/i915/intel_renderstate.h
 create mode 100644 drivers/gpu/drm/i915/intel_renderstate_gen6.c
 create mode 100644 drivers/gpu/drm/i915/intel_renderstate_gen7.c
 create mode 100644 drivers/gpu/drm/i915/intel_renderstate_gen8.c

Comments

oscar.mateo@intel.com May 14, 2014, 10:24 a.m. UTC | #1
Hi Mika,

> -----Original Message-----

> From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf Of

> Mika Kuoppala

> Sent: Tuesday, May 06, 2014 3:30 PM

> To: intel-gfx@lists.freedesktop.org

> Cc: ben@bwidawsk.net; miku@iki.fi; kristen@linux.intel.com

> Subject: [Intel-gfx] [PATCH v3 1/2] drm/i915: add render state initialization

> 

> HW guys say that it is not a cool idea to let device go into rc6 without proper 3d

> pipeline state.

> 

> For each new uninitialized context, generate a valid null render state to be run

> on context creation.


In Android, we have been seeing a problem in BDW D0 stepping (C0 is fine), in which actual rendering does not happen, even though everything seems to be healthy. The only "tell" seems to be that the pixel shader invocation count does not go up.
I wouldn´t dare say I understand why this fixes our problem, but it clearly does, so feel free to add:

Tested-by: Oscar Mateo <oscar.mateo@intel.com>


to both patches...
Lespiau, Damien May 14, 2014, 11:13 a.m. UTC | #2
On Wed, May 14, 2014 at 10:24:53AM +0000, Mateo Lozano, Oscar wrote:
> Hi Mika,
> 
> > -----Original Message-----
> > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On Behalf Of
> > Mika Kuoppala
> > Sent: Tuesday, May 06, 2014 3:30 PM
> > To: intel-gfx@lists.freedesktop.org
> > Cc: ben@bwidawsk.net; miku@iki.fi; kristen@linux.intel.com
> > Subject: [Intel-gfx] [PATCH v3 1/2] drm/i915: add render state initialization
> > 
> > HW guys say that it is not a cool idea to let device go into rc6 without proper 3d
> > pipeline state.
> > 
> > For each new uninitialized context, generate a valid null render state to be run
> > on context creation.
> 
> In Android, we have been seeing a problem in BDW D0 stepping (C0 is
> fine), in which actual rendering does not happen, even though
> everything seems to be healthy. The only "tell" seems to be that the
> pixel shader invocation count does not go up.  I wouldn´t dare say I
> understand why this fixes our problem, but it clearly does, so feel
> free to add:
> 
> Tested-by: Oscar Mateo <oscar.mateo@intel.com>

Stating the obvious here, mainly for my own understanding :)

FWIW, that looks like the userspace driver you are using actually
relying on the kernel setting up a golden state and missing "one bit" or
one packet. For instance it could be that it's missing a
3DSTATE_WM_HZ_OP command (that's the last fix Ken did in the gen8 render
copy state).

Out of sheer luck (or almost :) we happen to have this working.

So that's another kind of papering over than the one "needed" for rc6.

It seems to potentially be a useful service the kernel is providing to
user space apps though, trying to setup a sane state so user space
batches don't hang the GPU if they are missing one command. Of course
hangs can still happen if the batches themselves have bugs.

How to be sure that's the correct golden state is another interesting
question we need to answer.
oscar.mateo@intel.com May 14, 2014, 11:24 a.m. UTC | #3
> -----Original Message-----
> From: Lespiau, Damien
> Sent: Wednesday, May 14, 2014 12:14 PM
> To: Mateo Lozano, Oscar
> Cc: Mika Kuoppala; intel-gfx@lists.freedesktop.org; ben@bwidawsk.net;
> miku@iki.fi; kristen@linux.intel.com
> Subject: Re: [Intel-gfx] [PATCH v3 1/2] drm/i915: add render state initialization
> 
> On Wed, May 14, 2014 at 10:24:53AM +0000, Mateo Lozano, Oscar wrote:
> > Hi Mika,
> >
> > > -----Original Message-----
> > > From: Intel-gfx [mailto:intel-gfx-bounces@lists.freedesktop.org] On
> > > Behalf Of Mika Kuoppala
> > > Sent: Tuesday, May 06, 2014 3:30 PM
> > > To: intel-gfx@lists.freedesktop.org
> > > Cc: ben@bwidawsk.net; miku@iki.fi; kristen@linux.intel.com
> > > Subject: [Intel-gfx] [PATCH v3 1/2] drm/i915: add render state
> > > initialization
> > >
> > > HW guys say that it is not a cool idea to let device go into rc6
> > > without proper 3d pipeline state.
> > >
> > > For each new uninitialized context, generate a valid null render
> > > state to be run on context creation.
> >
> > In Android, we have been seeing a problem in BDW D0 stepping (C0 is
> > fine), in which actual rendering does not happen, even though
> > everything seems to be healthy. The only "tell" seems to be that the
> > pixel shader invocation count does not go up.  I wouldn´t dare say I
> > understand why this fixes our problem, but it clearly does, so feel
> > free to add:
> >
> > Tested-by: Oscar Mateo <oscar.mateo@intel.com>
> 
> Stating the obvious here, mainly for my own understanding :)
> 
> FWIW, that looks like the userspace driver you are using actually relying on the
> kernel setting up a golden state and missing "one bit" or one packet. For
> instance it could be that it's missing a 3DSTATE_WM_HZ_OP command (that's
> the last fix Ken did in the gen8 render copy state).
> 
> Out of sheer luck (or almost :) we happen to have this working.
> 
> So that's another kind of papering over than the one "needed" for rc6.
> 
> It seems to potentially be a useful service the kernel is providing to user space
> apps though, trying to setup a sane state so user space batches don't hang the
> GPU if they are missing one command. Of course hangs can still happen if the
> batches themselves have bugs.
> 
> How to be sure that's the correct golden state is another interesting question
> we need to answer.

Oops, sorry, I forgot a very important detail: the same userspace driver works just fine with a 3.10 kernel. I attribute this to the fact that the 3.10 kernel didn´t do MI_SET_CONTEXT unless you were explicitly creating contexts, but there could be another explanation, of course.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index b1445b7..2446916 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -18,6 +18,7 @@  i915-$(CONFIG_DEBUG_FS) += i915_debugfs.o
 # GEM code
 i915-y += i915_cmd_parser.o \
 	  i915_gem_context.o \
+	  i915_gem_render_state.o \
 	  i915_gem_debug.o \
 	  i915_gem_dmabuf.o \
 	  i915_gem_evict.o \
@@ -32,6 +33,11 @@  i915-y += i915_cmd_parser.o \
 	  intel_ringbuffer.o \
 	  intel_uncore.o
 
+# autogenerated null render state
+i915-y += intel_renderstate_gen6.o \
+	  intel_renderstate_gen7.o \
+	  intel_renderstate_gen8.o
+
 # modesetting core code
 i915-y += intel_bios.o \
 	  intel_display.o \
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 3fc2e3d..a2fc605 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2336,6 +2336,8 @@  int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file);
 
+/* i915_gem_render_state.c */
+int i915_gem_render_state_init(struct intel_ring_buffer *ring);
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
 					  struct i915_address_space *vm,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index f77b4c1..f7ad59e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -699,6 +699,12 @@  static int do_switch(struct intel_ring_buffer *ring,
 		/* obj is kept alive until the next request by its active ref */
 		i915_gem_object_ggtt_unpin(from->obj);
 		i915_gem_context_unreference(from);
+	} else {
+		if (to->is_initialized == false) {
+			ret = i915_gem_render_state_init(ring);
+			if (ret)
+				DRM_ERROR("init render state: %d\n", ret);
+		}
 	}
 
 	to->is_initialized = true;
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
new file mode 100644
index 0000000..392aa7b
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -0,0 +1,194 @@ 
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *    Mika Kuoppala <mika.kuoppala@intel.com>
+ *
+ */
+
+#include "i915_drv.h"
+#include "intel_renderstate.h"
+
+struct i915_render_state {
+	struct drm_i915_gem_object *obj;
+	unsigned long ggtt_offset;
+	void *batch;
+	u32 size;
+	u32 len;
+};
+
+static struct i915_render_state *render_state_alloc(struct drm_device *dev)
+{
+	struct i915_render_state *so;
+	struct page *page;
+	int ret;
+
+	so = kzalloc(sizeof(*so), GFP_KERNEL);
+	if (!so)
+		return ERR_PTR(-ENOMEM);
+
+	so->obj = i915_gem_alloc_object(dev, 4096);
+	if (so->obj == NULL) {
+		ret = -ENOMEM;
+		goto free;
+	}
+	so->size = 4096;
+
+	ret = i915_gem_obj_ggtt_pin(so->obj, 4096, 0);
+	if (ret)
+		goto free_gem;
+
+	BUG_ON(so->obj->pages->nents != 1);
+	page = sg_page(so->obj->pages->sgl);
+
+	so->batch = kmap(page);
+	if (!so->batch) {
+		ret = -ENOMEM;
+		goto unpin;
+	}
+
+	so->ggtt_offset = i915_gem_obj_ggtt_offset(so->obj);
+
+	return so;
+unpin:
+	i915_gem_object_ggtt_unpin(so->obj);
+free_gem:
+	drm_gem_object_unreference(&so->obj->base);
+free:
+	kfree(so);
+	return ERR_PTR(ret);
+}
+
+static void render_state_free(struct i915_render_state *so)
+{
+	kunmap(so->batch);
+	i915_gem_object_ggtt_unpin(so->obj);
+	drm_gem_object_unreference(&so->obj->base);
+	kfree(so);
+}
+
+static const struct intel_renderstate_rodata *
+render_state_get_rodata(struct drm_device *dev, const int gen)
+{
+	switch (gen) {
+	case 6:
+		return &gen6_null_state;
+	case 7:
+		return &gen7_null_state;
+	case 8:
+		return &gen8_null_state;
+	}
+
+	return NULL;
+}
+
+static int render_state_setup(const int gen,
+			      const struct intel_renderstate_rodata *rodata,
+			      struct i915_render_state *so)
+{
+	const u64 goffset = i915_gem_obj_ggtt_offset(so->obj);
+	u32 reloc_index = 0;
+	u32 * const d = so->batch;
+	unsigned int i = 0;
+	int ret;
+
+	if (!rodata || rodata->batch_items * 4 > so->size)
+		return -EINVAL;
+
+	ret = i915_gem_object_set_to_cpu_domain(so->obj, true);
+	if (ret)
+		return ret;
+
+	while (i < rodata->batch_items) {
+		u32 s = rodata->batch[i];
+
+		if (reloc_index < rodata->reloc_items &&
+		    i * 4  == rodata->reloc[reloc_index]) {
+
+			s += goffset & 0xffffffff;
+
+			/* We keep batch offsets max 32bit */
+			if (gen >= 8) {
+				if (i + 1 >= rodata->batch_items ||
+				    rodata->batch[i + 1] != 0)
+					return -EINVAL;
+
+				d[i] = s;
+				i++;
+				s = (goffset & 0xffffffff00000000ull) >> 32;
+			}
+
+			reloc_index++;
+		}
+
+		d[i] = s;
+		i++;
+	}
+
+	ret = i915_gem_object_set_to_gtt_domain(so->obj, false);
+	if (ret)
+		return ret;
+
+	if (rodata->reloc_items != reloc_index) {
+		DRM_ERROR("not all relocs resolved, %d out of %d\n",
+			  reloc_index, rodata->reloc_items);
+		return -EINVAL;
+	}
+
+	so->len = rodata->batch_items * 4;
+
+	return 0;
+}
+
+int i915_gem_render_state_init(struct intel_ring_buffer *ring)
+{
+	const int gen = INTEL_INFO(ring->dev)->gen;
+	struct i915_render_state *so;
+	const struct intel_renderstate_rodata *rodata;
+	u32 seqno;
+	int ret;
+
+	rodata = render_state_get_rodata(ring->dev, gen);
+	if (rodata == NULL)
+		return 0;
+
+	so = render_state_alloc(ring->dev);
+	if (IS_ERR(so))
+		return PTR_ERR(so);
+
+	ret = render_state_setup(gen, rodata, so);
+	if (ret)
+		goto out;
+
+	ret = ring->dispatch_execbuffer(ring,
+					i915_gem_obj_ggtt_offset(so->obj),
+					so->len,
+					I915_DISPATCH_SECURE);
+	if (ret)
+		goto out;
+
+	ret = i915_add_request(ring, &seqno);
+
+out:
+	render_state_free(so);
+	return ret;
+}
diff --git a/drivers/gpu/drm/i915/intel_renderstate.h b/drivers/gpu/drm/i915/intel_renderstate.h
new file mode 100644
index 0000000..a5e783a
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_renderstate.h
@@ -0,0 +1,48 @@ 
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef _INTEL_RENDERSTATE_H
+#define _INTEL_RENDERSTATE_H
+
+#include <linux/types.h>
+
+struct intel_renderstate_rodata {
+	const u32 *reloc;
+	const u32 reloc_items;
+	const u32 *batch;
+	const u32 batch_items;
+};
+
+extern const struct intel_renderstate_rodata gen6_null_state;
+extern const struct intel_renderstate_rodata gen7_null_state;
+extern const struct intel_renderstate_rodata gen8_null_state;
+
+#define RO_RENDERSTATE(_g)						\
+	const struct intel_renderstate_rodata gen ## _g ## _null_state = { \
+		.reloc = gen ## _g ## _null_state_relocs,		\
+		.reloc_items = sizeof(gen ## _g ## _null_state_relocs)/4, \
+		.batch = gen ## _g ## _null_state_batch,		\
+		.batch_items = sizeof(gen ## _g ## _null_state_batch)/4, \
+	}
+
+#endif /* INTEL_RENDERSTATE_H */
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen6.c b/drivers/gpu/drm/i915/intel_renderstate_gen6.c
new file mode 100644
index 0000000..5ed251a
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen6.c
@@ -0,0 +1,10 @@ 
+#include "intel_renderstate.h"
+
+static const u32 gen6_null_state_relocs[] = {
+};
+
+static const u32 gen6_null_state_batch[] = {
+	0x0a << 23, /* MI_BATCH_BUFFER_END */
+};
+
+RO_RENDERSTATE(6);
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen7.c b/drivers/gpu/drm/i915/intel_renderstate_gen7.c
new file mode 100644
index 0000000..5333f44
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen7.c
@@ -0,0 +1,10 @@ 
+#include "intel_renderstate.h"
+
+static const u32 gen7_null_state_relocs[] = {
+};
+
+static const u32 gen7_null_state_batch[] = {
+	0x0a << 23, /* MI_BATCH_BUFFER_END */
+};
+
+RO_RENDERSTATE(7);
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen8.c b/drivers/gpu/drm/i915/intel_renderstate_gen8.c
new file mode 100644
index 0000000..88c3733
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen8.c
@@ -0,0 +1,10 @@ 
+#include "intel_renderstate.h"
+
+static const u32 gen8_null_state_relocs[] = {
+};
+
+static const u32 gen8_null_state_batch[] = {
+	0x0a << 23, /* MI_BATCH_BUFFER_END */
+};
+
+RO_RENDERSTATE(8);