diff mbox

[RFC,3/4] drm/i915: Use new i915_gem_object_pin_map for LRC

Message ID 1460123698-16832-4-git-send-email-tvrtko.ursulin@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Tvrtko Ursulin April 8, 2016, 1:54 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

We can use the new pin/lazy unpin API for more performance
in the execlist submission paths.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

Comments

Chris Wilson April 8, 2016, 2:40 p.m. UTC | #1
On Fri, Apr 08, 2016 at 02:54:57PM +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> We can use the new pin/lazy unpin API for more performance
> in the execlist submission paths.

The sticking point for me was the ringbuffer and Braswell having to
re-ioremap it every context pin. I hadn't noticed the kmap of the
context, but this does seem like a valid use for pin_map so I'm not
complaining... (Just saying this half the solution ;)

Oh, and you don't have pin_map for your ringbuffer either yet.
-Chris
Tvrtko Ursulin April 11, 2016, 9:05 a.m. UTC | #2
On 08/04/16 15:40, Chris Wilson wrote:
> On Fri, Apr 08, 2016 at 02:54:57PM +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> We can use the new pin/lazy unpin API for more performance
>> in the execlist submission paths.
>
> The sticking point for me was the ringbuffer and Braswell having to
> re-ioremap it every context pin. I hadn't noticed the kmap of the
> context, but this does seem like a valid use for pin_map so I'm not
> complaining... (Just saying this half the solution ;)

Pull in to the series "drm/i915: Move ioremap_wc tracking onto VMA" as 
well then?

> Oh, and you don't have pin_map for your ringbuffer either yet.

Confused, I did base this on top of "drm/i915: Refactor duplicate object 
vmap functions" and the half missing is the ioremap paths. Which you 
commented on above. What other ringbuffer is missing?

Regards,

Tvrtko
Chris Wilson April 11, 2016, 9:18 a.m. UTC | #3
On Mon, Apr 11, 2016 at 10:05:34AM +0100, Tvrtko Ursulin wrote:
> 
> On 08/04/16 15:40, Chris Wilson wrote:
> >On Fri, Apr 08, 2016 at 02:54:57PM +0100, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>We can use the new pin/lazy unpin API for more performance
> >>in the execlist submission paths.
> >
> >The sticking point for me was the ringbuffer and Braswell having to
> >re-ioremap it every context pin. I hadn't noticed the kmap of the
> >context, but this does seem like a valid use for pin_map so I'm not
> >complaining... (Just saying this half the solution ;)
> 
> Pull in to the series "drm/i915: Move ioremap_wc tracking onto VMA"
> as well then?

That depends on "io-mapping: Specify mapping size for io_mapping_map_wc"
and the vma->map_and_fenceable (though you can workaround that) iirc .
 
> >Oh, and you don't have pin_map for your ringbuffer either yet.
> 
> Confused, I did base this on top of "drm/i915: Refactor duplicate
> object vmap functions" and the half missing is the ioremap paths.
> Which you commented on above. What other ringbuffer is missing?

No, I was just thinking it was still pending because I had a fixup in my
tree: drm/i915: Treat ringbuffer writes as write to normal memory

That and a patch to use WC vmaps for Braswell as well (that's still in
preliminary form, but if you are interested
https://cgit.freedesktop.org/~ickle/linux-2.6/commit/?h=tasklet&id=3fa97e044f3b5a2c23e44ec659a8199f134d8fb5
it really helps with context creation stress tests - and not much else!
;)
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 3d08dac0a9b0..5e967943ce49 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1111,8 +1111,8 @@  static int intel_lr_context_do_pin(struct intel_context *ctx,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *ctx_obj = ctx->engine[engine->id].state;
 	struct intel_ringbuffer *ringbuf = ctx->engine[engine->id].ringbuf;
-	struct page *lrc_state_page;
-	uint32_t *lrc_reg_state;
+	void *obj_addr;
+	u32 *lrc_reg_state;
 	int ret;
 
 	WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex));
@@ -1122,11 +1122,11 @@  static int intel_lr_context_do_pin(struct intel_context *ctx,
 	if (ret)
 		return ret;
 
-	lrc_state_page = i915_gem_object_get_dirty_page(ctx_obj, LRC_STATE_PN);
-	if (WARN_ON(!lrc_state_page)) {
-		ret = -ENODEV;
+	obj_addr = i915_gem_object_pin_map(ctx_obj);
+	if (!obj_addr)
 		goto unpin_ctx_obj;
-	}
+
+	lrc_reg_state = obj_addr + LRC_STATE_PN * PAGE_SIZE;
 
 	ret = intel_pin_and_map_ringbuffer_obj(engine->dev, ringbuf);
 	if (ret)
@@ -1134,7 +1134,6 @@  static int intel_lr_context_do_pin(struct intel_context *ctx,
 
 	ctx->engine[engine->id].lrc_vma = i915_gem_obj_to_ggtt(ctx_obj);
 	intel_lr_context_descriptor_update(ctx, engine);
-	lrc_reg_state = kmap(lrc_state_page);
 	lrc_reg_state[CTX_RING_BUFFER_START+1] = ringbuf->vma->node.start;
 	ctx->engine[engine->id].lrc_reg_state = lrc_reg_state;
 	ctx_obj->dirty = true;
@@ -1177,7 +1176,7 @@  void intel_lr_context_unpin(struct intel_context *ctx,
 
 	WARN_ON(!mutex_is_locked(&ctx->i915->dev->struct_mutex));
 	if (--ctx->engine[engine->id].pin_count == 0) {
-		kunmap(kmap_to_page(ctx->engine[engine->id].lrc_reg_state));
+		i915_gem_object_unpin_map(ctx_obj);
 		intel_unpin_ringbuffer_obj(ctx->engine[engine->id].ringbuf);
 		i915_gem_object_ggtt_unpin(ctx_obj);
 		ctx->engine[engine->id].lrc_vma = NULL;