diff mbox

[2/3] drm/i915: mark GEM objects as dirty when updated by the CPU

Message ID 1449593478-33649-3-git-send-email-david.s.gordon@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Gordon Dec. 8, 2015, 4:51 p.m. UTC
In various places, one or more pages of a GEM object are mapped into CPU
address space and updated. In each such case, either the page or the the
object should be marked dirty, to ensure that the modifications are not
discarded if the object is evicted under memory pressure.

Ideally, we would like to mark only the updated pages dirty; but it
isn't clear at this point whether this will work for all types of GEM
objects (regular/gtt, phys, stolen, userptr, dmabuf, ...). So for now,
let's ensure correctness by marking the whole object dirty.

Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   | 2 ++
 drivers/gpu/drm/i915/i915_gem_render_state.c | 1 +
 drivers/gpu/drm/i915/i915_guc_submission.c   | 1 +
 drivers/gpu/drm/i915/intel_lrc.c             | 6 +++++-
 4 files changed, 9 insertions(+), 1 deletion(-)

Comments

Chris Wilson Dec. 8, 2015, 5 p.m. UTC | #1
On Tue, Dec 08, 2015 at 04:51:17PM +0000, Dave Gordon wrote:
> In various places, one or more pages of a GEM object are mapped into CPU
> address space and updated. In each such case, either the page or the the
> object should be marked dirty, to ensure that the modifications are not
> discarded if the object is evicted under memory pressure.
> 
> Ideally, we would like to mark only the updated pages dirty; but it
> isn't clear at this point whether this will work for all types of GEM
> objects (regular/gtt, phys, stolen, userptr, dmabuf, ...). So for now,
> let's ensure correctness by marking the whole object dirty.
> 
> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c   | 2 ++
>  drivers/gpu/drm/i915/i915_gem_render_state.c | 1 +
>  drivers/gpu/drm/i915/i915_guc_submission.c   | 1 +
>  drivers/gpu/drm/i915/intel_lrc.c             | 6 +++++-
>  4 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index a4c243c..bc28a10 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -281,6 +281,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj,
>  	}
>  
>  	kunmap_atomic(vaddr);
> +	obj->dirty = 1;
Nak. CPU dirtying is a per-page interface.
-Chris
Dave Gordon Dec. 8, 2015, 6:43 p.m. UTC | #2
On 08/12/15 17:00, Chris Wilson wrote:
> On Tue, Dec 08, 2015 at 04:51:17PM +0000, Dave Gordon wrote:
>> In various places, one or more pages of a GEM object are mapped into CPU
>> address space and updated. In each such case, either the page or the the
>> object should be marked dirty, to ensure that the modifications are not
>> discarded if the object is evicted under memory pressure.
>>
>> Ideally, we would like to mark only the updated pages dirty; but it
>> isn't clear at this point whether this will work for all types of GEM
>> objects (regular/gtt, phys, stolen, userptr, dmabuf, ...). So for now,
>> let's ensure correctness by marking the whole object dirty.
>>
>> Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c   | 2 ++
>>   drivers/gpu/drm/i915/i915_gem_render_state.c | 1 +
>>   drivers/gpu/drm/i915/i915_guc_submission.c   | 1 +
>>   drivers/gpu/drm/i915/intel_lrc.c             | 6 +++++-
>>   4 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index a4c243c..bc28a10 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -281,6 +281,7 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj,
>>   	}
>>
>>   	kunmap_atomic(vaddr);
>> +	obj->dirty = 1;
> Nak. CPU dirtying is a per-page interface.
> -Chris

That's what my commit message said. But let's at least have /correct/ 
behaviour while we work out which object types we (can) support here.

Also, in:

         if (use_cpu_reloc(obj))
                 ret = relocate_entry_cpu(obj, reloc, target_offset);
         else if (obj->map_and_fenceable)
                 ret = relocate_entry_gtt(obj, reloc, target_offset);
         else if (cpu_has_clflush)
                 ret = relocate_entry_clflush(obj, reloc, target_offset);

both the other routines parallel to relocate_entry_cpu() [i.e. 
relocate_entry_gtt() and relocate_entry_clflush()] mark the whole object 
dirty. Why be inconsistent?

Can we be sure that the object in question actually has per-page 
tracking of dirty pages. shmfs objects do, but not phys, which only has 
object-level dirty tracking. Can we guarantee that only the right sort 
of objects will be handled here? And when stolen objects are exposed to 
the user?

.Dave.
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index a4c243c..bc28a10 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -281,6 +281,7 @@  relocate_entry_cpu(struct drm_i915_gem_object *obj,
 	}
 
 	kunmap_atomic(vaddr);
+	obj->dirty = 1;
 
 	return 0;
 }
@@ -372,6 +373,7 @@  relocate_entry_clflush(struct drm_i915_gem_object *obj,
 	}
 
 	kunmap_atomic(vaddr);
+	obj->dirty = 1;
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 5026a62..dd1976c 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -144,6 +144,7 @@  static int render_state_setup(struct render_state *so)
 	so->aux_batch_size = ALIGN(so->aux_batch_size, 8);
 
 	kunmap(page);
+	so->obj->dirty = 1;
 
 	ret = i915_gem_object_set_to_gtt_domain(so->obj, false);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0d23785b..c0e58f8 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -574,6 +574,7 @@  static void lr_context_update(struct drm_i915_gem_request *rq)
 	reg_state[CTX_RING_BUFFER_START+1] = i915_gem_obj_ggtt_offset(rb_obj);
 
 	kunmap_atomic(reg_state);
+	ctx_obj->dirty = 1;
 }
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4ebafab..bc77794 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -391,6 +391,7 @@  static int execlists_update_context(struct drm_i915_gem_request *rq)
 	}
 
 	kunmap_atomic(reg_state);
+	ctx_obj->dirty = 1;
 
 	return 0;
 }
@@ -1030,7 +1031,7 @@  static int intel_lr_context_do_pin(struct intel_engine_cs *ring,
 	if (ret)
 		goto unpin_ctx_obj;
 
-	ctx_obj->dirty = true;
+	ctx_obj->dirty = 1;
 
 	/* Invalidate GuC TLB. */
 	if (i915.enable_guc_submission)
@@ -1461,6 +1462,8 @@  static int intel_init_workaround_bb(struct intel_engine_cs *ring)
 
 out:
 	kunmap_atomic(batch);
+	wa_ctx->obj->dirty = 1;
+
 	if (ret)
 		lrc_destroy_wa_ctx_obj(ring);
 
@@ -2536,6 +2539,7 @@  void intel_lr_context_reset(struct drm_device *dev,
 		reg_state[CTX_RING_TAIL+1] = 0;
 
 		kunmap_atomic(reg_state);
+		ctx_obj->dirty = 1;
 
 		ringbuf->head = 0;
 		ringbuf->tail = 0;