diff mbox

drm/i915/guc: WA to address the Ringbuffer coherency issue

Message ID 1476469424-25618-1-git-send-email-akash.goel@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

akash.goel@intel.com Oct. 14, 2016, 6:23 p.m. UTC
From: Akash Goel <akash.goel@intel.com>

Driver accesses the ringbuffer pages, via GMADR BAR, if the pages are
pinned in mappable aperture portion of GGTT and for ringbuffer pages
allocated from Stolen memory, access can only be done through GMADR BAR.
In case of GuC based submission, updates done in ringbuffer via GMADR
may not get commited to memory by the time the Command streamer starts
reading them, resulting in fetching of stale data.
For Host based submission, such problem is not there as the write to Ring
Tail or ELSP register happens from the Host side prior to submission.
Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already
enforces the ordering between outstanding GMADR writes & new GTTMADR access.
MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to
registers within GT is contained within GT, so ordering is not enforced
resulting in a race, which can manifest in form of a hang.
To ensure the flush of in flight GMADR writes, a POSTING READ is done to
GuC register prior to doorbell ring.
There is already a similar WA in i915_gem_object_flush_gtt_write_domain(),
which takes care of GMADR writes from User space to GEM buffers, but not the
ringbuffer writes from KMD.
This WA is needed on all recent HW.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_guc_submission.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Chris Wilson Oct. 14, 2016, 6:15 p.m. UTC | #1
On Fri, Oct 14, 2016 at 11:53:44PM +0530, akash.goel@intel.com wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> Driver accesses the ringbuffer pages, via GMADR BAR, if the pages are
> pinned in mappable aperture portion of GGTT and for ringbuffer pages
> allocated from Stolen memory, access can only be done through GMADR BAR.
> In case of GuC based submission, updates done in ringbuffer via GMADR
> may not get commited to memory by the time the Command streamer starts
> reading them, resulting in fetching of stale data.

Please leave a blank line between paragraphs, or try to not leave so
much whitespace at the end of a sentence.

> For Host based submission, such problem is not there as the write to Ring
> Tail or ELSP register happens from the Host side prior to submission.
> Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already
> enforces the ordering between outstanding GMADR writes & new GTTMADR access.
> MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to
> registers within GT is contained within GT, so ordering is not enforced
> resulting in a race, which can manifest in form of a hang.
> To ensure the flush of in flight GMADR writes, a POSTING READ is done to
> GuC register prior to doorbell ring.
> There is already a similar WA in i915_gem_object_flush_gtt_write_domain(),
> which takes care of GMADR writes from User space to GEM buffers, but not the
> ringbuffer writes from KMD.
> This WA is needed on all recent HW.
> 
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_guc_submission.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
> index a1f76c8..43c8a72 100644
> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
> @@ -601,6 +601,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>   */
>  static void i915_guc_submit(struct drm_i915_gem_request *rq)
>  {
> +	struct drm_i915_private *dev_priv = rq->i915;
>  	unsigned int engine_id = rq->engine->id;
>  	struct intel_guc *guc = &rq->i915->guc;
>  	struct i915_guc_client *client = guc->execbuf_client;
> @@ -608,6 +609,11 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
>  
>  	spin_lock(&client->wq_lock);
>  	guc_wq_item_append(client, rq);
> +
> +	/* WA to flush out the pending GMADR writes to ring buffer. */
> +	if (i915_vma_is_map_and_fenceable(rq->ring->vma))
> +		POSTING_READ(GUC_STATUS);

Did you test POSTING_READ_FW() ?

Otherwise it makes an unfortunate amount of sense, and I feel justified
in what I had to do in flush_gtt_write_domwin! :)
-Chris
akash.goel@intel.com Oct. 14, 2016, 6:32 p.m. UTC | #2
On 10/14/2016 11:45 PM, Chris Wilson wrote:
> On Fri, Oct 14, 2016 at 11:53:44PM +0530, akash.goel@intel.com wrote:
>> From: Akash Goel <akash.goel@intel.com>
>>
>> Driver accesses the ringbuffer pages, via GMADR BAR, if the pages are
>> pinned in mappable aperture portion of GGTT and for ringbuffer pages
>> allocated from Stolen memory, access can only be done through GMADR BAR.
>> In case of GuC based submission, updates done in ringbuffer via GMADR
>> may not get commited to memory by the time the Command streamer starts
>> reading them, resulting in fetching of stale data.
>
> Please leave a blank line between paragraphs, or try to not leave so
> much whitespace at the end of a sentence.
>
I am sorry. Will be mindful of this from now.

>> For Host based submission, such problem is not there as the write to Ring
>> Tail or ELSP register happens from the Host side prior to submission.
>> Access to any GFX register from CPU side goes to GTTMMADR BAR and Hw already
>> enforces the ordering between outstanding GMADR writes & new GTTMADR access.
>> MMIO writes from GuC side do not go to GTTMMADR BAR as GuC communication to
>> registers within GT is contained within GT, so ordering is not enforced
>> resulting in a race, which can manifest in form of a hang.
>> To ensure the flush of in flight GMADR writes, a POSTING READ is done to
>> GuC register prior to doorbell ring.
>> There is already a similar WA in i915_gem_object_flush_gtt_write_domain(),
>> which takes care of GMADR writes from User space to GEM buffers, but not the
>> ringbuffer writes from KMD.
>> This WA is needed on all recent HW.
>>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> Signed-off-by: Sagar Arun Kamble <sagar.a.kamble@intel.com>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> ---
>>  drivers/gpu/drm/i915/i915_guc_submission.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
>> index a1f76c8..43c8a72 100644
>> --- a/drivers/gpu/drm/i915/i915_guc_submission.c
>> +++ b/drivers/gpu/drm/i915/i915_guc_submission.c
>> @@ -601,6 +601,7 @@ static int guc_ring_doorbell(struct i915_guc_client *gc)
>>   */
>>  static void i915_guc_submit(struct drm_i915_gem_request *rq)
>>  {
>> +	struct drm_i915_private *dev_priv = rq->i915;
>>  	unsigned int engine_id = rq->engine->id;
>>  	struct intel_guc *guc = &rq->i915->guc;
>>  	struct i915_guc_client *client = guc->execbuf_client;
>> @@ -608,6 +609,11 @@ static void i915_guc_submit(struct drm_i915_gem_request *rq)
>>
>>  	spin_lock(&client->wq_lock);
>>  	guc_wq_item_append(client, rq);
>> +
>> +	/* WA to flush out the pending GMADR writes to ring buffer. */
>> +	if (i915_vma_is_map_and_fenceable(rq->ring->vma))
>> +		POSTING_READ(GUC_STATUS);
>
> Did you test POSTING_READ_FW() ?
Sorry though we haven't explicitly tried POSTING_READ_FW() but it should 
work since, as per the __gen9_fw_ranges[] table, GuC registers 
(C000-Cxxx) do not lie in any Forcewake domain range.

>
> Otherwise it makes an unfortunate amount of sense, and I feel justified
> in what I had to do in flush_gtt_write_domwin! :)
Yes your hunch, expectedly, was spot on :).

Best regards
Akash
> -Chris
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index a1f76c8..43c8a72 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -601,6 +601,7 @@  static int guc_ring_doorbell(struct i915_guc_client *gc)
  */
 static void i915_guc_submit(struct drm_i915_gem_request *rq)
 {
+	struct drm_i915_private *dev_priv = rq->i915;
 	unsigned int engine_id = rq->engine->id;
 	struct intel_guc *guc = &rq->i915->guc;
 	struct i915_guc_client *client = guc->execbuf_client;
@@ -608,6 +609,11 @@  static void i915_guc_submit(struct drm_i915_gem_request *rq)
 
 	spin_lock(&client->wq_lock);
 	guc_wq_item_append(client, rq);
+
+	/* WA to flush out the pending GMADR writes to ring buffer. */
+	if (i915_vma_is_map_and_fenceable(rq->ring->vma))
+		POSTING_READ(GUC_STATUS);
+
 	b_ret = guc_ring_doorbell(client);
 
 	client->submissions[engine_id] += 1;