diff mbox series

[v2,2/4] drm/i915: Fix WaProgramMgsrForL3BankSpecificMmioReads

Message ID 20190711155939.18059-1-tvrtko.ursulin@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series None | expand

Commit Message

Tvrtko Ursulin July 11, 2019, 3:59 p.m. UTC
From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Two issues in this code:

1.
fls() usage is incorrect causing off by one in subslice mask lookup,
which in other words means subslice mask of all zeroes is always used
(subslice mask of a slice which is not present, or even out of bounds
array access), rendering the checks in wa_init_mcr either futile or
random.

2.
Condition in WARN_ON is not correct. It is doing a bitwise and operation
between a positive (present subslices) and negative mask (disabled L3
banks).

This means that with corrected fls() usage the assert would always
incorrectly fail.

We can fix this by inverting the fuse bits in the check.

v2:
 * Simplify check for logic and redability.
 * Improve commentary explaining what is really happening ie. what the
   assert is really trying to check and why.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: fe864b76c2ab ("drm/i915: Implement WaProgramMgsrForL3BankSpecificMmioReads")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
Cc: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/i915/gt/intel_workarounds.c | 80 ++++++++++-----------
 1 file changed, 40 insertions(+), 40 deletions(-)

Comments

Summers, Stuart July 11, 2019, 11:51 p.m. UTC | #1
On Thu, 2019-07-11 at 16:59 +0100, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This is generally much more readable than the previous implementation,
thanks! Some minor comments below...

> 
> Two issues in this code:
> 
> 1.
> fls() usage is incorrect causing off by one in subslice mask lookup,
> which in other words means subslice mask of all zeroes is always used
> (subslice mask of a slice which is not present, or even out of bounds
> array access), rendering the checks in wa_init_mcr either futile or
> random.
> 
> 2.
> Condition in WARN_ON is not correct. It is doing a bitwise and
> operation
> between a positive (present subslices) and negative mask (disabled L3
> banks).
> 
> This means that with corrected fls() usage the assert would always
> incorrectly fail.
> 
> We can fix this by inverting the fuse bits in the check.
> 
> v2:
>  * Simplify check for logic and redability.
>  * Improve commentary explaining what is really happening ie. what
> the
>    assert is really trying to check and why.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Fixes: fe864b76c2ab ("drm/i915: Implement
> WaProgramMgsrForL3BankSpecificMmioReads")
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
> Cc: Michał Winiarski <michal.winiarski@intel.com>
> ---
>  drivers/gpu/drm/i915/gt/intel_workarounds.c | 80 ++++++++++---------
> --
>  1 file changed, 40 insertions(+), 40 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> index 9e069286d3ce..80f1159e5cda 100644
> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
> @@ -761,7 +761,27 @@ static void
>  wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
>  {
>  	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
> -	u32 mcr_slice_subslice_mask;
> +	u32 mcr_mask, mcr;
> +
> +	/*
> +	 * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl,icl
> +	 * Before any MMIO read into slice/subslice specific registers,
> MCR
> +	 * packet control register needs to be programmed to point to
> any
> +	 * enabled s/ss pair. Otherwise, incorrect values will be
> returned.
> +	 * This means each subsequent MMIO read will be forwarded to an
> +	 * specific s/ss combination, but this is OK since these
> registers
> +	 * are consistent across s/ss in almost all cases. In the rare
> +	 * occasions, such as INSTDONE, where this value is dependent
> +	 * on s/ss combo, the read should be done with
> read_subslice_reg.
> +	 */
> +	mcr = intel_calculate_mcr_s_ss_select(i915);
> +
> +	if (INTEL_GEN(i915) >= 11)
> +		mcr_mask = GEN11_MCR_SLICE_MASK |
> GEN11_MCR_SUBSLICE_MASK;
> +	else
> +		mcr_mask = GEN8_MCR_SLICE_MASK |
> GEN8_MCR_SUBSLICE_MASK;
> +
> +	wa_write_masked_or(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);

Was there a specific reason to move this up to the top? Or this is
purely to move this functionality all together rather than spread out
through the function? Looking at the documentation, we do want to
specifically apply WaProgramMgsrForL3BankSpecificMmioReads before any
other workarounds. So maybe just move this whole block to the bottom of
the function instead? 

>  
>  	/*
>  	 * WaProgramMgsrForL3BankSpecificMmioReads: cnl,icl
> @@ -776,49 +796,29 @@ wa_init_mcr(struct drm_i915_private *i915,
> struct i915_wa_list *wal)
>  	 * something more complex that requires checking the range of
> every
>  	 * MMIO read).
>  	 */
> -	if (INTEL_GEN(i915) >= 10 &&
> -	    is_power_of_2(sseu->slice_mask)) {
> +	if (INTEL_GEN(i915) >= 10 && is_power_of_2(sseu->slice_mask)) {
>  		/*
> -		 * read FUSE3 for enabled L3 Bank IDs, if L3 Bank
> matches
> -		 * enabled subslice, no need to redirect MCR packet
> +		 * GEN8_MCR_SELECTOR contains dual-purpose bits which
> select
> +		 * both to which subslice, or to which L3 bank, the
> respective
> +		 * mmio reads will go.
> +		 * Since we have selected one enabled subslice in
> +		 * WaProgramMgsrForCorrectSliceSpecificMmioReads, we
> now
> +		 * need to check if the L3 bank of the equal "instance"
> is also
> +		 * enabled.
> +		 * If that is not the case we could try to find a
> number which
> +		 * works for both, or going even further, implement a
> dynamic
> +		 * scheme where we switch at before every affected mmio

s/at //?

> read.
> +		 * Fortunately neither seems to be needed at the moment
> for
> +		 * current parts and current driver behaviour.
>  		 */
> -		u32 slice = fls(sseu->slice_mask);
> -		u32 fuse3 =
> -			intel_uncore_read(&i915->uncore,
> GEN10_MIRROR_FUSE3);
> -		u8 ss_mask = sseu->subslice_mask[slice];
> -
> -		u8 enabled_mask = (ss_mask | ss_mask >>
> -				   GEN10_L3BANK_PAIR_COUNT) &
> GEN10_L3BANK_MASK;
> -		u8 disabled_mask = fuse3 & GEN10_L3BANK_MASK;
> +		unsigned int mcr_ss = BIT((mcr >> 24) & 0x7);

Can you add macros for these magic numbers?

> +		unsigned int l3_fuse =
> +			intel_uncore_read(&i915->uncore,
> GEN10_MIRROR_FUSE3) &
> +			GEN10_L3BANK_MASK;
> +		unsigned int l3_en = ~(l3_fuse << 4 | l3_fuse);

Macro here for the 4?

Thanks,
Stuart

>  
> -		/*
> -		 * Production silicon should have matched L3Bank and
> -		 * subslice enabled
> -		 */
> -		WARN_ON((enabled_mask & disabled_mask) !=
> enabled_mask);
> +		WARN_ON(!(mcr_ss & l3_en));
>  	}
> -
> -	if (INTEL_GEN(i915) >= 11)
> -		mcr_slice_subslice_mask = GEN11_MCR_SLICE_MASK |
> -					  GEN11_MCR_SUBSLICE_MASK;
> -	else
> -		mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK |
> -					  GEN8_MCR_SUBSLICE_MASK;
> -	/*
> -	 * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl,icl
> -	 * Before any MMIO read into slice/subslice specific registers,
> MCR
> -	 * packet control register needs to be programmed to point to
> any
> -	 * enabled s/ss pair. Otherwise, incorrect values will be
> returned.
> -	 * This means each subsequent MMIO read will be forwarded to an
> -	 * specific s/ss combination, but this is OK since these
> registers
> -	 * are consistent across s/ss in almost all cases. In the rare
> -	 * occasions, such as INSTDONE, where this value is dependent
> -	 * on s/ss combo, the read should be done with
> read_subslice_reg.
> -	 */
> -	wa_write_masked_or(wal,
> -			   GEN8_MCR_SELECTOR,
> -			   mcr_slice_subslice_mask,
> -			   intel_calculate_mcr_s_ss_select(i915));
>  }
>  
>  static void
Tvrtko Ursulin July 12, 2019, 5:32 a.m. UTC | #2
On 12/07/2019 00:51, Summers, Stuart wrote:
> On Thu, 2019-07-11 at 16:59 +0100, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> This is generally much more readable than the previous implementation,
> thanks! Some minor comments below...
> 
>>
>> Two issues in this code:
>>
>> 1.
>> fls() usage is incorrect causing off by one in subslice mask lookup,
>> which in other words means subslice mask of all zeroes is always used
>> (subslice mask of a slice which is not present, or even out of bounds
>> array access), rendering the checks in wa_init_mcr either futile or
>> random.
>>
>> 2.
>> Condition in WARN_ON is not correct. It is doing a bitwise and
>> operation
>> between a positive (present subslices) and negative mask (disabled L3
>> banks).
>>
>> This means that with corrected fls() usage the assert would always
>> incorrectly fail.
>>
>> We can fix this by inverting the fuse bits in the check.
>>
>> v2:
>>   * Simplify check for logic and redability.
>>   * Improve commentary explaining what is really happening ie. what
>> the
>>     assert is really trying to check and why.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Fixes: fe864b76c2ab ("drm/i915: Implement
>> WaProgramMgsrForL3BankSpecificMmioReads")
>> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
>> Cc: Michał Winiarski <michal.winiarski@intel.com>
>> ---
>>   drivers/gpu/drm/i915/gt/intel_workarounds.c | 80 ++++++++++---------
>> --
>>   1 file changed, 40 insertions(+), 40 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> index 9e069286d3ce..80f1159e5cda 100644
>> --- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> +++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
>> @@ -761,7 +761,27 @@ static void
>>   wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
>>   {
>>   	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
>> -	u32 mcr_slice_subslice_mask;
>> +	u32 mcr_mask, mcr;
>> +
>> +	/*
>> +	 * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl,icl
>> +	 * Before any MMIO read into slice/subslice specific registers,
>> MCR
>> +	 * packet control register needs to be programmed to point to
>> any
>> +	 * enabled s/ss pair. Otherwise, incorrect values will be
>> returned.
>> +	 * This means each subsequent MMIO read will be forwarded to an
>> +	 * specific s/ss combination, but this is OK since these
>> registers
>> +	 * are consistent across s/ss in almost all cases. In the rare
>> +	 * occasions, such as INSTDONE, where this value is dependent
>> +	 * on s/ss combo, the read should be done with
>> read_subslice_reg.
>> +	 */
>> +	mcr = intel_calculate_mcr_s_ss_select(i915);
>> +
>> +	if (INTEL_GEN(i915) >= 11)
>> +		mcr_mask = GEN11_MCR_SLICE_MASK |
>> GEN11_MCR_SUBSLICE_MASK;
>> +	else
>> +		mcr_mask = GEN8_MCR_SLICE_MASK |
>> GEN8_MCR_SUBSLICE_MASK;
>> +
>> +	wa_write_masked_or(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
> 
> Was there a specific reason to move this up to the top? Or this is

Yes, so below can check the actually selected MCR instead of deriving it 
from slice/subslice mask as stored in the driver.

> purely to move this functionality all together rather than spread out
> through the function? Looking at the documentation, we do want to
> specifically apply WaProgramMgsrForL3BankSpecificMmioReads before any
> other workarounds. So maybe just move this whole block to the bottom of
> the function instead?

I think it works better if MCR selection is first. Going forward, and 
for robustness, this probably needs to be improved to do both 
workarounds in a single block. Along the lines of ffs(ss_ena & l3_ena). 
And WARN_ON if no common bits. Oh well.. now I got no excuses not to do 
it...

>>   
>>   	/*
>>   	 * WaProgramMgsrForL3BankSpecificMmioReads: cnl,icl
>> @@ -776,49 +796,29 @@ wa_init_mcr(struct drm_i915_private *i915,
>> struct i915_wa_list *wal)
>>   	 * something more complex that requires checking the range of
>> every
>>   	 * MMIO read).
>>   	 */
>> -	if (INTEL_GEN(i915) >= 10 &&
>> -	    is_power_of_2(sseu->slice_mask)) {
>> +	if (INTEL_GEN(i915) >= 10 && is_power_of_2(sseu->slice_mask)) {
>>   		/*
>> -		 * read FUSE3 for enabled L3 Bank IDs, if L3 Bank
>> matches
>> -		 * enabled subslice, no need to redirect MCR packet
>> +		 * GEN8_MCR_SELECTOR contains dual-purpose bits which
>> select
>> +		 * both to which subslice, or to which L3 bank, the
>> respective
>> +		 * mmio reads will go.
>> +		 * Since we have selected one enabled subslice in
>> +		 * WaProgramMgsrForCorrectSliceSpecificMmioReads, we
>> now
>> +		 * need to check if the L3 bank of the equal "instance"
>> is also
>> +		 * enabled.
>> +		 * If that is not the case we could try to find a
>> number which
>> +		 * works for both, or going even further, implement a
>> dynamic
>> +		 * scheme where we switch at before every affected mmio
> 
> s/at //?
> 
>> read.
>> +		 * Fortunately neither seems to be needed at the moment
>> for
>> +		 * current parts and current driver behaviour.
>>   		 */
>> -		u32 slice = fls(sseu->slice_mask);
>> -		u32 fuse3 =
>> -			intel_uncore_read(&i915->uncore,
>> GEN10_MIRROR_FUSE3);
>> -		u8 ss_mask = sseu->subslice_mask[slice];
>> -
>> -		u8 enabled_mask = (ss_mask | ss_mask >>
>> -				   GEN10_L3BANK_PAIR_COUNT) &
>> GEN10_L3BANK_MASK;
>> -		u8 disabled_mask = fuse3 & GEN10_L3BANK_MASK;
>> +		unsigned int mcr_ss = BIT((mcr >> 24) & 0x7);
> 
> Can you add macros for these magic numbers?
> 
>> +		unsigned int l3_fuse =
>> +			intel_uncore_read(&i915->uncore,
>> GEN10_MIRROR_FUSE3) &
>> +			GEN10_L3BANK_MASK;
>> +		unsigned int l3_en = ~(l3_fuse << 4 | l3_fuse);
> 
> Macro here for the 4?

I knew someone would ask for it. You give fixes, people want perfection. 
;) Will do.

Regards,

Tvrtko

> Thanks,
> Stuart
> 
>>   
>> -		/*
>> -		 * Production silicon should have matched L3Bank and
>> -		 * subslice enabled
>> -		 */
>> -		WARN_ON((enabled_mask & disabled_mask) !=
>> enabled_mask);
>> +		WARN_ON(!(mcr_ss & l3_en));
>>   	}
>> -
>> -	if (INTEL_GEN(i915) >= 11)
>> -		mcr_slice_subslice_mask = GEN11_MCR_SLICE_MASK |
>> -					  GEN11_MCR_SUBSLICE_MASK;
>> -	else
>> -		mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK |
>> -					  GEN8_MCR_SUBSLICE_MASK;
>> -	/*
>> -	 * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl,icl
>> -	 * Before any MMIO read into slice/subslice specific registers,
>> MCR
>> -	 * packet control register needs to be programmed to point to
>> any
>> -	 * enabled s/ss pair. Otherwise, incorrect values will be
>> returned.
>> -	 * This means each subsequent MMIO read will be forwarded to an
>> -	 * specific s/ss combination, but this is OK since these
>> registers
>> -	 * are consistent across s/ss in almost all cases. In the rare
>> -	 * occasions, such as INSTDONE, where this value is dependent
>> -	 * on s/ss combo, the read should be done with
>> read_subslice_reg.
>> -	 */
>> -	wa_write_masked_or(wal,
>> -			   GEN8_MCR_SELECTOR,
>> -			   mcr_slice_subslice_mask,
>> -			   intel_calculate_mcr_s_ss_select(i915));
>>   }
>>   
>>   static void
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/gt/intel_workarounds.c b/drivers/gpu/drm/i915/gt/intel_workarounds.c
index 9e069286d3ce..80f1159e5cda 100644
--- a/drivers/gpu/drm/i915/gt/intel_workarounds.c
+++ b/drivers/gpu/drm/i915/gt/intel_workarounds.c
@@ -761,7 +761,27 @@  static void
 wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
 {
 	const struct sseu_dev_info *sseu = &RUNTIME_INFO(i915)->sseu;
-	u32 mcr_slice_subslice_mask;
+	u32 mcr_mask, mcr;
+
+	/*
+	 * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl,icl
+	 * Before any MMIO read into slice/subslice specific registers, MCR
+	 * packet control register needs to be programmed to point to any
+	 * enabled s/ss pair. Otherwise, incorrect values will be returned.
+	 * This means each subsequent MMIO read will be forwarded to an
+	 * specific s/ss combination, but this is OK since these registers
+	 * are consistent across s/ss in almost all cases. In the rare
+	 * occasions, such as INSTDONE, where this value is dependent
+	 * on s/ss combo, the read should be done with read_subslice_reg.
+	 */
+	mcr = intel_calculate_mcr_s_ss_select(i915);
+
+	if (INTEL_GEN(i915) >= 11)
+		mcr_mask = GEN11_MCR_SLICE_MASK | GEN11_MCR_SUBSLICE_MASK;
+	else
+		mcr_mask = GEN8_MCR_SLICE_MASK | GEN8_MCR_SUBSLICE_MASK;
+
+	wa_write_masked_or(wal, GEN8_MCR_SELECTOR, mcr_mask, mcr);
 
 	/*
 	 * WaProgramMgsrForL3BankSpecificMmioReads: cnl,icl
@@ -776,49 +796,29 @@  wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
 	 * something more complex that requires checking the range of every
 	 * MMIO read).
 	 */
-	if (INTEL_GEN(i915) >= 10 &&
-	    is_power_of_2(sseu->slice_mask)) {
+	if (INTEL_GEN(i915) >= 10 && is_power_of_2(sseu->slice_mask)) {
 		/*
-		 * read FUSE3 for enabled L3 Bank IDs, if L3 Bank matches
-		 * enabled subslice, no need to redirect MCR packet
+		 * GEN8_MCR_SELECTOR contains dual-purpose bits which select
+		 * both to which subslice, or to which L3 bank, the respective
+		 * mmio reads will go.
+		 * Since we have selected one enabled subslice in
+		 * WaProgramMgsrForCorrectSliceSpecificMmioReads, we now
+		 * need to check if the L3 bank of the equal "instance" is also
+		 * enabled.
+		 * If that is not the case we could try to find a number which
+		 * works for both, or going even further, implement a dynamic
+		 * scheme where we switch at before every affected mmio read.
+		 * Fortunately neither seems to be needed at the moment for
+		 * current parts and current driver behaviour.
 		 */
-		u32 slice = fls(sseu->slice_mask);
-		u32 fuse3 =
-			intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3);
-		u8 ss_mask = sseu->subslice_mask[slice];
-
-		u8 enabled_mask = (ss_mask | ss_mask >>
-				   GEN10_L3BANK_PAIR_COUNT) & GEN10_L3BANK_MASK;
-		u8 disabled_mask = fuse3 & GEN10_L3BANK_MASK;
+		unsigned int mcr_ss = BIT((mcr >> 24) & 0x7);
+		unsigned int l3_fuse =
+			intel_uncore_read(&i915->uncore, GEN10_MIRROR_FUSE3) &
+			GEN10_L3BANK_MASK;
+		unsigned int l3_en = ~(l3_fuse << 4 | l3_fuse);
 
-		/*
-		 * Production silicon should have matched L3Bank and
-		 * subslice enabled
-		 */
-		WARN_ON((enabled_mask & disabled_mask) != enabled_mask);
+		WARN_ON(!(mcr_ss & l3_en));
 	}
-
-	if (INTEL_GEN(i915) >= 11)
-		mcr_slice_subslice_mask = GEN11_MCR_SLICE_MASK |
-					  GEN11_MCR_SUBSLICE_MASK;
-	else
-		mcr_slice_subslice_mask = GEN8_MCR_SLICE_MASK |
-					  GEN8_MCR_SUBSLICE_MASK;
-	/*
-	 * WaProgramMgsrForCorrectSliceSpecificMmioReads:cnl,icl
-	 * Before any MMIO read into slice/subslice specific registers, MCR
-	 * packet control register needs to be programmed to point to any
-	 * enabled s/ss pair. Otherwise, incorrect values will be returned.
-	 * This means each subsequent MMIO read will be forwarded to an
-	 * specific s/ss combination, but this is OK since these registers
-	 * are consistent across s/ss in almost all cases. In the rare
-	 * occasions, such as INSTDONE, where this value is dependent
-	 * on s/ss combo, the read should be done with read_subslice_reg.
-	 */
-	wa_write_masked_or(wal,
-			   GEN8_MCR_SELECTOR,
-			   mcr_slice_subslice_mask,
-			   intel_calculate_mcr_s_ss_select(i915));
 }
 
 static void