diff mbox series

drm/i915: Don't rely that 2 VDSC engines are always enough for pixel rate

Message ID 20230628100816.16528-1-stanislav.lisovskiy@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915: Don't rely that 2 VDSC engines are always enough for pixel rate | expand

Commit Message

Lisovskiy, Stanislav June 28, 2023, 10:08 a.m. UTC
We are currently having FIFO underruns happening for kms_dsc test case,
problem is that, we check if curreny cdclk is >= pixel rate only if
there is a single VDSC engine enabled(i.e dsc_split=false) however if
we happen to have 2 VDSC engines enabled, we just kinda rely that this
would be automatically enough.
However pixel rate can be even >= than VDSC clock(cdclk) * 2, so in that
case even with 2 VDSC engines enabled, we still need to tweak it up.
So lets compare pixel rate with cdclk * slice count(VDSC engine count) and
check if it still requires bumping up.
Previously we had to bump up CDCLK many times for similar reasons.

Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/display/intel_cdclk.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Nautiyal, Ankit K July 3, 2023, 4:53 a.m. UTC | #1
On 6/28/2023 3:38 PM, Stanislav Lisovskiy wrote:
> We are currently having FIFO underruns happening for kms_dsc test case,
> problem is that, we check if curreny cdclk is >= pixel rate only if
> there is a single VDSC engine enabled(i.e dsc_split=false) however if
> we happen to have 2 VDSC engines enabled, we just kinda rely that this
> would be automatically enough.
> However pixel rate can be even >= than VDSC clock(cdclk) * 2, so in that
> case even with 2 VDSC engines enabled, we still need to tweak it up.
> So lets compare pixel rate with cdclk * slice count(VDSC engine count) and

Is it not that we use slice count for the number of DSC slices in which 
the horizontal scanline count is divided. So this can be 1,2, 4.

Whereas VDSC engine count is the number of VDSC engines the stream is 
splitted.

IIUC for a case where number of horizontal DSC slices is 4 and we use 2 
VDSC engines, each VDSC engine will get two slices and slice width will 
be HACTIVE/4.

Perhaps what we want to do is to compare pixel rate with cdclk * (number 
of vdsc engine count = dsc_split ? 2 : 1)

Regards,

Ankit


> check if it still requires bumping up.
> Previously we had to bump up CDCLK many times for similar reasons.
>
> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> ---
>   drivers/gpu/drm/i915/display/intel_cdclk.c | 9 +++++++--
>   1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
> index 4207863b7b2a..5880dcb11588 100644
> --- a/drivers/gpu/drm/i915/display/intel_cdclk.c
> +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
> @@ -2607,9 +2607,14 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
>   	 * When we decide to use only one VDSC engine, since
>   	 * each VDSC operates with 1 ppc throughput, pixel clock
>   	 * cannot be higher than the VDSC clock (cdclk)
> +	 * If there 2 VDSC engines, then pixel clock can't be higher than
> +	 * VDSC clock(cdclk) * 2. However even that can still be not enough.
> +	 * Slice count reflects amount of VDSC engines,
> +	 * so lets use that to determine, if need still need to tweak CDCLK higher.
>   	 */
> -	if (crtc_state->dsc.compression_enable && !crtc_state->dsc.dsc_split)
> -		min_cdclk = max(min_cdclk, (int)crtc_state->pixel_rate);
> +	if (crtc_state->dsc.compression_enable)
> +		min_cdclk = max_t(int, min_cdclk * crtc_state->dsc.slice_count,
> +			          crtc_state->pixel_rate);
>   
>   	/*
>   	 * HACK. Currently for TGL/DG2 platforms we calculate
Lisovskiy, Stanislav July 3, 2023, 8:50 a.m. UTC | #2
On Mon, Jul 03, 2023 at 10:23:00AM +0530, Nautiyal, Ankit K wrote:
> 
> On 6/28/2023 3:38 PM, Stanislav Lisovskiy wrote:
> > We are currently having FIFO underruns happening for kms_dsc test case,
> > problem is that, we check if curreny cdclk is >= pixel rate only if
> > there is a single VDSC engine enabled(i.e dsc_split=false) however if
> > we happen to have 2 VDSC engines enabled, we just kinda rely that this
> > would be automatically enough.
> > However pixel rate can be even >= than VDSC clock(cdclk) * 2, so in that
> > case even with 2 VDSC engines enabled, we still need to tweak it up.
> > So lets compare pixel rate with cdclk * slice count(VDSC engine count) and
> 
> Is it not that we use slice count for the number of DSC slices in which the
> horizontal scanline count is divided. So this can be 1,2, 4.
> 
> Whereas VDSC engine count is the number of VDSC engines the stream is
> splitted.
> 
> IIUC for a case where number of horizontal DSC slices is 4 and we use 2 VDSC
> engines, each VDSC engine will get two slices and slice width will be
> HACTIVE/4.
> 
> Perhaps what we want to do is to compare pixel rate with cdclk * (number of
> vdsc engine count = dsc_split ? 2 : 1)

Yes, we of course need amount of DSC engines here, however I was wondering is there
any other way to get amount of VDSC engines used more precisely, except just assuming
"2" if dsc_split is set to true?

As I understand amount of slices will always be >= amount of VDSC engines, however of course
if we will have 2 slices for each VDSC engines - that would be too optimistic.
However I just really don't want to hardcode "2" here.
Need to check if there is any other way..

Stan

> 
> Regards,
> 
> Ankit
> 
> 
> > check if it still requires bumping up.
> > Previously we had to bump up CDCLK many times for similar reasons.
> > 
> > Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > ---
> >   drivers/gpu/drm/i915/display/intel_cdclk.c | 9 +++++++--
> >   1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
> > index 4207863b7b2a..5880dcb11588 100644
> > --- a/drivers/gpu/drm/i915/display/intel_cdclk.c
> > +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
> > @@ -2607,9 +2607,14 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
> >   	 * When we decide to use only one VDSC engine, since
> >   	 * each VDSC operates with 1 ppc throughput, pixel clock
> >   	 * cannot be higher than the VDSC clock (cdclk)
> > +	 * If there 2 VDSC engines, then pixel clock can't be higher than
> > +	 * VDSC clock(cdclk) * 2. However even that can still be not enough.
> > +	 * Slice count reflects amount of VDSC engines,
> > +	 * so lets use that to determine, if need still need to tweak CDCLK higher.
> >   	 */
> > -	if (crtc_state->dsc.compression_enable && !crtc_state->dsc.dsc_split)
> > -		min_cdclk = max(min_cdclk, (int)crtc_state->pixel_rate);
> > +	if (crtc_state->dsc.compression_enable)
> > +		min_cdclk = max_t(int, min_cdclk * crtc_state->dsc.slice_count,
> > +			          crtc_state->pixel_rate);
> >   	/*
> >   	 * HACK. Currently for TGL/DG2 platforms we calculate
Nautiyal, Ankit K July 3, 2023, 9:35 a.m. UTC | #3
On 7/3/2023 2:20 PM, Lisovskiy, Stanislav wrote:
> On Mon, Jul 03, 2023 at 10:23:00AM +0530, Nautiyal, Ankit K wrote:
>> On 6/28/2023 3:38 PM, Stanislav Lisovskiy wrote:
>>> We are currently having FIFO underruns happening for kms_dsc test case,
>>> problem is that, we check if curreny cdclk is >= pixel rate only if
>>> there is a single VDSC engine enabled(i.e dsc_split=false) however if
>>> we happen to have 2 VDSC engines enabled, we just kinda rely that this
>>> would be automatically enough.
>>> However pixel rate can be even >= than VDSC clock(cdclk) * 2, so in that
>>> case even with 2 VDSC engines enabled, we still need to tweak it up.
>>> So lets compare pixel rate with cdclk * slice count(VDSC engine count) and
>> Is it not that we use slice count for the number of DSC slices in which the
>> horizontal scanline count is divided. So this can be 1,2, 4.
>>
>> Whereas VDSC engine count is the number of VDSC engines the stream is
>> splitted.
>>
>> IIUC for a case where number of horizontal DSC slices is 4 and we use 2 VDSC
>> engines, each VDSC engine will get two slices and slice width will be
>> HACTIVE/4.
>>
>> Perhaps what we want to do is to compare pixel rate with cdclk * (number of
>> vdsc engine count = dsc_split ? 2 : 1)
> Yes, we of course need amount of DSC engines here, however I was wondering is there
> any other way to get amount of VDSC engines used more precisely, except just assuming
> "2" if dsc_split is set to true?
>
> As I understand amount of slices will always be >= amount of VDSC engines, however of course
> if we will have 2 slices for each VDSC engines - that would be too optimistic.
> However I just really don't want to hardcode "2" here.
> Need to check if there is any other way..

Hmm right there is no exact way. DSS_CTL2 defines Left and right vdsc 
branch as of now and dsc_split is true if right VDSC engine also gets used.

Perhaps dsc_split can be defined as flag, each bit representing a VDSC 
engine?

Regards,

Ankit


>
> Stan
>
>> Regards,
>>
>> Ankit
>>
>>
>>> check if it still requires bumping up.
>>> Previously we had to bump up CDCLK many times for similar reasons.
>>>
>>> Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/display/intel_cdclk.c | 9 +++++++--
>>>    1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
>>> index 4207863b7b2a..5880dcb11588 100644
>>> --- a/drivers/gpu/drm/i915/display/intel_cdclk.c
>>> +++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
>>> @@ -2607,9 +2607,14 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
>>>    	 * When we decide to use only one VDSC engine, since
>>>    	 * each VDSC operates with 1 ppc throughput, pixel clock
>>>    	 * cannot be higher than the VDSC clock (cdclk)
>>> +	 * If there 2 VDSC engines, then pixel clock can't be higher than
>>> +	 * VDSC clock(cdclk) * 2. However even that can still be not enough.
>>> +	 * Slice count reflects amount of VDSC engines,
>>> +	 * so lets use that to determine, if need still need to tweak CDCLK higher.
>>>    	 */
>>> -	if (crtc_state->dsc.compression_enable && !crtc_state->dsc.dsc_split)
>>> -		min_cdclk = max(min_cdclk, (int)crtc_state->pixel_rate);
>>> +	if (crtc_state->dsc.compression_enable)
>>> +		min_cdclk = max_t(int, min_cdclk * crtc_state->dsc.slice_count,
>>> +			          crtc_state->pixel_rate);
>>>    	/*
>>>    	 * HACK. Currently for TGL/DG2 platforms we calculate
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/display/intel_cdclk.c b/drivers/gpu/drm/i915/display/intel_cdclk.c
index 4207863b7b2a..5880dcb11588 100644
--- a/drivers/gpu/drm/i915/display/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/display/intel_cdclk.c
@@ -2607,9 +2607,14 @@  int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
 	 * When we decide to use only one VDSC engine, since
 	 * each VDSC operates with 1 ppc throughput, pixel clock
 	 * cannot be higher than the VDSC clock (cdclk)
+	 * If there 2 VDSC engines, then pixel clock can't be higher than
+	 * VDSC clock(cdclk) * 2. However even that can still be not enough.
+	 * Slice count reflects amount of VDSC engines,
+	 * so lets use that to determine, if need still need to tweak CDCLK higher.
 	 */
-	if (crtc_state->dsc.compression_enable && !crtc_state->dsc.dsc_split)
-		min_cdclk = max(min_cdclk, (int)crtc_state->pixel_rate);
+	if (crtc_state->dsc.compression_enable)
+		min_cdclk = max_t(int, min_cdclk * crtc_state->dsc.slice_count,
+			          crtc_state->pixel_rate);
 
 	/*
 	 * HACK. Currently for TGL/DG2 platforms we calculate