diff mbox series

[v3] drm/i915: program wm blocks to at least blocks required per line

Message ID 20220406174311.82104-1-vinod.govindapillai@intel.com (mailing list archive)
State New, archived
Headers show
Series [v3] drm/i915: program wm blocks to at least blocks required per line | expand

Commit Message

Vinod Govindapillai April 6, 2022, 5:43 p.m. UTC
In configurations with single DRAM channel, for usecases like
4K 60 Hz, FIFO underruns are observed quite frequently. Looks
like the wm0 watermark values need to bumped up because the wm0
memory latency calculations are probably not taking the DRAM
channel's impact into account.

As per the Bspec 49325, if the ddb allocation can hold at least
one plane_blocks_per_line we should have selected method2.
Assuming that modern HW versions have enough dbuf to hold
at least one line, set the wm blocks to equivalent to blocks
per line.

V2: styling and comments changes (Ville)
V3: Updated the reviewed-by tag

cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>

Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
---
 drivers/gpu/drm/i915/intel_pm.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Ville Syrjälä April 14, 2022, 5:59 p.m. UTC | #1
On Wed, Apr 06, 2022 at 08:43:11PM +0300, Vinod Govindapillai wrote:
> In configurations with single DRAM channel, for usecases like
> 4K 60 Hz, FIFO underruns are observed quite frequently. Looks
> like the wm0 watermark values need to bumped up because the wm0
> memory latency calculations are probably not taking the DRAM
> channel's impact into account.
> 
> As per the Bspec 49325, if the ddb allocation can hold at least
> one plane_blocks_per_line we should have selected method2.
> Assuming that modern HW versions have enough dbuf to hold
> at least one line, set the wm blocks to equivalent to blocks
> per line.
> 
> V2: styling and comments changes (Ville)
> V3: Updated the reviewed-by tag
> 
> cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> 

Normally people put all the tags in one block, ie. no extra newlines
like this. Also 'Cc:' is the normal form for these.

I was expecting to see gitlab issue links here. Maybe not Closes:
since this is not the final form I guess. But could slap them on here
as References:.

> Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
> Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_pm.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 8824f269e5f5..d284ec8aff21 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -5475,6 +5475,25 @@ static void skl_compute_plane_wm(const struct intel_crtc_state *crtc_state,
>  	}
>  
>  	blocks = fixed16_to_u32_round_up(selected_result) + 1;
> +	/*
> +	 * Lets have blocks at minimum equivalent to plane_blocks_per_line
> +	 * as there will be at minimum one line for lines configuration. This
> +	 * is a work around for FIFO underruns observed with resolutions like
> +	 * 4k 60 Hz in single channel DRAM configurations.
> +	 *
> +	 * As per the Bspec 49325, if the ddb allocation can hold at least
> +	 * one plane_blocks_per_line, we should have selected method2 in
> +	 * the above logic. Assuming that modern versions have enough dbuf
> +	 * and method2 guarantees blocks equivalent to at least 1 line,
> +	 * select the blocks as plane_blocks_per_line.
> +	 *
> +	 * TODO: Revisit the logic when we have better understanding on DRAM
> +	 * channels' impact on the level 0 memory latency and the relevant
> +	 * wm calculations.
> +	 */
> +	if (skl_wm_has_lines(dev_priv, level))
> +		blocks = max_t(u32, blocks,
> +			       fixed16_to_u32_round_up(wp->plane_blocks_per_line));

A simple max() should do since the types are the same.

>  	lines = div_round_up_fixed16(selected_result,
>  				     wp->plane_blocks_per_line);
>  
> -- 
> 2.25.1
Vinod Govindapillai April 17, 2022, 9:32 a.m. UTC | #2
On Thu, 2022-04-14 at 20:59 +0300, Ville Syrjälä wrote:
> On Wed, Apr 06, 2022 at 08:43:11PM +0300, Vinod Govindapillai wrote:
> > In configurations with single DRAM channel, for usecases like
> > 4K 60 Hz, FIFO underruns are observed quite frequently. Looks
> > like the wm0 watermark values need to bumped up because the wm0
> > memory latency calculations are probably not taking the DRAM
> > channel's impact into account.
> > 
> > As per the Bspec 49325, if the ddb allocation can hold at least
> > one plane_blocks_per_line we should have selected method2.
> > Assuming that modern HW versions have enough dbuf to hold
> > at least one line, set the wm blocks to equivalent to blocks
> > per line.
> > 
> > V2: styling and comments changes (Ville)
> > V3: Updated the reviewed-by tag
> > 
> > cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > 
> 
> Normally people put all the tags in one block, ie. no extra newlines
> like this. Also 'Cc:' is the normal form for these.
> 
> I was expecting to see gitlab issue links here. Maybe not Closes:
> since this is not the final form I guess. But could slap them on here
> as References:.
> 
> > Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
> > Reviewed-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
> > ---
> >  drivers/gpu/drm/i915/intel_pm.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 8824f269e5f5..d284ec8aff21 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -5475,6 +5475,25 @@ static void skl_compute_plane_wm(const struct intel_crtc_state
> > *crtc_state,
> >  	}
> >  
> >  	blocks = fixed16_to_u32_round_up(selected_result) + 1;
> > +	/*
> > +	 * Lets have blocks at minimum equivalent to plane_blocks_per_line
> > +	 * as there will be at minimum one line for lines configuration. This
> > +	 * is a work around for FIFO underruns observed with resolutions like
> > +	 * 4k 60 Hz in single channel DRAM configurations.
> > +	 *
> > +	 * As per the Bspec 49325, if the ddb allocation can hold at least
> > +	 * one plane_blocks_per_line, we should have selected method2 in
> > +	 * the above logic. Assuming that modern versions have enough dbuf
> > +	 * and method2 guarantees blocks equivalent to at least 1 line,
> > +	 * select the blocks as plane_blocks_per_line.
> > +	 *
> > +	 * TODO: Revisit the logic when we have better understanding on DRAM
> > +	 * channels' impact on the level 0 memory latency and the relevant
> > +	 * wm calculations.
> > +	 */
> > +	if (skl_wm_has_lines(dev_priv, level))
> > +		blocks = max_t(u32, blocks,
> > +			       fixed16_to_u32_round_up(wp->plane_blocks_per_line));
> 
> A simple max() should do since the types are the same.

Thanks! I updated that.

> 
> >  	lines = div_round_up_fixed16(selected_result,
> >  				     wp->plane_blocks_per_line);
> >  
> > -- 
> > 2.25.1

V4 Sent.

BR
Vinod
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 8824f269e5f5..d284ec8aff21 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5475,6 +5475,25 @@  static void skl_compute_plane_wm(const struct intel_crtc_state *crtc_state,
 	}
 
 	blocks = fixed16_to_u32_round_up(selected_result) + 1;
+	/*
+	 * Lets have blocks at minimum equivalent to plane_blocks_per_line
+	 * as there will be at minimum one line for lines configuration. This
+	 * is a work around for FIFO underruns observed with resolutions like
+	 * 4k 60 Hz in single channel DRAM configurations.
+	 *
+	 * As per the Bspec 49325, if the ddb allocation can hold at least
+	 * one plane_blocks_per_line, we should have selected method2 in
+	 * the above logic. Assuming that modern versions have enough dbuf
+	 * and method2 guarantees blocks equivalent to at least 1 line,
+	 * select the blocks as plane_blocks_per_line.
+	 *
+	 * TODO: Revisit the logic when we have better understanding on DRAM
+	 * channels' impact on the level 0 memory latency and the relevant
+	 * wm calculations.
+	 */
+	if (skl_wm_has_lines(dev_priv, level))
+		blocks = max_t(u32, blocks,
+			       fixed16_to_u32_round_up(wp->plane_blocks_per_line));
 	lines = div_round_up_fixed16(selected_result,
 				     wp->plane_blocks_per_line);