[RFC,5/5] drm/i915/display: Add Nearest-neighbor based integer scaling support
diff mbox series

Message ID 20200225070545.4482-6-pankaj.laxminarayan.bharadiya@intel.com
State New
Headers show
Series
  • Introduce drm scaling filter property
Related show

Commit Message

Bharadiya,Pankaj Feb. 25, 2020, 7:05 a.m. UTC
Integer scaling (IS) is a nearest-neighbor upscaling technique that
simply scales up the existing pixels by an integer
(i.e., whole number) multiplier.Nearest-neighbor (NN) interpolation
works by filling in the missing color values in the upscaled image
with that of the coordinate-mapped nearest source pixel value.

Both IS and NN preserve the clarity of the original image. Integer
scaling is particularly useful for pixel art games that rely on
sharp, blocky images to deliver their distinctive look.

Program the scaler filter coefficients to enable the NN filter if
scaling filter property is set to DRM_SCALING_FILTER_NEAREST_NEIGHBOR
and enable integer scaling.

Bspec: 49247

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
---
 drivers/gpu/drm/i915/display/intel_display.c | 83 +++++++++++++++++++-
 drivers/gpu/drm/i915/display/intel_display.h |  2 +
 drivers/gpu/drm/i915/display/intel_sprite.c  | 20 +++--
 3 files changed, 97 insertions(+), 8 deletions(-)

Comments

Daniel Stone Feb. 25, 2020, 7:29 a.m. UTC | #1
Hi,

On Tue, 25 Feb 2020 at 07:17, Pankaj Bharadiya
<pankaj.laxminarayan.bharadiya@intel.com> wrote:
> @@ -415,18 +415,26 @@ skl_program_scaler(struct intel_plane *plane,
>         u16 y_vphase, uv_rgb_vphase;
>         int hscale, vscale;
>         const struct drm_plane_state *state = &plane_state->uapi;
> +       u32 src_w = drm_rect_width(&plane_state->uapi.src) >> 16;
> +       u32 src_h = drm_rect_height(&plane_state->uapi.src) >> 16;
>         u32 scaling_filter = PS_FILTER_MEDIUM;
> +       struct drm_rect dst;
>
>         if (state->scaling_filter == DRM_SCALING_FILTER_NEAREST_NEIGHBOR) {
>                 scaling_filter = PS_FILTER_PROGRAMMED;
> +               skl_setup_nearest_neighbor_filter(dev_priv, pipe, scaler_id);
> +
> +               /* Make the scaling window size to integer multiple of source
> +                * TODO: Should userspace take desision to round scaling window
> +                * to integer multiple?
> +                */
> +               crtc_w = rounddown(crtc_w, src_w);
> +               crtc_h = rounddown(crtc_h, src_h);

The kernel should absolutely not be changing the co-ordinates that
userspace requested.

Cheers,
Daniel
Bharadiya,Pankaj Feb. 28, 2020, 5:50 a.m. UTC | #2
> -----Original Message-----
> From: Daniel Stone <daniel@fooishbar.org>
> Sent: 25 February 2020 13:00
> To: Laxminarayan Bharadiya, Pankaj
> <pankaj.laxminarayan.bharadiya@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>; Daniel Vetter
> <daniel@ffwll.ch>; intel-gfx <intel-gfx@lists.freedesktop.org>; dri-devel
> <dri-devel@lists.freedesktop.org>; Ville Syrjälä
> <ville.syrjala@linux.intel.com>; David Airlie <airlied@linux.ie>; Maarten
> Lankhorst <maarten.lankhorst@linux.intel.com>; tzimmermann@suse.de;
> Maxime Ripard <mripard@kernel.org>; mihail.atanassov@arm.com; Joonas
> Lahtinen <joonas.lahtinen@linux.intel.com>; Vivi, Rodrigo
> <rodrigo.vivi@intel.com>; Chris Wilson <chris@chris-wilson.co.uk>; Souza,
> Jose <jose.souza@intel.com>; De Marchi, Lucas
> <lucas.demarchi@intel.com>; Roper, Matthew D
> <matthew.d.roper@intel.com>; Deak, Imre <imre.deak@intel.com>;
> Shankar, Uma <uma.shankar@intel.com>; Nautiyal, Ankit K
> <ankit.k.nautiyal@intel.com>; Linux Kernel Mailing List <linux-
> kernel@vger.kernel.org>
> Subject: Re: [Intel-gfx] [RFC][PATCH 5/5] drm/i915/display: Add Nearest-
> neighbor based integer scaling support
> 
> Hi,
> 
> On Tue, 25 Feb 2020 at 07:17, Pankaj Bharadiya
> <pankaj.laxminarayan.bharadiya@intel.com> wrote:
> > @@ -415,18 +415,26 @@ skl_program_scaler(struct intel_plane *plane,
> >         u16 y_vphase, uv_rgb_vphase;
> >         int hscale, vscale;
> >         const struct drm_plane_state *state = &plane_state->uapi;
> > +       u32 src_w = drm_rect_width(&plane_state->uapi.src) >> 16;
> > +       u32 src_h = drm_rect_height(&plane_state->uapi.src) >> 16;
> >         u32 scaling_filter = PS_FILTER_MEDIUM;
> > +       struct drm_rect dst;
> >
> >         if (state->scaling_filter ==
> DRM_SCALING_FILTER_NEAREST_NEIGHBOR) {
> >                 scaling_filter = PS_FILTER_PROGRAMMED;
> > +               skl_setup_nearest_neighbor_filter(dev_priv, pipe,
> > + scaler_id);
> > +
> > +               /* Make the scaling window size to integer multiple of source
> > +                * TODO: Should userspace take desision to round scaling window
> > +                * to integer multiple?
> > +                */
> > +               crtc_w = rounddown(crtc_w, src_w);
> > +               crtc_h = rounddown(crtc_h, src_h);
> 
> The kernel should absolutely not be changing the co-ordinates that
> userspace requested.

Thanks, Will get rid of this in V2.

Thanks,
Pankaj
> 
> Cheers,
> Daniel
Ville Syrjälä March 10, 2020, 4:17 p.m. UTC | #3
On Tue, Feb 25, 2020 at 12:35:45PM +0530, Pankaj Bharadiya wrote:
> Integer scaling (IS) is a nearest-neighbor upscaling technique that
> simply scales up the existing pixels by an integer
> (i.e., whole number) multiplier.Nearest-neighbor (NN) interpolation
> works by filling in the missing color values in the upscaled image
> with that of the coordinate-mapped nearest source pixel value.
> 
> Both IS and NN preserve the clarity of the original image. Integer
> scaling is particularly useful for pixel art games that rely on
> sharp, blocky images to deliver their distinctive look.
> 
> Program the scaler filter coefficients to enable the NN filter if
> scaling filter property is set to DRM_SCALING_FILTER_NEAREST_NEIGHBOR
> and enable integer scaling.
> 
> Bspec: 49247
> 
> Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
> ---
>  drivers/gpu/drm/i915/display/intel_display.c | 83 +++++++++++++++++++-
>  drivers/gpu/drm/i915/display/intel_display.h |  2 +
>  drivers/gpu/drm/i915/display/intel_sprite.c  | 20 +++--
>  3 files changed, 97 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
> index b5903ef3c5a0..6d5f59203258 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.c
> +++ b/drivers/gpu/drm/i915/display/intel_display.c
> @@ -6237,6 +6237,73 @@ void skl_scaler_disable(const struct intel_crtc_state *old_crtc_state)
>  		skl_detach_scaler(crtc, i);
>  }
>  
> +/**
> + *  Theory behind setting nearest-neighbor integer scaling:
> + *
> + *  17 phase of 7 taps requires 119 coefficients in 60 dwords per set.
> + *  The letter represents the filter tap (D is the center tap) and the number
> + *  represents the coefficient set for a phase (0-16).
> + *
> + *         +------------+------------------------+------------------------+
> + *         |Index value | Data value coeffient 1 | Data value coeffient 2 |
> + *         +------------+------------------------+------------------------+
> + *         |   00h      |          B0            |          A0            |
> + *         +------------+------------------------+------------------------+
> + *         |   01h      |          D0            |          C0            |
> + *         +------------+------------------------+------------------------+
> + *         |   02h      |          F0            |          E0            |
> + *         +------------+------------------------+------------------------+
> + *         |   03h      |          A1            |          G0            |
> + *         +------------+------------------------+------------------------+
> + *         |   04h      |          C1            |          B1            |
> + *         +------------+------------------------+------------------------+
> + *         |   ...      |          ...           |          ...           |
> + *         +------------+------------------------+------------------------+
> + *         |   38h      |          B16           |          A16           |
> + *         +------------+------------------------+------------------------+
> + *         |   39h      |          D16           |          C16           |
> + *         +------------+------------------------+------------------------+
> + *         |   3Ah      |          F16           |          C16           |
> + *         +------------+------------------------+------------------------+
> + *         |   3Bh      |        Reserved        |          G16           |
> + *         +------------+------------------------+------------------------+
> + *
> + *  To enable nearest-neighbor scaling:  program scaler coefficents with
> + *  the center tap (Dxx) values set to 1 and all other values set to 0 as per
> + *  SCALER_COEFFICIENT_FORMAT
> + *
> + */
> +void skl_setup_nearest_neighbor_filter(struct drm_i915_private *dev_priv,
> +				  enum pipe pipe, int scaler_id)

skl_scaler_... 

> +{
> +
> +	int coeff = 0;
> +	int phase = 0;
> +	int tap;
> +	int val = 0;

Needlessly wide scope for most of these.

> +
> +	/*enable the index auto increment.*/
> +	intel_de_write_fw(dev_priv, SKL_PS_COEF_INDEX_SET0(pipe, scaler_id),
> +			  _PS_COEE_INDEX_AUTO_INC);
> +
> +	for (phase = 0; phase < 17; phase++) {
> +		for (tap = 0; tap < 7; tap++) {
> +			coeff++;

Can be part of the % check.

> +			if (tap == 3)
> +				val = (phase % 2) ? (0x800) : (0x800 << 16);

Parens overload.

> +
> +			if (coeff % 2 == 0) {
> +				intel_de_write_fw(dev_priv, SKL_PS_COEF_DATA_SET0(pipe, scaler_id), val);
> +				val = 0;

Can drop this val=0 if you move the variable into tight scope and
initialize there.

I was trying to think of a bit more generic way to do this, but couldn't
really think of anything apart from pre-filling the entire coefficient
set and the programming blindly. And that seems a bit wasteful if we only
care about nearest neighbour.

> +			}
> +
> +		}
> +
> +	}
> +
> +	intel_de_write_fw(dev_priv, SKL_PS_COEF_DATA_SET0(pipe, scaler_id), 0);
> +}
> +
>  static void skl_pfit_enable(const struct intel_crtc_state *crtc_state)
>  {
>  	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> @@ -6260,9 +6327,23 @@ static void skl_pfit_enable(const struct intel_crtc_state *crtc_state)
>  		pfit_w = (crtc_state->pch_pfit.size >> 16) & 0xFFFF;
>  		pfit_h = crtc_state->pch_pfit.size & 0xFFFF;
>  
> +		id = scaler_state->scaler_id;
> +
>  		if (state->scaling_filter ==
>  		    DRM_SCALING_FILTER_NEAREST_NEIGHBOR) {
>  			scaling_filter = PS_FILTER_PROGRAMMED;
> +			skl_setup_nearest_neighbor_filter(dev_priv, pipe, id);

This should be sitting alongside the other register writes.

> +
> +			/* Make the scaling window size to integer multiple of
> +			 * source.
> +			 *
> +			 * TODO: Should userspace take desision to round
> +			 * scaling window to integer multiple?

To give userspace actual control of the pfit window size we need the border
props (or something along those lines). Step 1 is
https://patchwork.freedesktop.org/series/68409/. There are further steps
in my branch after that, but it's still missing the border props for
eDP/LVDS/DSI since I was too lazy to think how they should interact with
the existing scaling mode prop.

> +			 */
> +			pfit_w = rounddown(pfit_w,
> +					   (crtc_state->pipe_src_w << 16));
> +			pfit_h = rounddown(pfit_h,
> +					   (crtc_state->pipe_src_h << 16));
>  		}

This part should be dropped as Daniel mentioned.

>  
>  		hscale = (crtc_state->pipe_src_w << 16) / pfit_w;
> @@ -6271,8 +6352,6 @@ static void skl_pfit_enable(const struct intel_crtc_state *crtc_state)
>  		uv_rgb_hphase = skl_scaler_calc_phase(1, hscale, false);
>  		uv_rgb_vphase = skl_scaler_calc_phase(1, vscale, false);
>  
> -		id = scaler_state->scaler_id;
> -
>  		spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
>  
>  		intel_de_write_fw(dev_priv, SKL_PS_CTRL(pipe, id),

I think we should also explicitly indicate here which cofficient set(s)
we're going to use, even if using set0 does mean those bits will be 0.

> diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h
> index f92efbbec838..49f58d3c98fe 100644
> --- a/drivers/gpu/drm/i915/display/intel_display.h
> +++ b/drivers/gpu/drm/i915/display/intel_display.h
> @@ -586,6 +586,8 @@ void intel_crtc_arm_fifo_underrun(struct intel_crtc *crtc,
>  u16 skl_scaler_calc_phase(int sub, int scale, bool chroma_center);
>  int skl_update_scaler_crtc(struct intel_crtc_state *crtc_state);
>  void skl_scaler_disable(const struct intel_crtc_state *old_crtc_state);
> +void skl_setup_nearest_neighbor_filter(struct drm_i915_private *dev_priv,
> +				  enum pipe pipe, int scaler_id);
>  void ilk_pfit_disable(const struct intel_crtc_state *old_crtc_state);
>  u32 glk_plane_color_ctl(const struct intel_crtc_state *crtc_state,
>  			const struct intel_plane_state *plane_state);
> diff --git a/drivers/gpu/drm/i915/display/intel_sprite.c b/drivers/gpu/drm/i915/display/intel_sprite.c
> index fd7b31a21723..5bef5c031374 100644
> --- a/drivers/gpu/drm/i915/display/intel_sprite.c
> +++ b/drivers/gpu/drm/i915/display/intel_sprite.c
> @@ -415,18 +415,26 @@ skl_program_scaler(struct intel_plane *plane,
>  	u16 y_vphase, uv_rgb_vphase;
>  	int hscale, vscale;
>  	const struct drm_plane_state *state = &plane_state->uapi;
> +	u32 src_w = drm_rect_width(&plane_state->uapi.src) >> 16;
> +	u32 src_h = drm_rect_height(&plane_state->uapi.src) >> 16;
>  	u32 scaling_filter = PS_FILTER_MEDIUM;
> +	struct drm_rect dst;
>  
>  	if (state->scaling_filter == DRM_SCALING_FILTER_NEAREST_NEIGHBOR) {
>  		scaling_filter = PS_FILTER_PROGRAMMED;
> +		skl_setup_nearest_neighbor_filter(dev_priv, pipe, scaler_id);
> +
> +		/* Make the scaling window size to integer multiple of source
> +		 * TODO: Should userspace take desision to round scaling window
> +		 * to integer multiple?
> +		 */
> +		crtc_w = rounddown(crtc_w, src_w);
> +		crtc_h = rounddown(crtc_h, src_h);
>  	}
>  
> -	hscale = drm_rect_calc_hscale(&plane_state->uapi.src,
> -				      &plane_state->uapi.dst,
> -				      0, INT_MAX);
> -	vscale = drm_rect_calc_vscale(&plane_state->uapi.src,
> -				      &plane_state->uapi.dst,
> -				      0, INT_MAX);
> +	drm_rect_init(&dst, crtc_x, crtc_y, crtc_w, crtc_h);

Drop as well.

> +	hscale = drm_rect_calc_hscale(&plane_state->uapi.src, &dst, 0, INT_MAX);
> +	vscale = drm_rect_calc_vscale(&plane_state->uapi.src, &dst, 0, INT_MAX);
>  
>  	/* TODO: handle sub-pixel coordinates */
>  	if (intel_format_info_is_yuv_semiplanar(fb->format, fb->modifier) &&
> -- 
> 2.23.0
Bharadiya,Pankaj March 12, 2020, 9:13 a.m. UTC | #4
> -----Original Message-----
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Sent: 10 March 2020 21:47
> To: Laxminarayan Bharadiya, Pankaj
> <pankaj.laxminarayan.bharadiya@intel.com>
> Cc: jani.nikula@linux.intel.com; daniel@ffwll.ch; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; airlied@linux.ie;
> maarten.lankhorst@linux.intel.com; tzimmermann@suse.de;
> mripard@kernel.org; mihail.atanassov@arm.com; Joonas Lahtinen
> <joonas.lahtinen@linux.intel.com>; Vivi, Rodrigo <rodrigo.vivi@intel.com>;
> Chris Wilson <chris@chris-wilson.co.uk>; Souza, Jose
> <jose.souza@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>;
> Roper, Matthew D <matthew.d.roper@intel.com>; Deak, Imre
> <imre.deak@intel.com>; Shankar, Uma <uma.shankar@intel.com>; linux-
> kernel@vger.kernel.org; Nautiyal, Ankit K <ankit.k.nautiyal@intel.com>
> Subject: Re: [RFC][PATCH 5/5] drm/i915/display: Add Nearest-neighbor
> based integer scaling support
> 
> On Tue, Feb 25, 2020 at 12:35:45PM +0530, Pankaj Bharadiya wrote:
> > Integer scaling (IS) is a nearest-neighbor upscaling technique that
> > simply scales up the existing pixels by an integer (i.e., whole
> > number) multiplier.Nearest-neighbor (NN) interpolation works by
> > filling in the missing color values in the upscaled image with that of
> > the coordinate-mapped nearest source pixel value.
> >
> > Both IS and NN preserve the clarity of the original image. Integer
> > scaling is particularly useful for pixel art games that rely on sharp,
> > blocky images to deliver their distinctive look.
> >
> > Program the scaler filter coefficients to enable the NN filter if
> > scaling filter property is set to DRM_SCALING_FILTER_NEAREST_NEIGHBOR
> > and enable integer scaling.
> >
> > Bspec: 49247
> >
> > Signed-off-by: Pankaj Bharadiya
> > <pankaj.laxminarayan.bharadiya@intel.com>
> > Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
> > ---
> >  drivers/gpu/drm/i915/display/intel_display.c | 83
> > +++++++++++++++++++-  drivers/gpu/drm/i915/display/intel_display.h |
> > 2 +  drivers/gpu/drm/i915/display/intel_sprite.c  | 20 +++--
> >  3 files changed, 97 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > b/drivers/gpu/drm/i915/display/intel_display.c
> > index b5903ef3c5a0..6d5f59203258 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > @@ -6237,6 +6237,73 @@ void skl_scaler_disable(const struct
> intel_crtc_state *old_crtc_state)
> >  		skl_detach_scaler(crtc, i);
> >  }
> >
> > +/**
> > + *  Theory behind setting nearest-neighbor integer scaling:
> > + *
> > + *  17 phase of 7 taps requires 119 coefficients in 60 dwords per set.
> > + *  The letter represents the filter tap (D is the center tap) and
> > +the number
> > + *  represents the coefficient set for a phase (0-16).
> > + *
> > + *         +------------+------------------------+------------------------+
> > + *         |Index value | Data value coeffient 1 | Data value coeffient 2 |
> > + *         +------------+------------------------+------------------------+
> > + *         |   00h      |          B0            |          A0            |
> > + *         +------------+------------------------+------------------------+
> > + *         |   01h      |          D0            |          C0            |
> > + *         +------------+------------------------+------------------------+
> > + *         |   02h      |          F0            |          E0            |
> > + *         +------------+------------------------+------------------------+
> > + *         |   03h      |          A1            |          G0            |
> > + *         +------------+------------------------+------------------------+
> > + *         |   04h      |          C1            |          B1            |
> > + *         +------------+------------------------+------------------------+
> > + *         |   ...      |          ...           |          ...           |
> > + *         +------------+------------------------+------------------------+
> > + *         |   38h      |          B16           |          A16           |
> > + *         +------------+------------------------+------------------------+
> > + *         |   39h      |          D16           |          C16           |
> > + *         +------------+------------------------+------------------------+
> > + *         |   3Ah      |          F16           |          C16           |
> > + *         +------------+------------------------+------------------------+
> > + *         |   3Bh      |        Reserved        |          G16           |
> > + *         +------------+------------------------+------------------------+
> > + *
> > + *  To enable nearest-neighbor scaling:  program scaler coefficents
> > +with
> > + *  the center tap (Dxx) values set to 1 and all other values set to
> > +0 as per
> > + *  SCALER_COEFFICIENT_FORMAT
> > + *
> > + */
> > +void skl_setup_nearest_neighbor_filter(struct drm_i915_private
> *dev_priv,
> > +				  enum pipe pipe, int scaler_id)
> 
> skl_scaler_...
> 
> > +{
> > +
> > +	int coeff = 0;
> > +	int phase = 0;
> > +	int tap;
> > +	int val = 0;
> 
> Needlessly wide scope for most of these.
> 
> > +
> > +	/*enable the index auto increment.*/
> > +	intel_de_write_fw(dev_priv, SKL_PS_COEF_INDEX_SET0(pipe,
> scaler_id),
> > +			  _PS_COEE_INDEX_AUTO_INC);
> > +
> > +	for (phase = 0; phase < 17; phase++) {
> > +		for (tap = 0; tap < 7; tap++) {
> > +			coeff++;
> 
> Can be part of the % check.

OK.

> 
> > +			if (tap == 3)
> > +				val = (phase % 2) ? (0x800) : (0x800 << 16);
> 
> Parens overload.

OK. Will remove.
> 
> > +
> > +			if (coeff % 2 == 0) {
> > +				intel_de_write_fw(dev_priv,
> SKL_PS_COEF_DATA_SET0(pipe, scaler_id), val);
> > +				val = 0;
> 
> Can drop this val=0 if you move the variable into tight scope and initialize
> there.

Moving val=0 initialization to the tight scope will not work here as we need
to retain "val" and write only when 2 coefficients are ready (since 2 
coefficients are packed in 1 dword).

e.g. for (12th , 11th)  coefficients, coefficient reg value should be ( (0 << 16) | 0x800).
If we initialize val = 0 in tight loop, 0 will be written to  coefficient register.

> 
> I was trying to think of a bit more generic way to do this, but couldn't really
> think of anything apart from pre-filling the entire coefficient set and the
> programming blindly. And that seems a bit wasteful if we only care about
> nearest neighbour.
> 
> > +			}
> > +
> > +		}
> > +
> > +	}
> > +
> > +	intel_de_write_fw(dev_priv, SKL_PS_COEF_DATA_SET0(pipe,
> scaler_id),
> > +0); }
> > +
> >  static void skl_pfit_enable(const struct intel_crtc_state
> > *crtc_state)  {
> >  	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
> > @@ -6260,9 +6327,23 @@ static void skl_pfit_enable(const struct
> intel_crtc_state *crtc_state)
> >  		pfit_w = (crtc_state->pch_pfit.size >> 16) & 0xFFFF;
> >  		pfit_h = crtc_state->pch_pfit.size & 0xFFFF;
> >
> > +		id = scaler_state->scaler_id;
> > +
> >  		if (state->scaling_filter ==
> >  		    DRM_SCALING_FILTER_NEAREST_NEIGHBOR) {
> >  			scaling_filter = PS_FILTER_PROGRAMMED;
> > +			skl_setup_nearest_neighbor_filter(dev_priv, pipe,
> id);
> 
> This should be sitting alongside the other register writes.

I missed this, thanks for pointing out.

> 
> > +
> > +			/* Make the scaling window size to integer multiple
> of
> > +			 * source.
> > +			 *
> > +			 * TODO: Should userspace take desision to round
> > +			 * scaling window to integer multiple?
> 
> To give userspace actual control of the pfit window size we need the border
> props (or something along those lines). Step 1 is
> https://patchwork.freedesktop.org/series/68409/. There are further steps in
> my branch after that, but it's still missing the border props for eDP/LVDS/DSI
> since I was too lazy to think how they should interact with the existing scaling
> mode prop.
> 
> > +			 */
> > +			pfit_w = rounddown(pfit_w,
> > +					   (crtc_state->pipe_src_w << 16));
> > +			pfit_h = rounddown(pfit_h,
> > +					   (crtc_state->pipe_src_h << 16));
> >  		}
> 
> This part should be dropped as Daniel mentioned.

Will remove.

Thanks,
Pankaj

> 
> >
> >  		hscale = (crtc_state->pipe_src_w << 16) / pfit_w; @@ -
> 6271,8
> > +6352,6 @@ static void skl_pfit_enable(const struct intel_crtc_state
> *crtc_state)
> >  		uv_rgb_hphase = skl_scaler_calc_phase(1, hscale, false);
> >  		uv_rgb_vphase = skl_scaler_calc_phase(1, vscale, false);
> >
> > -		id = scaler_state->scaler_id;
> > -
> >  		spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
> >
> >  		intel_de_write_fw(dev_priv, SKL_PS_CTRL(pipe, id),
> 
> I think we should also explicitly indicate here which cofficient set(s) we're
> going to use, even if using set0 does mean those bits will be 0.
> 
> > diff --git a/drivers/gpu/drm/i915/display/intel_display.h
> > b/drivers/gpu/drm/i915/display/intel_display.h
> > index f92efbbec838..49f58d3c98fe 100644
> > --- a/drivers/gpu/drm/i915/display/intel_display.h
> > +++ b/drivers/gpu/drm/i915/display/intel_display.h
> > @@ -586,6 +586,8 @@ void intel_crtc_arm_fifo_underrun(struct
> > intel_crtc *crtc,
> >  u16 skl_scaler_calc_phase(int sub, int scale, bool chroma_center);
> > int skl_update_scaler_crtc(struct intel_crtc_state *crtc_state);  void
> > skl_scaler_disable(const struct intel_crtc_state *old_crtc_state);
> > +void skl_setup_nearest_neighbor_filter(struct drm_i915_private
> *dev_priv,
> > +				  enum pipe pipe, int scaler_id);
> >  void ilk_pfit_disable(const struct intel_crtc_state *old_crtc_state);
> >  u32 glk_plane_color_ctl(const struct intel_crtc_state *crtc_state,
> >  			const struct intel_plane_state *plane_state); diff --
> git
> > a/drivers/gpu/drm/i915/display/intel_sprite.c
> > b/drivers/gpu/drm/i915/display/intel_sprite.c
> > index fd7b31a21723..5bef5c031374 100644
> > --- a/drivers/gpu/drm/i915/display/intel_sprite.c
> > +++ b/drivers/gpu/drm/i915/display/intel_sprite.c
> > @@ -415,18 +415,26 @@ skl_program_scaler(struct intel_plane *plane,
> >  	u16 y_vphase, uv_rgb_vphase;
> >  	int hscale, vscale;
> >  	const struct drm_plane_state *state = &plane_state->uapi;
> > +	u32 src_w = drm_rect_width(&plane_state->uapi.src) >> 16;
> > +	u32 src_h = drm_rect_height(&plane_state->uapi.src) >> 16;
> >  	u32 scaling_filter = PS_FILTER_MEDIUM;
> > +	struct drm_rect dst;
> >
> >  	if (state->scaling_filter ==
> DRM_SCALING_FILTER_NEAREST_NEIGHBOR) {
> >  		scaling_filter = PS_FILTER_PROGRAMMED;
> > +		skl_setup_nearest_neighbor_filter(dev_priv, pipe,
> scaler_id);
> > +
> > +		/* Make the scaling window size to integer multiple of source
> > +		 * TODO: Should userspace take desision to round scaling
> window
> > +		 * to integer multiple?
> > +		 */
> > +		crtc_w = rounddown(crtc_w, src_w);
> > +		crtc_h = rounddown(crtc_h, src_h);
> >  	}
> >
> > -	hscale = drm_rect_calc_hscale(&plane_state->uapi.src,
> > -				      &plane_state->uapi.dst,
> > -				      0, INT_MAX);
> > -	vscale = drm_rect_calc_vscale(&plane_state->uapi.src,
> > -				      &plane_state->uapi.dst,
> > -				      0, INT_MAX);
> > +	drm_rect_init(&dst, crtc_x, crtc_y, crtc_w, crtc_h);
> 
> Drop as well.
> 
> > +	hscale = drm_rect_calc_hscale(&plane_state->uapi.src, &dst, 0,
> INT_MAX);
> > +	vscale = drm_rect_calc_vscale(&plane_state->uapi.src, &dst, 0,
> > +INT_MAX);
> >
> >  	/* TODO: handle sub-pixel coordinates */
> >  	if (intel_format_info_is_yuv_semiplanar(fb->format, fb->modifier)
> &&
> > --
> > 2.23.0
> 
> --
> Ville Syrjälä
> Intel
Ville Syrjälä March 12, 2020, 1:54 p.m. UTC | #5
On Thu, Mar 12, 2020 at 09:13:24AM +0000, Laxminarayan Bharadiya, Pankaj wrote:
> 
> 
> > -----Original Message-----
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Sent: 10 March 2020 21:47
> > To: Laxminarayan Bharadiya, Pankaj
> > <pankaj.laxminarayan.bharadiya@intel.com>
> > Cc: jani.nikula@linux.intel.com; daniel@ffwll.ch; intel-
> > gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; airlied@linux.ie;
> > maarten.lankhorst@linux.intel.com; tzimmermann@suse.de;
> > mripard@kernel.org; mihail.atanassov@arm.com; Joonas Lahtinen
> > <joonas.lahtinen@linux.intel.com>; Vivi, Rodrigo <rodrigo.vivi@intel.com>;
> > Chris Wilson <chris@chris-wilson.co.uk>; Souza, Jose
> > <jose.souza@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>;
> > Roper, Matthew D <matthew.d.roper@intel.com>; Deak, Imre
> > <imre.deak@intel.com>; Shankar, Uma <uma.shankar@intel.com>; linux-
> > kernel@vger.kernel.org; Nautiyal, Ankit K <ankit.k.nautiyal@intel.com>
> > Subject: Re: [RFC][PATCH 5/5] drm/i915/display: Add Nearest-neighbor
> > based integer scaling support
> > 
> > On Tue, Feb 25, 2020 at 12:35:45PM +0530, Pankaj Bharadiya wrote:
> > > Integer scaling (IS) is a nearest-neighbor upscaling technique that
> > > simply scales up the existing pixels by an integer (i.e., whole
> > > number) multiplier.Nearest-neighbor (NN) interpolation works by
> > > filling in the missing color values in the upscaled image with that of
> > > the coordinate-mapped nearest source pixel value.
> > >
> > > Both IS and NN preserve the clarity of the original image. Integer
> > > scaling is particularly useful for pixel art games that rely on sharp,
> > > blocky images to deliver their distinctive look.
> > >
> > > Program the scaler filter coefficients to enable the NN filter if
> > > scaling filter property is set to DRM_SCALING_FILTER_NEAREST_NEIGHBOR
> > > and enable integer scaling.
> > >
> > > Bspec: 49247
> > >
> > > Signed-off-by: Pankaj Bharadiya
> > > <pankaj.laxminarayan.bharadiya@intel.com>
> > > Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/display/intel_display.c | 83
> > > +++++++++++++++++++-  drivers/gpu/drm/i915/display/intel_display.h |
> > > 2 +  drivers/gpu/drm/i915/display/intel_sprite.c  | 20 +++--
> > >  3 files changed, 97 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > index b5903ef3c5a0..6d5f59203258 100644
> > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > @@ -6237,6 +6237,73 @@ void skl_scaler_disable(const struct
> > intel_crtc_state *old_crtc_state)
> > >  		skl_detach_scaler(crtc, i);
> > >  }
> > >
> > > +/**
> > > + *  Theory behind setting nearest-neighbor integer scaling:
> > > + *
> > > + *  17 phase of 7 taps requires 119 coefficients in 60 dwords per set.
> > > + *  The letter represents the filter tap (D is the center tap) and
> > > +the number
> > > + *  represents the coefficient set for a phase (0-16).
> > > + *
> > > + *         +------------+------------------------+------------------------+
> > > + *         |Index value | Data value coeffient 1 | Data value coeffient 2 |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   00h      |          B0            |          A0            |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   01h      |          D0            |          C0            |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   02h      |          F0            |          E0            |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   03h      |          A1            |          G0            |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   04h      |          C1            |          B1            |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   ...      |          ...           |          ...           |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   38h      |          B16           |          A16           |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   39h      |          D16           |          C16           |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   3Ah      |          F16           |          C16           |
> > > + *         +------------+------------------------+------------------------+
> > > + *         |   3Bh      |        Reserved        |          G16           |
> > > + *         +------------+------------------------+------------------------+
> > > + *
> > > + *  To enable nearest-neighbor scaling:  program scaler coefficents
> > > +with
> > > + *  the center tap (Dxx) values set to 1 and all other values set to
> > > +0 as per
> > > + *  SCALER_COEFFICIENT_FORMAT
> > > + *
> > > + */
> > > +void skl_setup_nearest_neighbor_filter(struct drm_i915_private
> > *dev_priv,
> > > +				  enum pipe pipe, int scaler_id)
> > 
> > skl_scaler_...
> > 
> > > +{
> > > +
> > > +	int coeff = 0;
> > > +	int phase = 0;
> > > +	int tap;
> > > +	int val = 0;
> > 
> > Needlessly wide scope for most of these.
> > 
> > > +
> > > +	/*enable the index auto increment.*/
> > > +	intel_de_write_fw(dev_priv, SKL_PS_COEF_INDEX_SET0(pipe,
> > scaler_id),
> > > +			  _PS_COEE_INDEX_AUTO_INC);
> > > +
> > > +	for (phase = 0; phase < 17; phase++) {
> > > +		for (tap = 0; tap < 7; tap++) {
> > > +			coeff++;
> > 
> > Can be part of the % check.
> 
> OK.
> 
> > 
> > > +			if (tap == 3)
> > > +				val = (phase % 2) ? (0x800) : (0x800 << 16);
> > 
> > Parens overload.
> 
> OK. Will remove.
> > 
> > > +
> > > +			if (coeff % 2 == 0) {
> > > +				intel_de_write_fw(dev_priv,
> > SKL_PS_COEF_DATA_SET0(pipe, scaler_id), val);
> > > +				val = 0;
> > 
> > Can drop this val=0 if you move the variable into tight scope and initialize
> > there.
> 
> Moving val=0 initialization to the tight scope will not work here as we need
> to retain "val" and write only when 2 coefficients are ready (since 2 
> coefficients are packed in 1 dword).
> 
> e.g. for (12th , 11th)  coefficients, coefficient reg value should be ( (0 << 16) | 0x800).
> If we initialize val = 0 in tight loop, 0 will be written to  coefficient register.

Hmm, right. I guess I'd try to rearrange this to iterate the
registers directly instead of the phases and taps. Something
like this perhaps:

static int cnl_coef_tap(int i)
{
	return i % 7;
}

static u16 cnl_coef(int t)
{
	return t == 3 ? 0x0800 : 0x3000;
}

static void cnl_program_nearest_filter_coefs(void)
{
	int i;

	for (i = 0; i < 17 * 7; i += 2) {
		uint32_t tmp;
		int t;

		t = cnl_coef_tap(i);
		tmp = cnl_nearest_filter_coef(t);

		t = cnl_coef_tap(i + 1);
		tmp |= cnl_nearest_filter_coef(t) << 16;

		intel_de_write_fw(tmp);
	}
}

More readable I think. The downside being all those modulo operations
but hopefully that's all in the noise when it comes to performance.
Bharadiya,Pankaj March 13, 2020, 8:45 a.m. UTC | #6
> -----Original Message-----
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Sent: 12 March 2020 19:25
> To: Laxminarayan Bharadiya, Pankaj
> <pankaj.laxminarayan.bharadiya@intel.com>
> Cc: jani.nikula@linux.intel.com; daniel@ffwll.ch; intel-
> gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; airlied@linux.ie;
> maarten.lankhorst@linux.intel.com; tzimmermann@suse.de;
> mripard@kernel.org; mihail.atanassov@arm.com; Joonas Lahtinen
> <joonas.lahtinen@linux.intel.com>; Vivi, Rodrigo <rodrigo.vivi@intel.com>;
> Chris Wilson <chris@chris-wilson.co.uk>; Souza, Jose <jose.souza@intel.com>;
> De Marchi, Lucas <lucas.demarchi@intel.com>; Roper, Matthew D
> <matthew.d.roper@intel.com>; Deak, Imre <imre.deak@intel.com>; Shankar,
> Uma <uma.shankar@intel.com>; linux-kernel@vger.kernel.org; Nautiyal, Ankit K
> <ankit.k.nautiyal@intel.com>
> Subject: Re: [RFC][PATCH 5/5] drm/i915/display: Add Nearest-neighbor based
> integer scaling support
> 
> On Thu, Mar 12, 2020 at 09:13:24AM +0000, Laxminarayan Bharadiya, Pankaj
> wrote:
> >
> >
> > > -----Original Message-----
> > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > Sent: 10 March 2020 21:47
> > > To: Laxminarayan Bharadiya, Pankaj
> > > <pankaj.laxminarayan.bharadiya@intel.com>
> > > Cc: jani.nikula@linux.intel.com; daniel@ffwll.ch; intel-
> > > gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> > > airlied@linux.ie; maarten.lankhorst@linux.intel.com;
> > > tzimmermann@suse.de; mripard@kernel.org; mihail.atanassov@arm.com;
> > > Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Vivi, Rodrigo
> > > <rodrigo.vivi@intel.com>; Chris Wilson <chris@chris-wilson.co.uk>;
> > > Souza, Jose <jose.souza@intel.com>; De Marchi, Lucas
> > > <lucas.demarchi@intel.com>; Roper, Matthew D
> > > <matthew.d.roper@intel.com>; Deak, Imre <imre.deak@intel.com>;
> > > Shankar, Uma <uma.shankar@intel.com>; linux- kernel@vger.kernel.org;
> > > Nautiyal, Ankit K <ankit.k.nautiyal@intel.com>
> > > Subject: Re: [RFC][PATCH 5/5] drm/i915/display: Add Nearest-neighbor
> > > based integer scaling support
> > >
> > > On Tue, Feb 25, 2020 at 12:35:45PM +0530, Pankaj Bharadiya wrote:
> > > > Integer scaling (IS) is a nearest-neighbor upscaling technique
> > > > that simply scales up the existing pixels by an integer (i.e.,
> > > > whole
> > > > number) multiplier.Nearest-neighbor (NN) interpolation works by
> > > > filling in the missing color values in the upscaled image with
> > > > that of the coordinate-mapped nearest source pixel value.
> > > >
> > > > Both IS and NN preserve the clarity of the original image. Integer
> > > > scaling is particularly useful for pixel art games that rely on
> > > > sharp, blocky images to deliver their distinctive look.
> > > >
> > > > Program the scaler filter coefficients to enable the NN filter if
> > > > scaling filter property is set to
> > > > DRM_SCALING_FILTER_NEAREST_NEIGHBOR
> > > > and enable integer scaling.
> > > >
> > > > Bspec: 49247
> > > >
> > > > Signed-off-by: Pankaj Bharadiya
> > > > <pankaj.laxminarayan.bharadiya@intel.com>
> > > > Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/display/intel_display.c | 83
> > > > +++++++++++++++++++-  drivers/gpu/drm/i915/display/intel_display.h
> > > > +++++++++++++++++++|
> > > > 2 +  drivers/gpu/drm/i915/display/intel_sprite.c  | 20 +++--
> > > >  3 files changed, 97 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > index b5903ef3c5a0..6d5f59203258 100644
> > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > @@ -6237,6 +6237,73 @@ void skl_scaler_disable(const struct
> > > intel_crtc_state *old_crtc_state)
> > > >  		skl_detach_scaler(crtc, i);
> > > >  }
> > > >
> > > > +/**
> > > > + *  Theory behind setting nearest-neighbor integer scaling:
> > > > + *
> > > > + *  17 phase of 7 taps requires 119 coefficients in 60 dwords per set.
> > > > + *  The letter represents the filter tap (D is the center tap)
> > > > +and the number
> > > > + *  represents the coefficient set for a phase (0-16).
> > > > + *
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |Index value | Data value coeffient 1 | Data value coeffient 2 |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   00h      |          B0            |          A0            |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   01h      |          D0            |          C0            |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   02h      |          F0            |          E0            |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   03h      |          A1            |          G0            |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   04h      |          C1            |          B1            |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   ...      |          ...           |          ...           |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   38h      |          B16           |          A16           |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   39h      |          D16           |          C16           |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   3Ah      |          F16           |          C16           |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *         |   3Bh      |        Reserved        |          G16           |
> > > > + *         +------------+------------------------+------------------------+
> > > > + *
> > > > + *  To enable nearest-neighbor scaling:  program scaler
> > > > +coefficents with
> > > > + *  the center tap (Dxx) values set to 1 and all other values set
> > > > +to
> > > > +0 as per
> > > > + *  SCALER_COEFFICIENT_FORMAT
> > > > + *
> > > > + */
> > > > +void skl_setup_nearest_neighbor_filter(struct drm_i915_private
> > > *dev_priv,
> > > > +				  enum pipe pipe, int scaler_id)
> > >
> > > skl_scaler_...
> > >
> > > > +{
> > > > +
> > > > +	int coeff = 0;
> > > > +	int phase = 0;
> > > > +	int tap;
> > > > +	int val = 0;
> > >
> > > Needlessly wide scope for most of these.
> > >
> > > > +
> > > > +	/*enable the index auto increment.*/
> > > > +	intel_de_write_fw(dev_priv, SKL_PS_COEF_INDEX_SET0(pipe,
> > > scaler_id),
> > > > +			  _PS_COEE_INDEX_AUTO_INC);
> > > > +
> > > > +	for (phase = 0; phase < 17; phase++) {
> > > > +		for (tap = 0; tap < 7; tap++) {
> > > > +			coeff++;
> > >
> > > Can be part of the % check.
> >
> > OK.
> >
> > >
> > > > +			if (tap == 3)
> > > > +				val = (phase % 2) ? (0x800) : (0x800 << 16);
> > >
> > > Parens overload.
> >
> > OK. Will remove.
> > >
> > > > +
> > > > +			if (coeff % 2 == 0) {
> > > > +				intel_de_write_fw(dev_priv,
> > > SKL_PS_COEF_DATA_SET0(pipe, scaler_id), val);
> > > > +				val = 0;
> > >
> > > Can drop this val=0 if you move the variable into tight scope and
> > > initialize there.
> >
> > Moving val=0 initialization to the tight scope will not work here as
> > we need to retain "val" and write only when 2 coefficients are ready
> > (since 2 coefficients are packed in 1 dword).
> >
> > e.g. for (12th , 11th)  coefficients, coefficient reg value should be ( (0 << 16) |
> 0x800).
> > If we initialize val = 0 in tight loop, 0 will be written to  coefficient register.
> 
> Hmm, right. I guess I'd try to rearrange this to iterate the registers directly
> instead of the phases and taps. Something like this perhaps:
> 
> static int cnl_coef_tap(int i)
> {
> 	return i % 7;
> }
> 
> static u16 cnl_coef(int t)

cnl_coef -> cnl_nearest_filter_coef.  Right?

> {
> 	return t == 3 ? 0x0800 : 0x3000;
> }
> 
> static void cnl_program_nearest_filter_coefs(void)
> {
> 	int i;
> 
> 	for (i = 0; i < 17 * 7; i += 2) {
> 		uint32_t tmp;
> 		int t;
> 
> 		t = cnl_coef_tap(i);
> 		tmp = cnl_nearest_filter_coef(t);
> 
> 		t = cnl_coef_tap(i + 1);
> 		tmp |= cnl_nearest_filter_coef(t) << 16;
> 
> 		intel_de_write_fw(tmp);
> 	}
> }
> 
> More readable I think. The downside being all those modulo operations but
> hopefully that's all in the noise when it comes to performance.

Looks better, thanks for spending time on this.
I will try this out.

Thanks,
Pankaj 
> 
> --
> Ville Syrjälä
> Intel
Ville Syrjälä March 13, 2020, 7:53 p.m. UTC | #7
On Fri, Mar 13, 2020 at 08:45:35AM +0000, Laxminarayan Bharadiya, Pankaj wrote:
> 
> 
> > -----Original Message-----
> > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > Sent: 12 March 2020 19:25
> > To: Laxminarayan Bharadiya, Pankaj
> > <pankaj.laxminarayan.bharadiya@intel.com>
> > Cc: jani.nikula@linux.intel.com; daniel@ffwll.ch; intel-
> > gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org; airlied@linux.ie;
> > maarten.lankhorst@linux.intel.com; tzimmermann@suse.de;
> > mripard@kernel.org; mihail.atanassov@arm.com; Joonas Lahtinen
> > <joonas.lahtinen@linux.intel.com>; Vivi, Rodrigo <rodrigo.vivi@intel.com>;
> > Chris Wilson <chris@chris-wilson.co.uk>; Souza, Jose <jose.souza@intel.com>;
> > De Marchi, Lucas <lucas.demarchi@intel.com>; Roper, Matthew D
> > <matthew.d.roper@intel.com>; Deak, Imre <imre.deak@intel.com>; Shankar,
> > Uma <uma.shankar@intel.com>; linux-kernel@vger.kernel.org; Nautiyal, Ankit K
> > <ankit.k.nautiyal@intel.com>
> > Subject: Re: [RFC][PATCH 5/5] drm/i915/display: Add Nearest-neighbor based
> > integer scaling support
> > 
> > On Thu, Mar 12, 2020 at 09:13:24AM +0000, Laxminarayan Bharadiya, Pankaj
> > wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> > > > Sent: 10 March 2020 21:47
> > > > To: Laxminarayan Bharadiya, Pankaj
> > > > <pankaj.laxminarayan.bharadiya@intel.com>
> > > > Cc: jani.nikula@linux.intel.com; daniel@ffwll.ch; intel-
> > > > gfx@lists.freedesktop.org; dri-devel@lists.freedesktop.org;
> > > > airlied@linux.ie; maarten.lankhorst@linux.intel.com;
> > > > tzimmermann@suse.de; mripard@kernel.org; mihail.atanassov@arm.com;
> > > > Joonas Lahtinen <joonas.lahtinen@linux.intel.com>; Vivi, Rodrigo
> > > > <rodrigo.vivi@intel.com>; Chris Wilson <chris@chris-wilson.co.uk>;
> > > > Souza, Jose <jose.souza@intel.com>; De Marchi, Lucas
> > > > <lucas.demarchi@intel.com>; Roper, Matthew D
> > > > <matthew.d.roper@intel.com>; Deak, Imre <imre.deak@intel.com>;
> > > > Shankar, Uma <uma.shankar@intel.com>; linux- kernel@vger.kernel.org;
> > > > Nautiyal, Ankit K <ankit.k.nautiyal@intel.com>
> > > > Subject: Re: [RFC][PATCH 5/5] drm/i915/display: Add Nearest-neighbor
> > > > based integer scaling support
> > > >
> > > > On Tue, Feb 25, 2020 at 12:35:45PM +0530, Pankaj Bharadiya wrote:
> > > > > Integer scaling (IS) is a nearest-neighbor upscaling technique
> > > > > that simply scales up the existing pixels by an integer (i.e.,
> > > > > whole
> > > > > number) multiplier.Nearest-neighbor (NN) interpolation works by
> > > > > filling in the missing color values in the upscaled image with
> > > > > that of the coordinate-mapped nearest source pixel value.
> > > > >
> > > > > Both IS and NN preserve the clarity of the original image. Integer
> > > > > scaling is particularly useful for pixel art games that rely on
> > > > > sharp, blocky images to deliver their distinctive look.
> > > > >
> > > > > Program the scaler filter coefficients to enable the NN filter if
> > > > > scaling filter property is set to
> > > > > DRM_SCALING_FILTER_NEAREST_NEIGHBOR
> > > > > and enable integer scaling.
> > > > >
> > > > > Bspec: 49247
> > > > >
> > > > > Signed-off-by: Pankaj Bharadiya
> > > > > <pankaj.laxminarayan.bharadiya@intel.com>
> > > > > Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/display/intel_display.c | 83
> > > > > +++++++++++++++++++-  drivers/gpu/drm/i915/display/intel_display.h
> > > > > +++++++++++++++++++|
> > > > > 2 +  drivers/gpu/drm/i915/display/intel_sprite.c  | 20 +++--
> > > > >  3 files changed, 97 insertions(+), 8 deletions(-)
> > > > >
> > > > > diff --git a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > index b5903ef3c5a0..6d5f59203258 100644
> > > > > --- a/drivers/gpu/drm/i915/display/intel_display.c
> > > > > +++ b/drivers/gpu/drm/i915/display/intel_display.c
> > > > > @@ -6237,6 +6237,73 @@ void skl_scaler_disable(const struct
> > > > intel_crtc_state *old_crtc_state)
> > > > >  		skl_detach_scaler(crtc, i);
> > > > >  }
> > > > >
> > > > > +/**
> > > > > + *  Theory behind setting nearest-neighbor integer scaling:
> > > > > + *
> > > > > + *  17 phase of 7 taps requires 119 coefficients in 60 dwords per set.
> > > > > + *  The letter represents the filter tap (D is the center tap)
> > > > > +and the number
> > > > > + *  represents the coefficient set for a phase (0-16).
> > > > > + *
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |Index value | Data value coeffient 1 | Data value coeffient 2 |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   00h      |          B0            |          A0            |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   01h      |          D0            |          C0            |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   02h      |          F0            |          E0            |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   03h      |          A1            |          G0            |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   04h      |          C1            |          B1            |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   ...      |          ...           |          ...           |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   38h      |          B16           |          A16           |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   39h      |          D16           |          C16           |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   3Ah      |          F16           |          C16           |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *         |   3Bh      |        Reserved        |          G16           |
> > > > > + *         +------------+------------------------+------------------------+
> > > > > + *
> > > > > + *  To enable nearest-neighbor scaling:  program scaler
> > > > > +coefficents with
> > > > > + *  the center tap (Dxx) values set to 1 and all other values set
> > > > > +to
> > > > > +0 as per
> > > > > + *  SCALER_COEFFICIENT_FORMAT
> > > > > + *
> > > > > + */
> > > > > +void skl_setup_nearest_neighbor_filter(struct drm_i915_private
> > > > *dev_priv,
> > > > > +				  enum pipe pipe, int scaler_id)
> > > >
> > > > skl_scaler_...
> > > >
> > > > > +{
> > > > > +
> > > > > +	int coeff = 0;
> > > > > +	int phase = 0;
> > > > > +	int tap;
> > > > > +	int val = 0;
> > > >
> > > > Needlessly wide scope for most of these.
> > > >
> > > > > +
> > > > > +	/*enable the index auto increment.*/
> > > > > +	intel_de_write_fw(dev_priv, SKL_PS_COEF_INDEX_SET0(pipe,
> > > > scaler_id),
> > > > > +			  _PS_COEE_INDEX_AUTO_INC);
> > > > > +
> > > > > +	for (phase = 0; phase < 17; phase++) {
> > > > > +		for (tap = 0; tap < 7; tap++) {
> > > > > +			coeff++;
> > > >
> > > > Can be part of the % check.
> > >
> > > OK.
> > >
> > > >
> > > > > +			if (tap == 3)
> > > > > +				val = (phase % 2) ? (0x800) : (0x800 << 16);
> > > >
> > > > Parens overload.
> > >
> > > OK. Will remove.
> > > >
> > > > > +
> > > > > +			if (coeff % 2 == 0) {
> > > > > +				intel_de_write_fw(dev_priv,
> > > > SKL_PS_COEF_DATA_SET0(pipe, scaler_id), val);
> > > > > +				val = 0;
> > > >
> > > > Can drop this val=0 if you move the variable into tight scope and
> > > > initialize there.
> > >
> > > Moving val=0 initialization to the tight scope will not work here as
> > > we need to retain "val" and write only when 2 coefficients are ready
> > > (since 2 coefficients are packed in 1 dword).
> > >
> > > e.g. for (12th , 11th)  coefficients, coefficient reg value should be ( (0 << 16) |
> > 0x800).
> > > If we initialize val = 0 in tight loop, 0 will be written to  coefficient register.
> > 
> > Hmm, right. I guess I'd try to rearrange this to iterate the registers directly
> > instead of the phases and taps. Something like this perhaps:
> > 
> > static int cnl_coef_tap(int i)
> > {
> > 	return i % 7;
> > }
> > 
> > static u16 cnl_coef(int t)
> 
> cnl_coef -> cnl_nearest_filter_coef.  Right?

Right.

> 
> > {
> > 	return t == 3 ? 0x0800 : 0x3000;
> > }
> > 
> > static void cnl_program_nearest_filter_coefs(void)
> > {
> > 	int i;
> > 
> > 	for (i = 0; i < 17 * 7; i += 2) {
> > 		uint32_t tmp;
> > 		int t;
> > 
> > 		t = cnl_coef_tap(i);
> > 		tmp = cnl_nearest_filter_coef(t);
> > 
> > 		t = cnl_coef_tap(i + 1);
> > 		tmp |= cnl_nearest_filter_coef(t) << 16;
> > 
> > 		intel_de_write_fw(tmp);
> > 	}
> > }
> > 
> > More readable I think. The downside being all those modulo operations but
> > hopefully that's all in the noise when it comes to performance.
> 
> Looks better, thanks for spending time on this.
> I will try this out.
> 
> Thanks,
> Pankaj 
> > 
> > --
> > Ville Syrjälä
> > Intel

Patch
diff mbox series

diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index b5903ef3c5a0..6d5f59203258 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -6237,6 +6237,73 @@  void skl_scaler_disable(const struct intel_crtc_state *old_crtc_state)
 		skl_detach_scaler(crtc, i);
 }
 
+/**
+ *  Theory behind setting nearest-neighbor integer scaling:
+ *
+ *  17 phase of 7 taps requires 119 coefficients in 60 dwords per set.
+ *  The letter represents the filter tap (D is the center tap) and the number
+ *  represents the coefficient set for a phase (0-16).
+ *
+ *         +------------+------------------------+------------------------+
+ *         |Index value | Data value coeffient 1 | Data value coeffient 2 |
+ *         +------------+------------------------+------------------------+
+ *         |   00h      |          B0            |          A0            |
+ *         +------------+------------------------+------------------------+
+ *         |   01h      |          D0            |          C0            |
+ *         +------------+------------------------+------------------------+
+ *         |   02h      |          F0            |          E0            |
+ *         +------------+------------------------+------------------------+
+ *         |   03h      |          A1            |          G0            |
+ *         +------------+------------------------+------------------------+
+ *         |   04h      |          C1            |          B1            |
+ *         +------------+------------------------+------------------------+
+ *         |   ...      |          ...           |          ...           |
+ *         +------------+------------------------+------------------------+
+ *         |   38h      |          B16           |          A16           |
+ *         +------------+------------------------+------------------------+
+ *         |   39h      |          D16           |          C16           |
+ *         +------------+------------------------+------------------------+
+ *         |   3Ah      |          F16           |          C16           |
+ *         +------------+------------------------+------------------------+
+ *         |   3Bh      |        Reserved        |          G16           |
+ *         +------------+------------------------+------------------------+
+ *
+ *  To enable nearest-neighbor scaling:  program scaler coefficents with
+ *  the center tap (Dxx) values set to 1 and all other values set to 0 as per
+ *  SCALER_COEFFICIENT_FORMAT
+ *
+ */
+void skl_setup_nearest_neighbor_filter(struct drm_i915_private *dev_priv,
+				  enum pipe pipe, int scaler_id)
+{
+
+	int coeff = 0;
+	int phase = 0;
+	int tap;
+	int val = 0;
+
+	/*enable the index auto increment.*/
+	intel_de_write_fw(dev_priv, SKL_PS_COEF_INDEX_SET0(pipe, scaler_id),
+			  _PS_COEE_INDEX_AUTO_INC);
+
+	for (phase = 0; phase < 17; phase++) {
+		for (tap = 0; tap < 7; tap++) {
+			coeff++;
+			if (tap == 3)
+				val = (phase % 2) ? (0x800) : (0x800 << 16);
+
+			if (coeff % 2 == 0) {
+				intel_de_write_fw(dev_priv, SKL_PS_COEF_DATA_SET0(pipe, scaler_id), val);
+				val = 0;
+			}
+
+		}
+
+	}
+
+	intel_de_write_fw(dev_priv, SKL_PS_COEF_DATA_SET0(pipe, scaler_id), 0);
+}
+
 static void skl_pfit_enable(const struct intel_crtc_state *crtc_state)
 {
 	struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
@@ -6260,9 +6327,23 @@  static void skl_pfit_enable(const struct intel_crtc_state *crtc_state)
 		pfit_w = (crtc_state->pch_pfit.size >> 16) & 0xFFFF;
 		pfit_h = crtc_state->pch_pfit.size & 0xFFFF;
 
+		id = scaler_state->scaler_id;
+
 		if (state->scaling_filter ==
 		    DRM_SCALING_FILTER_NEAREST_NEIGHBOR) {
 			scaling_filter = PS_FILTER_PROGRAMMED;
+			skl_setup_nearest_neighbor_filter(dev_priv, pipe, id);
+
+			/* Make the scaling window size to integer multiple of
+			 * source.
+			 *
+			 * TODO: Should userspace take desision to round
+			 * scaling window to integer multiple?
+			 */
+			pfit_w = rounddown(pfit_w,
+					   (crtc_state->pipe_src_w << 16));
+			pfit_h = rounddown(pfit_h,
+					   (crtc_state->pipe_src_h << 16));
 		}
 
 		hscale = (crtc_state->pipe_src_w << 16) / pfit_w;
@@ -6271,8 +6352,6 @@  static void skl_pfit_enable(const struct intel_crtc_state *crtc_state)
 		uv_rgb_hphase = skl_scaler_calc_phase(1, hscale, false);
 		uv_rgb_vphase = skl_scaler_calc_phase(1, vscale, false);
 
-		id = scaler_state->scaler_id;
-
 		spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
 
 		intel_de_write_fw(dev_priv, SKL_PS_CTRL(pipe, id),
diff --git a/drivers/gpu/drm/i915/display/intel_display.h b/drivers/gpu/drm/i915/display/intel_display.h
index f92efbbec838..49f58d3c98fe 100644
--- a/drivers/gpu/drm/i915/display/intel_display.h
+++ b/drivers/gpu/drm/i915/display/intel_display.h
@@ -586,6 +586,8 @@  void intel_crtc_arm_fifo_underrun(struct intel_crtc *crtc,
 u16 skl_scaler_calc_phase(int sub, int scale, bool chroma_center);
 int skl_update_scaler_crtc(struct intel_crtc_state *crtc_state);
 void skl_scaler_disable(const struct intel_crtc_state *old_crtc_state);
+void skl_setup_nearest_neighbor_filter(struct drm_i915_private *dev_priv,
+				  enum pipe pipe, int scaler_id);
 void ilk_pfit_disable(const struct intel_crtc_state *old_crtc_state);
 u32 glk_plane_color_ctl(const struct intel_crtc_state *crtc_state,
 			const struct intel_plane_state *plane_state);
diff --git a/drivers/gpu/drm/i915/display/intel_sprite.c b/drivers/gpu/drm/i915/display/intel_sprite.c
index fd7b31a21723..5bef5c031374 100644
--- a/drivers/gpu/drm/i915/display/intel_sprite.c
+++ b/drivers/gpu/drm/i915/display/intel_sprite.c
@@ -415,18 +415,26 @@  skl_program_scaler(struct intel_plane *plane,
 	u16 y_vphase, uv_rgb_vphase;
 	int hscale, vscale;
 	const struct drm_plane_state *state = &plane_state->uapi;
+	u32 src_w = drm_rect_width(&plane_state->uapi.src) >> 16;
+	u32 src_h = drm_rect_height(&plane_state->uapi.src) >> 16;
 	u32 scaling_filter = PS_FILTER_MEDIUM;
+	struct drm_rect dst;
 
 	if (state->scaling_filter == DRM_SCALING_FILTER_NEAREST_NEIGHBOR) {
 		scaling_filter = PS_FILTER_PROGRAMMED;
+		skl_setup_nearest_neighbor_filter(dev_priv, pipe, scaler_id);
+
+		/* Make the scaling window size to integer multiple of source
+		 * TODO: Should userspace take desision to round scaling window
+		 * to integer multiple?
+		 */
+		crtc_w = rounddown(crtc_w, src_w);
+		crtc_h = rounddown(crtc_h, src_h);
 	}
 
-	hscale = drm_rect_calc_hscale(&plane_state->uapi.src,
-				      &plane_state->uapi.dst,
-				      0, INT_MAX);
-	vscale = drm_rect_calc_vscale(&plane_state->uapi.src,
-				      &plane_state->uapi.dst,
-				      0, INT_MAX);
+	drm_rect_init(&dst, crtc_x, crtc_y, crtc_w, crtc_h);
+	hscale = drm_rect_calc_hscale(&plane_state->uapi.src, &dst, 0, INT_MAX);
+	vscale = drm_rect_calc_vscale(&plane_state->uapi.src, &dst, 0, INT_MAX);
 
 	/* TODO: handle sub-pixel coordinates */
 	if (intel_format_info_is_yuv_semiplanar(fb->format, fb->modifier) &&