[v5,11/16] drm/vkms: Add YUV support

Message ID	20240313-yuv-v5-11-e610cbd03f52@bootlin.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: Louis Chauvet <louis.chauvet@bootlin.com> Date: Wed, 13 Mar 2024 18:45:05 +0100 Subject: [PATCH v5 11/16] drm/vkms: Add YUV support MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20240313-yuv-v5-11-e610cbd03f52@bootlin.com> References: <20240313-yuv-v5-0-e610cbd03f52@bootlin.com> In-Reply-To: <20240313-yuv-v5-0-e610cbd03f52@bootlin.com> To: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com>, Melissa Wen <melissa.srw@gmail.com>, =?utf-8?q?Ma=C3=ADra_Canal?= <mairacanal@riseup.net>, Haneen Mohammed <hamohammed.sa@gmail.com>, Daniel Vetter <daniel@ffwll.ch>, Maarten Lankhorst <maarten.lankhorst@linux.intel.com>, Maxime Ripard <mripard@kernel.org>, Thomas Zimmermann <tzimmermann@suse.de>, David Airlie <airlied@gmail.com>, arthurgrillo@riseup.net, Jonathan Corbet <corbet@lwn.net>, pekka.paalanen@haloniitty.fi Cc: dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, jeremie.dautheribes@bootlin.com, miquel.raynal@bootlin.com, thomas.petazzoni@bootlin.com, seanpaul@google.com, marcheu@google.com, nicolejadeyee@google.com, Louis Chauvet <louis.chauvet@bootlin.com> Precedence: list Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/vkms: Reimplement line-per-line pixel conversion for plane reading \| expand [v5,00/16] drm/vkms: Reimplement line-per-line pixel conversion for plane reading [v5,01/16] drm/vkms: Code formatting [v5,02/16] drm/vkms: Use drm_frame directly [v5,03/16] drm/vkms: write/update the documentation for pixel conversion and pixel write functions [v5,04/16] drm/vkms: Add typedef and documentation for pixel_read and pixel_write functions [v5,05/16] drm/vkms: Add dummy pixel_read/pixel_write callbacks to avoid NULL pointers [v5,06/16] drm/vkms: Use const for input pointers in pixel_read an pixel_write functions [v5,07/16] drm/vkms: Update pixels accessor to support packed and multi-plane formats. [v5,08/16] drm/vkms: Avoid computing blending limits inside pre_mul_alpha_blend [v5,09/16] drm/vkms: Introduce pixel_read_direction enum [v5,10/16] drm/vkms: Re-introduce line-per-line composition algorithm [v5,11/16] drm/vkms: Add YUV support [v5,12/16] drm/vkms: Add range and encoding properties to the plane [v5,13/16] drm/vkms: Drop YUV formats TODO [v5,14/16] drm/vkms: Create KUnit tests for YUV conversions [v5,15/16] drm/vkms: Add how to run the Kunit tests [v5,16/16] drm/vkms: Add support for DRM_FORMAT_R*

Louis Chauvet March 13, 2024, 5:45 p.m. UTC

From: Arthur Grillo <arthurgrillo@riseup.net>

Add support to the YUV formats bellow:

- NV12/NV16/NV24
- NV21/NV61/NV42
- YUV420/YUV422/YUV444
- YVU420/YVU422/YVU444

The conversion from yuv to rgb is done with fixed-point arithmetic, using
32.32 floats and the drm_fixed helpers.

To do the conversion, a specific matrix must be used for each color range
(DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
the `conversion_matrix` struct, along with the specific y_offset needed.
This matrix is queried only once, in `vkms_plane_atomic_update` and
stored in a `vkms_plane_state`. Those conversion matrices of each
encoding and range were obtained by rounding the values of the original
conversion matrices multiplied by 2^32. This is done to avoid the use of
floating point operations.

The same reading function is used for YUV and YVU formats. As the only
difference between those two category of formats is the order of field, a
simple swap in conversion matrix columns allows using the same function.

Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
[Louis Chauvet:
- Adapted Arthur's work
- Implemented the read_line_t callbacks for yuv
- add struct conversion_matrix
- remove struct pixel_yuv_u8
- update the commit message
- Merge the modifications from Arthur]
Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
---
 drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
 drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/vkms/vkms_formats.h |   4 +
 drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
 4 files changed, 473 insertions(+), 1 deletion(-)

Randy Dunlap March 13, 2024, 7:20 p.m. UTC | #1

Hi,

On 3/13/24 10:45, Louis Chauvet wrote:
> From: Arthur Grillo <arthurgrillo@riseup.net>
> 

> 
> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> [Louis Chauvet:
> - Adapted Arthur's work
> - Implemented the read_line_t callbacks for yuv
> - add struct conversion_matrix
> - remove struct pixel_yuv_u8
> - update the commit message
> - Merge the modifications from Arthur]
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
>  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
>  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
>  4 files changed, 473 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 23e1d247468d..f3116084de5a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
>  				  int y_start, enum pixel_read_direction direction, int count,
>  				  struct pixel_argb_u16 out_pixel[]);
>  
> +/**
> + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values

This should be

+ * define CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values

to conform to kernel-doc format.

> + */
> +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> +

> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 1449a0e6c706..edbf4b321b91 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c

> +/**
> + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> + * given encoding and range.
> + *
> + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> + * diagonal matrix, which act as a "no-op".
> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> + */
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range)
> +{
> +	static struct conversion_matrix no_operation = {
> +		.matrix = {
> +			{ 4294967296, 0,          0, },
> +			{ 0,          4294967296, 0, },
> +			{ 0,          0,          4294967296, },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * Those matrixies were generated using the colour python framework

	         matrices

> +	 *
> +	 * Below are the function calls used to generate eac matrix, go to

	                                                 each

> +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> +	 * for more info:
> +	 *
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */

> +
>  /**

Please convert this comment to kernel-doc format or just use "/*" to begin
the comment.

>   * Retrieve the correct write_pixel function for a specific format.
>   * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"

> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 8875bed76410..987dd2b686a8 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c


thanks.

Louis Chauvet March 14, 2024, 2:41 p.m. UTC | #2

Le 13/03/24 - 12:20, Randy Dunlap a écrit :
> Hi,
> 
> On 3/13/24 10:45, Louis Chauvet wrote:
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> 
> > 
> > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > [Louis Chauvet:
> > - Adapted Arthur's work
> > - Implemented the read_line_t callbacks for yuv
> > - add struct conversion_matrix
> > - remove struct pixel_yuv_u8
> > - update the commit message
> > - Merge the modifications from Arthur]
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> >  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> >  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> >  4 files changed, 473 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 23e1d247468d..f3116084de5a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
> >  				  int y_start, enum pixel_read_direction direction, int count,
> >  				  struct pixel_argb_u16 out_pixel[]);
> >  
> > +/**
> > + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> 
> This should be
> 
> + * define CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> 
> to conform to kernel-doc format.
> 
> > + */
> > +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> > +

Hi Randy,

Thanks for your feedback.

I missed it while squashing Arthur's work, but this constant is not needed 
anymore, it will be removed in v6.

For all other kernel-doc formatting (PATCHv5 03/16, PATCH V5 05/16), I 
will correct them in the v6.

Kind regards,
Louis Chauvet

> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 1449a0e6c706..edbf4b321b91 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> 
> > +/**
> > + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> > + * given encoding and range.
> > + *
> > + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> > + * diagonal matrix, which act as a "no-op".
> > + *
> > + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> > + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> > + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> > + */
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range)
> > +{
> > +	static struct conversion_matrix no_operation = {
> > +		.matrix = {
> > +			{ 4294967296, 0,          0, },
> > +			{ 0,          4294967296, 0, },
> > +			{ 0,          0,          4294967296, },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +
> > +	/*
> > +	 * Those matrixies were generated using the colour python framework
> 
> 	         matrices
> 
> > +	 *
> > +	 * Below are the function calls used to generate eac matrix, go to
> 
> 	                                                 each
> 
> > +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> > +	 * for more info:
> > +	 *
> > +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> > +	 *                                  is_legal = False,
> > +	 *                                  bits = 8) * 2**32).astype(int)
> > +	 */
> 
> > +
> >  /**
> 
> Please convert this comment to kernel-doc format or just use "/*" to begin
> the comment.
> 
> >   * Retrieve the correct write_pixel function for a specific format.
> >   * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> 
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 8875bed76410..987dd2b686a8 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> 
> 
> thanks.
> -- 
> #Randy

Maíra Canal March 25, 2024, 2:26 p.m. UTC | #3

On 3/13/24 14:45, Louis Chauvet wrote:
> From: Arthur Grillo <arthurgrillo@riseup.net>
> 
> Add support to the YUV formats bellow:
> 
> - NV12/NV16/NV24
> - NV21/NV61/NV42
> - YUV420/YUV422/YUV444
> - YVU420/YVU422/YVU444
> 
> The conversion from yuv to rgb is done with fixed-point arithmetic, using
> 32.32 floats and the drm_fixed helpers.
> 
> To do the conversion, a specific matrix must be used for each color range
> (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> the `conversion_matrix` struct, along with the specific y_offset needed.
> This matrix is queried only once, in `vkms_plane_atomic_update` and
> stored in a `vkms_plane_state`. Those conversion matrices of each
> encoding and range were obtained by rounding the values of the original
> conversion matrices multiplied by 2^32. This is done to avoid the use of
> floating point operations.
> 
> The same reading function is used for YUV and YVU formats. As the only
> difference between those two category of formats is the order of field, a
> simple swap in conversion matrix columns allows using the same function.
> 
> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> [Louis Chauvet:
> - Adapted Arthur's work
> - Implemented the read_line_t callbacks for yuv
> - add struct conversion_matrix
> - remove struct pixel_yuv_u8
> - update the commit message
> - Merge the modifications from Arthur]

A Co-developed-by tag would be more appropriate.

> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>   drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
>   drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
>   drivers/gpu/drm/vkms/vkms_formats.h |   4 +
>   drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
>   4 files changed, 473 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 23e1d247468d..f3116084de5a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
>   				  int y_start, enum pixel_read_direction direction, int count,
>   				  struct pixel_argb_u16 out_pixel[]);
>   
> +/**
> + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> + */
> +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> +
> +/**
> + * struct conversion_matrix - Matrix to use for a specific encoding and range
> + *
> + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> + * used to compute rgb values from yuv values:
> + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> + *   OR for yvu formats:
> + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH > + * @y_offest: Offset to apply on the y value.

s/y_offest/y_offset

> + */
> +struct conversion_matrix {
> +	s64 matrix[3][3];
> +	s64 y_offset;
> +};
> +
>   /**
>    * vkms_plane_state - Driver specific plane state
>    * @base: base plane state
> @@ -110,6 +131,7 @@ struct vkms_plane_state {
>   	struct drm_shadow_plane_state base;
>   	struct vkms_frame_info *frame_info;
>   	pixel_read_line_t pixel_read_line;
> +	struct conversion_matrix *conversion_matrix;

Add @conversion_matrix on the kernel-doc from the struct
vkms_plane_state.

>   };
>   
>   struct vkms_plane {
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 1449a0e6c706..edbf4b321b91 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -105,6 +105,44 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
>   	return 0;
>   }
>   
> +/**
> + * get_subsampling() - Get the subsampling divisor value on a specific direction

Where are the arguments?

> + */
> +static int get_subsampling(const struct drm_format_info *format,
> +			   enum pixel_read_direction direction)
> +{
> +	switch (direction) {
> +	case READ_BOTTOM_TO_TOP:
> +	case READ_TOP_TO_BOTTOM:
> +		return format->vsub;
> +	case READ_RIGHT_TO_LEFT:
> +	case READ_LEFT_TO_RIGHT:
> +		return format->hsub;
> +	}
> +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> +	return 1;
> +}
> +
> +/**
> + * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
> + * x_start and y_start values

Same.

> + */
> +static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
> +{
> +	switch (direction) {
> +	case READ_BOTTOM_TO_TOP:
> +		return -y_start - 1;
> +	case READ_TOP_TO_BOTTOM:
> +		return y_start;
> +	case READ_RIGHT_TO_LEFT:
> +		return -x_start - 1;
> +	case READ_LEFT_TO_RIGHT:
> +		return x_start;
> +	}
> +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> +	return 0;
> +}
> +
>   /*
>    * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
>    * ARGB16161616 in out_pixel.
> @@ -161,6 +199,42 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
>   	return out_pixel;
>   }
>   

[...]

>   
> +/**
> + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> + * given encoding and range.
> + *
> + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> + * diagonal matrix, which act as a "no-op".
> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix

A bit odd to see the arguments after the description.

> + */
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range)
> +{
> +	static struct conversion_matrix no_operation = {
> +		.matrix = {
> +			{ 4294967296, 0,          0, },
> +			{ 0,          4294967296, 0, },
> +			{ 0,          0,          4294967296, },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * Those matrixies were generated using the colour python framework
> +	 *
> +	 * Below are the function calls used to generate eac matrix, go to
> +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> +	 * for more info:
> +	 *
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt601_full = {
> +		.matrix = {
> +			{ 4294967296, 0,           6021544149 },
> +			{ 4294967296, -1478054095, -3067191994 },
> +			{ 4294967296, 7610682049,  0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt601_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,           6881764740 },
> +			{ 5020601039, -1689204679, -3505362278 },
> +			{ 5020601039, 8697922339,  0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt709_full = {
> +		.matrix = {
> +			{ 4294967296, 0,          6763714498 },
> +			{ 4294967296, -804551626, -2010578443 },
> +			{ 4294967296, 7969741314, 0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt709_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,          7729959424 },
> +			{ 5020601039, -919487572, -2297803934 },
> +			{ 5020601039, 9108275786, 0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt2020_full = {
> +		.matrix = {
> +			{ 4294967296, 0,          6333358775 },
> +			{ 4294967296, -706750298, -2453942994 },
> +			{ 4294967296, 8080551471, 0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt2020_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,          7238124312 },
> +			{ 5020601039, -807714626, -2804506279 },
> +			{ 5020601039, 9234915964, 0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * The next matrices are just the previous ones, but with the first and
> +	 * second columns swapped
> +	 */
> +	static struct conversion_matrix yvu_bt601_full = {
> +		.matrix = {
> +			{ 4294967296, 6021544149,  0 },
> +			{ 4294967296, -3067191994, -1478054095 },
> +			{ 4294967296, 0,           7610682049 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt601_limited = {
> +		.matrix = {
> +			{ 5020601039, 6881764740,  0 },
> +			{ 5020601039, -3505362278, -1689204679 },
> +			{ 5020601039, 0,           8697922339 },
> +		},
> +		.y_offset = 16,
> +	};
> +	static struct conversion_matrix yvu_bt709_full = {
> +		.matrix = {
> +			{ 4294967296, 6763714498,  0 },
> +			{ 4294967296, -2010578443, -804551626 },
> +			{ 4294967296, 0,           7969741314 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt709_limited = {
> +		.matrix = {
> +			{ 5020601039, 7729959424,  0 },
> +			{ 5020601039, -2297803934, -919487572 },
> +			{ 5020601039, 0,           9108275786 },
> +		},
> +		.y_offset = 16,
> +	};
> +	static struct conversion_matrix yvu_bt2020_full = {
> +		.matrix = {
> +			{ 4294967296, 6333358775,  0 },
> +			{ 4294967296, -2453942994, -706750298 },
> +			{ 4294967296, 0,           8080551471 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt2020_limited = {
> +		.matrix = {
> +			{ 5020601039, 7238124312,  0 },
> +			{ 5020601039, -2804506279, -807714626 },
> +			{ 5020601039, 0,           9234915964 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/* Breaking in this switch means that the color format+encoding+range is not supported */

s/color format+encoding+range/color format + encoding + range

> +	switch (format) {
> +	case DRM_FORMAT_NV12:
> +	case DRM_FORMAT_NV16:
> +	case DRM_FORMAT_NV24:
> +	case DRM_FORMAT_YUV420:
> +	case DRM_FORMAT_YUV422:
> +	case DRM_FORMAT_YUV444:
> +		switch (encoding) {
> +		case DRM_COLOR_YCBCR_BT601:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt601_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt601_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT709:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt709_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt709_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT2020:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt2020_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt2020_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_ENCODING_MAX:
> +			break;
> +		}
> +		break;
> +	case DRM_FORMAT_YVU420:
> +	case DRM_FORMAT_YVU422:
> +	case DRM_FORMAT_YVU444:
> +	case DRM_FORMAT_NV21:
> +	case DRM_FORMAT_NV61:
> +	case DRM_FORMAT_NV42:
> +		switch (encoding) {
> +		case DRM_COLOR_YCBCR_BT601:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt601_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt601_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT709:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt709_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt709_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT2020:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt2020_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt2020_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_ENCODING_MAX:
> +			break;
> +		}
> +		break;
> +	case DRM_FORMAT_ARGB8888:
> +	case DRM_FORMAT_XRGB8888:
> +	case DRM_FORMAT_ARGB16161616:
> +	case DRM_FORMAT_XRGB16161616:
> +	case DRM_FORMAT_RGB565:
> +		/*
> +		 * Those formats are supported, but they don't need a conversion matrix. Return
> +		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
> +		 * somewhere.
> +		 */
> +		return &no_operation;
> +	default:
> +		break;
> +	}
> +	WARN(true, "Unsupported encoding (%d), range (%d) and format (%p4cc) combination\n",
> +	     encoding, range, &format);
> +	return &no_operation;
> +}
> +
>   /**
>    * Retrieve the correct write_pixel function for a specific format.
>    * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> index 8d2bef95ff79..e1d324764b17 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.h
> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> @@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);
>   
>   pixel_write_t get_pixel_write_function(u32 format);
>   
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range);
> +
>   #endif /* _VKMS_FORMATS_H_ */
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 8875bed76410..987dd2b686a8 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
>   	DRM_FORMAT_XRGB8888,
>   	DRM_FORMAT_XRGB16161616,
>   	DRM_FORMAT_ARGB16161616,
> -	DRM_FORMAT_RGB565
> +	DRM_FORMAT_RGB565,
> +	DRM_FORMAT_NV12,
> +	DRM_FORMAT_NV16,
> +	DRM_FORMAT_NV24,
> +	DRM_FORMAT_NV21,
> +	DRM_FORMAT_NV61,
> +	DRM_FORMAT_NV42,
> +	DRM_FORMAT_YUV420,
> +	DRM_FORMAT_YUV422,
> +	DRM_FORMAT_YUV444,
> +	DRM_FORMAT_YVU420,
> +	DRM_FORMAT_YVU422,
> +	DRM_FORMAT_YVU444

Let's add a comma by the end of this entry, to avoid deleting this line
when adding a new format.

>   };
>   
>   static struct drm_plane_state *
> @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>   	drm_framebuffer_get(frame_info->fb);
>   	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
>   									  DRM_MODE_ROTATE_90 |
> +									  DRM_MODE_ROTATE_180 |

Why do we need to add DRM_MODE_ROTATE_180 here? Isn't the same as
reflecting both along the X and Y axis?

Best Regards,
- Maíra

>   									  DRM_MODE_ROTATE_270 |
>   									  DRM_MODE_REFLECT_X |
>   									  DRM_MODE_REFLECT_Y);
>   
>   
>   	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> +	vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
> +		(fmt, new_state->color_encoding, new_state->color_range);
>   }
>   
>   static int vkms_plane_atomic_check(struct drm_plane *plane,
>

Louis Chauvet March 26, 2024, 3:57 p.m. UTC | #4

Le 25/03/24 - 11:26, Maíra Canal a écrit :
> On 3/13/24 14:45, Louis Chauvet wrote:
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> > Add support to the YUV formats bellow:
> > 
> > - NV12/NV16/NV24
> > - NV21/NV61/NV42
> > - YUV420/YUV422/YUV444
> > - YVU420/YVU422/YVU444
> > 
> > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > 32.32 floats and the drm_fixed helpers.
> > 
> > To do the conversion, a specific matrix must be used for each color range
> > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > the `conversion_matrix` struct, along with the specific y_offset needed.
> > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > stored in a `vkms_plane_state`. Those conversion matrices of each
> > encoding and range were obtained by rounding the values of the original
> > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > floating point operations.
> > 
> > The same reading function is used for YUV and YVU formats. As the only
> > difference between those two category of formats is the order of field, a
> > simple swap in conversion matrix columns allows using the same function.
> > 
> > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > [Louis Chauvet:
> > - Adapted Arthur's work
> > - Implemented the read_line_t callbacks for yuv
> > - add struct conversion_matrix
> > - remove struct pixel_yuv_u8
> > - update the commit message
> > - Merge the modifications from Arthur]
> 
> A Co-developed-by tag would be more appropriate.

I am not the main author of this part, I only applied a few simple 
suggestions, the complex part was done by Arthur.

I will wait for Arthur's confirmation to change it to Co-developed by if
he agrees.

 
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >   drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> >   drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> >   drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> >   drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> >   4 files changed, 473 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 23e1d247468d..f3116084de5a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
> >   				  int y_start, enum pixel_read_direction direction, int count,
> >   				  struct pixel_argb_u16 out_pixel[]);
> >   
> > +/**
> > + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> > + */
> > +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> > +
> > +/**
> > + * struct conversion_matrix - Matrix to use for a specific encoding and range
> > + *
> > + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> > + * used to compute rgb values from yuv values:
> > + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> > + *   OR for yvu formats:
> > + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> > + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH > + * @y_offest: Offset to apply on the y value.
> 
> s/y_offest/y_offset

Fixed in v6.

> > + */
> > +struct conversion_matrix {
> > +	s64 matrix[3][3];
> > +	s64 y_offset;
> > +};
> > +
> >   /**
> >    * vkms_plane_state - Driver specific plane state
> >    * @base: base plane state
> > @@ -110,6 +131,7 @@ struct vkms_plane_state {
> >   	struct drm_shadow_plane_state base;
> >   	struct vkms_frame_info *frame_info;
> >   	pixel_read_line_t pixel_read_line;
> > +	struct conversion_matrix *conversion_matrix;
> 
> Add @conversion_matrix on the kernel-doc from the struct
> vkms_plane_state.

Fixed in v6.

> >   };
> >   
> >   struct vkms_plane {
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 1449a0e6c706..edbf4b321b91 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -105,6 +105,44 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
> >   	return 0;
> >   }
> >   
> > +/**
> > + * get_subsampling() - Get the subsampling divisor value on a specific direction
> 
> Where are the arguments?

Fixed in v6.

> > + */
> > +static int get_subsampling(const struct drm_format_info *format,
> > +			   enum pixel_read_direction direction)
> > +{
> > +	switch (direction) {
> > +	case READ_BOTTOM_TO_TOP:
> > +	case READ_TOP_TO_BOTTOM:
> > +		return format->vsub;
> > +	case READ_RIGHT_TO_LEFT:
> > +	case READ_LEFT_TO_RIGHT:
> > +		return format->hsub;
> > +	}
> > +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> > +	return 1;
> > +}
> > +
> > +/**
> > + * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
> > + * x_start and y_start values
> 
> Same.

Fixed in v6.

> > + */
> > +static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
> > +{
> > +	switch (direction) {
> > +	case READ_BOTTOM_TO_TOP:
> > +		return -y_start - 1;
> > +	case READ_TOP_TO_BOTTOM:
> > +		return y_start;
> > +	case READ_RIGHT_TO_LEFT:
> > +		return -x_start - 1;
> > +	case READ_LEFT_TO_RIGHT:
> > +		return x_start;
> > +	}
> > +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> > +	return 0;
> > +}
> > +
> >   /*
> >    * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
> >    * ARGB16161616 in out_pixel.
> > @@ -161,6 +199,42 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
> >   	return out_pixel;
> >   }
> >   
> 
> [...]
> 
> >   
> > +/**
> > + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> > + * given encoding and range.
> > + *
> > + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> > + * diagonal matrix, which act as a "no-op".
> > + *
> > + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> > + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> > + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> 
> A bit odd to see the arguments after the description.

Fixed in v6.

> > + */
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range)
> > +{
> > +	static struct conversion_matrix no_operation = {
> > +		.matrix = {
> > +			{ 4294967296, 0,          0, },
> > +			{ 0,          4294967296, 0, },
> > +			{ 0,          0,          4294967296, },
> > +		},
> > +		.y_offset = 0,
> > +	};

[...]

> > +
> > +	/* Breaking in this switch means that the color format+encoding+range is not supported */
> 
> s/color format+encoding+range/color format + encoding + range

Fixed in v6.

> > +	switch (format) {
> > +	case DRM_FORMAT_NV12:
> > +	case DRM_FORMAT_NV16:
> > +	case DRM_FORMAT_NV24:
> > +	case DRM_FORMAT_YUV420:
> > +	case DRM_FORMAT_YUV422:
> > +	case DRM_FORMAT_YUV444:
> > +		switch (encoding) {
> > +		case DRM_COLOR_YCBCR_BT601:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yuv_bt601_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:

[...]

> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> > index 8d2bef95ff79..e1d324764b17 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.h
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> > @@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);
> >   
> >   pixel_write_t get_pixel_write_function(u32 format);
> >   
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range);
> > +
> >   #endif /* _VKMS_FORMATS_H_ */
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 8875bed76410..987dd2b686a8 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
> >   	DRM_FORMAT_XRGB8888,
> >   	DRM_FORMAT_XRGB16161616,
> >   	DRM_FORMAT_ARGB16161616,
> > -	DRM_FORMAT_RGB565
> > +	DRM_FORMAT_RGB565,
> > +	DRM_FORMAT_NV12,
> > +	DRM_FORMAT_NV16,
> > +	DRM_FORMAT_NV24,
> > +	DRM_FORMAT_NV21,
> > +	DRM_FORMAT_NV61,
> > +	DRM_FORMAT_NV42,
> > +	DRM_FORMAT_YUV420,
> > +	DRM_FORMAT_YUV422,
> > +	DRM_FORMAT_YUV444,
> > +	DRM_FORMAT_YVU420,
> > +	DRM_FORMAT_YVU422,
> > +	DRM_FORMAT_YVU444
> 
> Let's add a comma by the end of this entry, to avoid deleting this line
> when adding a new format.

Fixed in v6.

> >   };
> >   
> >   static struct drm_plane_state *
> > @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >   	drm_framebuffer_get(frame_info->fb);
> >   	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> >   									  DRM_MODE_ROTATE_90 |
> > +									  DRM_MODE_ROTATE_180 |
> 
> Why do we need to add DRM_MODE_ROTATE_180 here? Isn't the same as
> reflecting both along the X and Y axis?

Oops, I had no intention of putting that change here. I will move it to 
another patch.

I don't understand why DRM_MODE_ROTATE_180 isn't in this list. If I read 
the drm_rotation_simplify documentation, it explains that this argument 
should contain all supported rotations and reflections, and ROT_180 is 
supported by VKMS. Perhaps this call is unnecessary because all 
combinations are supported by vkms?

Thanks,
Louis Chauvet

> Best Regards,
> - Maíra
> 
> >   									  DRM_MODE_ROTATE_270 |
> >   									  DRM_MODE_REFLECT_X |
> >   									  DRM_MODE_REFLECT_Y);
> >   
> >   
> >   	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> > +	vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
> > +		(fmt, new_state->color_encoding, new_state->color_range);
> >   }
> >   
> >   static int vkms_plane_atomic_check(struct drm_plane *plane,
> >

Philipp Zabel March 27, 2024, 12:11 p.m. UTC | #5

Hi Louis,

On Mi, 2024-03-13 at 18:45 +0100, Louis Chauvet wrote:
> From: Arthur Grillo <arthurgrillo@riseup.net>
> 
> Add support to the YUV formats bellow:
> 
> - NV12/NV16/NV24
> - NV21/NV61/NV42
> - YUV420/YUV422/YUV444
> - YVU420/YVU422/YVU444
> 
> The conversion from yuv to rgb is done with fixed-point arithmetic, using
> 32.32 floats and the drm_fixed helpers.

s/floats/fixed-point numbers/

Nothing floating here, the point is fixed.

[...]
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 23e1d247468d..f3116084de5a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
>  				  int y_start, enum pixel_read_direction direction, int count,
>  				  struct pixel_argb_u16 out_pixel[]);
>  
> +/**
> + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values

s/CONVERSION_MATRIX_FLOAT_DEPTH/CONVERSION_MATRIX_FRACTIONAL_BITS/

Just a suggestion, maybe there are better terms, but using "FLOAT" here
is confusing.

> + */
> +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> +
> +/**
> + * struct conversion_matrix - Matrix to use for a specific encoding and range
> + *
> + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> + * used to compute rgb values from yuv values:
> + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> + *   OR for yvu formats:
> + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH

s/fixed floats/fixed-point numbers/

regards
Philipp

Pekka Paalanen March 27, 2024, 12:59 p.m. UTC | #6

On Tue, 26 Mar 2024 16:57:03 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Le 25/03/24 - 11:26, Maíra Canal a écrit :
> > On 3/13/24 14:45, Louis Chauvet wrote:  
> > > From: Arthur Grillo <arthurgrillo@riseup.net>
> > > 
> > > Add support to the YUV formats bellow:
> > > 
> > > - NV12/NV16/NV24
> > > - NV21/NV61/NV42
> > > - YUV420/YUV422/YUV444
> > > - YVU420/YVU422/YVU444
> > > 
> > > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > > 32.32 floats and the drm_fixed helpers.
> > > 
> > > To do the conversion, a specific matrix must be used for each color range
> > > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > > the `conversion_matrix` struct, along with the specific y_offset needed.
> > > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > > stored in a `vkms_plane_state`. Those conversion matrices of each
> > > encoding and range were obtained by rounding the values of the original
> > > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > > floating point operations.
> > > 
> > > The same reading function is used for YUV and YVU formats. As the only
> > > difference between those two category of formats is the order of field, a
> > > simple swap in conversion matrix columns allows using the same function.
> > > 
> > > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > > [Louis Chauvet:
> > > - Adapted Arthur's work
> > > - Implemented the read_line_t callbacks for yuv
> > > - add struct conversion_matrix
> > > - remove struct pixel_yuv_u8
> > > - update the commit message
> > > - Merge the modifications from Arthur]  
> > 
> > A Co-developed-by tag would be more appropriate.  
> 
> I am not the main author of this part, I only applied a few simple 
> suggestions, the complex part was done by Arthur.
> 
> I will wait for Arthur's confirmation to change it to Co-developed by if
> he agrees.

Co-developed-by is an additional tag, and does not replace S-o-b. To my
understanding, the kernel rules and Developers' Certificate of Origin
require S-o-b to be added by anyone who has taken a patch and
re-submitted it, regardless of who the original author is, and
especially if the patch was modified.

Personally I also like to keep the list of changes like Louis added, to
credit people better.

> > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > ---
> > >   drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> > >   drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> > >   drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> > >   drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> > >   4 files changed, 473 insertions(+), 1 deletion(-)
> > > 

...

> > >   };
> > >   
> > >   static struct drm_plane_state *
> > > @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> > >   	drm_framebuffer_get(frame_info->fb);
> > >   	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> > >   									  DRM_MODE_ROTATE_90 |
> > > +									  DRM_MODE_ROTATE_180 |  
> > 
> > Why do we need to add DRM_MODE_ROTATE_180 here? Isn't the same as
> > reflecting both along the X and Y axis?  
> 
> Oops, I had no intention of putting that change here. I will move it to 
> another patch.
> 
> I don't understand why DRM_MODE_ROTATE_180 isn't in this list. If I read 
> the drm_rotation_simplify documentation, it explains that this argument 
> should contain all supported rotations and reflections, and ROT_180 is 
> supported by VKMS. Perhaps this call is unnecessary because all 
> combinations are supported by vkms?

If you truly handle all bit patterns that the rotation bitfield can
have, then yes, the call seems unnecessary.

However, as documented, the bitfield contains redundancy: the same
orientation can be expressed in more than one bit pattern. One example
is that ROTATE_180 is equivalent to REFLECT_X | REFLECT_Y.

Since it's a bitmask, userspace can give you funny values like
ROTATE_0 | ROTATE_90 | ROTATE_180. That is a valid orientation of
270-degree rotation (according to UAPI doc), but it is very awkwardly
expressed, hence the need to normalise it into a minimal bit pattern.

It does not look like drm_rotation_simplify() actually does this
minimisation!

I was not able to tell if DRM common code actually stops userspace from
combining multiple ROTATE bits in the same value. I suspect it must
stop them, or perhaps all code dealing with rotation is actually broken.

drm_rotation_simplify() is useful for cases where your hardware does
not have exactly the same flexibility. Maybe it cannot do REFLECT_Y but
it can do everything else? Then drm_rotation_simplify() gives you a bit
pattern that you can use directly, or fails if the orientation is not
representable with what your hardware can do.

At least, that's my understanding of quickly glancing over it.

IOW, if you wanted to never have to deal with REFLECT_Y bit, you could
leave it out here. Or, if you never want to deal with ROTATE_180, leave
that out - you will get REFLECT_X | REFLECT_Y instead. In theory.

Thanks,
pq

Pekka Paalanen March 27, 2024, 2:23 p.m. UTC | #7

On Wed, 13 Mar 2024 18:45:05 +0100
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> From: Arthur Grillo <arthurgrillo@riseup.net>
> 
> Add support to the YUV formats bellow:
> 
> - NV12/NV16/NV24
> - NV21/NV61/NV42
> - YUV420/YUV422/YUV444
> - YVU420/YVU422/YVU444
> 
> The conversion from yuv to rgb is done with fixed-point arithmetic, using
> 32.32 floats and the drm_fixed helpers.

You mean fixed-point, not floating-point (floats).

> 
> To do the conversion, a specific matrix must be used for each color range
> (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> the `conversion_matrix` struct, along with the specific y_offset needed.
> This matrix is queried only once, in `vkms_plane_atomic_update` and
> stored in a `vkms_plane_state`. Those conversion matrices of each
> encoding and range were obtained by rounding the values of the original
> conversion matrices multiplied by 2^32. This is done to avoid the use of
> floating point operations.
> 
> The same reading function is used for YUV and YVU formats. As the only
> difference between those two category of formats is the order of field, a
> simple swap in conversion matrix columns allows using the same function.

Sounds good!

> Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> [Louis Chauvet:
> - Adapted Arthur's work
> - Implemented the read_line_t callbacks for yuv
> - add struct conversion_matrix
> - remove struct pixel_yuv_u8
> - update the commit message
> - Merge the modifications from Arthur]
> Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> ---
>  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
>  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
>  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
>  4 files changed, 473 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> index 23e1d247468d..f3116084de5a 100644
> --- a/drivers/gpu/drm/vkms/vkms_drv.h
> +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
>  				  int y_start, enum pixel_read_direction direction, int count,
>  				  struct pixel_argb_u16 out_pixel[]);
>  
> +/**
> + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> + */
> +#define CONVERSION_MATRIX_FLOAT_DEPTH 32

Fraction, not float.

> +
> +/**
> + * struct conversion_matrix - Matrix to use for a specific encoding and range
> + *
> + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> + * used to compute rgb values from yuv values:
> + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> + *   OR for yvu formats:
> + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH

Fixed float is not a thing. They are signed fixed-point values with
32-bit fractional part.

> + * @y_offest: Offset to apply on the y value.
> + */
> +struct conversion_matrix {
> +	s64 matrix[3][3];
> +	s64 y_offset;
> +};

Btw. too bad that drm_fixed.h does not use something like

	typedef struct drm_fixed {
		s64 v;
	} drm_fixed_t;

and use that in all the API where a fixed-point value is passed. It
would make the type very explicit, and the struct prevents it from
implicitly casting to/from regular integer formats.

Then you could use drm_fixed_t instead of s64 and it would be obvious
how the values must be handled and which API is appropriate.

> +
>  /**
>   * vkms_plane_state - Driver specific plane state
>   * @base: base plane state
> @@ -110,6 +131,7 @@ struct vkms_plane_state {
>  	struct drm_shadow_plane_state base;
>  	struct vkms_frame_info *frame_info;
>  	pixel_read_line_t pixel_read_line;
> +	struct conversion_matrix *conversion_matrix;

If the matrix was embedded as a copy instead of a pointer to (const!)
matrix, you would not need to manually hardcode YVU variant of the
matrices, but you could simply swap the columns of the YUV matrix while
copying them into this field.


>  };
>  
>  struct vkms_plane {
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> index 1449a0e6c706..edbf4b321b91 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.c
> +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> @@ -105,6 +105,44 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
>  	return 0;
>  }
>  
> +/**
> + * get_subsampling() - Get the subsampling divisor value on a specific direction
> + */
> +static int get_subsampling(const struct drm_format_info *format,
> +			   enum pixel_read_direction direction)
> +{
> +	switch (direction) {
> +	case READ_BOTTOM_TO_TOP:
> +	case READ_TOP_TO_BOTTOM:
> +		return format->vsub;
> +	case READ_RIGHT_TO_LEFT:
> +	case READ_LEFT_TO_RIGHT:
> +		return format->hsub;
> +	}
> +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> +	return 1;
> +}
> +
> +/**
> + * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
> + * x_start and y_start values
> + */
> +static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
> +{
> +	switch (direction) {
> +	case READ_BOTTOM_TO_TOP:
> +		return -y_start - 1;
> +	case READ_TOP_TO_BOTTOM:
> +		return y_start;
> +	case READ_RIGHT_TO_LEFT:
> +		return -x_start - 1;
> +	case READ_LEFT_TO_RIGHT:
> +		return x_start;
> +	}
> +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> +	return 0;
> +}
> +
>  /*
>   * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
>   * ARGB16161616 in out_pixel.
> @@ -161,6 +199,42 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
>  	return out_pixel;
>  }
>  
> +static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> +						  struct conversion_matrix *matrix)

If you are using the "swap the matrix columns" trick, then you cannot
call these cb, cr nor even u,v, because they might be the opposite.
They are simply the first and second chroma channel, and their meaning
depends on the given matrix.

> +{
> +	u8 r, g, b;
> +	s64 fp_y, fp_cb, fp_cr;
> +	s64 fp_r, fp_g, fp_b;
> +
> +	fp_y = y - matrix->y_offset;
> +	fp_cb = cb - 128;
> +	fp_cr = cr - 128;

This looks like an incorrect way to convert u8 to fixed-point, but...

> +
> +	fp_y = drm_int2fixp(fp_y);
> +	fp_cb = drm_int2fixp(fp_cb);
> +	fp_cr = drm_int2fixp(fp_cr);

I find it confusing to re-purpose variables like this.

I'd do just

	fp_c1 = drm_int2fixp((int)c1 - 128);

If the function arguments were int to begin with, then the cast would
be obviously unnecessary.

So, what you have in fp variables at this point is fractional numbers
in the 8-bit integer scale. However, because the target format is
16-bit, you should not show the extra precision away here. Instead,
multiply by 257 to bring the values to 16-bit scale, and do the RGB
clamping to 16-bit, not 8-bit.

> +
> +	fp_r = drm_fixp_mul(matrix->matrix[0][0], fp_y) +
> +	       drm_fixp_mul(matrix->matrix[0][1], fp_cb) +
> +	       drm_fixp_mul(matrix->matrix[0][2], fp_cr);
> +	fp_g = drm_fixp_mul(matrix->matrix[1][0], fp_y) +
> +	       drm_fixp_mul(matrix->matrix[1][1], fp_cb) +
> +	       drm_fixp_mul(matrix->matrix[1][2], fp_cr);
> +	fp_b = drm_fixp_mul(matrix->matrix[2][0], fp_y) +
> +	       drm_fixp_mul(matrix->matrix[2][1], fp_cb) +
> +	       drm_fixp_mul(matrix->matrix[2][2], fp_cr);
> +
> +	fp_r = drm_fixp2int_round(fp_r);
> +	fp_g = drm_fixp2int_round(fp_g);
> +	fp_b = drm_fixp2int_round(fp_b);
> +
> +	r = clamp(fp_r, 0, 0xff);
> +	g = clamp(fp_g, 0, 0xff);
> +	b = clamp(fp_b, 0, 0xff);
> +
> +	return argb_u16_from_u8888(255, r, g, b);

Going through argb_u16_from_u8888() will throw away precision.

> +}
> +
>  /*
>   * The following functions are read_line function for each pixel format supported by VKMS.
>   *
> @@ -293,6 +367,79 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
>  	}
>  }
>  
> +/*
> + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> + * (column inversion)

Would be nice to explain what semi_planar_yuv means, so that the
documentation for these functions would show how they differ rather
than all saying exactly the same thing.

> + */
> +static void semi_planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
> +				      int y_start, enum pixel_read_direction direction, int count,
> +				      struct pixel_argb_u16 out_pixel[])
> +{
> +	int rem_x, rem_y;
> +	u8 *y_plane;
> +	u8 *uv_plane;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);

Assert rem_x, rem_y are zero, or block is 1x1.

> +	packed_pixels_addr(plane->frame_info,
> +			   x_start / plane->frame_info->fb->format->hsub,
> +			   y_start / plane->frame_info->fb->format->vsub,
> +			   1, &uv_plane, &rem_x, &rem_y);

Assert rem_x, rem_y are zero, or block is 1x1.

Actually, this is so common, that maybe there should be a wrapper for
packed_pixels_addr() or another variant of it, that asserts that the
block size is 1x1 and does not return rem_x, rem_y at all.

> +	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
> +	int step_uv = get_step_next_block(plane->frame_info->fb, direction, 1);
> +	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
> +	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
> +	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
> +
> +	for (int i = 0; i < count; i++) {
> +		*out_pixel = argb_u16_from_yuv888(y_plane[0], uv_plane[0], uv_plane[1],
> +						  conversion_matrix);
> +		out_pixel += 1;
> +		y_plane += step_y;
> +		if ((i + subsampling_offset + 1) % subsampling == 0)
> +			uv_plane += step_uv;
> +	}
> +}
> +
> +/*
> + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> + * (column inversion)
> + */
> +static void planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
> +				 int y_start, enum pixel_read_direction direction, int count,
> +				 struct pixel_argb_u16 out_pixel[])
> +{
> +	int rem_x, rem_y;
> +	u8 *y_plane;
> +	u8 *u_plane;
> +	u8 *v_plane;
> +
> +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);
> +	packed_pixels_addr(plane->frame_info,
> +			   x_start / plane->frame_info->fb->format->hsub,
> +			   y_start / plane->frame_info->fb->format->vsub,
> +			   1, &u_plane, &rem_x, &rem_y);
> +	packed_pixels_addr(plane->frame_info,
> +			   x_start / plane->frame_info->fb->format->hsub,
> +			   y_start / plane->frame_info->fb->format->vsub,
> +			   2, &v_plane, &rem_x, &rem_y);
> +	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
> +	int step_u = get_step_next_block(plane->frame_info->fb, direction, 1);
> +	int step_v = get_step_next_block(plane->frame_info->fb, direction, 2);
> +	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
> +	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
> +	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
> +
> +	for (int i = 0; i < count; i++) {
> +		*out_pixel = argb_u16_from_yuv888(*y_plane, *u_plane, *v_plane, conversion_matrix);
> +		out_pixel += 1;
> +		y_plane += step_y;
> +		if ((i + subsampling_offset + 1) % subsampling == 0) {
> +			u_plane += step_u;
> +			v_plane += step_v;
> +		}
> +	}
> +}
> +
>  /*
>   * The following functions take one argb_u16 pixel and convert it to a specific format. The
>   * result is stored in @out_pixel.
> @@ -418,6 +565,20 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
>  		return &XRGB16161616_read_line;
>  	case DRM_FORMAT_RGB565:
>  		return &RGB565_read_line;
> +	case DRM_FORMAT_NV12:
> +	case DRM_FORMAT_NV16:
> +	case DRM_FORMAT_NV24:
> +	case DRM_FORMAT_NV21:
> +	case DRM_FORMAT_NV61:
> +	case DRM_FORMAT_NV42:
> +		return &semi_planar_yuv_read_line;
> +	case DRM_FORMAT_YUV420:
> +	case DRM_FORMAT_YUV422:
> +	case DRM_FORMAT_YUV444:
> +	case DRM_FORMAT_YVU420:
> +	case DRM_FORMAT_YVU422:
> +	case DRM_FORMAT_YVU444:
> +		return &planar_yuv_read_line;
>  	default:
>  		/*
>  		 * This is a bug in vkms_plane_atomic_check. All the supported
> @@ -435,6 +596,276 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
>  	}
>  }
>  
> +/**
> + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> + * given encoding and range.
> + *
> + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> + * diagonal matrix, which act as a "no-op".

This comment about NULL seems bogus.

> + *
> + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> + */
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range)
> +{
> +	static struct conversion_matrix no_operation = {

Every matrix here should be 'static const' rather than only 'static'.

> +		.matrix = {
> +			{ 4294967296, 0,          0, },
> +			{ 0,          4294967296, 0, },
> +			{ 0,          0,          4294967296, },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * Those matrixies were generated using the colour python framework
> +	 *
> +	 * Below are the function calls used to generate eac matrix, go to
> +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> +	 * for more info:
> +	 *
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = False,

Ugh, colour.matrix_YCbCr documentation is confusing. This is the first
time I've heard of "legal range", so I had to look it up. Of course,
the doc does not explain it.

Reading
https://kb.pomfort.com/livegrade/advanced-grading-features/legal-and-extended-sdi-signals-and-luts-in-livegrade/
it sounds like extended range in 8-bit is 1-254, not 0-255 that
we use in computer graphics. This matches what I've read before
elsewhere in ITU or SMPTE specs.

SDI signals reserve the 8-bit code points 0 and 255 for
synchronization, making them invalid as data. It scales to higher bit
depths, so 10-bit code points 0-3 and 1020-1023 inclusive are reserved
for synchronization.

IOW, there are two different "full range" quantizations: extended and full.

Does is_legal=False refer to extended or full? The documentation
does not say.

However, given that changing 'bits' value with is_legal=False does not
change the result, and with is_legal=True it does change the result, I
suspect is_legal=False means full range, not extended range.

So I think the python snippet is correct.

> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt601_full = {
> +		.matrix = {
> +			{ 4294967296, 0,           6021544149 },
> +			{ 4294967296, -1478054095, -3067191994 },
> +			{ 4294967296, 7610682049,  0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt601_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,           6881764740 },
> +			{ 5020601039, -1689204679, -3505362278 },
> +			{ 5020601039, 8697922339,  0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt709_full = {
> +		.matrix = {
> +			{ 4294967296, 0,          6763714498 },
> +			{ 4294967296, -804551626, -2010578443 },
> +			{ 4294967296, 7969741314, 0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.709"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt709_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,          7729959424 },
> +			{ 5020601039, -919487572, -2297803934 },
> +			{ 5020601039, 9108275786, 0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                                  is_legal = False,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt2020_full = {
> +		.matrix = {
> +			{ 4294967296, 0,          6333358775 },
> +			{ 4294967296, -706750298, -2453942994 },
> +			{ 4294967296, 8080551471, 0 },
> +		},
> +		.y_offset = 0,
> +	};
> +
> +	/*
> +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.2020"],
> +	 *                                  is_legal = True,
> +	 *                                  bits = 8) * 2**32).astype(int)
> +	 */
> +	static struct conversion_matrix yuv_bt2020_limited = {
> +		.matrix = {
> +			{ 5020601039, 0,          7238124312 },
> +			{ 5020601039, -807714626, -2804506279 },
> +			{ 5020601039, 9234915964, 0 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/*
> +	 * The next matrices are just the previous ones, but with the first and
> +	 * second columns swapped

As I mentioned earlier, you could derive those below from the above
matrices in code, so you don't need all these open-coded.

You also would not need twice the switch-ladders below, you'd only need
a 'bool need_to_swap_columns' from the pixel format.

You could also have a 'bool limited_range', and do

	case DRM_COLOR_YCBCR_BT601:
		return limited_range ? &yuv_bt601_limited : &yuv_bt601_full;


> +	 */
> +	static struct conversion_matrix yvu_bt601_full = {
> +		.matrix = {
> +			{ 4294967296, 6021544149,  0 },
> +			{ 4294967296, -3067191994, -1478054095 },
> +			{ 4294967296, 0,           7610682049 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt601_limited = {
> +		.matrix = {
> +			{ 5020601039, 6881764740,  0 },
> +			{ 5020601039, -3505362278, -1689204679 },
> +			{ 5020601039, 0,           8697922339 },
> +		},
> +		.y_offset = 16,
> +	};
> +	static struct conversion_matrix yvu_bt709_full = {
> +		.matrix = {
> +			{ 4294967296, 6763714498,  0 },
> +			{ 4294967296, -2010578443, -804551626 },
> +			{ 4294967296, 0,           7969741314 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt709_limited = {
> +		.matrix = {
> +			{ 5020601039, 7729959424,  0 },
> +			{ 5020601039, -2297803934, -919487572 },
> +			{ 5020601039, 0,           9108275786 },
> +		},
> +		.y_offset = 16,
> +	};
> +	static struct conversion_matrix yvu_bt2020_full = {
> +		.matrix = {
> +			{ 4294967296, 6333358775,  0 },
> +			{ 4294967296, -2453942994, -706750298 },
> +			{ 4294967296, 0,           8080551471 },
> +		},
> +		.y_offset = 0,
> +	};
> +	static struct conversion_matrix yvu_bt2020_limited = {
> +		.matrix = {
> +			{ 5020601039, 7238124312,  0 },
> +			{ 5020601039, -2804506279, -807714626 },
> +			{ 5020601039, 0,           9234915964 },
> +		},
> +		.y_offset = 16,
> +	};
> +
> +	/* Breaking in this switch means that the color format+encoding+range is not supported */
> +	switch (format) {
> +	case DRM_FORMAT_NV12:
> +	case DRM_FORMAT_NV16:
> +	case DRM_FORMAT_NV24:
> +	case DRM_FORMAT_YUV420:
> +	case DRM_FORMAT_YUV422:
> +	case DRM_FORMAT_YUV444:
> +		switch (encoding) {
> +		case DRM_COLOR_YCBCR_BT601:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt601_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt601_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT709:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt709_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt709_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT2020:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yuv_bt2020_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yuv_bt2020_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_ENCODING_MAX:
> +			break;
> +		}
> +		break;
> +	case DRM_FORMAT_YVU420:
> +	case DRM_FORMAT_YVU422:
> +	case DRM_FORMAT_YVU444:
> +	case DRM_FORMAT_NV21:
> +	case DRM_FORMAT_NV61:
> +	case DRM_FORMAT_NV42:
> +		switch (encoding) {
> +		case DRM_COLOR_YCBCR_BT601:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt601_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt601_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT709:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt709_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt709_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_YCBCR_BT2020:
> +			switch (range) {
> +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> +				return &yvu_bt2020_limited;
> +			case DRM_COLOR_YCBCR_FULL_RANGE:
> +				return &yvu_bt2020_full;
> +			case DRM_COLOR_RANGE_MAX:
> +				break;
> +			}
> +			break;
> +		case DRM_COLOR_ENCODING_MAX:
> +			break;
> +		}
> +		break;
> +	case DRM_FORMAT_ARGB8888:
> +	case DRM_FORMAT_XRGB8888:
> +	case DRM_FORMAT_ARGB16161616:
> +	case DRM_FORMAT_XRGB16161616:
> +	case DRM_FORMAT_RGB565:
> +		/*
> +		 * Those formats are supported, but they don't need a conversion matrix. Return
> +		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
> +		 * somewhere.
> +		 */
> +		return &no_operation;
> +	default:
> +		break;
> +	}
> +	WARN(true, "Unsupported encoding (%d), range (%d) and format (%p4cc) combination\n",
> +	     encoding, range, &format);
> +	return &no_operation;
> +}
> +
>  /**
>   * Retrieve the correct write_pixel function for a specific format.
>   * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> index 8d2bef95ff79..e1d324764b17 100644
> --- a/drivers/gpu/drm/vkms/vkms_formats.h
> +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> @@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);
>  
>  pixel_write_t get_pixel_write_function(u32 format);
>  
> +struct conversion_matrix *
> +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> +				  enum drm_color_range range);
> +
>  #endif /* _VKMS_FORMATS_H_ */
> diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> index 8875bed76410..987dd2b686a8 100644
> --- a/drivers/gpu/drm/vkms/vkms_plane.c
> +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> @@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
>  	DRM_FORMAT_XRGB8888,
>  	DRM_FORMAT_XRGB16161616,
>  	DRM_FORMAT_ARGB16161616,
> -	DRM_FORMAT_RGB565
> +	DRM_FORMAT_RGB565,
> +	DRM_FORMAT_NV12,
> +	DRM_FORMAT_NV16,
> +	DRM_FORMAT_NV24,
> +	DRM_FORMAT_NV21,
> +	DRM_FORMAT_NV61,
> +	DRM_FORMAT_NV42,
> +	DRM_FORMAT_YUV420,
> +	DRM_FORMAT_YUV422,
> +	DRM_FORMAT_YUV444,
> +	DRM_FORMAT_YVU420,
> +	DRM_FORMAT_YVU422,
> +	DRM_FORMAT_YVU444
>  };
>  
>  static struct drm_plane_state *
> @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
>  	drm_framebuffer_get(frame_info->fb);
>  	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
>  									  DRM_MODE_ROTATE_90 |
> +									  DRM_MODE_ROTATE_180 |
>  									  DRM_MODE_ROTATE_270 |
>  									  DRM_MODE_REFLECT_X |
>  									  DRM_MODE_REFLECT_Y);
>  
>  
>  	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> +	vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
> +		(fmt, new_state->color_encoding, new_state->color_range);
>  }
>  
>  static int vkms_plane_atomic_check(struct drm_plane *plane,
> 

I couldn't pinpoint what would need to be fixed so that rotation would
not change chroma siting, but I also cannot say that chroma siting is
definitely correct already.


Thanks,
pq

Louis Chauvet April 8, 2024, 7:50 a.m. UTC | #8

Le 27/03/24 - 16:23, Pekka Paalanen a écrit :
> On Wed, 13 Mar 2024 18:45:05 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> > Add support to the YUV formats bellow:
> > 
> > - NV12/NV16/NV24
> > - NV21/NV61/NV42
> > - YUV420/YUV422/YUV444
> > - YVU420/YVU422/YVU444
> > 
> > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > 32.32 floats and the drm_fixed helpers.
> 
> You mean fixed-point, not floating-point (floats).
> 
> > 
> > To do the conversion, a specific matrix must be used for each color range
> > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > the `conversion_matrix` struct, along with the specific y_offset needed.
> > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > stored in a `vkms_plane_state`. Those conversion matrices of each
> > encoding and range were obtained by rounding the values of the original
> > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > floating point operations.
> > 
> > The same reading function is used for YUV and YVU formats. As the only
> > difference between those two category of formats is the order of field, a
> > simple swap in conversion matrix columns allows using the same function.
> 
> Sounds good!
> 
> > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > [Louis Chauvet:
> > - Adapted Arthur's work
> > - Implemented the read_line_t callbacks for yuv
> > - add struct conversion_matrix
> > - remove struct pixel_yuv_u8
> > - update the commit message
> > - Merge the modifications from Arthur]
> > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > ---
> >  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> >  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> >  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> >  4 files changed, 473 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 23e1d247468d..f3116084de5a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
> >  				  int y_start, enum pixel_read_direction direction, int count,
> >  				  struct pixel_argb_u16 out_pixel[]);
> >  
> > +/**
> > + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> > + */
> > +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> 
> Fraction, not float.
> 
> > +
> > +/**
> > + * struct conversion_matrix - Matrix to use for a specific encoding and range
> > + *
> > + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> > + * used to compute rgb values from yuv values:
> > + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> > + *   OR for yvu formats:
> > + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> > + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH
> 
> Fixed float is not a thing. They are signed fixed-point values with
> 32-bit fractional part.

I will edit all of this for v6.
 
> > + * @y_offest: Offset to apply on the y value.
> > + */
> > +struct conversion_matrix {
> > +	s64 matrix[3][3];
> > +	s64 y_offset;
> > +};
> 
> Btw. too bad that drm_fixed.h does not use something like
> 
> 	typedef struct drm_fixed {
> 		s64 v;
> 	} drm_fixed_t;
> 
> and use that in all the API where a fixed-point value is passed. It
> would make the type very explicit, and the struct prevents it from
> implicitly casting to/from regular integer formats.
> 
> Then you could use drm_fixed_t instead of s64 and it would be obvious
> how the values must be handled and which API is appropriate.

I agree this could be a nice improvment, but it may require touching a lot 
of places. 

> > +
> >  /**
> >   * vkms_plane_state - Driver specific plane state
> >   * @base: base plane state
> > @@ -110,6 +131,7 @@ struct vkms_plane_state {
> >  	struct drm_shadow_plane_state base;
> >  	struct vkms_frame_info *frame_info;
> >  	pixel_read_line_t pixel_read_line;
> > +	struct conversion_matrix *conversion_matrix;
> 
> If the matrix was embedded as a copy instead of a pointer to (const!)
> matrix, you would not need to manually hardcode YVU variant of the
> matrices, but you could simply swap the columns of the YUV matrix while
> copying them into this field.

Very good suggestion thanks, applied for the v6!

> 
> >  };
> >  
> >  struct vkms_plane {
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.c b/drivers/gpu/drm/vkms/vkms_formats.c
> > index 1449a0e6c706..edbf4b321b91 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.c
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.c
> > @@ -105,6 +105,44 @@ static int get_step_next_block(struct drm_framebuffer *fb, enum pixel_read_direc
> >  	return 0;
> >  }
> >  
> > +/**
> > + * get_subsampling() - Get the subsampling divisor value on a specific direction
> > + */
> > +static int get_subsampling(const struct drm_format_info *format,
> > +			   enum pixel_read_direction direction)
> > +{
> > +	switch (direction) {
> > +	case READ_BOTTOM_TO_TOP:
> > +	case READ_TOP_TO_BOTTOM:
> > +		return format->vsub;
> > +	case READ_RIGHT_TO_LEFT:
> > +	case READ_LEFT_TO_RIGHT:
> > +		return format->hsub;
> > +	}
> > +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> > +	return 1;
> > +}
> > +
> > +/**
> > + * get_subsampling_offset() - An offset for keeping the chroma siting consistent regardless of
> > + * x_start and y_start values
> > + */
> > +static int get_subsampling_offset(enum pixel_read_direction direction, int x_start, int y_start)
> > +{
> > +	switch (direction) {
> > +	case READ_BOTTOM_TO_TOP:
> > +		return -y_start - 1;
> > +	case READ_TOP_TO_BOTTOM:
> > +		return y_start;
> > +	case READ_RIGHT_TO_LEFT:
> > +		return -x_start - 1;
> > +	case READ_LEFT_TO_RIGHT:
> > +		return x_start;
> > +	}
> > +	WARN_ONCE(true, "Invalid direction for pixel reading: %d\n", direction);
> > +	return 0;
> > +}
> > +
> >  /*
> >   * The following  functions take pixel data (a, r, g, b, pixel, ...), convert them to the format
> >   * ARGB16161616 in out_pixel.
> > @@ -161,6 +199,42 @@ static struct pixel_argb_u16 argb_u16_from_RGB565(const u16 *pixel)
> >  	return out_pixel;
> >  }
> >  
> > +static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> > +						  struct conversion_matrix *matrix)
> 
> If you are using the "swap the matrix columns" trick, then you cannot
> call these cb, cr nor even u,v, because they might be the opposite.
> They are simply the first and second chroma channel, and their meaning
> depends on the given matrix.

I will rename them for v6, channel_1 and channel_2.

> > +{
> > +	u8 r, g, b;
> > +	s64 fp_y, fp_cb, fp_cr;
> > +	s64 fp_r, fp_g, fp_b;
> > +
> > +	fp_y = y - matrix->y_offset;
> > +	fp_cb = cb - 128;
> > +	fp_cr = cr - 128;
> 
> This looks like an incorrect way to convert u8 to fixed-point, but...
>
> > +
> > +	fp_y = drm_int2fixp(fp_y);
> > +	fp_cb = drm_int2fixp(fp_cb);
> > +	fp_cr = drm_int2fixp(fp_cr);
> 
> I find it confusing to re-purpose variables like this.
> 
> I'd do just
> 
> 	fp_c1 = drm_int2fixp((int)c1 - 128);

I agree with this remark, I will change it for the v6.

> If the function arguments were int to begin with, then the cast would
> be obviously unnecessary.

For this I'm less sure. The name of the function and the usage is 
explicit: we want to use u8 as input. As we manipulate pointers in 
read_line, I don't know how it will works if the pointer is dereferenced 
to a int instead of a u8.

> So, what you have in fp variables at this point is fractional numbers
> in the 8-bit integer scale. However, because the target format is
> 16-bit, you should not show the extra precision away here. Instead,
> multiply by 257 to bring the values to 16-bit scale, and do the RGB
> clamping to 16-bit, not 8-bit.
> 
> > +
> > +	fp_r = drm_fixp_mul(matrix->matrix[0][0], fp_y) +
> > +	       drm_fixp_mul(matrix->matrix[0][1], fp_cb) +
> > +	       drm_fixp_mul(matrix->matrix[0][2], fp_cr);
> > +	fp_g = drm_fixp_mul(matrix->matrix[1][0], fp_y) +
> > +	       drm_fixp_mul(matrix->matrix[1][1], fp_cb) +
> > +	       drm_fixp_mul(matrix->matrix[1][2], fp_cr);
> > +	fp_b = drm_fixp_mul(matrix->matrix[2][0], fp_y) +
> > +	       drm_fixp_mul(matrix->matrix[2][1], fp_cb) +
> > +	       drm_fixp_mul(matrix->matrix[2][2], fp_cr);
> > +
> > +	fp_r = drm_fixp2int_round(fp_r);
> > +	fp_g = drm_fixp2int_round(fp_g);
> > +	fp_b = drm_fixp2int_round(fp_b);
> > +
> > +	r = clamp(fp_r, 0, 0xff);
> > +	g = clamp(fp_g, 0, 0xff);
> > +	b = clamp(fp_b, 0, 0xff);
> > +
> > +	return argb_u16_from_u8888(255, r, g, b);
> 
> Going through argb_u16_from_u8888() will throw away precision.

I tried to fix it in the v6, IGT tests pass. If something is wrong in the 
v6, please let me know.

> > +}
> > +
> >  /*
> >   * The following functions are read_line function for each pixel format supported by VKMS.
> >   *
> > @@ -293,6 +367,79 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> >  	}
> >  }
> >  
> > +/*
> > + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> > + * (column inversion)
> 
> Would be nice to explain what semi_planar_yuv means, so that the
> documentation for these functions would show how they differ rather
> than all saying exactly the same thing.

 /* This callback can be used for YUV format where each color component is 
  * stored in a different plane (often called planar formats). It will 
  * handle correctly subsampling.

 /*
  * This callback can be used for YUV formats where U and V values are 
  * stored in the same plane (often called semi-planar formats). It will 
  * corectly handle subsampling.
  * 
  * The conversion matrix stored in the @plane is used to:
  * - Apply the correct color range and encoding
  * - Convert YUV and YVU with the same function (a simple column swap is 
  *   needed)
  */
 
> > + */
> > +static void semi_planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
> > +				      int y_start, enum pixel_read_direction direction, int count,
> > +				      struct pixel_argb_u16 out_pixel[])
> > +{
> > +	int rem_x, rem_y;
> > +	u8 *y_plane;
> > +	u8 *uv_plane;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);
> 
> Assert rem_x, rem_y are zero, or block is 1x1.
> 
> > +	packed_pixels_addr(plane->frame_info,
> > +			   x_start / plane->frame_info->fb->format->hsub,
> > +			   y_start / plane->frame_info->fb->format->vsub,
> > +			   1, &uv_plane, &rem_x, &rem_y);
> 
> Assert rem_x, rem_y are zero, or block is 1x1.
> 
> Actually, this is so common, that maybe there should be a wrapper for
> packed_pixels_addr() or another variant of it, that asserts that the
> block size is 1x1 and does not return rem_x, rem_y at all.

I will create a packed_pixel_addr_1x1 for this, and add this assert 
inside, so no code duplication.

> > +	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +	int step_uv = get_step_next_block(plane->frame_info->fb, direction, 1);
> > +	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
> > +	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
> > +	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
> > +
> > +	for (int i = 0; i < count; i++) {
> > +		*out_pixel = argb_u16_from_yuv888(y_plane[0], uv_plane[0], uv_plane[1],
> > +						  conversion_matrix);
> > +		out_pixel += 1;
> > +		y_plane += step_y;
> > +		if ((i + subsampling_offset + 1) % subsampling == 0)
> > +			uv_plane += step_uv;
> > +	}
> > +}
> > +
> > +/*
> > + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> > + * (column inversion)
> > + */
> > +static void planar_yuv_read_line(const struct vkms_plane_state *plane, int x_start,
> > +				 int y_start, enum pixel_read_direction direction, int count,
> > +				 struct pixel_argb_u16 out_pixel[])
> > +{
> > +	int rem_x, rem_y;
> > +	u8 *y_plane;
> > +	u8 *u_plane;
> > +	u8 *v_plane;
> > +
> > +	packed_pixels_addr(plane->frame_info, x_start, y_start, 0, &y_plane, &rem_x, &rem_y);
> > +	packed_pixels_addr(plane->frame_info,
> > +			   x_start / plane->frame_info->fb->format->hsub,
> > +			   y_start / plane->frame_info->fb->format->vsub,
> > +			   1, &u_plane, &rem_x, &rem_y);
> > +	packed_pixels_addr(plane->frame_info,
> > +			   x_start / plane->frame_info->fb->format->hsub,
> > +			   y_start / plane->frame_info->fb->format->vsub,
> > +			   2, &v_plane, &rem_x, &rem_y);
> > +	int step_y = get_step_next_block(plane->frame_info->fb, direction, 0);
> > +	int step_u = get_step_next_block(plane->frame_info->fb, direction, 1);
> > +	int step_v = get_step_next_block(plane->frame_info->fb, direction, 2);
> > +	int subsampling = get_subsampling(plane->frame_info->fb->format, direction);
> > +	int subsampling_offset = get_subsampling_offset(direction, x_start, y_start);
> > +	struct conversion_matrix *conversion_matrix = plane->conversion_matrix;
> > +
> > +	for (int i = 0; i < count; i++) {
> > +		*out_pixel = argb_u16_from_yuv888(*y_plane, *u_plane, *v_plane, conversion_matrix);
> > +		out_pixel += 1;
> > +		y_plane += step_y;
> > +		if ((i + subsampling_offset + 1) % subsampling == 0) {
> > +			u_plane += step_u;
> > +			v_plane += step_v;
> > +		}
> > +	}
> > +}
> > +
> >  /*
> >   * The following functions take one argb_u16 pixel and convert it to a specific format. The
> >   * result is stored in @out_pixel.
> > @@ -418,6 +565,20 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
> >  		return &XRGB16161616_read_line;
> >  	case DRM_FORMAT_RGB565:
> >  		return &RGB565_read_line;
> > +	case DRM_FORMAT_NV12:
> > +	case DRM_FORMAT_NV16:
> > +	case DRM_FORMAT_NV24:
> > +	case DRM_FORMAT_NV21:
> > +	case DRM_FORMAT_NV61:
> > +	case DRM_FORMAT_NV42:
> > +		return &semi_planar_yuv_read_line;
> > +	case DRM_FORMAT_YUV420:
> > +	case DRM_FORMAT_YUV422:
> > +	case DRM_FORMAT_YUV444:
> > +	case DRM_FORMAT_YVU420:
> > +	case DRM_FORMAT_YVU422:
> > +	case DRM_FORMAT_YVU444:
> > +		return &planar_yuv_read_line;
> >  	default:
> >  		/*
> >  		 * This is a bug in vkms_plane_atomic_check. All the supported
> > @@ -435,6 +596,276 @@ pixel_read_line_t get_pixel_read_line_function(u32 format)
> >  	}
> >  }
> >  
> > +/**
> > + * get_conversion_matrix_to_argb_u16() - Retrieve the correct yuv to rgb conversion matrix for a
> > + * given encoding and range.
> > + *
> > + * If the matrix is not found, return a null pointer. In all other cases, it return a simple
> > + * diagonal matrix, which act as a "no-op".
> 
> This comment about NULL seems bogus.

Because it is... and it become useless when using the "copy matrix" 
method.

> > + *
> > + * @format: DRM_FORMAT_* value for which to obtain a conversion function (see [drm_fourcc.h])
> > + * @encoding: DRM_COLOR_* value for which to obtain a conversion matrix
> > + * @range: DRM_COLOR_*_RANGE value for which to obtain a conversion matrix
> > + */
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range)
> > +{
> > +	static struct conversion_matrix no_operation = {
> 
> Every matrix here should be 'static const' rather than only 'static'.
>
> > +		.matrix = {
> > +			{ 4294967296, 0,          0, },
> > +			{ 0,          4294967296, 0, },
> > +			{ 0,          0,          4294967296, },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +
> > +	/*
> > +	 * Those matrixies were generated using the colour python framework
> > +	 *
> > +	 * Below are the function calls used to generate eac matrix, go to
> > +	 * https://colour.readthedocs.io/en/develop/generated/colour.matrix_YCbCr.html
> > +	 * for more info:
> > +	 *
> > +	 * numpy.around(colour.matrix_YCbCr(K=colour.WEIGHTS_YCBCR["ITU-R BT.601"],
> > +	 *                                  is_legal = False,
> 
> Ugh, colour.matrix_YCbCr documentation is confusing. This is the first
> time I've heard of "legal range", so I had to look it up. Of course,
> the doc does not explain it.
> 
> Reading
> https://kb.pomfort.com/livegrade/advanced-grading-features/legal-and-extended-sdi-signals-and-luts-in-livegrade/
> it sounds like extended range in 8-bit is 1-254, not 0-255 that
> we use in computer graphics. This matches what I've read before
> elsewhere in ITU or SMPTE specs.
> 
> SDI signals reserve the 8-bit code points 0 and 255 for
> synchronization, making them invalid as data. It scales to higher bit
> depths, so 10-bit code points 0-3 and 1020-1023 inclusive are reserved
> for synchronization.
> 
> IOW, there are two different "full range" quantizations: extended and full.
> 
> Does is_legal=False refer to extended or full? The documentation
> does not say.
> 
> However, given that changing 'bits' value with is_legal=False does not
> change the result, and with is_legal=True it does change the result, I
> suspect is_legal=False means full range, not extended range.
> 
> So I think the python snippet is correct.
> 
> > +	 *                                  bits = 8) * 2**32).astype(int)
> > +	 */
> > +	static struct conversion_matrix yuv_bt601_full = {
> > +		.matrix = {
> > +			{ 4294967296, 0,           6021544149 },

[...]

> > +			{ 5020601039, 9234915964, 0 },
> > +		},
> > +		.y_offset = 16,
> > +	};
> > +
> > +	/*
> > +	 * The next matrices are just the previous ones, but with the first and
> > +	 * second columns swapped
> 
> As I mentioned earlier, you could derive those below from the above
> matrices in code, so you don't need all these open-coded.
> 
> You also would not need twice the switch-ladders below, you'd only need
> a 'bool need_to_swap_columns' from the pixel format.

It is done in the v6, the code is much simpler.

Thanks,
Louis Chauvet

> You could also have a 'bool limited_range', and do
> 
> 	case DRM_COLOR_YCBCR_BT601:
> 		return limited_range ? &yuv_bt601_limited : &yuv_bt601_full;
> 
> 
> > +	 */
> > +	static struct conversion_matrix yvu_bt601_full = {
> > +		.matrix = {
> > +			{ 4294967296, 6021544149,  0 },
> > +			{ 4294967296, -3067191994, -1478054095 },
> > +			{ 4294967296, 0,           7610682049 },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +	static struct conversion_matrix yvu_bt601_limited = {
> > +		.matrix = {
> > +			{ 5020601039, 6881764740,  0 },
> > +			{ 5020601039, -3505362278, -1689204679 },
> > +			{ 5020601039, 0,           8697922339 },
> > +		},
> > +		.y_offset = 16,
> > +	};
> > +	static struct conversion_matrix yvu_bt709_full = {
> > +		.matrix = {
> > +			{ 4294967296, 6763714498,  0 },
> > +			{ 4294967296, -2010578443, -804551626 },
> > +			{ 4294967296, 0,           7969741314 },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +	static struct conversion_matrix yvu_bt709_limited = {
> > +		.matrix = {
> > +			{ 5020601039, 7729959424,  0 },
> > +			{ 5020601039, -2297803934, -919487572 },
> > +			{ 5020601039, 0,           9108275786 },
> > +		},
> > +		.y_offset = 16,
> > +	};
> > +	static struct conversion_matrix yvu_bt2020_full = {
> > +		.matrix = {
> > +			{ 4294967296, 6333358775,  0 },
> > +			{ 4294967296, -2453942994, -706750298 },
> > +			{ 4294967296, 0,           8080551471 },
> > +		},
> > +		.y_offset = 0,
> > +	};
> > +	static struct conversion_matrix yvu_bt2020_limited = {
> > +		.matrix = {
> > +			{ 5020601039, 7238124312,  0 },
> > +			{ 5020601039, -2804506279, -807714626 },
> > +			{ 5020601039, 0,           9234915964 },
> > +		},
> > +		.y_offset = 16,
> > +	};
> > +
> > +	/* Breaking in this switch means that the color format+encoding+range is not supported */
> > +	switch (format) {
> > +	case DRM_FORMAT_NV12:
> > +	case DRM_FORMAT_NV16:
> > +	case DRM_FORMAT_NV24:
> > +	case DRM_FORMAT_YUV420:
> > +	case DRM_FORMAT_YUV422:
> > +	case DRM_FORMAT_YUV444:
> > +		switch (encoding) {
> > +		case DRM_COLOR_YCBCR_BT601:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yuv_bt601_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yuv_bt601_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_YCBCR_BT709:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yuv_bt709_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yuv_bt709_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_YCBCR_BT2020:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yuv_bt2020_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yuv_bt2020_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_ENCODING_MAX:
> > +			break;
> > +		}
> > +		break;
> > +	case DRM_FORMAT_YVU420:
> > +	case DRM_FORMAT_YVU422:
> > +	case DRM_FORMAT_YVU444:
> > +	case DRM_FORMAT_NV21:
> > +	case DRM_FORMAT_NV61:
> > +	case DRM_FORMAT_NV42:
> > +		switch (encoding) {
> > +		case DRM_COLOR_YCBCR_BT601:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yvu_bt601_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yvu_bt601_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_YCBCR_BT709:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yvu_bt709_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yvu_bt709_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_YCBCR_BT2020:
> > +			switch (range) {
> > +			case DRM_COLOR_YCBCR_LIMITED_RANGE:
> > +				return &yvu_bt2020_limited;
> > +			case DRM_COLOR_YCBCR_FULL_RANGE:
> > +				return &yvu_bt2020_full;
> > +			case DRM_COLOR_RANGE_MAX:
> > +				break;
> > +			}
> > +			break;
> > +		case DRM_COLOR_ENCODING_MAX:
> > +			break;
> > +		}
> > +		break;
> > +	case DRM_FORMAT_ARGB8888:
> > +	case DRM_FORMAT_XRGB8888:
> > +	case DRM_FORMAT_ARGB16161616:
> > +	case DRM_FORMAT_XRGB16161616:
> > +	case DRM_FORMAT_RGB565:
> > +		/*
> > +		 * Those formats are supported, but they don't need a conversion matrix. Return
> > +		 * a valid pointer to avoid kernel panic in case this matrix is used/checked
> > +		 * somewhere.
> > +		 */
> > +		return &no_operation;
> > +	default:
> > +		break;
> > +	}
> > +	WARN(true, "Unsupported encoding (%d), range (%d) and format (%p4cc) combination\n",
> > +	     encoding, range, &format);
> > +	return &no_operation;
> > +}
> > +
> >  /**
> >   * Retrieve the correct write_pixel function for a specific format.
> >   * If the format is not supported by VKMS a warn is emitted and a dummy "don't do anything"
> > diff --git a/drivers/gpu/drm/vkms/vkms_formats.h b/drivers/gpu/drm/vkms/vkms_formats.h
> > index 8d2bef95ff79..e1d324764b17 100644
> > --- a/drivers/gpu/drm/vkms/vkms_formats.h
> > +++ b/drivers/gpu/drm/vkms/vkms_formats.h
> > @@ -9,4 +9,8 @@ pixel_read_line_t get_pixel_read_line_function(u32 format);
> >  
> >  pixel_write_t get_pixel_write_function(u32 format);
> >  
> > +struct conversion_matrix *
> > +get_conversion_matrix_to_argb_u16(u32 format, enum drm_color_encoding encoding,
> > +				  enum drm_color_range range);
> > +
> >  #endif /* _VKMS_FORMATS_H_ */
> > diff --git a/drivers/gpu/drm/vkms/vkms_plane.c b/drivers/gpu/drm/vkms/vkms_plane.c
> > index 8875bed76410..987dd2b686a8 100644
> > --- a/drivers/gpu/drm/vkms/vkms_plane.c
> > +++ b/drivers/gpu/drm/vkms/vkms_plane.c
> > @@ -17,7 +17,19 @@ static const u32 vkms_formats[] = {
> >  	DRM_FORMAT_XRGB8888,
> >  	DRM_FORMAT_XRGB16161616,
> >  	DRM_FORMAT_ARGB16161616,
> > -	DRM_FORMAT_RGB565
> > +	DRM_FORMAT_RGB565,
> > +	DRM_FORMAT_NV12,
> > +	DRM_FORMAT_NV16,
> > +	DRM_FORMAT_NV24,
> > +	DRM_FORMAT_NV21,
> > +	DRM_FORMAT_NV61,
> > +	DRM_FORMAT_NV42,
> > +	DRM_FORMAT_YUV420,
> > +	DRM_FORMAT_YUV422,
> > +	DRM_FORMAT_YUV444,
> > +	DRM_FORMAT_YVU420,
> > +	DRM_FORMAT_YVU422,
> > +	DRM_FORMAT_YVU444
> >  };
> >  
> >  static struct drm_plane_state *
> > @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> >  	drm_framebuffer_get(frame_info->fb);
> >  	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> >  									  DRM_MODE_ROTATE_90 |
> > +									  DRM_MODE_ROTATE_180 |
> >  									  DRM_MODE_ROTATE_270 |
> >  									  DRM_MODE_REFLECT_X |
> >  									  DRM_MODE_REFLECT_Y);
> >  
> >  
> >  	vkms_plane_state->pixel_read_line = get_pixel_read_line_function(fmt);
> > +	vkms_plane_state->conversion_matrix = get_conversion_matrix_to_argb_u16
> > +		(fmt, new_state->color_encoding, new_state->color_range);
> >  }
> >  
> >  static int vkms_plane_atomic_check(struct drm_plane *plane,
> > 
> 
> I couldn't pinpoint what would need to be fixed so that rotation would
> not change chroma siting, but I also cannot say that chroma siting is
> definitely correct already.
> 
> Thanks,
> pq

Louis Chauvet April 8, 2024, 7:50 a.m. UTC | #9

Le 27/03/24 - 14:59, Pekka Paalanen a écrit :
> On Tue, 26 Mar 2024 16:57:03 +0100
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Le 25/03/24 - 11:26, Maíra Canal a écrit :
> > > On 3/13/24 14:45, Louis Chauvet wrote:  
> > > > From: Arthur Grillo <arthurgrillo@riseup.net>
> > > > 
> > > > Add support to the YUV formats bellow:
> > > > 
> > > > - NV12/NV16/NV24
> > > > - NV21/NV61/NV42
> > > > - YUV420/YUV422/YUV444
> > > > - YVU420/YVU422/YVU444
> > > > 
> > > > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > > > 32.32 floats and the drm_fixed helpers.
> > > > 
> > > > To do the conversion, a specific matrix must be used for each color range
> > > > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > > > the `conversion_matrix` struct, along with the specific y_offset needed.
> > > > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > > > stored in a `vkms_plane_state`. Those conversion matrices of each
> > > > encoding and range were obtained by rounding the values of the original
> > > > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > > > floating point operations.
> > > > 
> > > > The same reading function is used for YUV and YVU formats. As the only
> > > > difference between those two category of formats is the order of field, a
> > > > simple swap in conversion matrix columns allows using the same function.
> > > > 
> > > > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > > > [Louis Chauvet:
> > > > - Adapted Arthur's work
> > > > - Implemented the read_line_t callbacks for yuv
> > > > - add struct conversion_matrix
> > > > - remove struct pixel_yuv_u8
> > > > - update the commit message
> > > > - Merge the modifications from Arthur]  
> > > 
> > > A Co-developed-by tag would be more appropriate.  
> > 
> > I am not the main author of this part, I only applied a few simple 
> > suggestions, the complex part was done by Arthur.
> > 
> > I will wait for Arthur's confirmation to change it to Co-developed by if
> > he agrees.
> 
> Co-developed-by is an additional tag, and does not replace S-o-b. To my
> understanding, the kernel rules and Developers' Certificate of Origin
> require S-o-b to be added by anyone who has taken a patch and
> re-submitted it, regardless of who the original author is, and
> especially if the patch was modified.
> 
> Personally I also like to keep the list of changes like Louis added, to
> credit people better.

I will keep everything, don't worry!
 
> > > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > > ---
> > > >   drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> > > >   drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> > > >   drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> > > >   drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> > > >   4 files changed, 473 insertions(+), 1 deletion(-)
> > > > 
> 
> ...
> 
> > > >   };
> > > >   
> > > >   static struct drm_plane_state *
> > > > @@ -117,12 +129,15 @@ static void vkms_plane_atomic_update(struct drm_plane *plane,
> > > >   	drm_framebuffer_get(frame_info->fb);
> > > >   	frame_info->rotation = drm_rotation_simplify(new_state->rotation, DRM_MODE_ROTATE_0 |
> > > >   									  DRM_MODE_ROTATE_90 |
> > > > +									  DRM_MODE_ROTATE_180 |  
> > > 
> > > Why do we need to add DRM_MODE_ROTATE_180 here? Isn't the same as
> > > reflecting both along the X and Y axis?  
> > 
> > Oops, I had no intention of putting that change here. I will move it to 
> > another patch.
> > 
> > I don't understand why DRM_MODE_ROTATE_180 isn't in this list. If I read 
> > the drm_rotation_simplify documentation, it explains that this argument 
> > should contain all supported rotations and reflections, and ROT_180 is 
> > supported by VKMS. Perhaps this call is unnecessary because all 
> > combinations are supported by vkms?
> 
> If you truly handle all bit patterns that the rotation bitfield can
> have, then yes, the call seems unnecessary.
> 
> However, as documented, the bitfield contains redundancy: the same
> orientation can be expressed in more than one bit pattern. One example
> is that ROTATE_180 is equivalent to REFLECT_X | REFLECT_Y.
> 
> Since it's a bitmask, userspace can give you funny values like
> ROTATE_0 | ROTATE_90 | ROTATE_180. That is a valid orientation of
> 270-degree rotation (according to UAPI doc), but it is very awkwardly
> expressed, hence the need to normalise it into a minimal bit pattern.

The userspace can't set multiple bit, if you look at [1]:

	if (!is_power_of_2(val & DRM_MODE_ROTATE_MASK)) {
		drm_dbg_atomic(plane->dev,
			       "[PLANE:%d:%s] bad rotation bitmask: 0x%llx\n",
			       plane->base.id, plane->name, val);
		return -EINVAL;
	}

I have a series ready for improving the drm documentation, you will be in 
copy.

[1]: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/drm_atomic_uapi.c#L527
 
> It does not look like drm_rotation_simplify() actually does this
> minimisation!

drm_rotation_simplify() apply an additionnal 
REFLECT_X|REFLECT_Y|ROTATE_180, it is a "no-op" operation, but it 
transform ROT_90|REF_X into ROT_270|REF_Y, so you have eliminated REF_X 
and ROT_90. I will create a patch to document a bit more this function.

In the current vkms code, it will just remove the 180° rotation. I 
think we can just delete this call as everything is supported. I will add 
the patch in the v6 (I don't know why it was there at the first place, 
and I don't want to introduce regression).

> I was not able to tell if DRM common code actually stops userspace from
> combining multiple ROTATE bits in the same value. I suspect it must
> stop them, or perhaps all code dealing with rotation is actually broken.

See [1], and yes, drm helpers are broken with multiple bit sets, they 
expect only one rotation bit.

> drm_rotation_simplify() is useful for cases where your hardware does
> not have exactly the same flexibility. Maybe it cannot do REFLECT_Y but
> it can do everything else? Then drm_rotation_simplify() gives you a bit
> pattern that you can use directly, or fails if the orientation is not
> representable with what your hardware can do.
> 
> At least, that's my understanding of quickly glancing over it.
> 
> IOW, if you wanted to never have to deal with REFLECT_Y bit, you could
> leave it out here. Or, if you never want to deal with ROTATE_180, leave
> that out - you will get REFLECT_X | REFLECT_Y instead. In theory.
> 
> 
> Thanks,
> pq

Louis Chauvet April 8, 2024, 7:50 a.m. UTC | #10

Le 27/03/24 - 13:11, Philipp Zabel a écrit :
> Hi Louis,
> 
> On Mi, 2024-03-13 at 18:45 +0100, Louis Chauvet wrote:
> > From: Arthur Grillo <arthurgrillo@riseup.net>
> > 
> > Add support to the YUV formats bellow:
> > 
> > - NV12/NV16/NV24
> > - NV21/NV61/NV42
> > - YUV420/YUV422/YUV444
> > - YVU420/YVU422/YVU444
> > 
> > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > 32.32 floats and the drm_fixed helpers.
> 
> s/floats/fixed-point numbers/
> 
> Nothing floating here, the point is fixed.
> 
> [...]
> > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > index 23e1d247468d..f3116084de5a 100644
> > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> > @@ -99,6 +99,27 @@ typedef void (*pixel_read_line_t)(const struct vkms_plane_state *plane, int x_st
> >  				  int y_start, enum pixel_read_direction direction, int count,
> >  				  struct pixel_argb_u16 out_pixel[]);
> >  
> > +/**
> > + * CONVERSION_MATRIX_FLOAT_DEPTH - Number of digits after the point for conversion matrix values
> 
> s/CONVERSION_MATRIX_FLOAT_DEPTH/CONVERSION_MATRIX_FRACTIONAL_BITS/
> 
> Just a suggestion, maybe there are better terms, but using "FLOAT" here
> is confusing.
> 
> > + */
> > +#define CONVERSION_MATRIX_FLOAT_DEPTH 32
> > +
> > +/**
> > + * struct conversion_matrix - Matrix to use for a specific encoding and range
> > + *
> > + * @matrix: Conversion matrix from yuv to rgb. The matrix is stored in a row-major manner and is
> > + * used to compute rgb values from yuv values:
> > + *     [[r],[g],[b]] = @matrix * [[y],[u],[v]]
> > + *   OR for yvu formats:
> > + *     [[r],[g],[b]] = @matrix * [[y],[v],[u]]
> > + *  The values of the matrix are fixed floats, 32.CONVERSION_MATRIX_FLOAT_DEPTH
> 
> s/fixed floats/fixed-point numbers/

Thanks for those precision, I will change the wording in v6.

Louis Chauvet
 
> regards
> Philipp

Pekka Paalanen April 9, 2024, 7:58 a.m. UTC | #11

On Mon, 8 Apr 2024 09:50:19 +0200
Louis Chauvet <louis.chauvet@bootlin.com> wrote:

> Le 27/03/24 - 16:23, Pekka Paalanen a écrit :
> > On Wed, 13 Mar 2024 18:45:05 +0100
> > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> >   
> > > From: Arthur Grillo <arthurgrillo@riseup.net>
> > > 
> > > Add support to the YUV formats bellow:
> > > 
> > > - NV12/NV16/NV24
> > > - NV21/NV61/NV42
> > > - YUV420/YUV422/YUV444
> > > - YVU420/YVU422/YVU444
> > > 
> > > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > > 32.32 floats and the drm_fixed helpers.  
> > 
> > You mean fixed-point, not floating-point (floats).
> >   
> > > 
> > > To do the conversion, a specific matrix must be used for each color range
> > > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > > the `conversion_matrix` struct, along with the specific y_offset needed.
> > > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > > stored in a `vkms_plane_state`. Those conversion matrices of each
> > > encoding and range were obtained by rounding the values of the original
> > > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > > floating point operations.
> > > 
> > > The same reading function is used for YUV and YVU formats. As the only
> > > difference between those two category of formats is the order of field, a
> > > simple swap in conversion matrix columns allows using the same function.  
> > 
> > Sounds good!
> >   
> > > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > > [Louis Chauvet:
> > > - Adapted Arthur's work
> > > - Implemented the read_line_t callbacks for yuv
> > > - add struct conversion_matrix
> > > - remove struct pixel_yuv_u8
> > > - update the commit message
> > > - Merge the modifications from Arthur]
> > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > ---
> > >  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> > >  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> > >  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> > >  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> > >  4 files changed, 473 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > index 23e1d247468d..f3116084de5a 100644
> > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h

...

> > > +static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> > > +						  struct conversion_matrix *matrix)  
> > 
> > If you are using the "swap the matrix columns" trick, then you cannot
> > call these cb, cr nor even u,v, because they might be the opposite.
> > They are simply the first and second chroma channel, and their meaning
> > depends on the given matrix.  
> 
> I will rename them for v6, channel_1 and channel_2.
> 
> > > +{
> > > +	u8 r, g, b;
> > > +	s64 fp_y, fp_cb, fp_cr;
> > > +	s64 fp_r, fp_g, fp_b;
> > > +
> > > +	fp_y = y - matrix->y_offset;
> > > +	fp_cb = cb - 128;
> > > +	fp_cr = cr - 128;  
> > 
> > This looks like an incorrect way to convert u8 to fixed-point, but...
> >  
> > > +
> > > +	fp_y = drm_int2fixp(fp_y);
> > > +	fp_cb = drm_int2fixp(fp_cb);
> > > +	fp_cr = drm_int2fixp(fp_cr);  
> > 
> > I find it confusing to re-purpose variables like this.
> > 
> > I'd do just
> > 
> > 	fp_c1 = drm_int2fixp((int)c1 - 128);  
> 
> I agree with this remark, I will change it for the v6.
> 
> > If the function arguments were int to begin with, then the cast would
> > be obviously unnecessary.  
> 
> For this I'm less sure. The name of the function and the usage is 
> explicit: we want to use u8 as input. As we manipulate pointers in 
> read_line, I don't know how it will works if the pointer is dereferenced 
> to a int instead of a u8.

Dereference operator acts on its input type. What happens to the result
is irrelevant.

If we have

u8 *p = ...;

void foo(int x);

then you can call

foo(*v);

if that was your question. Dereference acts on u8* which results in u8.
Then it gets implicitly cast to int.

However, you have a semantic reason to keep the argument as u8, and
that is fine.

> > So, what you have in fp variables at this point is fractional numbers
> > in the 8-bit integer scale. However, because the target format is
> > 16-bit, you should not show the extra precision away here. Instead,
> > multiply by 257 to bring the values to 16-bit scale, and do the RGB
> > clamping to 16-bit, not 8-bit.
> >   
> > > +
> > > +	fp_r = drm_fixp_mul(matrix->matrix[0][0], fp_y) +
> > > +	       drm_fixp_mul(matrix->matrix[0][1], fp_cb) +
> > > +	       drm_fixp_mul(matrix->matrix[0][2], fp_cr);
> > > +	fp_g = drm_fixp_mul(matrix->matrix[1][0], fp_y) +
> > > +	       drm_fixp_mul(matrix->matrix[1][1], fp_cb) +
> > > +	       drm_fixp_mul(matrix->matrix[1][2], fp_cr);
> > > +	fp_b = drm_fixp_mul(matrix->matrix[2][0], fp_y) +
> > > +	       drm_fixp_mul(matrix->matrix[2][1], fp_cb) +
> > > +	       drm_fixp_mul(matrix->matrix[2][2], fp_cr);
> > > +
> > > +	fp_r = drm_fixp2int_round(fp_r);
> > > +	fp_g = drm_fixp2int_round(fp_g);
> > > +	fp_b = drm_fixp2int_round(fp_b);
> > > +
> > > +	r = clamp(fp_r, 0, 0xff);
> > > +	g = clamp(fp_g, 0, 0xff);
> > > +	b = clamp(fp_b, 0, 0xff);
> > > +
> > > +	return argb_u16_from_u8888(255, r, g, b);  
> > 
> > Going through argb_u16_from_u8888() will throw away precision.  
> 
> I tried to fix it in the v6, IGT tests pass. If something is wrong in the 
> v6, please let me know.
> 
> > > +}
> > > +
> > >  /*
> > >   * The following functions are read_line function for each pixel format supported by VKMS.
> > >   *
> > > @@ -293,6 +367,79 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> > >  	}
> > >  }
> > >  
> > > +/*
> > > + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> > > + * (column inversion)  
> > 
> > Would be nice to explain what semi_planar_yuv means, so that the
> > documentation for these functions would show how they differ rather
> > than all saying exactly the same thing.  
> 
>  /* This callback can be used for YUV format where each color component is 
>   * stored in a different plane (often called planar formats). It will 
>   * handle correctly subsampling.
> 
>  /*
>   * This callback can be used for YUV formats where U and V values are 
>   * stored in the same plane (often called semi-planar formats). It will 
>   * corectly handle subsampling.
>   * 
>   * The conversion matrix stored in the @plane is used to:
>   * - Apply the correct color range and encoding
>   * - Convert YUV and YVU with the same function (a simple column swap is 
>   *   needed)
>   */

Sounds good. I'd just drop the "It will handle correctly subsampling."
because all code is supposed to be correct by default.

If there is a function that intentionally overlooks something, that
certainly should be documented.


Thanks,
pq

Louis Chauvet April 9, 2024, 10:06 a.m. UTC | #12

Le 09/04/24 - 10:58, Pekka Paalanen a écrit :
> On Mon, 8 Apr 2024 09:50:19 +0200
> Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> 
> > Le 27/03/24 - 16:23, Pekka Paalanen a écrit :
> > > On Wed, 13 Mar 2024 18:45:05 +0100
> > > Louis Chauvet <louis.chauvet@bootlin.com> wrote:
> > >   
> > > > From: Arthur Grillo <arthurgrillo@riseup.net>
> > > > 
> > > > Add support to the YUV formats bellow:
> > > > 
> > > > - NV12/NV16/NV24
> > > > - NV21/NV61/NV42
> > > > - YUV420/YUV422/YUV444
> > > > - YVU420/YVU422/YVU444
> > > > 
> > > > The conversion from yuv to rgb is done with fixed-point arithmetic, using
> > > > 32.32 floats and the drm_fixed helpers.  
> > > 
> > > You mean fixed-point, not floating-point (floats).
> > >   
> > > > 
> > > > To do the conversion, a specific matrix must be used for each color range
> > > > (DRM_COLOR_*_RANGE) and encoding (DRM_COLOR_*). This matrix is stored in
> > > > the `conversion_matrix` struct, along with the specific y_offset needed.
> > > > This matrix is queried only once, in `vkms_plane_atomic_update` and
> > > > stored in a `vkms_plane_state`. Those conversion matrices of each
> > > > encoding and range were obtained by rounding the values of the original
> > > > conversion matrices multiplied by 2^32. This is done to avoid the use of
> > > > floating point operations.
> > > > 
> > > > The same reading function is used for YUV and YVU formats. As the only
> > > > difference between those two category of formats is the order of field, a
> > > > simple swap in conversion matrix columns allows using the same function.  
> > > 
> > > Sounds good!
> > >   
> > > > Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
> > > > [Louis Chauvet:
> > > > - Adapted Arthur's work
> > > > - Implemented the read_line_t callbacks for yuv
> > > > - add struct conversion_matrix
> > > > - remove struct pixel_yuv_u8
> > > > - update the commit message
> > > > - Merge the modifications from Arthur]
> > > > Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>
> > > > ---
> > > >  drivers/gpu/drm/vkms/vkms_drv.h     |  22 ++
> > > >  drivers/gpu/drm/vkms/vkms_formats.c | 431 ++++++++++++++++++++++++++++++++++++
> > > >  drivers/gpu/drm/vkms/vkms_formats.h |   4 +
> > > >  drivers/gpu/drm/vkms/vkms_plane.c   |  17 +-
> > > >  4 files changed, 473 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/vkms/vkms_drv.h b/drivers/gpu/drm/vkms/vkms_drv.h
> > > > index 23e1d247468d..f3116084de5a 100644
> > > > --- a/drivers/gpu/drm/vkms/vkms_drv.h
> > > > +++ b/drivers/gpu/drm/vkms/vkms_drv.h
> 
> ...
> 
> > > > +static struct pixel_argb_u16 argb_u16_from_yuv888(u8 y, u8 cb, u8 cr,
> > > > +						  struct conversion_matrix *matrix)  
> > > 
> > > If you are using the "swap the matrix columns" trick, then you cannot
> > > call these cb, cr nor even u,v, because they might be the opposite.
> > > They are simply the first and second chroma channel, and their meaning
> > > depends on the given matrix.  
> > 
> > I will rename them for v6, channel_1 and channel_2.
> > 
> > > > +{
> > > > +	u8 r, g, b;
> > > > +	s64 fp_y, fp_cb, fp_cr;
> > > > +	s64 fp_r, fp_g, fp_b;
> > > > +
> > > > +	fp_y = y - matrix->y_offset;
> > > > +	fp_cb = cb - 128;
> > > > +	fp_cr = cr - 128;  
> > > 
> > > This looks like an incorrect way to convert u8 to fixed-point, but...
> > >  
> > > > +
> > > > +	fp_y = drm_int2fixp(fp_y);
> > > > +	fp_cb = drm_int2fixp(fp_cb);
> > > > +	fp_cr = drm_int2fixp(fp_cr);  
> > > 
> > > I find it confusing to re-purpose variables like this.
> > > 
> > > I'd do just
> > > 
> > > 	fp_c1 = drm_int2fixp((int)c1 - 128);  
> > 
> > I agree with this remark, I will change it for the v6.
> > 
> > > If the function arguments were int to begin with, then the cast would
> > > be obviously unnecessary.  
> > 
> > For this I'm less sure. The name of the function and the usage is 
> > explicit: we want to use u8 as input. As we manipulate pointers in 
> > read_line, I don't know how it will works if the pointer is dereferenced 
> > to a int instead of a u8.
> 
> Dereference operator acts on its input type. What happens to the result
> is irrelevant.
> 
> If we have
> 
> u8 *p = ...;
> 
> void foo(int x);
> 
> then you can call
> 
> foo(*v);
> 
> if that was your question. Dereference acts on u8* which results in u8.
> Then it gets implicitly cast to int.

Thanks for the clear explaination!
 
> However, you have a semantic reason to keep the argument as u8, and
> that is fine.

So I will keep u8 for the v6.

> > > So, what you have in fp variables at this point is fractional numbers
> > > in the 8-bit integer scale. However, because the target format is
> > > 16-bit, you should not show the extra precision away here. Instead,
> > > multiply by 257 to bring the values to 16-bit scale, and do the RGB
> > > clamping to 16-bit, not 8-bit.
> > >   
> > > > +
> > > > +	fp_r = drm_fixp_mul(matrix->matrix[0][0], fp_y) +
> > > > +	       drm_fixp_mul(matrix->matrix[0][1], fp_cb) +
> > > > +	       drm_fixp_mul(matrix->matrix[0][2], fp_cr);
> > > > +	fp_g = drm_fixp_mul(matrix->matrix[1][0], fp_y) +
> > > > +	       drm_fixp_mul(matrix->matrix[1][1], fp_cb) +
> > > > +	       drm_fixp_mul(matrix->matrix[1][2], fp_cr);
> > > > +	fp_b = drm_fixp_mul(matrix->matrix[2][0], fp_y) +
> > > > +	       drm_fixp_mul(matrix->matrix[2][1], fp_cb) +
> > > > +	       drm_fixp_mul(matrix->matrix[2][2], fp_cr);
> > > > +
> > > > +	fp_r = drm_fixp2int_round(fp_r);
> > > > +	fp_g = drm_fixp2int_round(fp_g);
> > > > +	fp_b = drm_fixp2int_round(fp_b);
> > > > +
> > > > +	r = clamp(fp_r, 0, 0xff);
> > > > +	g = clamp(fp_g, 0, 0xff);
> > > > +	b = clamp(fp_b, 0, 0xff);
> > > > +
> > > > +	return argb_u16_from_u8888(255, r, g, b);  
> > > 
> > > Going through argb_u16_from_u8888() will throw away precision.  
> > 
> > I tried to fix it in the v6, IGT tests pass. If something is wrong in the 
> > v6, please let me know.
> > 
> > > > +}
> > > > +
> > > >  /*
> > > >   * The following functions are read_line function for each pixel format supported by VKMS.
> > > >   *
> > > > @@ -293,6 +367,79 @@ static void RGB565_read_line(const struct vkms_plane_state *plane, int x_start,
> > > >  	}
> > > >  }
> > > >  
> > > > +/*
> > > > + * This callback can be used for yuv and yvu formats, given a properly modified conversion matrix
> > > > + * (column inversion)  
> > > 
> > > Would be nice to explain what semi_planar_yuv means, so that the
> > > documentation for these functions would show how they differ rather
> > > than all saying exactly the same thing.  
> > 
> >  /* This callback can be used for YUV format where each color component is 
> >   * stored in a different plane (often called planar formats). It will 
> >   * handle correctly subsampling.
> > 
> >  /*
> >   * This callback can be used for YUV formats where U and V values are 
> >   * stored in the same plane (often called semi-planar formats). It will 
> >   * corectly handle subsampling.
> >   * 
> >   * The conversion matrix stored in the @plane is used to:
> >   * - Apply the correct color range and encoding
> >   * - Convert YUV and YVU with the same function (a simple column swap is 
> >   *   needed)
> >   */
> 
> Sounds good. I'd just drop the "It will handle correctly subsampling."
> because all code is supposed to be correct by default.

Will do for the v6.

Thanks,
Louis Chauvet
 
> If there is a function that intentionally overlooks something, that
> certainly should be documented.
> 
> 
> Thanks,
> pq

[v5,11/16] drm/vkms: Add YUV support

Commit Message

Comments

Patch