diff mbox series

[v2,25/25] drm/xlnx: Compute dumb-buffer sizes with drm_mode_size_dumb()

Message ID 20250109150310.219442-26-tzimmermann@suse.de (mailing list archive)
State New
Delegated to: Kieran Bingham
Headers show
Series drm/dumb-buffers: Fix and improve buffer-size calculation | expand

Commit Message

Thomas Zimmermann Jan. 9, 2025, 2:57 p.m. UTC
Call drm_mode_size_dumb() to compute dumb-buffer scanline pitch and
buffer size. Align the pitch according to hardware requirements.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
---
 drivers/gpu/drm/xlnx/zynqmp_kms.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

Comments

Tomi Valkeinen Jan. 15, 2025, 10:13 a.m. UTC | #1
Hi!

On 09/01/2025 16:57, Thomas Zimmermann wrote:
> Call drm_mode_size_dumb() to compute dumb-buffer scanline pitch and
> buffer size. Align the pitch according to hardware requirements.
> 
> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
> Cc: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
> ---
>   drivers/gpu/drm/xlnx/zynqmp_kms.c | 7 +++++--
>   1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/xlnx/zynqmp_kms.c b/drivers/gpu/drm/xlnx/zynqmp_kms.c
> index b47463473472..7ea0cd4f71d3 100644
> --- a/drivers/gpu/drm/xlnx/zynqmp_kms.c
> +++ b/drivers/gpu/drm/xlnx/zynqmp_kms.c
> @@ -19,6 +19,7 @@
>   #include <drm/drm_crtc.h>
>   #include <drm/drm_device.h>
>   #include <drm/drm_drv.h>
> +#include <drm/drm_dumb_buffers.h>
>   #include <drm/drm_encoder.h>
>   #include <drm/drm_fbdev_dma.h>
>   #include <drm/drm_fourcc.h>
> @@ -363,10 +364,12 @@ static int zynqmp_dpsub_dumb_create(struct drm_file *file_priv,
>   				    struct drm_mode_create_dumb *args)
>   {
>   	struct zynqmp_dpsub *dpsub = to_zynqmp_dpsub(drm);
> -	unsigned int pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
> +	int ret;
>   
>   	/* Enforce the alignment constraints of the DMA engine. */
> -	args->pitch = ALIGN(pitch, dpsub->dma_align);
> +	ret = drm_mode_size_dumb(drm, args, dpsub->dma_align, 0);
> +	if (ret)
> +		return ret;
>   
>   	return drm_gem_dma_dumb_create_internal(file_priv, drm, args);
>   }

I have some trouble with this one.

I have sent a series to add some pixel formats:

https://lore.kernel.org/all/20250115-xilinx-formats-v2-0-160327ca652a@ideasonboard.com/

Let's look at XV15. It's similar to NV12, but 10 bits per component, and 
some packing and padding.

First plane: 3 pixels in a 32 bit group
Second plane: 3 pixels in a 64 bit group, 2x2 subsampled

So, on average, a pixel on the first plane takes 32 / 3 = 10.666... bits 
on a line. That's not a usable number for the DRM_IOCTL_MODE_CREATE_DUMB 
ioctl.

What I did was to use the pixel group size as "bpp" for 
DRM_IOCTL_MODE_CREATE_DUMB. So, e.g., for 720 x 576:

Stride for first plane: 720 * (32 / 3) / 8 = 960 bytes
Stride for second plane: 720 / 2 * (64 / 3) / 8 = 960 bytes

First plane: 720 / 3 = 240 pixel groups
Second plane: 720 / 2 / 3 = 120 pixel groups

So I allocated the two planes with:
240 x 576 with 32 bitspp
120 x 288 with 64 bitspp

This worked, and if I look at the DRM_IOCTL_MODE_CREATE_DUMB in the 
docs, I can't right away see anything there that says my tactic was not 
allowed.

The above doesn't work anymore with this patch, as the code calls 
drm_driver_color_mode_format(), which fails for 64 bitspp. It feels a 
bit odd that DRM_IOCTL_MODE_CREATE_DUMB will try to guess the RGB fourcc 
for a dumb buffer allocation.

So, what to do here? Am I doing something silly? What's the correct way 
to allocate the buffers for XV15? Should I just use 32 bitspp for the 
plane 2 too, and double the width (this works)?

Is DRM_IOCTL_MODE_CREATE_DUMB only meant for simple RGB formats? The 
xilinx driver can, of course, just not use drm_mode_size_dumb(). But if 
so, I guess the limitations of drm_mode_size_dumb() should be documented.

Do we need a new dumb-alloc ioctl that takes the format and plane number 
as parameters? Or alternatively a simpler dumb-alloc that doesn't have 
width and bpp, but instead takes a stride and height as parameters? I 
think those would be easier for the userspace to use, instead of trying 
to adjust the parameters to be suitable for the kernel.

  Tomi
Thomas Zimmermann Jan. 15, 2025, 10:26 a.m. UTC | #2
Hi


Am 15.01.25 um 11:13 schrieb Tomi Valkeinen:
> Hi!
>
> On 09/01/2025 16:57, Thomas Zimmermann wrote:
>> Call drm_mode_size_dumb() to compute dumb-buffer scanline pitch and
>> buffer size. Align the pitch according to hardware requirements.
>>
>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>> Cc: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
>> ---
>>   drivers/gpu/drm/xlnx/zynqmp_kms.c | 7 +++++--
>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/xlnx/zynqmp_kms.c 
>> b/drivers/gpu/drm/xlnx/zynqmp_kms.c
>> index b47463473472..7ea0cd4f71d3 100644
>> --- a/drivers/gpu/drm/xlnx/zynqmp_kms.c
>> +++ b/drivers/gpu/drm/xlnx/zynqmp_kms.c
>> @@ -19,6 +19,7 @@
>>   #include <drm/drm_crtc.h>
>>   #include <drm/drm_device.h>
>>   #include <drm/drm_drv.h>
>> +#include <drm/drm_dumb_buffers.h>
>>   #include <drm/drm_encoder.h>
>>   #include <drm/drm_fbdev_dma.h>
>>   #include <drm/drm_fourcc.h>
>> @@ -363,10 +364,12 @@ static int zynqmp_dpsub_dumb_create(struct 
>> drm_file *file_priv,
>>                       struct drm_mode_create_dumb *args)
>>   {
>>       struct zynqmp_dpsub *dpsub = to_zynqmp_dpsub(drm);
>> -    unsigned int pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
>> +    int ret;
>>         /* Enforce the alignment constraints of the DMA engine. */
>> -    args->pitch = ALIGN(pitch, dpsub->dma_align);
>> +    ret = drm_mode_size_dumb(drm, args, dpsub->dma_align, 0);
>> +    if (ret)
>> +        return ret;
>>         return drm_gem_dma_dumb_create_internal(file_priv, drm, args);
>>   }
>
> I have some trouble with this one.
>
> I have sent a series to add some pixel formats:
>
> https://lore.kernel.org/all/20250115-xilinx-formats-v2-0-160327ca652a@ideasonboard.com/ 
>
>
> Let's look at XV15. It's similar to NV12, but 10 bits per component, 
> and some packing and padding.
>
> First plane: 3 pixels in a 32 bit group
> Second plane: 3 pixels in a 64 bit group, 2x2 subsampled
>
> So, on average, a pixel on the first plane takes 32 / 3 = 10.666... 
> bits on a line. That's not a usable number for the 
> DRM_IOCTL_MODE_CREATE_DUMB ioctl.
>
> What I did was to use the pixel group size as "bpp" for 
> DRM_IOCTL_MODE_CREATE_DUMB. So, e.g., for 720 x 576:
>
> Stride for first plane: 720 * (32 / 3) / 8 = 960 bytes
> Stride for second plane: 720 / 2 * (64 / 3) / 8 = 960 bytes
>
> First plane: 720 / 3 = 240 pixel groups
> Second plane: 720 / 2 / 3 = 120 pixel groups
>
> So I allocated the two planes with:
> 240 x 576 with 32 bitspp
> 120 x 288 with 64 bitspp
>
> This worked, and if I look at the DRM_IOCTL_MODE_CREATE_DUMB in the 
> docs, I can't right away see anything there that says my tactic was 
> not allowed.
>
> The above doesn't work anymore with this patch, as the code calls 
> drm_driver_color_mode_format(), which fails for 64 bitspp. It feels a 
> bit odd that DRM_IOCTL_MODE_CREATE_DUMB will try to guess the RGB 
> fourcc for a dumb buffer allocation.
>
> So, what to do here? Am I doing something silly? What's the correct 
> way to allocate the buffers for XV15? Should I just use 32 bitspp for 
> the plane 2 too, and double the width (this works)?
>
> Is DRM_IOCTL_MODE_CREATE_DUMB only meant for simple RGB formats? The 
> xilinx driver can, of course, just not use drm_mode_size_dumb(). But 
> if so, I guess the limitations of drm_mode_size_dumb() should be 
> documented.
>
> Do we need a new dumb-alloc ioctl that takes the format and plane 
> number as parameters? Or alternatively a simpler dumb-alloc that 
> doesn't have width and bpp, but instead takes a stride and height as 
> parameters? I think those would be easier for the userspace to use, 
> instead of trying to adjust the parameters to be suitable for the kernel.

These are all good points. Did you read my discussion with Andy on patch 
2? I think it resolves all the points you have. The current CREATE_DUMB 
ioctl is unsuited for anything but the simple RGB formats. The bpp 
parameter is not very precise. The solution would be a new ioctl call 
that receives the DRM format and returns a buffer for each individual plane.

I provided a workaround patch that uses the bpp value directly if 
drm_driver_color_mode_format() does not support the bpp value. 
User-space code has to allocate a large enough buffer via the current 
CREATE_DUMB and compute the individual planes itself. See [1] for an 
example. [1] 
https://gitlab.freedesktop.org/mesa/drm/-/blob/main/tests/modetest/buffers.c?ref_type=heads#L302 
Does this work for you? Otherwise, I guess we should be talking about a 
possible CREATE_DUMB2 that fixes these shortcomings. Best regards Thomas
>
>  Tomi
>
Tomi Valkeinen Jan. 15, 2025, 10:58 a.m. UTC | #3
Hi,

On 15/01/2025 12:26, Thomas Zimmermann wrote:
> Hi
> 
> 
> Am 15.01.25 um 11:13 schrieb Tomi Valkeinen:
>> Hi!
>>
>> On 09/01/2025 16:57, Thomas Zimmermann wrote:
>>> Call drm_mode_size_dumb() to compute dumb-buffer scanline pitch and
>>> buffer size. Align the pitch according to hardware requirements.
>>>
>>> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
>>> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
>>> Cc: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
>>> ---
>>>   drivers/gpu/drm/xlnx/zynqmp_kms.c | 7 +++++--
>>>   1 file changed, 5 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/xlnx/zynqmp_kms.c b/drivers/gpu/drm/ 
>>> xlnx/zynqmp_kms.c
>>> index b47463473472..7ea0cd4f71d3 100644
>>> --- a/drivers/gpu/drm/xlnx/zynqmp_kms.c
>>> +++ b/drivers/gpu/drm/xlnx/zynqmp_kms.c
>>> @@ -19,6 +19,7 @@
>>>   #include <drm/drm_crtc.h>
>>>   #include <drm/drm_device.h>
>>>   #include <drm/drm_drv.h>
>>> +#include <drm/drm_dumb_buffers.h>
>>>   #include <drm/drm_encoder.h>
>>>   #include <drm/drm_fbdev_dma.h>
>>>   #include <drm/drm_fourcc.h>
>>> @@ -363,10 +364,12 @@ static int zynqmp_dpsub_dumb_create(struct 
>>> drm_file *file_priv,
>>>                       struct drm_mode_create_dumb *args)
>>>   {
>>>       struct zynqmp_dpsub *dpsub = to_zynqmp_dpsub(drm);
>>> -    unsigned int pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
>>> +    int ret;
>>>         /* Enforce the alignment constraints of the DMA engine. */
>>> -    args->pitch = ALIGN(pitch, dpsub->dma_align);
>>> +    ret = drm_mode_size_dumb(drm, args, dpsub->dma_align, 0);
>>> +    if (ret)
>>> +        return ret;
>>>         return drm_gem_dma_dumb_create_internal(file_priv, drm, args);
>>>   }
>>
>> I have some trouble with this one.
>>
>> I have sent a series to add some pixel formats:
>>
>> https://lore.kernel.org/all/20250115-xilinx-formats- 
>> v2-0-160327ca652a@ideasonboard.com/
>>
>> Let's look at XV15. It's similar to NV12, but 10 bits per component, 
>> and some packing and padding.
>>
>> First plane: 3 pixels in a 32 bit group
>> Second plane: 3 pixels in a 64 bit group, 2x2 subsampled
>>
>> So, on average, a pixel on the first plane takes 32 / 3 = 10.666... 
>> bits on a line. That's not a usable number for the 
>> DRM_IOCTL_MODE_CREATE_DUMB ioctl.
>>
>> What I did was to use the pixel group size as "bpp" for 
>> DRM_IOCTL_MODE_CREATE_DUMB. So, e.g., for 720 x 576:
>>
>> Stride for first plane: 720 * (32 / 3) / 8 = 960 bytes
>> Stride for second plane: 720 / 2 * (64 / 3) / 8 = 960 bytes
>>
>> First plane: 720 / 3 = 240 pixel groups
>> Second plane: 720 / 2 / 3 = 120 pixel groups
>>
>> So I allocated the two planes with:
>> 240 x 576 with 32 bitspp
>> 120 x 288 with 64 bitspp
>>
>> This worked, and if I look at the DRM_IOCTL_MODE_CREATE_DUMB in the 
>> docs, I can't right away see anything there that says my tactic was 
>> not allowed.
>>
>> The above doesn't work anymore with this patch, as the code calls 
>> drm_driver_color_mode_format(), which fails for 64 bitspp. It feels a 
>> bit odd that DRM_IOCTL_MODE_CREATE_DUMB will try to guess the RGB 
>> fourcc for a dumb buffer allocation.
>>
>> So, what to do here? Am I doing something silly? What's the correct 
>> way to allocate the buffers for XV15? Should I just use 32 bitspp for 
>> the plane 2 too, and double the width (this works)?
>>
>> Is DRM_IOCTL_MODE_CREATE_DUMB only meant for simple RGB formats? The 
>> xilinx driver can, of course, just not use drm_mode_size_dumb(). But 
>> if so, I guess the limitations of drm_mode_size_dumb() should be 
>> documented.
>>
>> Do we need a new dumb-alloc ioctl that takes the format and plane 
>> number as parameters? Or alternatively a simpler dumb-alloc that 
>> doesn't have width and bpp, but instead takes a stride and height as 
>> parameters? I think those would be easier for the userspace to use, 
>> instead of trying to adjust the parameters to be suitable for the kernel.
> 
> These are all good points. Did you read my discussion with Andy on patch 
> 2? I think it resolves all the points you have. The current CREATE_DUMB 

I had missed the discussion, and, indeed, the patch you attached fixes 
the problem on Xilinx.

> ioctl is unsuited for anything but the simple RGB formats. The bpp 

It's a bit difficult to use, but is it really unsuited? bitsperpixel, 
width and height do give an exact pitch and size, do they not? It does 
require the userspace to handle the subsampling and planes, though, so 
far from perfect.

So, I'm all for a new ioctl, but I don't right away see why the current 
ioctl couldn't be used. Which makes me wonder about the drm_warn() in 
your patch, and the "userspace throws in arbitrary values for bpp and 
relies on the kernel to figure it out". Maybe I'm missing something here.

> parameter is not very precise. The solution would be a new ioctl call 
> that receives the DRM format and returns a buffer for each individual 
> plane.

Yes, I think that makes sense. That's a long road, though =). So my 
question is, is CREATE_DUMB really unsuitable for other than simple RGB 
formats, or can it be suitable if we just define how the userspace 
should use it for multiplanar, subsampled formats?

  Tomi
Thomas Zimmermann Jan. 15, 2025, 11:37 a.m. UTC | #4
Hi


Am 15.01.25 um 11:58 schrieb Tomi Valkeinen:
[...]
>> These are all good points. Did you read my discussion with Andy on 
>> patch 2? I think it resolves all the points you have. The current 
>> CREATE_DUMB 
>
> I had missed the discussion, and, indeed, the patch you attached fixes 
> the problem on Xilinx.

Great. Thanks for testing.

>
>> ioctl is unsuited for anything but the simple RGB formats. The bpp 
>
> It's a bit difficult to use, but is it really unsuited? bitsperpixel, 
> width and height do give an exact pitch and size, do they not? It does 
> require the userspace to handle the subsampling and planes, though, so 
> far from perfect.

The bpp value sets the number of bits per pixel; except for bpp==15 
(XRGB1555), where it sets the color depth. OR bpp is the color depth; 
except for bpp==32 (XRGB8888), where it is the number of bits per pixel. 
It's therefore best to interpret it like a color-mode enum.

>
> So, I'm all for a new ioctl, but I don't right away see why the 
> current ioctl couldn't be used. Which makes me wonder about the 
> drm_warn() in your patch, and the "userspace throws in arbitrary 
> values for bpp and relies on the kernel to figure it out". Maybe I'm 
> missing something here.

I was unsure about the drm_warn() as well. It's not really wrong to have 
odd bpp values, but handing in an unknown bpp value might point to a 
user-space error. At least there should be a drm_dbg().

>
>> parameter is not very precise. The solution would be a new ioctl call 
>> that receives the DRM format and returns a buffer for each individual 
>> plane.
>
> Yes, I think that makes sense. That's a long road, though =). So my 
> question is, is CREATE_DUMB really unsuitable for other than simple 
> RGB formats, or can it be suitable if we just define how the userspace 
> should use it for multiplanar, subsampled formats?

That would duplicate format and hardware information in user-space. Some 
hardware might have odd per-plane limitations that only the driver knows 
about. For example, there's another discussion on dri-devel about 
pitch-alignment requirements of DRM_FORMAT_MOD_LINEAR on various 
hardware. That affects dumb buffers as well. I don't think that there's 
an immediate need for a CREATE_DUMB2, but it seems worth to keep in mind.

Best regards
Thomas

>
>  Tomi
>
Tomi Valkeinen Jan. 15, 2025, 12:06 p.m. UTC | #5
Hi,

On 15/01/2025 13:37, Thomas Zimmermann wrote:
> Hi
> 
> 
> Am 15.01.25 um 11:58 schrieb Tomi Valkeinen:
> [...]
>>> These are all good points. Did you read my discussion with Andy on 
>>> patch 2? I think it resolves all the points you have. The current 
>>> CREATE_DUMB 
>>
>> I had missed the discussion, and, indeed, the patch you attached fixes 
>> the problem on Xilinx.
> 
> Great. Thanks for testing.
> 
>>
>>> ioctl is unsuited for anything but the simple RGB formats. The bpp 
>>
>> It's a bit difficult to use, but is it really unsuited? bitsperpixel, 
>> width and height do give an exact pitch and size, do they not? It does 
>> require the userspace to handle the subsampling and planes, though, so 
>> far from perfect.
> 
> The bpp value sets the number of bits per pixel; except for bpp==15 
> (XRGB1555), where it sets the color depth. OR bpp is the color depth; 
> except for bpp==32 (XRGB8888), where it is the number of bits per pixel. 
> It's therefore best to interpret it like a color-mode enum.

Ah, right... That's horrible =).

And I assume it's not really possible to define the bpp to mean bits per 
pixel, except for a few special cases like 15?

Why do we even really care about color depth here? We're just allocating 
memory. Doesn't DIV_ROUND_UP(args->bpp, SZ_8) work fine for XRGB1555 too?

>> So, I'm all for a new ioctl, but I don't right away see why the 
>> current ioctl couldn't be used. Which makes me wonder about the 
>> drm_warn() in your patch, and the "userspace throws in arbitrary 
>> values for bpp and relies on the kernel to figure it out". Maybe I'm 
>> missing something here.
> 
> I was unsure about the drm_warn() as well. It's not really wrong to have 
> odd bpp values, but handing in an unknown bpp value might point to a 
> user-space error. At least there should be a drm_dbg().
> 
>>
>>> parameter is not very precise. The solution would be a new ioctl call 
>>> that receives the DRM format and returns a buffer for each individual 
>>> plane.
>>
>> Yes, I think that makes sense. That's a long road, though =). So my 
>> question is, is CREATE_DUMB really unsuitable for other than simple 
>> RGB formats, or can it be suitable if we just define how the userspace 
>> should use it for multiplanar, subsampled formats?
> 
> That would duplicate format and hardware information in user-space. Some 

But we already have that, don't we? We have drivers and userspace that 
support, say, NV12 via dumb buffers. But (correct me if I'm wrong) we 
don't document how CREATE_DUMB has to be used to allocate multiplanar 
subsampled buffers, so the userspace devs have to "guess".

> hardware might have odd per-plane limitations that only the driver knows 
> about. For example, there's another discussion on dri-devel about pitch- 
> alignment requirements of DRM_FORMAT_MOD_LINEAR on various hardware. 
> That affects dumb buffers as well. I don't think that there's an 
> immediate need for a CREATE_DUMB2, but it seems worth to keep in mind.

Yes, the current CREATE_DUMB can't cover all the hardware. We do need 
CREATE_DUMB2, sooner or later. I just hope we can define and document a 
set of rules that allows using CREATE_DUMB for the cases where it 
sensibly works (and is already being used).

  Tomi
Thomas Zimmermann Jan. 15, 2025, 12:34 p.m. UTC | #6
Hi


Am 15.01.25 um 13:06 schrieb Tomi Valkeinen:
> Hi,
>
> On 15/01/2025 13:37, Thomas Zimmermann wrote:
>> Hi
>>
>>
>> Am 15.01.25 um 11:58 schrieb Tomi Valkeinen:
>> [...]
>>>> These are all good points. Did you read my discussion with Andy on 
>>>> patch 2? I think it resolves all the points you have. The current 
>>>> CREATE_DUMB 
>>>
>>> I had missed the discussion, and, indeed, the patch you attached 
>>> fixes the problem on Xilinx.
>>
>> Great. Thanks for testing.
>>
>>>
>>>> ioctl is unsuited for anything but the simple RGB formats. The bpp 
>>>
>>> It's a bit difficult to use, but is it really unsuited? 
>>> bitsperpixel, width and height do give an exact pitch and size, do 
>>> they not? It does require the userspace to handle the subsampling 
>>> and planes, though, so far from perfect.
>>
>> The bpp value sets the number of bits per pixel; except for bpp==15 
>> (XRGB1555), where it sets the color depth. OR bpp is the color depth; 
>> except for bpp==32 (XRGB8888), where it is the number of bits per 
>> pixel. It's therefore best to interpret it like a color-mode enum.
>
> Ah, right... That's horrible =).
>
> And I assume it's not really possible to define the bpp to mean bits 
> per pixel, except for a few special cases like 15?
>
> Why do we even really care about color depth here? We're just 
> allocating memory. Doesn't DIV_ROUND_UP(args->bpp, SZ_8) work fine for 
> XRGB1555 too?

Drivers always did that, but it does not work correctly for (bpp < 8). 
As we already have helpers to deal with bpp, it makes sense to use 
them.  This also aligns dumb buffers with the kernel's video= parameter, 
which as the same odd semantics. The fallback that uses bpp directly 
will hopefully be the exception.

>
>>> So, I'm all for a new ioctl, but I don't right away see why the 
>>> current ioctl couldn't be used. Which makes me wonder about the 
>>> drm_warn() in your patch, and the "userspace throws in arbitrary 
>>> values for bpp and relies on the kernel to figure it out". Maybe I'm 
>>> missing something here.
>>
>> I was unsure about the drm_warn() as well. It's not really wrong to 
>> have odd bpp values, but handing in an unknown bpp value might point 
>> to a user-space error. At least there should be a drm_dbg().
>>
>>>
>>>> parameter is not very precise. The solution would be a new ioctl 
>>>> call that receives the DRM format and returns a buffer for each 
>>>> individual plane.
>>>
>>> Yes, I think that makes sense. That's a long road, though =). So my 
>>> question is, is CREATE_DUMB really unsuitable for other than simple 
>>> RGB formats, or can it be suitable if we just define how the 
>>> userspace should use it for multiplanar, subsampled formats?
>>
>> That would duplicate format and hardware information in user-space. Some 
>
> But we already have that, don't we? We have drivers and userspace that 
> support, say, NV12 via dumb buffers. But (correct me if I'm wrong) we 
> don't document how CREATE_DUMB has to be used to allocate multiplanar 
> subsampled buffers, so the userspace devs have to "guess".

Yeah, there are constrains in the scanline and buffer alignments and 
orientation. And if we say that bpp==12 means NV12, it will be a problem 
for all other cases where bpp==12 makes sense.

Best regards
Thomas

>
>> hardware might have odd per-plane limitations that only the driver 
>> knows about. For example, there's another discussion on dri-devel 
>> about pitch- alignment requirements of DRM_FORMAT_MOD_LINEAR on 
>> various hardware. That affects dumb buffers as well. I don't think 
>> that there's an immediate need for a CREATE_DUMB2, but it seems worth 
>> to keep in mind.
>
> Yes, the current CREATE_DUMB can't cover all the hardware. We do need 
> CREATE_DUMB2, sooner or later. I just hope we can define and document 
> a set of rules that allows using CREATE_DUMB for the cases where it 
> sensibly works (and is already being used).
>
>  Tomi
>
Tomi Valkeinen Jan. 15, 2025, 1:33 p.m. UTC | #7
Hi,

On 15/01/2025 14:34, Thomas Zimmermann wrote:
> Hi
> 
> 
> Am 15.01.25 um 13:06 schrieb Tomi Valkeinen:
>> Hi,
>>
>> On 15/01/2025 13:37, Thomas Zimmermann wrote:
>>> Hi
>>>
>>>
>>> Am 15.01.25 um 11:58 schrieb Tomi Valkeinen:
>>> [...]
>>>>> These are all good points. Did you read my discussion with Andy on 
>>>>> patch 2? I think it resolves all the points you have. The current 
>>>>> CREATE_DUMB 
>>>>
>>>> I had missed the discussion, and, indeed, the patch you attached 
>>>> fixes the problem on Xilinx.
>>>
>>> Great. Thanks for testing.
>>>
>>>>
>>>>> ioctl is unsuited for anything but the simple RGB formats. The bpp 
>>>>
>>>> It's a bit difficult to use, but is it really unsuited? 
>>>> bitsperpixel, width and height do give an exact pitch and size, do 
>>>> they not? It does require the userspace to handle the subsampling 
>>>> and planes, though, so far from perfect.
>>>
>>> The bpp value sets the number of bits per pixel; except for bpp==15 
>>> (XRGB1555), where it sets the color depth. OR bpp is the color depth; 
>>> except for bpp==32 (XRGB8888), where it is the number of bits per 
>>> pixel. It's therefore best to interpret it like a color-mode enum.
>>
>> Ah, right... That's horrible =).
>>
>> And I assume it's not really possible to define the bpp to mean bits 
>> per pixel, except for a few special cases like 15?
>>
>> Why do we even really care about color depth here? We're just 
>> allocating memory. Doesn't DIV_ROUND_UP(args->bpp, SZ_8) work fine for 
>> XRGB1555 too?
> 
> Drivers always did that, but it does not work correctly for (bpp < 8). 
> As we already have helpers to deal with bpp, it makes sense to use 
> them.  This also aligns dumb buffers with the kernel's video= parameter, 
> which as the same odd semantics. The fallback that uses bpp directly 
> will hopefully be the exception.

Hmm, well... If we had a 64-bit RGB format in the list of "legacy fb 
formats", I wouldn't have noticed anything. And if I would just use 32 
as the bpp, and adjust width accordingly, it would also have worked. So, 
I expect the fallback to be an exception,

And by working I mean that I can run my apps fine, but the internal 
operation would sure be odd: allocating any YUV buffer will cause 
drm_mode_size_dumb() to get an RGB format from 
drm_driver_color_mode_format(), and get a drm_format_info_min_pitch() 
for an RGB format.

>>>> So, I'm all for a new ioctl, but I don't right away see why the 
>>>> current ioctl couldn't be used. Which makes me wonder about the 
>>>> drm_warn() in your patch, and the "userspace throws in arbitrary 
>>>> values for bpp and relies on the kernel to figure it out". Maybe I'm 
>>>> missing something here.
>>>
>>> I was unsure about the drm_warn() as well. It's not really wrong to 
>>> have odd bpp values, but handing in an unknown bpp value might point 
>>> to a user-space error. At least there should be a drm_dbg().
>>>
>>>>
>>>>> parameter is not very precise. The solution would be a new ioctl 
>>>>> call that receives the DRM format and returns a buffer for each 
>>>>> individual plane.
>>>>
>>>> Yes, I think that makes sense. That's a long road, though =). So my 
>>>> question is, is CREATE_DUMB really unsuitable for other than simple 
>>>> RGB formats, or can it be suitable if we just define how the 
>>>> userspace should use it for multiplanar, subsampled formats?
>>>
>>> That would duplicate format and hardware information in user-space. Some 
>>
>> But we already have that, don't we? We have drivers and userspace that 
>> support, say, NV12 via dumb buffers. But (correct me if I'm wrong) we 
>> don't document how CREATE_DUMB has to be used to allocate multiplanar 
>> subsampled buffers, so the userspace devs have to "guess".
> 
> Yeah, there are constrains in the scanline and buffer alignments and 
> orientation. And if we say that bpp==12 means NV12, it will be a problem 
> for all other cases where bpp==12 makes sense.

I feel I still don't quite understand. Can't we define and document 
CREATE_DUMB like this:

If (bpp < 8 || is_power_of_two(bpp))
	bpp means bitsperpixel
	pitch is args->width * args->bpp / 8, aligned up to driver-specific-align
else
	bpp is a legacy parameter, and we deal with it case by case.
	list the cases and what they mean

And describe that when allocating subsampled buffers, the caller must 
adjust the width and height accordingly. And that the bpp and width can 
also refer to pixel groups.

Or if the currently existing code prevents the above for 16 and 32 bpps, 
how about defining that any non-RGB or not-simple buffer has to be 
allocated with bpp=8, and the userspace has to align the pitch correctly 
according to the format and platform's hw restrictions?

  Tomi
Thomas Zimmermann Jan. 15, 2025, 1:45 p.m. UTC | #8
Hi


Am 15.01.25 um 14:33 schrieb Tomi Valkeinen:
[...]
>> Yeah, there are constrains in the scanline and buffer alignments and 
>> orientation. And if we say that bpp==12 means NV12, it will be a 
>> problem for all other cases where bpp==12 makes sense.
>
> I feel I still don't quite understand. Can't we define and document 
> CREATE_DUMB like this:
>
> If (bpp < 8 || is_power_of_two(bpp))
>     bpp means bitsperpixel
>     pitch is args->width * args->bpp / 8, aligned up to 
> driver-specific-align
> else
>     bpp is a legacy parameter, and we deal with it case by case.
>     list the cases and what they mean
>
> And describe that when allocating subsampled buffers, the caller must 
> adjust the width and height accordingly. And that the bpp and width 
> can also refer to pixel groups.
>
> Or if the currently existing code prevents the above for 16 and 32 
> bpps, how about defining that any non-RGB or not-simple buffer has to 
> be allocated with bpp=8, and the userspace has to align the pitch 
> correctly according to the format and platform's hw restrictions?

What if a hardware requires certain per-format alignments? Or requires 
certain alignments for each plane? Or only supports tile modes? Or has 
strict limits on the maximum buffer size?

It is not possible to encode all this in a simple 32-bit value. So 
user-space code has to be aware of all this and tweak bpp-based 
allocation to make it work. Obviously you can use the current UAPI for 
your use case. It's just not optimal or future proof.

Best regards
Thomas

>
>
>  Tomi
>
Tomi Valkeinen Jan. 15, 2025, 2:20 p.m. UTC | #9
On 15/01/2025 15:45, Thomas Zimmermann wrote:
> Hi
> 
> 
> Am 15.01.25 um 14:33 schrieb Tomi Valkeinen:
> [...]
>>> Yeah, there are constrains in the scanline and buffer alignments and 
>>> orientation. And if we say that bpp==12 means NV12, it will be a 
>>> problem for all other cases where bpp==12 makes sense.
>>
>> I feel I still don't quite understand. Can't we define and document 
>> CREATE_DUMB like this:
>>
>> If (bpp < 8 || is_power_of_two(bpp))
>>     bpp means bitsperpixel
>>     pitch is args->width * args->bpp / 8, aligned up to driver- 
>> specific-align
>> else
>>     bpp is a legacy parameter, and we deal with it case by case.
>>     list the cases and what they mean
>>
>> And describe that when allocating subsampled buffers, the caller must 
>> adjust the width and height accordingly. And that the bpp and width 
>> can also refer to pixel groups.
>>
>> Or if the currently existing code prevents the above for 16 and 32 
>> bpps, how about defining that any non-RGB or not-simple buffer has to 
>> be allocated with bpp=8, and the userspace has to align the pitch 
>> correctly according to the format and platform's hw restrictions?
> 
> What if a hardware requires certain per-format alignments? Or requires 
> certain alignments for each plane? Or only supports tile modes? Or has 
> strict limits on the maximum buffer size?
> 
> It is not possible to encode all this in a simple 32-bit value. So user- 
> space code has to be aware of all this and tweak bpp-based allocation to 
> make it work. Obviously you can use the current UAPI for your use case. 
> It's just not optimal or future proof.

No disagreement there, we need CREATE_DUMB2.

My point is that we have the current UAPI, and we have userspace using 
it, but we don't have clear rules what the ioctl does with specific 
parameters, and we don't document how it has to be used.

Perhaps the situation is bad, and all we can really say is that 
CREATE_DUMB only works for use with simple RGB formats, and the behavior 
for all other formats is platform specific. But I think even that would 
be valuable in the UAPI docs.

Thinking about this, I wonder if this change is good for omapdrm or 
xilinx (probably other platforms too that support non-simple non-RGB 
formats via dumb buffers): without this patch, in both drivers, the 
pitch calculations just take the bpp as bit-per-pixels, align it up, and 
that's it.

With this patch we end up using drm_driver_color_mode_format(), and 
aligning buffers according to RGB formats figured out via heuristics. It 
does happen to work, for the formats I tested, but it sounds like 
something that might easily not work, as it's doing adjustments based on 
wrong format.

Should we have another version of drm_mode_size_dumb() which just 
calculates using the bpp, without the drm_driver_color_mode_format() 
path? Or does the drm_driver_color_mode_format() path provide some value 
for the drivers that do not currently do anything similar?

  Tomi
Daniel Stone Jan. 15, 2025, 2:34 p.m. UTC | #10
On Wed, 15 Jan 2025 at 14:20, Tomi Valkeinen
<tomi.valkeinen@ideasonboard.com> wrote:
> No disagreement there, we need CREATE_DUMB2.
>
> My point is that we have the current UAPI, and we have userspace using
> it, but we don't have clear rules what the ioctl does with specific
> parameters, and we don't document how it has to be used.
>
> Perhaps the situation is bad, and all we can really say is that
> CREATE_DUMB only works for use with simple RGB formats, and the behavior
> for all other formats is platform specific. But I think even that would
> be valuable in the UAPI docs.

Yeah, CREATE_DUMB only works for use with simple RGB formats in a
linear layout. Not monochrome or YUV or tiled or displayed rotated or
whatever.

If it happens to accidentally work for other uses, that's fine, but
it's not generically reliable for anything other than simple linear
RGB. It's intended to let you do splash screens, consoles, recovery
password entries, and software-rendered compositors if you really
want. Anything more than that isn't 'dumb'.

Cheers,
Daniel
Thomas Zimmermann Jan. 16, 2025, 8:09 a.m. UTC | #11
Hi


Am 15.01.25 um 15:20 schrieb Tomi Valkeinen:
[...]
>
> My point is that we have the current UAPI, and we have userspace using 
> it, but we don't have clear rules what the ioctl does with specific 
> parameters, and we don't document how it has to be used.
>
> Perhaps the situation is bad, and all we can really say is that 
> CREATE_DUMB only works for use with simple RGB formats, and the 
> behavior for all other formats is platform specific. But I think even 
> that would be valuable in the UAPI docs.

To be honest, I would not want to specify behavior for anything but the 
linear RGB formats. If anything, I'd take Daniel's reply mail for 
documentation as-is. Anyone stretching the UAPI beyond RGB is on their own.

>
> Thinking about this, I wonder if this change is good for omapdrm or 
> xilinx (probably other platforms too that support non-simple non-RGB 
> formats via dumb buffers): without this patch, in both drivers, the 
> pitch calculations just take the bpp as bit-per-pixels, align it up, 
> and that's it.
>
> With this patch we end up using drm_driver_color_mode_format(), and 
> aligning buffers according to RGB formats figured out via heuristics. 
> It does happen to work, for the formats I tested, but it sounds like 
> something that might easily not work, as it's doing adjustments based 
> on wrong format.
>
> Should we have another version of drm_mode_size_dumb() which just 
> calculates using the bpp, without the drm_driver_color_mode_format() 
> path? Or does the drm_driver_color_mode_format() path provide some 
> value for the drivers that do not currently do anything similar?

With the RGB-only rule, using drm_driver_color_mode_format() makes 
sense. It aligns dumb buffers and video=, provides error checking, and 
overall harmonizes code. The fallback is only required because of the 
existing odd cases that already bend the UAPI's rules.

Best regards
Thomas

>
>  Tomi
>
Laurent Pinchart Jan. 16, 2025, 8:43 a.m. UTC | #12
On Wed, Jan 15, 2025 at 02:34:26PM +0000, Daniel Stone wrote:
> On Wed, 15 Jan 2025 at 14:20, Tomi Valkeinen wrote:
> > No disagreement there, we need CREATE_DUMB2.
> >
> > My point is that we have the current UAPI, and we have userspace using
> > it, but we don't have clear rules what the ioctl does with specific
> > parameters, and we don't document how it has to be used.
> >
> > Perhaps the situation is bad, and all we can really say is that
> > CREATE_DUMB only works for use with simple RGB formats, and the behavior
> > for all other formats is platform specific. But I think even that would
> > be valuable in the UAPI docs.
> 
> Yeah, CREATE_DUMB only works for use with simple RGB formats in a
> linear layout. Not monochrome or YUV or tiled or displayed rotated or
> whatever.
> 
> If it happens to accidentally work for other uses, that's fine, but
> it's not generically reliable for anything other than simple linear
> RGB. It's intended to let you do splash screens, consoles, recovery
> password entries, and software-rendered compositors if you really
> want. Anything more than that isn't 'dumb'.

We have lots of software out there that rely on CREATE_DUMB supporting
YUV linear formats, and lots of drivers (mostly on Arm I suppose) that
implement YUV support in CREATE_DUMB. I'm fine replacing it with
something better, but I think we need a standard ioctl that can create
linear YUV buffers. I've been told many times that DRM doesn't want to
standardize buffer allocation further than what CREATE_DUMB is made for.
Can we reconsider this rule then ?
Laurent Pinchart Jan. 16, 2025, 9:38 a.m. UTC | #13
On Thu, Jan 16, 2025 at 10:43:40AM +0200, Laurent Pinchart wrote:
> On Wed, Jan 15, 2025 at 02:34:26PM +0000, Daniel Stone wrote:
> > On Wed, 15 Jan 2025 at 14:20, Tomi Valkeinen wrote:
> > > No disagreement there, we need CREATE_DUMB2.
> > >
> > > My point is that we have the current UAPI, and we have userspace using
> > > it, but we don't have clear rules what the ioctl does with specific
> > > parameters, and we don't document how it has to be used.
> > >
> > > Perhaps the situation is bad, and all we can really say is that
> > > CREATE_DUMB only works for use with simple RGB formats, and the behavior
> > > for all other formats is platform specific. But I think even that would
> > > be valuable in the UAPI docs.
> > 
> > Yeah, CREATE_DUMB only works for use with simple RGB formats in a
> > linear layout. Not monochrome or YUV or tiled or displayed rotated or
> > whatever.
> > 
> > If it happens to accidentally work for other uses, that's fine, but
> > it's not generically reliable for anything other than simple linear
> > RGB. It's intended to let you do splash screens, consoles, recovery
> > password entries, and software-rendered compositors if you really
> > want. Anything more than that isn't 'dumb'.
> 
> We have lots of software out there that rely on CREATE_DUMB supporting
> YUV linear formats, and lots of drivers (mostly on Arm I suppose) that
> implement YUV support in CREATE_DUMB. I'm fine replacing it with
> something better, but I think we need a standard ioctl that can create
> linear YUV buffers. I've been told many times that DRM doesn't want to
> standardize buffer allocation further than what CREATE_DUMB is made for.
> Can we reconsider this rule then ?

Actually... Instead of adding a CREATE_DUMB2, it would be best on trying
to leverage DMA heaps and deprecate allocating from the KMS device.
Tomi Valkeinen Jan. 16, 2025, 10:03 a.m. UTC | #14
Hi,

On 16/01/2025 10:09, Thomas Zimmermann wrote:
> Hi
> 
> 
> Am 15.01.25 um 15:20 schrieb Tomi Valkeinen:
> [...]
>>
>> My point is that we have the current UAPI, and we have userspace using 
>> it, but we don't have clear rules what the ioctl does with specific 
>> parameters, and we don't document how it has to be used.
>>
>> Perhaps the situation is bad, and all we can really say is that 
>> CREATE_DUMB only works for use with simple RGB formats, and the 
>> behavior for all other formats is platform specific. But I think even 
>> that would be valuable in the UAPI docs.
> 
> To be honest, I would not want to specify behavior for anything but the 
> linear RGB formats. If anything, I'd take Daniel's reply mail for 
> documentation as-is. Anyone stretching the UAPI beyond RGB is on their own.
> 
>>
>> Thinking about this, I wonder if this change is good for omapdrm or 
>> xilinx (probably other platforms too that support non-simple non-RGB 
>> formats via dumb buffers): without this patch, in both drivers, the 
>> pitch calculations just take the bpp as bit-per-pixels, align it up, 
>> and that's it.
>>
>> With this patch we end up using drm_driver_color_mode_format(), and 
>> aligning buffers according to RGB formats figured out via heuristics. 
>> It does happen to work, for the formats I tested, but it sounds like 
>> something that might easily not work, as it's doing adjustments based 
>> on wrong format.
>>
>> Should we have another version of drm_mode_size_dumb() which just 
>> calculates using the bpp, without the drm_driver_color_mode_format() 
>> path? Or does the drm_driver_color_mode_format() path provide some 
>> value for the drivers that do not currently do anything similar?
> 
> With the RGB-only rule, using drm_driver_color_mode_format() makes 
> sense. It aligns dumb buffers and video=, provides error checking, and 
> overall harmonizes code. The fallback is only required because of the 
> existing odd cases that already bend the UAPI's rules.

I have to disagree here.

On the platforms I have been using (omap, tidss, xilinx, rcar) the dumb 
buffers are the only buffers you can get from the DRM driver. The dumb 
buffers have been used to allocate linear and multiplanar YUV buffers 
for a very long time on those platforms.

I tried to look around, but I did not find any mentions that CREATE_DUMB 
should only be used for RGB buffers. Is anyone outside the core 
developers even aware of it?

If we don't use dumb buffers there, where do we get the buffers? Maybe 
from a v4l2 device or from a gpu device, but often you don't have those. 
DMA_HEAP is there, of course.

So we have the option to get DMA_HEAP buffers, specifying just the size 
of the buffer. Here we only specify the size, so the userspace has to 
understand the requirements for the format and the platform.

Or we can use CREATE_DUMB, specifying the width, height and 
bitsperpixel, and if we don't have any heuristics about figuring out the 
pixel format (as it has been), the end result is exactly the same as 
with DMA_HEAP (i.e. we essentially define the size of the buffer).

So, on these platforms (omap, tidss, xilinx, rcar), the CREATE_DUMB has 
always been just "give me X amount of memory that can be used for 
scanout". With this series, the meaning of the ioctl changes, and it's 
now "give me an memory buffer buffer that works with an RGB format with 
this width, height, bpp".

In practice I believe that doesn't cause regressions, as aligning 
buffers according to RGB pixel format rules happens to be fine for YUV 
formats too, but I'm not sure (and it already almost caused a regression 
with bpp=64). And I'm having trouble seeing the upside.

Aligning video= and dumb buffers almost sounds like going backwards. 
video= parameter is bad, so let's also make dumb buffers bad?

Harmonizing code is fine, but I think that can be done with a function 
that only does the fallback-case.

So... I can only speak for the platforms I'm using and maintaining, but 
I'd rather keep the old behavior for CREATE_DUMB that we've had for ages.

  Tomi
Tomi Valkeinen Jan. 16, 2025, 10:07 a.m. UTC | #15
Hi,

On 16/01/2025 11:38, Laurent Pinchart wrote:
> On Thu, Jan 16, 2025 at 10:43:40AM +0200, Laurent Pinchart wrote:
>> On Wed, Jan 15, 2025 at 02:34:26PM +0000, Daniel Stone wrote:
>>> On Wed, 15 Jan 2025 at 14:20, Tomi Valkeinen wrote:
>>>> No disagreement there, we need CREATE_DUMB2.
>>>>
>>>> My point is that we have the current UAPI, and we have userspace using
>>>> it, but we don't have clear rules what the ioctl does with specific
>>>> parameters, and we don't document how it has to be used.
>>>>
>>>> Perhaps the situation is bad, and all we can really say is that
>>>> CREATE_DUMB only works for use with simple RGB formats, and the behavior
>>>> for all other formats is platform specific. But I think even that would
>>>> be valuable in the UAPI docs.
>>>
>>> Yeah, CREATE_DUMB only works for use with simple RGB formats in a
>>> linear layout. Not monochrome or YUV or tiled or displayed rotated or
>>> whatever.
>>>
>>> If it happens to accidentally work for other uses, that's fine, but
>>> it's not generically reliable for anything other than simple linear
>>> RGB. It's intended to let you do splash screens, consoles, recovery
>>> password entries, and software-rendered compositors if you really
>>> want. Anything more than that isn't 'dumb'.
>>
>> We have lots of software out there that rely on CREATE_DUMB supporting
>> YUV linear formats, and lots of drivers (mostly on Arm I suppose) that
>> implement YUV support in CREATE_DUMB. I'm fine replacing it with
>> something better, but I think we need a standard ioctl that can create
>> linear YUV buffers. I've been told many times that DRM doesn't want to
>> standardize buffer allocation further than what CREATE_DUMB is made for.
>> Can we reconsider this rule then ?
> 
> Actually... Instead of adding a CREATE_DUMB2, it would be best on trying
> to leverage DMA heaps and deprecate allocating from the KMS device.

If we allocate from DMA heaps, I think we then need a DRM ioctl which 
will tell you how big buffer(s) you need, based on the size and format.

  Tomi
Geert Uytterhoeven Jan. 16, 2025, 10:17 a.m. UTC | #16
On Thu, Jan 16, 2025 at 11:03 AM Tomi Valkeinen
<tomi.valkeinen@ideasonboard.com> wrote:
> On 16/01/2025 10:09, Thomas Zimmermann wrote:
> > Am 15.01.25 um 15:20 schrieb Tomi Valkeinen:
> > [...]
> >>
> >> My point is that we have the current UAPI, and we have userspace using
> >> it, but we don't have clear rules what the ioctl does with specific
> >> parameters, and we don't document how it has to be used.
> >>
> >> Perhaps the situation is bad, and all we can really say is that
> >> CREATE_DUMB only works for use with simple RGB formats, and the
> >> behavior for all other formats is platform specific. But I think even
> >> that would be valuable in the UAPI docs.
> >
> > To be honest, I would not want to specify behavior for anything but the
> > linear RGB formats. If anything, I'd take Daniel's reply mail for
> > documentation as-is. Anyone stretching the UAPI beyond RGB is on their own.
> >
> >> Thinking about this, I wonder if this change is good for omapdrm or
> >> xilinx (probably other platforms too that support non-simple non-RGB
> >> formats via dumb buffers): without this patch, in both drivers, the
> >> pitch calculations just take the bpp as bit-per-pixels, align it up,
> >> and that's it.
> >>
> >> With this patch we end up using drm_driver_color_mode_format(), and
> >> aligning buffers according to RGB formats figured out via heuristics.
> >> It does happen to work, for the formats I tested, but it sounds like
> >> something that might easily not work, as it's doing adjustments based
> >> on wrong format.
> >>
> >> Should we have another version of drm_mode_size_dumb() which just
> >> calculates using the bpp, without the drm_driver_color_mode_format()
> >> path? Or does the drm_driver_color_mode_format() path provide some
> >> value for the drivers that do not currently do anything similar?
> >
> > With the RGB-only rule, using drm_driver_color_mode_format() makes
> > sense. It aligns dumb buffers and video=, provides error checking, and
> > overall harmonizes code. The fallback is only required because of the
> > existing odd cases that already bend the UAPI's rules.
>
> I have to disagree here.
>
> On the platforms I have been using (omap, tidss, xilinx, rcar) the dumb
> buffers are the only buffers you can get from the DRM driver. The dumb
> buffers have been used to allocate linear and multiplanar YUV buffers
> for a very long time on those platforms.
>
> I tried to look around, but I did not find any mentions that CREATE_DUMB
> should only be used for RGB buffers. Is anyone outside the core
> developers even aware of it?
>
> If we don't use dumb buffers there, where do we get the buffers? Maybe
> from a v4l2 device or from a gpu device, but often you don't have those.
> DMA_HEAP is there, of course.

Why can't there be a variant that takes a proper fourcc format instead of
an imprecise bpp value?

Gr{oetje,eeting}s,

                        Geert
Tomi Valkeinen Jan. 16, 2025, 10:26 a.m. UTC | #17
Hi,

On 16/01/2025 12:17, Geert Uytterhoeven wrote:
> On Thu, Jan 16, 2025 at 11:03 AM Tomi Valkeinen
> <tomi.valkeinen@ideasonboard.com> wrote:
>> On 16/01/2025 10:09, Thomas Zimmermann wrote:
>>> Am 15.01.25 um 15:20 schrieb Tomi Valkeinen:
>>> [...]
>>>>
>>>> My point is that we have the current UAPI, and we have userspace using
>>>> it, but we don't have clear rules what the ioctl does with specific
>>>> parameters, and we don't document how it has to be used.
>>>>
>>>> Perhaps the situation is bad, and all we can really say is that
>>>> CREATE_DUMB only works for use with simple RGB formats, and the
>>>> behavior for all other formats is platform specific. But I think even
>>>> that would be valuable in the UAPI docs.
>>>
>>> To be honest, I would not want to specify behavior for anything but the
>>> linear RGB formats. If anything, I'd take Daniel's reply mail for
>>> documentation as-is. Anyone stretching the UAPI beyond RGB is on their own.
>>>
>>>> Thinking about this, I wonder if this change is good for omapdrm or
>>>> xilinx (probably other platforms too that support non-simple non-RGB
>>>> formats via dumb buffers): without this patch, in both drivers, the
>>>> pitch calculations just take the bpp as bit-per-pixels, align it up,
>>>> and that's it.
>>>>
>>>> With this patch we end up using drm_driver_color_mode_format(), and
>>>> aligning buffers according to RGB formats figured out via heuristics.
>>>> It does happen to work, for the formats I tested, but it sounds like
>>>> something that might easily not work, as it's doing adjustments based
>>>> on wrong format.
>>>>
>>>> Should we have another version of drm_mode_size_dumb() which just
>>>> calculates using the bpp, without the drm_driver_color_mode_format()
>>>> path? Or does the drm_driver_color_mode_format() path provide some
>>>> value for the drivers that do not currently do anything similar?
>>>
>>> With the RGB-only rule, using drm_driver_color_mode_format() makes
>>> sense. It aligns dumb buffers and video=, provides error checking, and
>>> overall harmonizes code. The fallback is only required because of the
>>> existing odd cases that already bend the UAPI's rules.
>>
>> I have to disagree here.
>>
>> On the platforms I have been using (omap, tidss, xilinx, rcar) the dumb
>> buffers are the only buffers you can get from the DRM driver. The dumb
>> buffers have been used to allocate linear and multiplanar YUV buffers
>> for a very long time on those platforms.
>>
>> I tried to look around, but I did not find any mentions that CREATE_DUMB
>> should only be used for RGB buffers. Is anyone outside the core
>> developers even aware of it?
>>
>> If we don't use dumb buffers there, where do we get the buffers? Maybe
>> from a v4l2 device or from a gpu device, but often you don't have those.
>> DMA_HEAP is there, of course.
> 
> Why can't there be a variant that takes a proper fourcc format instead of
> an imprecise bpp value?

There can, but it's somewhat a different topic, although it's been 
covered a bit in this thread.

My specific concern here is the current CREATE_DUMB, and (not) changing 
how it behaves.

  Tomi
Dmitry Baryshkov Jan. 16, 2025, 10:35 a.m. UTC | #18
On Thu, Jan 16, 2025 at 11:17:50AM +0100, Geert Uytterhoeven wrote:
> On Thu, Jan 16, 2025 at 11:03 AM Tomi Valkeinen
> <tomi.valkeinen@ideasonboard.com> wrote:
> > On 16/01/2025 10:09, Thomas Zimmermann wrote:
> > > Am 15.01.25 um 15:20 schrieb Tomi Valkeinen:
> > > [...]
> > >>
> > >> My point is that we have the current UAPI, and we have userspace using
> > >> it, but we don't have clear rules what the ioctl does with specific
> > >> parameters, and we don't document how it has to be used.
> > >>
> > >> Perhaps the situation is bad, and all we can really say is that
> > >> CREATE_DUMB only works for use with simple RGB formats, and the
> > >> behavior for all other formats is platform specific. But I think even
> > >> that would be valuable in the UAPI docs.
> > >
> > > To be honest, I would not want to specify behavior for anything but the
> > > linear RGB formats. If anything, I'd take Daniel's reply mail for
> > > documentation as-is. Anyone stretching the UAPI beyond RGB is on their own.
> > >
> > >> Thinking about this, I wonder if this change is good for omapdrm or
> > >> xilinx (probably other platforms too that support non-simple non-RGB
> > >> formats via dumb buffers): without this patch, in both drivers, the
> > >> pitch calculations just take the bpp as bit-per-pixels, align it up,
> > >> and that's it.
> > >>
> > >> With this patch we end up using drm_driver_color_mode_format(), and
> > >> aligning buffers according to RGB formats figured out via heuristics.
> > >> It does happen to work, for the formats I tested, but it sounds like
> > >> something that might easily not work, as it's doing adjustments based
> > >> on wrong format.
> > >>
> > >> Should we have another version of drm_mode_size_dumb() which just
> > >> calculates using the bpp, without the drm_driver_color_mode_format()
> > >> path? Or does the drm_driver_color_mode_format() path provide some
> > >> value for the drivers that do not currently do anything similar?
> > >
> > > With the RGB-only rule, using drm_driver_color_mode_format() makes
> > > sense. It aligns dumb buffers and video=, provides error checking, and
> > > overall harmonizes code. The fallback is only required because of the
> > > existing odd cases that already bend the UAPI's rules.
> >
> > I have to disagree here.
> >
> > On the platforms I have been using (omap, tidss, xilinx, rcar) the dumb
> > buffers are the only buffers you can get from the DRM driver. The dumb
> > buffers have been used to allocate linear and multiplanar YUV buffers
> > for a very long time on those platforms.
> >
> > I tried to look around, but I did not find any mentions that CREATE_DUMB
> > should only be used for RGB buffers. Is anyone outside the core
> > developers even aware of it?
> >
> > If we don't use dumb buffers there, where do we get the buffers? Maybe
> > from a v4l2 device or from a gpu device, but often you don't have those.
> > DMA_HEAP is there, of course.
> 
> Why can't there be a variant that takes a proper fourcc format instead of
> an imprecise bpp value?

Backwards compatibility. We can add an IOCTL for YUV / etc. But
userspace must be able to continue allocating YUV buffers through
CREATE_DUMB.

> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> -- 
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
Daniel Stone Jan. 16, 2025, 12:24 p.m. UTC | #19
On Thu, 16 Jan 2025 at 10:35, Dmitry Baryshkov
<dmitry.baryshkov@linaro.org> wrote:
> On Thu, Jan 16, 2025 at 11:17:50AM +0100, Geert Uytterhoeven wrote:
> > On Thu, Jan 16, 2025 at 11:03 AM Tomi Valkeinen
> > <tomi.valkeinen@ideasonboard.com> wrote:
> > > On the platforms I have been using (omap, tidss, xilinx, rcar) the dumb
> > > buffers are the only buffers you can get from the DRM driver. The dumb
> > > buffers have been used to allocate linear and multiplanar YUV buffers
> > > for a very long time on those platforms.
> > >
> > > I tried to look around, but I did not find any mentions that CREATE_DUMB
> > > should only be used for RGB buffers. Is anyone outside the core
> > > developers even aware of it?
> > >
> > > If we don't use dumb buffers there, where do we get the buffers? Maybe
> > > from a v4l2 device or from a gpu device, but often you don't have those.
> > > DMA_HEAP is there, of course.
> >
> > Why can't there be a variant that takes a proper fourcc format instead of
> > an imprecise bpp value?
>
> Backwards compatibility. We can add an IOCTL for YUV / etc. But
> userspace must be able to continue allocating YUV buffers through
> CREATE_DUMB.

Right. If allocating YUYV dumb buffers works on AM68 today, then we
need to keep that working. But it doesn't mean we should go out of our
way to make CREATE_DUMB work for every YUV format on every device.

Currently, drivers are free to implement their own ioctls for anything
specific they have. But like Laurent said, standardising on heaps and
how to communicate requirements to userspace wrt heap selection / size
/ alignment / etc is imo a better path forward for something generic.

Cheers,
Daniel
Sui Jingfeng Jan. 19, 2025, 11:29 a.m. UTC | #20
Hi,

On 2025/1/16 18:35, Dmitry Baryshkov wrote:
> On Thu, Jan 16, 2025 at 11:17:50AM +0100, Geert Uytterhoeven wrote:
>> On Thu, Jan 16, 2025 at 11:03 AM Tomi Valkeinen
>> <tomi.valkeinen@ideasonboard.com> wrote:
>>> On 16/01/2025 10:09, Thomas Zimmermann wrote:
>>>> Am 15.01.25 um 15:20 schrieb Tomi Valkeinen:
>>>> [...]
>>>>> My point is that we have the current UAPI, and we have userspace using
>>>>> it, but we don't have clear rules what the ioctl does with specific
>>>>> parameters, and we don't document how it has to be used.
>>>>>
>>>>> Perhaps the situation is bad, and all we can really say is that
>>>>> CREATE_DUMB only works for use with simple RGB formats, and the
>>>>> behavior for all other formats is platform specific. But I think even
>>>>> that would be valuable in the UAPI docs.
>>>> To be honest, I would not want to specify behavior for anything but the
>>>> linear RGB formats. If anything, I'd take Daniel's reply mail for
>>>> documentation as-is. Anyone stretching the UAPI beyond RGB is on their own.
>>>>
>>>>> Thinking about this, I wonder if this change is good for omapdrm or
>>>>> xilinx (probably other platforms too that support non-simple non-RGB
>>>>> formats via dumb buffers): without this patch, in both drivers, the
>>>>> pitch calculations just take the bpp as bit-per-pixels, align it up,
>>>>> and that's it.
>>>>>
>>>>> With this patch we end up using drm_driver_color_mode_format(), and
>>>>> aligning buffers according to RGB formats figured out via heuristics.
>>>>> It does happen to work, for the formats I tested, but it sounds like
>>>>> something that might easily not work, as it's doing adjustments based
>>>>> on wrong format.
>>>>>
>>>>> Should we have another version of drm_mode_size_dumb() which just
>>>>> calculates using the bpp, without the drm_driver_color_mode_format()
>>>>> path? Or does the drm_driver_color_mode_format() path provide some
>>>>> value for the drivers that do not currently do anything similar?
>>>> With the RGB-only rule, using drm_driver_color_mode_format() makes
>>>> sense. It aligns dumb buffers and video=, provides error checking, and
>>>> overall harmonizes code. The fallback is only required because of the
>>>> existing odd cases that already bend the UAPI's rules.
>>> I have to disagree here.
>>>
>>> On the platforms I have been using (omap, tidss, xilinx, rcar) the dumb
>>> buffers are the only buffers you can get from the DRM driver. The dumb
>>> buffers have been used to allocate linear and multiplanar YUV buffers
>>> for a very long time on those platforms.
>>>
>>> I tried to look around, but I did not find any mentions that CREATE_DUMB
>>> should only be used for RGB buffers. Is anyone outside the core
>>> developers even aware of it?
>>>
>>> If we don't use dumb buffers there, where do we get the buffers? Maybe
>>> from a v4l2 device or from a gpu device, but often you don't have those.
>>> DMA_HEAP is there, of course.
>> Why can't there be a variant that takes a proper fourcc format instead of
>> an imprecise bpp value?
> Backwards compatibility. We can add an IOCTL for YUV / etc.

[...]

> But userspace must be able to continue allocating YUV buffers through
> CREATE_DUMB.

I think, allocating YUV buffers through CREATE_DUMB interface is just
an *abuse* and *misuse* of this API for now.

Take the NV12 format as an example, NV12 is YUV420 planar format, have
two planar: the Y-planar and the UV-planar. The Y-planar appear first
in memory as an array of unsigned char values. The Y-planar is followed
immediately by the UV-planar, which is also an array of unsigned char
values that contains packed U (Cb) and V (Cr) samples.

But the 'drm_mode_create_dumb' structure is only intend to provide
descriptions for *one* planar.

struct drm_mode_create_dumb {
     __u32 height;
     __u32 width;
     __u32 bpp;
     __u32 flags;
     __u32 handle;
     __u32 pitch;
     __u64 size;
};

An width x height NV12 image takes up width*height*(1 + 1/4 + 1/4) bytes.

So we can allocate an *equivalent* sized buffer to store the NV12 raw data.

Either 'width * (height * 3/2)' where each pixel take up 8 bits,
or just 'with * height' where each pixels take up 12 bits.

However, all those math are just equivalents description to the original
NV12 format, neither are concrete correct physical description.

Therefore, allocating YUV buffers through the dumb interface is just an
abuse for that API. We certainly can abuse more by allocating two dumb
buffers, one for Y-planer, another one for the UV-planer. But again,dumb buffers can be (and must be) used for *scanout* directly. What will yield if I commit the YUV buffers you allocated to the CRTC directly?

In other words, You can allocated buffers via the dumb APIs to store anything,
but the key point is that how can we interpret it.

As Daniel puts it, the semantics of that API is well defined for simple RGB
formats. Usages on non linear RGB dumb buffers are considered as undefined
behavior.

Peoples can still abusing it at the user-space though, but the kernel don't
have to guarantee that the user-space *must* to be able to continue doing
balabala..., That's it.


Best regards,
Sui

>> Gr{oetje,eeting}s,
>>
>>                          Geert
>>
>> -- 
>> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>>
>> In personal conversations with technical people, I call myself a hacker. But
>> when I'm talking to journalists I just say "programmer" or something like that.
>>                                  -- Linus Torvalds
Tomi Valkeinen Jan. 19, 2025, 12:18 p.m. UTC | #21
Hi,

On 19/01/2025 13:29, Sui Jingfeng wrote:
> Hi,
> 
> On 2025/1/16 18:35, Dmitry Baryshkov wrote:
>> On Thu, Jan 16, 2025 at 11:17:50AM +0100, Geert Uytterhoeven wrote:
>>> On Thu, Jan 16, 2025 at 11:03 AM Tomi Valkeinen
>>> <tomi.valkeinen@ideasonboard.com> wrote:
>>>> On 16/01/2025 10:09, Thomas Zimmermann wrote:
>>>>> Am 15.01.25 um 15:20 schrieb Tomi Valkeinen:
>>>>> [...]
>>>>>> My point is that we have the current UAPI, and we have userspace 
>>>>>> using
>>>>>> it, but we don't have clear rules what the ioctl does with specific
>>>>>> parameters, and we don't document how it has to be used.
>>>>>>
>>>>>> Perhaps the situation is bad, and all we can really say is that
>>>>>> CREATE_DUMB only works for use with simple RGB formats, and the
>>>>>> behavior for all other formats is platform specific. But I think even
>>>>>> that would be valuable in the UAPI docs.
>>>>> To be honest, I would not want to specify behavior for anything but 
>>>>> the
>>>>> linear RGB formats. If anything, I'd take Daniel's reply mail for
>>>>> documentation as-is. Anyone stretching the UAPI beyond RGB is on 
>>>>> their own.
>>>>>
>>>>>> Thinking about this, I wonder if this change is good for omapdrm or
>>>>>> xilinx (probably other platforms too that support non-simple non-RGB
>>>>>> formats via dumb buffers): without this patch, in both drivers, the
>>>>>> pitch calculations just take the bpp as bit-per-pixels, align it up,
>>>>>> and that's it.
>>>>>>
>>>>>> With this patch we end up using drm_driver_color_mode_format(), and
>>>>>> aligning buffers according to RGB formats figured out via heuristics.
>>>>>> It does happen to work, for the formats I tested, but it sounds like
>>>>>> something that might easily not work, as it's doing adjustments based
>>>>>> on wrong format.
>>>>>>
>>>>>> Should we have another version of drm_mode_size_dumb() which just
>>>>>> calculates using the bpp, without the drm_driver_color_mode_format()
>>>>>> path? Or does the drm_driver_color_mode_format() path provide some
>>>>>> value for the drivers that do not currently do anything similar?
>>>>> With the RGB-only rule, using drm_driver_color_mode_format() makes
>>>>> sense. It aligns dumb buffers and video=, provides error checking, and
>>>>> overall harmonizes code. The fallback is only required because of the
>>>>> existing odd cases that already bend the UAPI's rules.
>>>> I have to disagree here.
>>>>
>>>> On the platforms I have been using (omap, tidss, xilinx, rcar) the dumb
>>>> buffers are the only buffers you can get from the DRM driver. The dumb
>>>> buffers have been used to allocate linear and multiplanar YUV buffers
>>>> for a very long time on those platforms.
>>>>
>>>> I tried to look around, but I did not find any mentions that 
>>>> CREATE_DUMB
>>>> should only be used for RGB buffers. Is anyone outside the core
>>>> developers even aware of it?
>>>>
>>>> If we don't use dumb buffers there, where do we get the buffers? Maybe
>>>> from a v4l2 device or from a gpu device, but often you don't have 
>>>> those.
>>>> DMA_HEAP is there, of course.
>>> Why can't there be a variant that takes a proper fourcc format 
>>> instead of
>>> an imprecise bpp value?
>> Backwards compatibility. We can add an IOCTL for YUV / etc.
> 
> [...]
> 
>> But userspace must be able to continue allocating YUV buffers through
>> CREATE_DUMB.
> 
> I think, allocating YUV buffers through CREATE_DUMB interface is just
> an *abuse* and *misuse* of this API for now.
> 
> Take the NV12 format as an example, NV12 is YUV420 planar format, have
> two planar: the Y-planar and the UV-planar. The Y-planar appear first
> in memory as an array of unsigned char values. The Y-planar is followed
> immediately by the UV-planar, which is also an array of unsigned char
> values that contains packed U (Cb) and V (Cr) samples.
> 
> But the 'drm_mode_create_dumb' structure is only intend to provide
> descriptions for *one* planar.
> 
> struct drm_mode_create_dumb {
>      __u32 height;
>      __u32 width;
>      __u32 bpp;
>      __u32 flags;
>      __u32 handle;
>      __u32 pitch;
>      __u64 size;
> };
> 
> An width x height NV12 image takes up width*height*(1 + 1/4 + 1/4) bytes.
> 
> So we can allocate an *equivalent* sized buffer to store the NV12 raw data.
> 
> Either 'width * (height * 3/2)' where each pixel take up 8 bits,
> or just 'with * height' where each pixels take up 12 bits.
> 
> However, all those math are just equivalents description to the original
> NV12 format, neither are concrete correct physical description.

I don't see the problem. Allocating dumb buffers, if we don't have any 
heuristics related to RGB behind it, is essentially just allocating a 
specific amount of memory, defined by width, height and bitsperpixel.

If I want to create an NV12 framebuffer, I allocate two dumb buffers, 
one for Y and one for UV planes, and size them accordingly. And then 
create the DRM framebuffer with those.

> Therefore, allocating YUV buffers through the dumb interface is just an
> abuse for that API. We certainly can abuse more by allocating two dumb
> buffers, one for Y-planer, another one for the UV-planer. But again,dumb 
> buffers can be (and must be) used for *scanout* directly. What will 
> yield if I commit the YUV buffers you allocated to the CRTC directly?

You'll see it on the screen? I don't understand your point here...

> In other words, You can allocated buffers via the dumb APIs to store 
> anything,
> but the key point is that how can we interpret it.
> 
> As Daniel puts it, the semantics of that API is well defined for simple RGB
> formats. Usages on non linear RGB dumb buffers are considered as undefined
> behavior.
> 
> Peoples can still abusing it at the user-space though, but the kernel don't
> have to guarantee that the user-space *must* to be able to continue doing
> balabala..., That's it.

I have hard time understanding the "abuse" argument. But in any case, 
the API has been working like this for who knows how long, and used 
widely (afaik). The question is, do we break it or not. Granted, this 
series doesn't break it as such, but it adds heuristics that wasn't 
there before, and it could affect the behavior. If we still want to do 
that, I want to understand what is the benefit, because there's a 
potential to cause regressions.

  Tomi
Sui Jingfeng Jan. 19, 2025, 2:59 p.m. UTC | #22
Hi,

On 2025/1/19 20:18, Tomi Valkeinen wrote:
> Hi,
>
> On 19/01/2025 13:29, Sui Jingfeng wrote:
>> Hi,
>>
>> On 2025/1/16 18:35, Dmitry Baryshkov wrote:
>>> On Thu, Jan 16, 2025 at 11:17:50AM +0100, Geert Uytterhoeven wrote:
>>>> On Thu, Jan 16, 2025 at 11:03 AM Tomi Valkeinen
>>>> <tomi.valkeinen@ideasonboard.com> wrote:
>>>>> On 16/01/2025 10:09, Thomas Zimmermann wrote:
>>>>>> Am 15.01.25 um 15:20 schrieb Tomi Valkeinen:
>>>>>> [...]
>>>>>>> My point is that we have the current UAPI, and we have userspace 
>>>>>>> using
>>>>>>> it, but we don't have clear rules what the ioctl does with specific
>>>>>>> parameters, and we don't document how it has to be used.
>>>>>>>
>>>>>>> Perhaps the situation is bad, and all we can really say is that
>>>>>>> CREATE_DUMB only works for use with simple RGB formats, and the
>>>>>>> behavior for all other formats is platform specific. But I think 
>>>>>>> even
>>>>>>> that would be valuable in the UAPI docs.
>>>>>> To be honest, I would not want to specify behavior for anything 
>>>>>> but the
>>>>>> linear RGB formats. If anything, I'd take Daniel's reply mail for
>>>>>> documentation as-is. Anyone stretching the UAPI beyond RGB is on 
>>>>>> their own.
>>>>>>
>>>>>>> Thinking about this, I wonder if this change is good for omapdrm or
>>>>>>> xilinx (probably other platforms too that support non-simple 
>>>>>>> non-RGB
>>>>>>> formats via dumb buffers): without this patch, in both drivers, the
>>>>>>> pitch calculations just take the bpp as bit-per-pixels, align it 
>>>>>>> up,
>>>>>>> and that's it.
>>>>>>>
>>>>>>> With this patch we end up using drm_driver_color_mode_format(), and
>>>>>>> aligning buffers according to RGB formats figured out via 
>>>>>>> heuristics.
>>>>>>> It does happen to work, for the formats I tested, but it sounds 
>>>>>>> like
>>>>>>> something that might easily not work, as it's doing adjustments 
>>>>>>> based
>>>>>>> on wrong format.
>>>>>>>
>>>>>>> Should we have another version of drm_mode_size_dumb() which just
>>>>>>> calculates using the bpp, without the 
>>>>>>> drm_driver_color_mode_format()
>>>>>>> path? Or does the drm_driver_color_mode_format() path provide some
>>>>>>> value for the drivers that do not currently do anything similar?
>>>>>> With the RGB-only rule, using drm_driver_color_mode_format() makes
>>>>>> sense. It aligns dumb buffers and video=, provides error 
>>>>>> checking, and
>>>>>> overall harmonizes code. The fallback is only required because of 
>>>>>> the
>>>>>> existing odd cases that already bend the UAPI's rules.
>>>>> I have to disagree here.
>>>>>
>>>>> On the platforms I have been using (omap, tidss, xilinx, rcar) the 
>>>>> dumb
>>>>> buffers are the only buffers you can get from the DRM driver. The 
>>>>> dumb
>>>>> buffers have been used to allocate linear and multiplanar YUV buffers
>>>>> for a very long time on those platforms.
>>>>>
>>>>> I tried to look around, but I did not find any mentions that 
>>>>> CREATE_DUMB
>>>>> should only be used for RGB buffers. Is anyone outside the core
>>>>> developers even aware of it?
>>>>>
>>>>> If we don't use dumb buffers there, where do we get the buffers? 
>>>>> Maybe
>>>>> from a v4l2 device or from a gpu device, but often you don't have 
>>>>> those.
>>>>> DMA_HEAP is there, of course.
>>>> Why can't there be a variant that takes a proper fourcc format 
>>>> instead of
>>>> an imprecise bpp value?
>>> Backwards compatibility. We can add an IOCTL for YUV / etc.
>>
>> [...]
>>
>>> But userspace must be able to continue allocating YUV buffers through
>>> CREATE_DUMB.
>>
>> I think, allocating YUV buffers through CREATE_DUMB interface is just
>> an *abuse* and *misuse* of this API for now.
>>
>> Take the NV12 format as an example, NV12 is YUV420 planar format, have
>> two planar: the Y-planar and the UV-planar. The Y-planar appear first
>> in memory as an array of unsigned char values. The Y-planar is followed
>> immediately by the UV-planar, which is also an array of unsigned char
>> values that contains packed U (Cb) and V (Cr) samples.
>>
>> But the 'drm_mode_create_dumb' structure is only intend to provide
>> descriptions for *one* planar.
>>
>> struct drm_mode_create_dumb {
>>      __u32 height;
>>      __u32 width;
>>      __u32 bpp;
>>      __u32 flags;
>>      __u32 handle;
>>      __u32 pitch;
>>      __u64 size;
>> };
>>
>> An width x height NV12 image takes up width*height*(1 + 1/4 + 1/4) 
>> bytes.
>>
>> So we can allocate an *equivalent* sized buffer to store the NV12 raw 
>> data.
>>
>> Either 'width * (height * 3/2)' where each pixel take up 8 bits,
>> or just 'with * height' where each pixels take up 12 bits.
>>
>> However, all those math are just equivalents description to the original
>> NV12 format, neither are concrete correct physical description.
>
> I don't see the problem. Allocating dumb buffers, if we don't have any 
> heuristics related to RGB behind it, is essentially just allocating a 
> specific amount of memory, defined by width, height and bitsperpixel.
>
I think, the problem will be that the 'width', 'height' and 'bpp'
are originally used to describe one plane. Those three parameters
has perfectly defined physical semantics.

But with multi planar formats, take NV12 image as an example,
for a 2×2 square of pixels, there are 4 Y samples but only 1 U
sample and 1 V sample. This format requires 4x8+1x8+1x8=48 bits
to store the 2x2 square.

So its depth is 12 bits per pixel (48 / (2 * 2)).

so my problem is that the mentioned 12bpp in this example only
make sense in mathematics, it doesn't has a good physical
interpret. Do you agree with me on this technique point?
     

> If I want to create an NV12 framebuffer, I allocate two dumb buffers, 
> one for Y and one for UV planes, and size them accordingly. And then 
> create the DRM framebuffer with those.
>
Then how you fill the value of the 'width', 'height' and 'bpp' of each dumb buffers?

Why not allocate storage for the whole on one shoot?

The modetest in libdrm can be an good example, send it[1] to you as an reference.

[1] https://gitlab.freedesktop.org/mesa/drm/-/blob/main/tests/modetest/buffers.c?ref_type=heads#L114
Tomi Valkeinen Jan. 19, 2025, 3:22 p.m. UTC | #23
On 19/01/2025 16:59, Sui Jingfeng wrote:

>>>> But userspace must be able to continue allocating YUV buffers through
>>>> CREATE_DUMB.
>>>
>>> I think, allocating YUV buffers through CREATE_DUMB interface is just
>>> an *abuse* and *misuse* of this API for now.
>>>
>>> Take the NV12 format as an example, NV12 is YUV420 planar format, have
>>> two planar: the Y-planar and the UV-planar. The Y-planar appear first
>>> in memory as an array of unsigned char values. The Y-planar is followed
>>> immediately by the UV-planar, which is also an array of unsigned char
>>> values that contains packed U (Cb) and V (Cr) samples.
>>>
>>> But the 'drm_mode_create_dumb' structure is only intend to provide
>>> descriptions for *one* planar.
>>>
>>> struct drm_mode_create_dumb {
>>>      __u32 height;
>>>      __u32 width;
>>>      __u32 bpp;
>>>      __u32 flags;
>>>      __u32 handle;
>>>      __u32 pitch;
>>>      __u64 size;
>>> };
>>>
>>> An width x height NV12 image takes up width*height*(1 + 1/4 + 1/4) 
>>> bytes.
>>>
>>> So we can allocate an *equivalent* sized buffer to store the NV12 raw 
>>> data.
>>>
>>> Either 'width * (height * 3/2)' where each pixel take up 8 bits,
>>> or just 'with * height' where each pixels take up 12 bits.
>>>
>>> However, all those math are just equivalents description to the original
>>> NV12 format, neither are concrete correct physical description.
>>
>> I don't see the problem. Allocating dumb buffers, if we don't have any 
>> heuristics related to RGB behind it, is essentially just allocating a 
>> specific amount of memory, defined by width, height and bitsperpixel.
>>
> I think, the problem will be that the 'width', 'height' and 'bpp'
> are originally used to describe one plane. Those three parameters
> has perfectly defined physical semantics.
> 
> But with multi planar formats, take NV12 image as an example,
> for a 2×2 square of pixels, there are 4 Y samples but only 1 U
> sample and 1 V sample. This format requires 4x8+1x8+1x8=48 bits
> to store the 2x2 square.
> 
> So its depth is 12 bits per pixel (48 / (2 * 2)).
> 
> so my problem is that the mentioned 12bpp in this example only
> make sense in mathematics, it doesn't has a good physical
> interpret. Do you agree with me on this technique point?
> 
>> If I want to create an NV12 framebuffer, I allocate two dumb buffers, 
>> one for Y and one for UV planes, and size them accordingly. And then 
>> create the DRM framebuffer with those.
>>
> Then how you fill the value of the 'width', 'height' and 'bpp' of each 
> dumb buffers?

For 640x480-NV12:
plane 0: width = 640, height = 480, bpp = 8
plane 1: width = 640 / 2, height = 480 / 2, bpp = 16

> Why not allocate storage for the whole on one shoot?

You can, if you adjust the parameters accordingly. However, if the 
strides of the planes are not equal, I guess it might cause problems on 
some platforms.

But I think it's usually simpler to allocate one buffer per plane, and 
perhaps even better as it doesn't require as large contiguous memory area.

> The modetest in libdrm can be an good example, send it[1] to you as an 
> reference.

Right, so modetest already does it successfully. So... What is the issue?

Everyone agrees that CREATE_DUMB is not the best ioctl to allocate 
buffers, and one can't consider it to work identically across the 
platforms. But it's what we have and what has been used for ages.

  Tomi
Sui Jingfeng Jan. 19, 2025, 4:26 p.m. UTC | #24
Hi,

On 2025/1/19 23:22, Tomi Valkeinen wrote:
> On 19/01/2025 16:59, Sui Jingfeng wrote:
>
>>>>> But userspace must be able to continue allocating YUV buffers through
>>>>> CREATE_DUMB.
>>>>
>>>> I think, allocating YUV buffers through CREATE_DUMB interface is just
>>>> an *abuse* and *misuse* of this API for now.
>>>>
>>>> Take the NV12 format as an example, NV12 is YUV420 planar format, have
>>>> two planar: the Y-planar and the UV-planar. The Y-planar appear first
>>>> in memory as an array of unsigned char values. The Y-planar is 
>>>> followed
>>>> immediately by the UV-planar, which is also an array of unsigned char
>>>> values that contains packed U (Cb) and V (Cr) samples.
>>>>
>>>> But the 'drm_mode_create_dumb' structure is only intend to provide
>>>> descriptions for *one* planar.
>>>>
>>>> struct drm_mode_create_dumb {
>>>>      __u32 height;
>>>>      __u32 width;
>>>>      __u32 bpp;
>>>>      __u32 flags;
>>>>      __u32 handle;
>>>>      __u32 pitch;
>>>>      __u64 size;
>>>> };
>>>>
>>>> An width x height NV12 image takes up width*height*(1 + 1/4 + 1/4) 
>>>> bytes.
>>>>
>>>> So we can allocate an *equivalent* sized buffer to store the NV12 
>>>> raw data.
>>>>
>>>> Either 'width * (height * 3/2)' where each pixel take up 8 bits,
>>>> or just 'with * height' where each pixels take up 12 bits.
>>>>
>>>> However, all those math are just equivalents description to the 
>>>> original
>>>> NV12 format, neither are concrete correct physical description.
>>>
>>> I don't see the problem. Allocating dumb buffers, if we don't have 
>>> any heuristics related to RGB behind it, is essentially just 
>>> allocating a specific amount of memory, defined by width, height and 
>>> bitsperpixel.
>>>
>> I think, the problem will be that the 'width', 'height' and 'bpp'
>> are originally used to describe one plane. Those three parameters
>> has perfectly defined physical semantics.
>>
>> But with multi planar formats, take NV12 image as an example,
>> for a 2×2 square of pixels, there are 4 Y samples but only 1 U
>> sample and 1 V sample. This format requires 4x8+1x8+1x8=48 bits
>> to store the 2x2 square.
>>
>> So its depth is 12 bits per pixel (48 / (2 * 2)).
>>
>> so my problem is that the mentioned 12bpp in this example only
>> make sense in mathematics, it doesn't has a good physical
>> interpret. Do you agree with me on this technique point?
>>
>>> If I want to create an NV12 framebuffer, I allocate two dumb 
>>> buffers, one for Y and one for UV planes, and size them accordingly. 
>>> And then create the DRM framebuffer with those.
>>>
>> Then how you fill the value of the 'width', 'height' and 'bpp' of 
>> each dumb buffers?
>
> For 640x480-NV12:
> plane 0: width = 640, height = 480, bpp = 8
> plane 1: width = 640 / 2, height = 480 / 2, bpp = 16
>
But i think this should be hardware dependent. The hardware I'm using
load NV12  raw data as a whole. I only need to feed gpuva of the backing
memory to the hardware register once.

Not familiar with your hardware, so I can't talk more on this software
design. Perhaps someone know more could have a comment on this.

>> Why not allocate storage for the whole on one shoot?
>
> You can, if you adjust the parameters accordingly. However, if the 
> strides of the planes are not equal, I guess it might cause problems 
> on some platforms.
>
> But I think it's usually simpler to allocate one buffer per plane, and 
> perhaps even better as it doesn't require as large contiguous memory 
> area.
>
>> The modetest in libdrm can be an good example, send it[1] to you as 
>> an reference.
>
> Right, so modetest already does it successfully. So... What is the issue?
>
But then, the problem will become that it override the 'height' parameter.
What's the physical interpretation of the 'height' parameter when creating
an NV12 image with the dump API then?

I guess, solving complex problems with simple APIs may see the limitation,
sooner or later. But I not very sure and might be wrong. So other peoples
can override me words.


> Everyone agrees that CREATE_DUMB is not the best ioctl to allocate 
> buffers, and one can't consider it to work identically across the 
> platforms. But it's what we have and what has been used for ages.
>
Yeah, your request are not unreasonable. It can be seen as a kind of rigid demand.
Since GEM DMA helpers doesn't export an more advanced interface to userspace so far.
As a result, drivers that employing GEM DMA has no other choice, but to abuse the
dumb buffer API to do allocation for the more complex format buffers.

The dumb buffer API doesn't support to specify buffer format, tile status and
placement etc. The more advance drivers has been exposed the xxx_create_gem()
to user-space. It seems that a few more experienced programmers hint us to
create an new ioctl at above thread, so that we can keep employing simple API
to do simple things and to suit complex needs with the more advanced APIs.


>  Tomi
>
Laurent Pinchart Jan. 19, 2025, 5:08 p.m. UTC | #25
On Thu, Jan 16, 2025 at 12:07:49PM +0200, Tomi Valkeinen wrote:
> On 16/01/2025 11:38, Laurent Pinchart wrote:
> > On Thu, Jan 16, 2025 at 10:43:40AM +0200, Laurent Pinchart wrote:
> >> On Wed, Jan 15, 2025 at 02:34:26PM +0000, Daniel Stone wrote:
> >>> On Wed, 15 Jan 2025 at 14:20, Tomi Valkeinen wrote:
> >>>> No disagreement there, we need CREATE_DUMB2.
> >>>>
> >>>> My point is that we have the current UAPI, and we have userspace using
> >>>> it, but we don't have clear rules what the ioctl does with specific
> >>>> parameters, and we don't document how it has to be used.
> >>>>
> >>>> Perhaps the situation is bad, and all we can really say is that
> >>>> CREATE_DUMB only works for use with simple RGB formats, and the behavior
> >>>> for all other formats is platform specific. But I think even that would
> >>>> be valuable in the UAPI docs.
> >>>
> >>> Yeah, CREATE_DUMB only works for use with simple RGB formats in a
> >>> linear layout. Not monochrome or YUV or tiled or displayed rotated or
> >>> whatever.
> >>>
> >>> If it happens to accidentally work for other uses, that's fine, but
> >>> it's not generically reliable for anything other than simple linear
> >>> RGB. It's intended to let you do splash screens, consoles, recovery
> >>> password entries, and software-rendered compositors if you really
> >>> want. Anything more than that isn't 'dumb'.
> >>
> >> We have lots of software out there that rely on CREATE_DUMB supporting
> >> YUV linear formats, and lots of drivers (mostly on Arm I suppose) that
> >> implement YUV support in CREATE_DUMB. I'm fine replacing it with
> >> something better, but I think we need a standard ioctl that can create
> >> linear YUV buffers. I've been told many times that DRM doesn't want to
> >> standardize buffer allocation further than what CREATE_DUMB is made for.
> >> Can we reconsider this rule then ?
> > 
> > Actually... Instead of adding a CREATE_DUMB2, it would be best on trying
> > to leverage DMA heaps and deprecate allocating from the KMS device.
> 
> If we allocate from DMA heaps, I think we then need a DRM ioctl which 
> will tell you how big buffer(s) you need, based on the size and format.

We will at least a kernel API to expose constraints to userspace.
Whether that should calculate the buffer size for a given format, or
expose information to userspace to enable that calculation, I'm not
sure. But regardless of which option we pick, I agree we likely need a
new API to enable usage of DMA heaps as allocators.
Laurent Pinchart Jan. 19, 2025, 8:14 p.m. UTC | #26
On Mon, Jan 20, 2025 at 12:26:30AM +0800, Sui Jingfeng wrote:
> On 2025/1/19 23:22, Tomi Valkeinen wrote:
> > On 19/01/2025 16:59, Sui Jingfeng wrote:
> >
> >>>>> But userspace must be able to continue allocating YUV buffers through
> >>>>> CREATE_DUMB.
> >>>>
> >>>> I think, allocating YUV buffers through CREATE_DUMB interface is just
> >>>> an *abuse* and *misuse* of this API for now.
> >>>>
> >>>> Take the NV12 format as an example, NV12 is YUV420 planar format, have
> >>>> two planar: the Y-planar and the UV-planar. The Y-planar appear first
> >>>> in memory as an array of unsigned char values. The Y-planar is followed
> >>>> immediately by the UV-planar, which is also an array of unsigned char
> >>>> values that contains packed U (Cb) and V (Cr) samples.
> >>>>
> >>>> But the 'drm_mode_create_dumb' structure is only intend to provide
> >>>> descriptions for *one* planar.
> >>>>
> >>>> struct drm_mode_create_dumb {
> >>>>      __u32 height;
> >>>>      __u32 width;
> >>>>      __u32 bpp;
> >>>>      __u32 flags;
> >>>>      __u32 handle;
> >>>>      __u32 pitch;
> >>>>      __u64 size;
> >>>> };
> >>>>
> >>>> An width x height NV12 image takes up width*height*(1 + 1/4 + 1/4) 
> >>>> bytes.
> >>>>
> >>>> So we can allocate an *equivalent* sized buffer to store the NV12 
> >>>> raw data.
> >>>>
> >>>> Either 'width * (height * 3/2)' where each pixel take up 8 bits,
> >>>> or just 'with * height' where each pixels take up 12 bits.
> >>>>
> >>>> However, all those math are just equivalents description to the original
> >>>> NV12 format, neither are concrete correct physical description.
> >>>
> >>> I don't see the problem. Allocating dumb buffers, if we don't have 
> >>> any heuristics related to RGB behind it, is essentially just 
> >>> allocating a specific amount of memory, defined by width, height and 
> >>> bitsperpixel.
> >>>
> >> I think, the problem will be that the 'width', 'height' and 'bpp'
> >> are originally used to describe one plane. Those three parameters
> >> has perfectly defined physical semantics.
> >>
> >> But with multi planar formats, take NV12 image as an example,
> >> for a 2×2 square of pixels, there are 4 Y samples but only 1 U
> >> sample and 1 V sample. This format requires 4x8+1x8+1x8=48 bits
> >> to store the 2x2 square.
> >>
> >> So its depth is 12 bits per pixel (48 / (2 * 2)).
> >>
> >> so my problem is that the mentioned 12bpp in this example only
> >> make sense in mathematics, it doesn't has a good physical
> >> interpret. Do you agree with me on this technique point?
> >>
> >>> If I want to create an NV12 framebuffer, I allocate two dumb 
> >>> buffers, one for Y and one for UV planes, and size them accordingly. 
> >>> And then create the DRM framebuffer with those.
> >>>
> >> Then how you fill the value of the 'width', 'height' and 'bpp' of 
> >> each dumb buffers?
> >
> > For 640x480-NV12:
> > plane 0: width = 640, height = 480, bpp = 8
> > plane 1: width = 640 / 2, height = 480 / 2, bpp = 16
>
> But i think this should be hardware dependent. The hardware I'm using
> load NV12  raw data as a whole. I only need to feed gpuva of the backing
> memory to the hardware register once.
> 
> Not familiar with your hardware, so I can't talk more on this software
> design. Perhaps someone know more could have a comment on this.

Layout of planes in memory is just one hardware constraint, the same way
we have constraints on alignment and strides. Some devices require the
planes to be contiguous (likely with some alignment constraints), some
can work with planes being in discontiguous pieces of memory, and even
require them to be discontiguous and located in separate DRAM banks.

> >> Why not allocate storage for the whole on one shoot?
> >
> > You can, if you adjust the parameters accordingly. However, if the 
> > strides of the planes are not equal, I guess it might cause problems 
> > on some platforms.
> >
> > But I think it's usually simpler to allocate one buffer per plane, and 
> > perhaps even better as it doesn't require as large contiguous memory 
> > area.
> >
> >> The modetest in libdrm can be an good example, send it[1] to you as 
> >> an reference.
> >
> > Right, so modetest already does it successfully. So... What is the issue?
>
> But then, the problem will become that it override the 'height' parameter.
> What's the physical interpretation of the 'height' parameter when creating
> an NV12 image with the dump API then?

I wouldn't be too concerned about physical interpretations. Yes, the
height, width and bpp parameters were likely designed with RGB formats
in mind. Yes, using DUMB_CREATE for YUV formats is probably something
that the original authors didn't envision. And yes, from that point of
view, it could be seen by the original authors as an abuse of the API.
But I don't think that's a problem as such.

An API is just an API. True, it would be nicer if the usage of the ioctl
parameters was more intuitive for YUV formats, but I believe we could
still standardize how the existing parameters map to linear scanout YUV
formats without causing the world to end. As has been said before, lots
of drivers are using DUMB_CREATE for this purpose, and we can't change
that.

This doesn't mean we shouldn't work on improving memory allocation, but
I see that as a separate issue.

> I guess, solving complex problems with simple APIs may see the limitation,
> sooner or later. But I not very sure and might be wrong. So other peoples
> can override me words.
> 
> > Everyone agrees that CREATE_DUMB is not the best ioctl to allocate 
> > buffers, and one can't consider it to work identically across the 
> > platforms. But it's what we have and what has been used for ages.
>
> Yeah, your request are not unreasonable. It can be seen as a kind of rigid demand.
> Since GEM DMA helpers doesn't export an more advanced interface to userspace so far.
> As a result, drivers that employing GEM DMA has no other choice, but to abuse the
> dumb buffer API to do allocation for the more complex format buffers.
> 
> The dumb buffer API doesn't support to specify buffer format, tile status and
> placement etc. The more advance drivers has been exposed the xxx_create_gem()
> to user-space. It seems that a few more experienced programmers hint us to
> create an new ioctl at above thread, so that we can keep employing simple API
> to do simple things and to suit complex needs with the more advanced APIs.

I'd really like to explore adding new ioctls to exposure memory
allocation constraints, and allocating the memory itself from DMA heaps.
Sui Jingfeng Jan. 20, 2025, 3:34 a.m. UTC | #27
Hi,

On 2025/1/16 18:17, Geert Uytterhoeven wrote:
> On Thu, Jan 16, 2025 at 11:03 AM Tomi Valkeinen
> <tomi.valkeinen@ideasonboard.com> wrote:
>> On 16/01/2025 10:09, Thomas Zimmermann wrote:
>>> Am 15.01.25 um 15:20 schrieb Tomi Valkeinen:
>>> [...]
>>>> My point is that we have the current UAPI, and we have userspace using
>>>> it, but we don't have clear rules what the ioctl does with specific
>>>> parameters, and we don't document how it has to be used.
>>>>
>>>> Perhaps the situation is bad, and all we can really say is that
>>>> CREATE_DUMB only works for use with simple RGB formats, and the
>>>> behavior for all other formats is platform specific. But I think even
>>>> that would be valuable in the UAPI docs.
>>> To be honest, I would not want to specify behavior for anything but the
>>> linear RGB formats. If anything, I'd take Daniel's reply mail for
>>> documentation as-is. Anyone stretching the UAPI beyond RGB is on their own.
>>>
>>>> Thinking about this, I wonder if this change is good for omapdrm or
>>>> xilinx (probably other platforms too that support non-simple non-RGB
>>>> formats via dumb buffers): without this patch, in both drivers, the
>>>> pitch calculations just take the bpp as bit-per-pixels, align it up,
>>>> and that's it.
>>>>
>>>> With this patch we end up using drm_driver_color_mode_format(), and
>>>> aligning buffers according to RGB formats figured out via heuristics.
>>>> It does happen to work, for the formats I tested, but it sounds like
>>>> something that might easily not work, as it's doing adjustments based
>>>> on wrong format.
>>>>
>>>> Should we have another version of drm_mode_size_dumb() which just
>>>> calculates using the bpp, without the drm_driver_color_mode_format()
>>>> path? Or does the drm_driver_color_mode_format() path provide some
>>>> value for the drivers that do not currently do anything similar?
>>> With the RGB-only rule, using drm_driver_color_mode_format() makes
>>> sense. It aligns dumb buffers and video=, provides error checking, and
>>> overall harmonizes code. The fallback is only required because of the
>>> existing odd cases that already bend the UAPI's rules.
>> I have to disagree here.
>>
>> On the platforms I have been using (omap, tidss, xilinx, rcar) the dumb
>> buffers are the only buffers you can get from the DRM driver. The dumb
>> buffers have been used to allocate linear and multiplanar YUV buffers
>> for a very long time on those platforms.
>>
>> I tried to look around, but I did not find any mentions that CREATE_DUMB
>> should only be used for RGB buffers. Is anyone outside the core
>> developers even aware of it?
>>
>> If we don't use dumb buffers there, where do we get the buffers? Maybe
>> from a v4l2 device or from a gpu device, but often you don't have those.
>> DMA_HEAP is there, of course.
> Why can't there be a variant that takes a proper fourcc format instead of
> an imprecise bpp value?

The 'flags' parameter of the 'struct drm_mode_create_dumb' doesn't gets
in used so far, I guess the situation will be much better if passing a
correct fourcc code from the user-space to kernel is allowed.


> Gr{oetje,eeting}s,
>
>                          Geert
>
Sui Jingfeng Jan. 20, 2025, 7:04 a.m. UTC | #28
Hi,

On 2025/1/20 04:14, Laurent Pinchart wrote:
> On Mon, Jan 20, 2025 at 12:26:30AM +0800, Sui Jingfeng wrote:
>> On 2025/1/19 23:22, Tomi Valkeinen wrote:
>>> On 19/01/2025 16:59, Sui Jingfeng wrote:
>>>
>>>>>>> But userspace must be able to continue allocating YUV buffers through
>>>>>>> CREATE_DUMB.
>>>>>> I think, allocating YUV buffers through CREATE_DUMB interface is just
>>>>>> an *abuse* and *misuse* of this API for now.
>>>>>>
>>>>>> Take the NV12 format as an example, NV12 is YUV420 planar format, have
>>>>>> two planar: the Y-planar and the UV-planar. The Y-planar appear first
>>>>>> in memory as an array of unsigned char values. The Y-planar is followed
>>>>>> immediately by the UV-planar, which is also an array of unsigned char
>>>>>> values that contains packed U (Cb) and V (Cr) samples.
>>>>>>
>>>>>> But the 'drm_mode_create_dumb' structure is only intend to provide
>>>>>> descriptions for *one* planar.
>>>>>>
>>>>>> struct drm_mode_create_dumb {
>>>>>>       __u32 height;
>>>>>>       __u32 width;
>>>>>>       __u32 bpp;
>>>>>>       __u32 flags;
>>>>>>       __u32 handle;
>>>>>>       __u32 pitch;
>>>>>>       __u64 size;
>>>>>> };
>>>>>>
>>>>>> An width x height NV12 image takes up width*height*(1 + 1/4 + 1/4)
>>>>>> bytes.
>>>>>>
>>>>>> So we can allocate an *equivalent* sized buffer to store the NV12
>>>>>> raw data.
>>>>>>
>>>>>> Either 'width * (height * 3/2)' where each pixel take up 8 bits,
>>>>>> or just 'with * height' where each pixels take up 12 bits.
>>>>>>
>>>>>> However, all those math are just equivalents description to the original
>>>>>> NV12 format, neither are concrete correct physical description.
>>>>> I don't see the problem. Allocating dumb buffers, if we don't have
>>>>> any heuristics related to RGB behind it, is essentially just
>>>>> allocating a specific amount of memory, defined by width, height and
>>>>> bitsperpixel.
>>>>>
>>>> I think, the problem will be that the 'width', 'height' and 'bpp'
>>>> are originally used to describe one plane. Those three parameters
>>>> has perfectly defined physical semantics.
>>>>
>>>> But with multi planar formats, take NV12 image as an example,
>>>> for a 2×2 square of pixels, there are 4 Y samples but only 1 U
>>>> sample and 1 V sample. This format requires 4x8+1x8+1x8=48 bits
>>>> to store the 2x2 square.
>>>>
>>>> So its depth is 12 bits per pixel (48 / (2 * 2)).
>>>>
>>>> so my problem is that the mentioned 12bpp in this example only
>>>> make sense in mathematics, it doesn't has a good physical
>>>> interpret. Do you agree with me on this technique point?
>>>>
>>>>> If I want to create an NV12 framebuffer, I allocate two dumb
>>>>> buffers, one for Y and one for UV planes, and size them accordingly.
>>>>> And then create the DRM framebuffer with those.
>>>>>
>>>> Then how you fill the value of the 'width', 'height' and 'bpp' of
>>>> each dumb buffers?
>>> For 640x480-NV12:
>>> plane 0: width = 640, height = 480, bpp = 8
>>> plane 1: width = 640 / 2, height = 480 / 2, bpp = 16
>> But i think this should be hardware dependent. The hardware I'm using
>> load NV12  raw data as a whole. I only need to feed gpuva of the backing
>> memory to the hardware register once.
>>
>> Not familiar with your hardware, so I can't talk more on this software
>> design. Perhaps someone know more could have a comment on this.
> Layout of planes in memory is just one hardware constraint, the same way
> we have constraints on alignment and strides. Some devices require the
> planes to be contiguous (likely with some alignment constraints), some
> can work with planes being in discontiguous pieces of memory, and even
> require them to be discontiguous and located in separate DRAM banks.

Right.


>>>> Why not allocate storage for the whole on one shoot?
>>> You can, if you adjust the parameters accordingly. However, if the
>>> strides of the planes are not equal, I guess it might cause problems
>>> on some platforms.
>>>
>>> But I think it's usually simpler to allocate one buffer per plane, and
>>> perhaps even better as it doesn't require as large contiguous memory
>>> area.
>>>
>>>> The modetest in libdrm can be an good example, send it[1] to you as
>>>> an reference.
>>> Right, so modetest already does it successfully. So... What is the issue?
>> But then, the problem will become that it override the 'height' parameter.
>> What's the physical interpretation of the 'height' parameter when creating
>> an NV12 image with the dump API then?
> I wouldn't be too concerned about physical interpretations. Yes, the
> height, width and bpp parameters were likely designed with RGB formats
> in mind. Yes, using DUMB_CREATE for YUV formats is probably something
> that the original authors didn't envision. And yes, from that point of
> view, it could be seen by the original authors as an abuse of the API.
> But I don't think that's a problem as such.

Sometimes there may have 2D GPU or Image Process Unit get involved.
Setting the dimension, the clip window and pitch etc parameters to the
hardware are needed.

The value of bpp affects pitch, bigger pitch may cause the hardware load
more bytes than it should on one line. While the 'height' may affect the
size of the clip window, depend on the calculation method.

But I have no strong opinion toward with this and agree with you in overall.

> An API is just an API. True, it would be nicer if the usage of the ioctl
> parameters was more intuitive for YUV formats, but I believe we could
> still standardize how the existing parameters map to linear scanout YUV
> formats without causing the world to end. As has been said before, lots
> of drivers are using DUMB_CREATE for this purpose, and we can't change
> that.
>
> This doesn't mean we shouldn't work on improving memory allocation, but
> I see that as a separate issue.
>
>> I guess, solving complex problems with simple APIs may see the limitation,
>> sooner or later. But I not very sure and might be wrong. So other peoples
>> can override me words.
>>
>>> Everyone agrees that CREATE_DUMB is not the best ioctl to allocate
>>> buffers, and one can't consider it to work identically across the
>>> platforms. But it's what we have and what has been used for ages.
>> Yeah, your request are not unreasonable. It can be seen as a kind of rigid demand.
>> Since GEM DMA helpers doesn't export an more advanced interface to userspace so far.
>> As a result, drivers that employing GEM DMA has no other choice, but to abuse the
>> dumb buffer API to do allocation for the more complex format buffers.
>>
>> The dumb buffer API doesn't support to specify buffer format, tile status and
>> placement etc. The more advance drivers has been exposed the xxx_create_gem()
>> to user-space. It seems that a few more experienced programmers hint us to
>> create an new ioctl at above thread, so that we can keep employing simple API
>> to do simple things and to suit complex needs with the more advanced APIs.
> I'd really like to explore adding new ioctls to exposure memory
> allocation constraints, and allocating the memory itself from DMA heaps.
>
Thomas Zimmermann Jan. 20, 2025, 7:49 a.m. UTC | #29
Hi


Am 16.01.25 um 11:03 schrieb Tomi Valkeinen:
[...]
> Aligning video= and dumb buffers almost sounds like going backwards. 
> video= parameter is bad,

Who told you that? Video= is still the way to specify an initial display 
mode to the kernel and it will remain so.

Of course, it is better to auto-detect settings, but that's for a 
different use case.

Best regards
Thomas

>
> Harmonizing code is fine, but I think that can be done with a function 
> that only does the fallback-case.
>
> So... I can only speak for the platforms I'm using and maintaining, 
> but I'd rather keep the old behavior for CREATE_DUMB that we've had 
> for ages.
>
>  Tomi
>
Thomas Zimmermann Jan. 20, 2025, 7:54 a.m. UTC | #30
Hi


Am 16.01.25 um 11:03 schrieb Tomi Valkeinen:
[...]
>
> Harmonizing code is fine, but I think that can be done with a function 
> that only does the fallback-case.
>
> So... I can only speak for the platforms I'm using and maintaining, 
> but I'd rather keep the old behavior for CREATE_DUMB that we've had 
> for ages.

And we're not going to change that. I'll also include documentation of 
the intended behavior and semantics in the series' next update.

Whatever else is being discussed here, such as new ioctls, is a topic 
for a different series.

Best regards
Thomas

>
>  Tomi
>
Tomi Valkeinen Jan. 20, 2025, 8:51 a.m. UTC | #31
Hi,

On 20/01/2025 09:49, Thomas Zimmermann wrote:
> Hi
> 
> 
> Am 16.01.25 um 11:03 schrieb Tomi Valkeinen:
> [...]
>> Aligning video= and dumb buffers almost sounds like going backwards. 
>> video= parameter is bad,
> 
> Who told you that? Video= is still the way to specify an initial display 
> mode to the kernel and it will remain so.

You did =). "It aligns dumb buffers and video=". I understand the need 
for drm_driver_color_mode_format() for video=. But I think it's bad for 
CREATE_DUMB, at least for the platforms which have never aimed for 
"RGB-only".

So you're not in favor of a drm_mode_size_dumb() version that does not 
use drm_driver_color_mode_format(), for these platforms? I'm still at 
loss as to why we would want to change the behavior of CREATE_DUMB. I 
see no upside, but I see the chance of regressions.

  Tomi
Thomas Zimmermann Jan. 20, 2025, 8:57 a.m. UTC | #32
Hi


Am 20.01.25 um 09:51 schrieb Tomi Valkeinen:
> Hi,
>
> On 20/01/2025 09:49, Thomas Zimmermann wrote:
>> Hi
>>
>>
>> Am 16.01.25 um 11:03 schrieb Tomi Valkeinen:
>> [...]
>>> Aligning video= and dumb buffers almost sounds like going backwards. 
>>> video= parameter is bad,
>>
>> Who told you that? Video= is still the way to specify an initial 
>> display mode to the kernel and it will remain so.
>
> You did =). "It aligns dumb buffers and video=". 

I did not tell you "video= parameter is bad".

Best regards
Thomas

> I understand the need for drm_driver_color_mode_format() for video=. 
> But I think it's bad for CREATE_DUMB, at least for the platforms which 
> have never aimed for "RGB-only".
>
> So you're not in favor of a drm_mode_size_dumb() version that does not 
> use drm_driver_color_mode_format(), for these platforms? I'm still at 
> loss as to why we would want to change the behavior of CREATE_DUMB. I 
> see no upside, but I see the chance of regressions.
>
>  Tomi
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/xlnx/zynqmp_kms.c b/drivers/gpu/drm/xlnx/zynqmp_kms.c
index b47463473472..7ea0cd4f71d3 100644
--- a/drivers/gpu/drm/xlnx/zynqmp_kms.c
+++ b/drivers/gpu/drm/xlnx/zynqmp_kms.c
@@ -19,6 +19,7 @@ 
 #include <drm/drm_crtc.h>
 #include <drm/drm_device.h>
 #include <drm/drm_drv.h>
+#include <drm/drm_dumb_buffers.h>
 #include <drm/drm_encoder.h>
 #include <drm/drm_fbdev_dma.h>
 #include <drm/drm_fourcc.h>
@@ -363,10 +364,12 @@  static int zynqmp_dpsub_dumb_create(struct drm_file *file_priv,
 				    struct drm_mode_create_dumb *args)
 {
 	struct zynqmp_dpsub *dpsub = to_zynqmp_dpsub(drm);
-	unsigned int pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
+	int ret;
 
 	/* Enforce the alignment constraints of the DMA engine. */
-	args->pitch = ALIGN(pitch, dpsub->dma_align);
+	ret = drm_mode_size_dumb(drm, args, dpsub->dma_align, 0);
+	if (ret)
+		return ret;
 
 	return drm_gem_dma_dumb_create_internal(file_priv, drm, args);
 }