Message ID | 20250216183524.12095-1-ryan@testtoast.com (mailing list archive) |
---|---|
Headers | show |
Series | drm: sun4i: add Display Engine 3.3 (DE33) support | expand |
Hi Ryan, sorry for very late review, but here we go... Dne nedelja, 16. februar 2025 ob 19:31:59 Srednjeevropski standardni čas je Ryan Walklin napisal(a): > Hi All, > > v7 of this patch adding support for the Allwinner DE33 display engine, used in the H616 family of SoCs. Apologies for the short interval between versions but a compile error due to a missed helper function in patch 11 snuck by me. v7 only differs from v6 in adding this back. > > v6 includes some small fixes to the device tree documentation, improves naming of an enum type, moves colorspace configuration from the sunxi engine object to the mixer object, and a handful of very small style and whitespace changes. All comments/tags from previous versions addressed. No functional change from v5. > > A v1 patch to enable LCD output for the Anbernic RGnnXX family of devices which use this SoC with an RGB LCD will be submitted shortly. > > Thanks to those who have reviewed and tested previous versions, and to Jernej for the initial patch. > > Original blurb below: > > There is existing mainline support for the DE2 and DE3 AllWinner display pipeline IP blocks, used in the A64 and H6 among others, however the H700 (as well as the H616/H618 and the T507 automotive SoC) have a newer version of the Display Engine (v3.3/DE33) which adds additional high-resolution support as well as YUV colour formats and AFBC compression support. > > This patch set adds DE33 support, following up from the previous RFC [1], with significant rework to break down the previous relatively complex set into more logical steps, detailed below. > > 1. Refactor the existing DE2/DE3 code in readiness to support YUV colour formats in the DE3 engine (patches 1-4). > 2. Add YUV420 colour format support in the DE3 driver (patches 5-13). > 3. Replace the is_de3 mixer flag with an enum to support multiple DE versions (patch 14). > 4. Refactor the mixer, vi_scaler and some register code to merge common init code and more easily support multiple DE versions (patches 15-18). > 5. Add Arm Frame Buffer Compression (AFBC) compressed buffer support to the DE3 driver. This is currently only supported for VI layers (for HW-decoded video output) but is well integrated into these changes and a subsequent patchset to enable the Video Engine is planned. (patch 19). > 6. Add DT bindings for the DE33 engine. (patches 20-22). > 7. Extend the DE2/3 driver for the DE33, comprising clock, mixer, vi_scaler, fmt and csc module support (patches 23-27). This patchset actually introduces 3 disticnt features, which should IMO be separated and thus making reviewing patches easier. 1. native 10-bit YUV420 output (without YUV->RGB->YUV conversions) - this is used on HDMI for proper 10-bit 4K@60 HDR output 2. Display Engine 3.3 (DE33) support 3. AFBC modifier/format support (used for more efficient GPU or VPU output rendering) If I'm not mistaken, your goal is only DE33 support. There is are two reasons why I've never sent these patches myself: 1. scaling YUV images doesn't work Not a problem for any user who doesn't use video player with DRM plane rendering, which I assume is most of them. We can get around of this issue to deny scaling YUV buffers until the issue is sorted out. 2. plane allocations are hackish DE33 for the first time introduced option to move planes between the mixers. DRM has also support for this, but for time being static allocation is acceptable and tbh, even dynamic support need appropriate initial setting. As you might guessed this is done in DE33 clock driver using magic values. Proper way would be to introduce some kind of connection between mixer/plane and clock driver, so initial configuration can be moved to appropriate module instead of being thrown into clock driver. I need to check where to put it and how to properly connect DT nodes. Maybe syscon phandle? I'll do some research. There is one glaring issue (when you work on driver and test every aspect of it). DE33 introduced RCQ, which is basically DMA, which copies shadowed registers from memory buffer directly to hw registers. IIRC it does that at vblank time. This tells you that current concept of writing register values at any time userspace feels to do commit is wrong and we've been lucky that it works as well as it does. So, in order to fix this, driver would need quite a big reorganization. Introducing shadow buffers would solve most of the issues, most likely also with YUV scaling. Implementing RCQ would be then relatively simple and even improve timings. Note that even DE3 has occasional issue with YUV scaler and also read-modify-read access to some of its register can produce bogus value and thus corrupt image until next commit. This is not to say that these issues must be solved with this effort, I'm just stating them to make people aware. Best regards, Jernej > > Further patchsets are planned to support HDMI and the LCD timing controller present in these SoCs. > > Regards, > > Ryan > > Jernej Skrabec (21): > drm: sun4i: de2/de3: Change CSC argument > drm: sun4i: de2/de3: Merge CSC functions into one > drm: sun4i: de2/de3: call csc setup also for UI layer > drm: sun4i: de2: Initialize layer fields earlier > drm: sun4i: de3: Add YUV formatter module > drm: sun4i: de3: add format enumeration function to engine > drm: sun4i: de3: add formatter flag to mixer config > drm: sun4i: de3: add YUV support to the DE3 mixer > drm: sun4i: de3: pass mixer reference to ccsc setup function > drm: sun4i: de3: add YUV support to the color space correction module > drm: sun4i: de3: add YUV support to the TCON > drm: sun4i: support YUV formats in VI scaler > drm: sun4i: de2/de3: add mixer version enum > drm: sun4i: de2/de3: refactor mixer initialisation > drm: sun4i: vi_scaler refactor vi_scaler enablement > drm: sun4i: de2/de3: add generic blender register reference function > drm: sun4i: de2/de3: use generic register reference function for layer > configuration > drm: sun4i: de3: Implement AFBC support > drm: sun4i: de33: mixer: add Display Engine 3.3 (DE33) support > drm: sun4i: de33: vi_scaler: add Display Engine 3.3 (DE33) support > drm: sun4i: de33: fmt: add Display Engine 3.3 (DE33) support > > Ryan Walklin (6): > drm: sun4i: de3: refactor YUV formatter module setup > dt-bindings: allwinner: add H616 DE33 bus binding > dt-bindings: allwinner: add H616 DE33 clock binding > dt-bindings: allwinner: add H616 DE33 mixer binding > clk: sunxi-ng: ccu: add Display Engine 3.3 (DE33) support > drm: sun4i: de33: csc: add Display Engine 3.3 (DE33) support > > .../bus/allwinner,sun50i-a64-de2.yaml | 7 +- > .../clock/allwinner,sun8i-a83t-de2-clk.yaml | 1 + > .../allwinner,sun8i-a83t-de2-mixer.yaml | 21 +- > drivers/clk/sunxi-ng/ccu-sun8i-de2.c | 25 ++ > drivers/gpu/drm/sun4i/Makefile | 3 +- > drivers/gpu/drm/sun4i/sun4i_tcon.c | 28 +- > drivers/gpu/drm/sun4i/sun50i_afbc.c | 250 +++++++++++++ > drivers/gpu/drm/sun4i/sun50i_afbc.h | 87 +++++ > drivers/gpu/drm/sun4i/sun50i_fmt.c | 100 +++++ > drivers/gpu/drm/sun4i/sun50i_fmt.h | 32 ++ > drivers/gpu/drm/sun4i/sun8i_csc.c | 345 +++++++++++++++--- > drivers/gpu/drm/sun4i/sun8i_csc.h | 20 +- > drivers/gpu/drm/sun4i/sun8i_mixer.c | 226 +++++++++--- > drivers/gpu/drm/sun4i/sun8i_mixer.h | 53 ++- > drivers/gpu/drm/sun4i/sun8i_ui_layer.c | 41 ++- > drivers/gpu/drm/sun4i/sun8i_ui_scaler.c | 2 +- > drivers/gpu/drm/sun4i/sun8i_vi_layer.c | 133 +++++-- > drivers/gpu/drm/sun4i/sun8i_vi_scaler.c | 115 ++++-- > drivers/gpu/drm/sun4i/sun8i_vi_scaler.h | 2 +- > drivers/gpu/drm/sun4i/sunxi_engine.h | 29 ++ > 20 files changed, 1306 insertions(+), 214 deletions(-) > create mode 100644 drivers/gpu/drm/sun4i/sun50i_afbc.c > create mode 100644 drivers/gpu/drm/sun4i/sun50i_afbc.h > create mode 100644 drivers/gpu/drm/sun4i/sun50i_fmt.c > create mode 100644 drivers/gpu/drm/sun4i/sun50i_fmt.h > >
On Sat, 22 Feb 2025, at 7:57 AM, Jernej Škrabec wrote: > Hi Ryan, > > sorry for very late review, but here we go... No problem, thanks for the review! > This patchset actually introduces 3 disticnt features, which should IMO > be separated > and thus making reviewing patches easier. > > 1. native 10-bit YUV420 output (without YUV->RGB->YUV conversions) - > this is used on > HDMI for proper 10-bit 4K@60 HDR output > 2. Display Engine 3.3 (DE33) support > 3. AFBC modifier/format support (used for more efficient GPU or VPU > output rendering) > > If I'm not mistaken, your goal is only DE33 support. This is my initial goal, but I would still like good HDMI and video support (including HW decode). It took some refactoring and resequencing of (your) existing driver work to get to this series, and I have left out the HDMI and separated the TCON code for the same reason, that initially I am intending to just support RGB operation to an LCD. I do intend however to use the HDMI code either in or out of tree but think that will be a much bigger/more complex change to mainline given the current HDMI code is more invasive to the generic driver. The YUV and AFBC changes seemed more self-contained and seemed reasonable approaches given that they were both features of the DE3 and up. I am happy either to split these into either 4 or 5 separate patches ie: 1a. refactor CSC code to prepare for YUV 1b. add YUV for DE3 2. refactor mixer enumeration and regmaps to support DE3+ 3. add DE33 4. Add AFBC My only reluctance is that that adds more work to manage multiple patches which are logically grouped and in terms of ease of review, all these 4 steps are in the current set in that order (apart from AFBC which is completely standalone), and I don't think submitting them separately reduces complexity. It however will reduce reviewer burden as you suggest, but this has been a slow process already. I am happy to accept either process but this has been some time already now with lots of stylistic feedback but not much on the correctness of the approach and code, and I am keen to get this landed. There is are two > reasons why > I've never sent these patches myself: > > 1. scaling YUV images doesn't work > > Not a problem for any user who doesn't use video player with DRM plane > rendering, > which I assume is most of them. We can get around of this issue to deny > scaling > YUV buffers until the issue is sorted out. Good to know, I hadn't appreciated that. Mostly relevant for video as you say and it would be good to land YUV support without scaling, then extend subsequently, possibly when we get to the video engine? > 2. plane allocations are hackish > > DE33 for the first time introduced option to move planes between the > mixers. DRM > has also support for this, but for time being static allocation is > acceptable and tbh, > even dynamic support need appropriate initial setting. > As you might guessed this is done in DE33 clock driver using magic > values. Proper > way would be to introduce some kind of connection between mixer/plane > and clock > driver, so initial configuration can be moved to appropriate module > instead of > being thrown into clock driver. I need to check where to put it and how > to properly > connect DT nodes. Maybe syscon phandle? I'll do some research. Thanks, I was not aware of this either. > There is one glaring issue (when you work on driver and test every > aspect of it). > DE33 introduced RCQ, which is basically DMA, which copies shadowed > registers > from memory buffer directly to hw registers. IIRC it does that at > vblank time. This > tells you that current concept of writing register values at any time > userspace feels > to do commit is wrong and we've been lucky that it works as well as it > does. So, > in order to fix this, driver would need quite a big reorganization. > Introducing > shadow buffers would solve most of the issues, most likely also with > YUV scaling. > Implementing RCQ would be then relatively simple and even improve > timings. > Note that even DE3 has occasional issue with YUV scaler and also > read-modify-read > access to some of its register can produce bogus value and thus corrupt > image > until next commit. Thanks, again I wasn't aware. All my testing has shown remarkable stability and there are some downstream users out-of-tree with good feedback, but would be happy to support an effort to improve this. Regards, Ryan
Dne sobota, 22. februar 2025 ob 03:48:01 Srednjeevropski standardni čas je Ryan Walklin napisal(a): > On Sat, 22 Feb 2025, at 7:57 AM, Jernej Škrabec wrote: > > Hi Ryan, > > > > sorry for very late review, but here we go... > > No problem, thanks for the review! > > > This patchset actually introduces 3 disticnt features, which should IMO > > be separated > > and thus making reviewing patches easier. > > > > 1. native 10-bit YUV420 output (without YUV->RGB->YUV conversions) - > > this is used on > > HDMI for proper 10-bit 4K@60 HDR output > > 2. Display Engine 3.3 (DE33) support > > 3. AFBC modifier/format support (used for more efficient GPU or VPU > > output rendering) > > > > If I'm not mistaken, your goal is only DE33 support. > > This is my initial goal, but I would still like good HDMI and video support (including HW decode). > > It took some refactoring and resequencing of (your) existing driver work to get to this series, and I have left out the HDMI and separated the TCON code for the same reason, that initially I am intending to just support RGB operation to an LCD. I do intend however to use the HDMI code either in or out of tree but think that will be a much bigger/more complex change to mainline given the current HDMI code is more invasive to the generic driver. > > The YUV and AFBC changes seemed more self-contained and seemed reasonable approaches given that they were both features of the DE3 and up. I am happy either to split these into either 4 or 5 separate patches ie: > > 1a. refactor CSC code to prepare for YUV > 1b. add YUV for DE3 > 2. refactor mixer enumeration and regmaps to support DE3+ > 3. add DE33 > 4. Add AFBC > > My only reluctance is that that adds more work to manage multiple patches which are logically grouped and in terms of ease of review, all these 4 steps are in the current set in that order (apart from AFBC which is completely standalone), and I don't think submitting them separately reduces complexity. It however will reduce reviewer burden as you suggest, but this has been a slow process already. Sorry, completely forgot. YUV420 HDMI code relies on my previous work, with which Maxime wasn't happy with: https://lore.kernel.org/linux-sunxi/20230924192604.3262187-1-jernej.skrabec@gmail.com/ So unless switching HDMI to bridge ops is implemented, which also brings format, YUV formatter and some other patches just add unused code, which isn't ideal, especially if we decide to rework driver before that code can be put in acceptable state for all involved. From quick look, patches 5-13, 26 should be dropped for now. Not sure about 1-4. I'm fine with AFBC support going in, it's just one patch. > > I am happy to accept either process but this has been some time already now with lots of stylistic feedback but not much on the correctness of the approach and code, and I am keen to get this landed. > > There is are two > > reasons why > > I've never sent these patches myself: > > > > 1. scaling YUV images doesn't work > > > > Not a problem for any user who doesn't use video player with DRM plane > > rendering, > > which I assume is most of them. We can get around of this issue to deny > > scaling > > YUV buffers until the issue is sorted out. > > Good to know, I hadn't appreciated that. Mostly relevant for video as you say and it would be good to land YUV support without scaling, then extend subsequently, possibly when we get to the video engine? Correct. Just be aware of one thing. Even if YUV buffer is rendered in 1:1 scale, vi scaler still needs to be configured. That's because U and V planes are subsampled and need to be scaled to Y plane size. For unknown reason, that works just fine, but if Y scale is bigger than 1, it all falls apart. This should be implemented in atomic check. > > > 2. plane allocations are hackish > > > > DE33 for the first time introduced option to move planes between the > > mixers. DRM > > has also support for this, but for time being static allocation is > > acceptable and tbh, > > even dynamic support need appropriate initial setting. > > As you might guessed this is done in DE33 clock driver using magic > > values. Proper > > way would be to introduce some kind of connection between mixer/plane > > and clock > > driver, so initial configuration can be moved to appropriate module > > instead of > > being thrown into clock driver. I need to check where to put it and how > > to properly > > connect DT nodes. Maybe syscon phandle? I'll do some research. > > Thanks, I was not aware of this either. Here you have some pointers how this values are actually set: https://github.com/orangepi-xunlong/linux-orangepi/blob/orange-pi-5.15-sun55iw3/bsp/drivers/video/sunxi/disp2/disp/de/lowlevel_v33x/de330/de_top.c#L232-L260 I think this is the biggest issue of this code. As soon as we solve it properly, we have an ability to implement all remaining features at a later date. > > > There is one glaring issue (when you work on driver and test every > > aspect of it). > > DE33 introduced RCQ, which is basically DMA, which copies shadowed > > registers > > from memory buffer directly to hw registers. IIRC it does that at > > vblank time. This > > tells you that current concept of writing register values at any time > > userspace feels > > to do commit is wrong and we've been lucky that it works as well as it > > does. So, > > in order to fix this, driver would need quite a big reorganization. > > Introducing > > shadow buffers would solve most of the issues, most likely also with > > YUV scaling. > > Implementing RCQ would be then relatively simple and even improve > > timings. > > Note that even DE3 has occasional issue with YUV scaler and also > > read-modify-read > > access to some of its register can produce bogus value and thus corrupt > > image > > until next commit. > > Thanks, again I wasn't aware. All my testing has shown remarkable stability and there are some downstream users out-of-tree with good feedback, but would be happy to support an effort to improve this. Let's land DE33 support first since it's long overdue and then I'm happy to discuss plans for moving forward. Best regards, Jernej > > Regards, > > Ryan >