mbox series

[v18,0/2] Add memory bandwidth management to NVIDIA Tegra DRM driver

Message ID 20210601042108.1942-1-digetx@gmail.com (mailing list archive)
Headers show
Series Add memory bandwidth management to NVIDIA Tegra DRM driver | expand

Message

Dmitry Osipenko June 1, 2021, 4:21 a.m. UTC
This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
which is done using interconnect framework. It fixes display corruption that
happens due to insufficient memory bandwidth.

Changelog:

v18: - Moved total peak bandwidth from CRTC state to plane state and removed
       dummy plane bandwidth state initialization from T186+ plane hub. This
       was suggested by Thierry Reding to v17.

     - I haven't done anything about the cursor's plane bandwidth which
       doesn't contribute to overlapping bandwidths for a small sized
       window because it works okay as-is.

v17: - No changes, re-sending for v5.14.

v16: - Implemented suggestions that were given by Michał Mirosław to v15.

     - Added r-b from Michał Mirosław to the debug-stats patch.

     - Rebased on top of a recent linux-next.

     - Removed bandwidth scaling based on width difference of src/dst
       windows since it's not actual anymore. Apparently the recent memory
       driver changes fixed problems that I witnessed before.

     - Average bandwidth calculation now won't overflow for 4k resolutions.

     - Average bandwidth calculation now uses the size of the visible
       area instead of the src area since debug stats of the memory
       controller clearly show that downscaled window takes less bandwidth,
       proportionally to the scaled size.

     - Bandwidth calculation now uses "adjusted mode" of the CRTC, which
       is what used for h/w programming, instead of the mode that was
       requested by userspace, although the two usually match in practice.

v15: - Corrected tegra_plane_icc_names[] NULL-check that was partially lost
       by accident in v14 after unsuccessful rebase.

v14: - Made improvements that were suggested by Michał Mirosław to v13:

       - Changed 'unsigned int' to 'bool'.
       - Renamed functions which calculate bandwidth state.
       - Reworked comment in the code that explains why downscaled plane
         require higher bandwidth.
       - Added round-up to bandwidth calculation.
       - Added sanity checks of the plane index and fixed out-of-bounds
         access which happened on T124 due to the cursor plane index.

v13: - No code changes. Patches missed v5.12, re-sending them for v5.13.

Dmitry Osipenko (2):
  drm/tegra: dc: Support memory bandwidth management
  drm/tegra: dc: Extend debug stats with total number of events

 drivers/gpu/drm/tegra/Kconfig |   1 +
 drivers/gpu/drm/tegra/dc.c    | 358 +++++++++++++++++++++++++++++++++-
 drivers/gpu/drm/tegra/dc.h    |  17 ++
 drivers/gpu/drm/tegra/drm.c   |  14 ++
 drivers/gpu/drm/tegra/plane.c | 117 +++++++++++
 drivers/gpu/drm/tegra/plane.h |  16 ++
 6 files changed, 521 insertions(+), 2 deletions(-)

Comments

Dmitry Osipenko June 6, 2021, 10:40 p.m. UTC | #1
01.06.2021 07:21, Dmitry Osipenko пишет:
> This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
> which is done using interconnect framework. It fixes display corruption that
> happens due to insufficient memory bandwidth.
> 
> Changelog:
> 
> v18: - Moved total peak bandwidth from CRTC state to plane state and removed
>        dummy plane bandwidth state initialization from T186+ plane hub. This
>        was suggested by Thierry Reding to v17.
> 
>      - I haven't done anything about the cursor's plane bandwidth which
>        doesn't contribute to overlapping bandwidths for a small sized
>        window because it works okay as-is.

Thierry, will you take these patches for 5.14?
Dmitry Osipenko June 21, 2021, 4:19 a.m. UTC | #2
07.06.2021 01:40, Dmitry Osipenko пишет:
> 01.06.2021 07:21, Dmitry Osipenko пишет:
>> This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
>> which is done using interconnect framework. It fixes display corruption that
>> happens due to insufficient memory bandwidth.
>>
>> Changelog:
>>
>> v18: - Moved total peak bandwidth from CRTC state to plane state and removed
>>        dummy plane bandwidth state initialization from T186+ plane hub. This
>>        was suggested by Thierry Reding to v17.
>>
>>      - I haven't done anything about the cursor's plane bandwidth which
>>        doesn't contribute to overlapping bandwidths for a small sized
>>        window because it works okay as-is.
> 
> Thierry, will you take these patches for 5.14?
> 

The display controller does _NOT_WORK_ properly without bandwidth
management. Can we get this patch into 5.14? What is the problem?
Thierry Reding June 21, 2021, 11:01 a.m. UTC | #3
On Mon, Jun 21, 2021 at 07:19:15AM +0300, Dmitry Osipenko wrote:
> 07.06.2021 01:40, Dmitry Osipenko пишет:
> > 01.06.2021 07:21, Dmitry Osipenko пишет:
> >> This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
> >> which is done using interconnect framework. It fixes display corruption that
> >> happens due to insufficient memory bandwidth.
> >>
> >> Changelog:
> >>
> >> v18: - Moved total peak bandwidth from CRTC state to plane state and removed
> >>        dummy plane bandwidth state initialization from T186+ plane hub. This
> >>        was suggested by Thierry Reding to v17.
> >>
> >>      - I haven't done anything about the cursor's plane bandwidth which
> >>        doesn't contribute to overlapping bandwidths for a small sized
> >>        window because it works okay as-is.
> > 
> > Thierry, will you take these patches for 5.14?
> > 
> 
> The display controller does _NOT_WORK_ properly without bandwidth
> management.

That's surprising. So either it has never worked before (which I think
I'd know) or something has caused this regression recently. In the
latter case we need to identify what that was and revert (or fix) it.

> Can we get this patch into 5.14? What is the problem?

There was not enough time to review and test this, so I didn't feel
comfortable picking it up so close to the -rc6 cut-off. I plan to pick
this up early in the v5.14 release cycle and target v5.15.

Thierry
Dmitry Osipenko June 21, 2021, 11:43 a.m. UTC | #4
21.06.2021 14:01, Thierry Reding пишет:
> On Mon, Jun 21, 2021 at 07:19:15AM +0300, Dmitry Osipenko wrote:
>> 07.06.2021 01:40, Dmitry Osipenko пишет:
>>> 01.06.2021 07:21, Dmitry Osipenko пишет:
>>>> This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
>>>> which is done using interconnect framework. It fixes display corruption that
>>>> happens due to insufficient memory bandwidth.
>>>>
>>>> Changelog:
>>>>
>>>> v18: - Moved total peak bandwidth from CRTC state to plane state and removed
>>>>        dummy plane bandwidth state initialization from T186+ plane hub. This
>>>>        was suggested by Thierry Reding to v17.
>>>>
>>>>      - I haven't done anything about the cursor's plane bandwidth which
>>>>        doesn't contribute to overlapping bandwidths for a small sized
>>>>        window because it works okay as-is.
>>>
>>> Thierry, will you take these patches for 5.14?
>>>
>>
>> The display controller does _NOT_WORK_ properly without bandwidth
>> management.
> 
> That's surprising. So either it has never worked before (which I think
> I'd know) or something has caused this regression recently. In the
> latter case we need to identify what that was and revert (or fix) it.

The problem is caused by the support of dynamic memory frequency scaling
which does a good job at keeping memory in a low power state during idle
time. So display controller may not get enough bandwidth at the start of
scanout if it won't request BW beforehand. This problem existed for many
years on T124 and now T20/30 are also affected. The DC of T20 is the
least tolerant to memory bandwidth troubles.

This problem is not critical, but it hurts user experience since high
resolution modes may not work at all and display output may become
distorted, requiring a DC reset.

>> Can we get this patch into 5.14? What is the problem?
> 
> There was not enough time to review and test this, so I didn't feel
> comfortable picking it up so close to the -rc6 cut-off. I plan to pick
> this up early in the v5.14 release cycle and target v5.15.

Thank you for the explanation! It's not uncommon to forget about
patches, so the silence worries me. I hoped that both the dynamic freq
scaling and display BW support would be merged around the same time, but
apparently we got a disconnect here.
Dmitry Osipenko July 18, 2021, 8:59 p.m. UTC | #5
21.06.2021 14:43, Dmitry Osipenko пишет:
> 21.06.2021 14:01, Thierry Reding пишет:
>> On Mon, Jun 21, 2021 at 07:19:15AM +0300, Dmitry Osipenko wrote:
>>> 07.06.2021 01:40, Dmitry Osipenko пишет:
>>>> 01.06.2021 07:21, Dmitry Osipenko пишет:
>>>>> This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
>>>>> which is done using interconnect framework. It fixes display corruption that
>>>>> happens due to insufficient memory bandwidth.
>>>>>
>>>>> Changelog:
>>>>>
>>>>> v18: - Moved total peak bandwidth from CRTC state to plane state and removed
>>>>>        dummy plane bandwidth state initialization from T186+ plane hub. This
>>>>>        was suggested by Thierry Reding to v17.
>>>>>
>>>>>      - I haven't done anything about the cursor's plane bandwidth which
>>>>>        doesn't contribute to overlapping bandwidths for a small sized
>>>>>        window because it works okay as-is.
>>>>
>>>> Thierry, will you take these patches for 5.14?
>>>>
>>>
>>> The display controller does _NOT_WORK_ properly without bandwidth
>>> management.
>>
>> That's surprising. So either it has never worked before (which I think
>> I'd know) or something has caused this regression recently. In the
>> latter case we need to identify what that was and revert (or fix) it.
> 
> The problem is caused by the support of dynamic memory frequency scaling
> which does a good job at keeping memory in a low power state during idle
> time. So display controller may not get enough bandwidth at the start of
> scanout if it won't request BW beforehand. This problem existed for many
> years on T124 and now T20/30 are also affected. The DC of T20 is the
> least tolerant to memory bandwidth troubles.
> 
> This problem is not critical, but it hurts user experience since high
> resolution modes may not work at all and display output may become
> distorted, requiring a DC reset.
> 
>>> Can we get this patch into 5.14? What is the problem?
>>
>> There was not enough time to review and test this, so I didn't feel
>> comfortable picking it up so close to the -rc6 cut-off. I plan to pick
>> this up early in the v5.14 release cycle and target v5.15.
> 
> Thank you for the explanation! It's not uncommon to forget about
> patches, so the silence worries me. I hoped that both the dynamic freq
> scaling and display BW support would be merged around the same time, but
> apparently we got a disconnect here.
> 

Reminder ping :)
Thierry Reding Aug. 13, 2021, 10:33 a.m. UTC | #6
On Mon, Jun 07, 2021 at 01:40:06AM +0300, Dmitry Osipenko wrote:
> 01.06.2021 07:21, Dmitry Osipenko пишет:
> > This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
> > which is done using interconnect framework. It fixes display corruption that
> > happens due to insufficient memory bandwidth.
> > 
> > Changelog:
> > 
> > v18: - Moved total peak bandwidth from CRTC state to plane state and removed
> >        dummy plane bandwidth state initialization from T186+ plane hub. This
> >        was suggested by Thierry Reding to v17.
> > 
> >      - I haven't done anything about the cursor's plane bandwidth which
> >        doesn't contribute to overlapping bandwidths for a small sized
> >        window because it works okay as-is.
> 
> Thierry, will you take these patches for 5.14?

As discussed offline, I've picked these up for v5.15 with a small patch
squashed in to unbreak the Tegra186 and later support.

Thanks,
Thierry
Dmitry Osipenko Aug. 13, 2021, 1:27 p.m. UTC | #7
13.08.2021 13:33, Thierry Reding пишет:
> On Mon, Jun 07, 2021 at 01:40:06AM +0300, Dmitry Osipenko wrote:
>> 01.06.2021 07:21, Dmitry Osipenko пишет:
>>> This series adds memory bandwidth management to the NVIDIA Tegra DRM driver,
>>> which is done using interconnect framework. It fixes display corruption that
>>> happens due to insufficient memory bandwidth.
>>>
>>> Changelog:
>>>
>>> v18: - Moved total peak bandwidth from CRTC state to plane state and removed
>>>        dummy plane bandwidth state initialization from T186+ plane hub. This
>>>        was suggested by Thierry Reding to v17.
>>>
>>>      - I haven't done anything about the cursor's plane bandwidth which
>>>        doesn't contribute to overlapping bandwidths for a small sized
>>>        window because it works okay as-is.
>>
>> Thierry, will you take these patches for 5.14?
> 
> As discussed offline, I've picked these up for v5.15 with a small patch
> squashed in to unbreak the Tegra186 and later support.

Cool, thanks.