drm/omap: gem: Fix tearing with BO_TILED
diff mbox series

Message ID 20191221005711.47314-1-tony@atomide.com
State New
Headers show
Series
  • drm/omap: gem: Fix tearing with BO_TILED
Related show

Commit Message

Tony Lindgren Dec. 21, 2019, 12:57 a.m. UTC
On my droid4 I noticed bad constant tearing on the LCD with stellarium in
landscape mode with xorg-video-omap rotated with xrandr --rotate right.
Every second or so update gets squeezed down in size to only the top half
of the LCD panel.

This issue does not happen with xrandr --rotate normal, or when HDMI
display is also connected. Looks like xorg-video-omap switches to tiled
mode in the the rotated case.

Looking around what might affect BO_TILED, I noticed Matthijs had this
change in his earlier pyra tiler patches. The earlier patch "XXX omapdrm:
force tiled buffers to be pinned and page-aligned" has no commit log
though, so I'm not sure what other issues this might fix.

This is with the old pvr-omap4 driver, but presumably the same issue
exists in all cases.

Cc: H. Nikolaus Schaller <hns@goldelico.com>
Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Matthijs van Duin <matthijsvanduin@gmail.com>
Cc: Merlijn Wajer <merlijn@wizzup.org>
Cc: Sebastian Reichel <sre@kernel.org>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
---


Matthijs, do you have some more info to add to the description?


 drivers/gpu/drm/omapdrm/omap_gem.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Tony Lindgren Dec. 21, 2019, 4:41 p.m. UTC | #1
* Tony Lindgren <tony@atomide.com> [191220 16:57]:
> Looking around what might affect BO_TILED, I noticed Matthijs had this
> change in his earlier pyra tiler patches. The earlier patch "XXX omapdrm:
> force tiled buffers to be pinned and page-aligned" has no commit log
> though, so I'm not sure what other issues this might fix.
..
> Matthijs, do you have some more info to add to the description?

Also, I'm wondering if this change makes sense alone without the pinning
changes for a fix, or if also the pinning changes are needed.

For reference, the original related patch(es) are available at [0] below.

Regards,

Tony

[0] https://github.com/mvduin/linux/commit/70593563f531a7ac4a3f6ebed0fc98ef86742b12
Matthijs van Duin Jan. 4, 2020, 4:52 a.m. UTC | #2
On Fri, Dec 20, 2019 at 04:57:11PM -0800, Tony Lindgren wrote:
> On my droid4 I noticed bad constant tearing on the LCD with stellarium in
> landscape mode with xorg-video-omap rotated with xrandr --rotate right.
> Every second or so update gets squeezed down in size to only the top half
> of the LCD panel.

Odd, there's not really a good reason for rotation to have any effect on
whether tearing happens or not.

BTW, with "top half", I assume you mean the top of the screen (i.e.
right side of the display), not the top of the display (i.e. left side
of the screen) ?


> This issue does not happen with xrandr --rotate normal, or when HDMI
> display is also connected.

Ehhhh, mirroring onto HDMI fixes the problem?  Strange


> Looking around what might affect BO_TILED, I noticed Matthijs had this
> change in his earlier pyra tiler patches. The earlier patch "XXX omapdrm:
> force tiled buffers to be pinned and page-aligned" has no commit log
> though, so I'm not sure what other issues this might fix.

This is just part of a hacky patch series to improve performance for
userspace access to tiled buffers.  Page alignment has no effect by
itself, but it's necessary to allow the tiled memory allocated by
tiled_reserve_2d() to be mapped directly into userspace instead of using
the really slow "usergart" mechanism.

You can find the full patch series in github.com/mvduin/linux branch
4.15/patch/tiler-fast (based on mainline 4.15-rc6):

ae664249050b ARM: introduce pgprot_device()
fc1e8ffd1334 drm: omapdrm: improve choice of memory type for tiled memory
   these improve performance on omap5/dra7 by mapping tiled buffers as
   "device" memory by default instead of the pointlessly slow "strongly
   ordered" which is currently used as a workaround for the
   incompatibility between TILER and the bizarre way the ARM Cortex-A15
   implements loads from normal non-cacheable memory.

3d4c98cc47dd XXX omapdrm: factor out _omap_gem_(un)pin
70593563f531 XXX omapdrm: force tiled buffers to be pinned and page-aligned
e061e454afd5 XXX omapdrm: fast userspace mapping of tiled buffers
   these greatly improve performance of userspace access to tiled
   buffers (on all devices that use tiler) at the expense of using more
   tiler virtual memory.  note that tiler virtual memory is a less
   scarce resource on omap5/dra7 where 2d and 1d mappings have separate
   page tables than on omap4 where they share a page table.

None of this should have any impact on tearing.


Matthijs
Matthijs van Duin Jan. 4, 2020, 5:09 a.m. UTC | #3
On Sat, Dec 21, 2019 at 08:41:41AM -0800, Tony Lindgren wrote:
> Also, I'm wondering if this change makes sense alone without the pinning
> changes for a fix, or if also the pinning changes are needed.

Both pinning and page-alignment are done just to support the direct
userspace mapping.  By themselves they mostly just waste tiler memory
but otherwise don't really have much impact.

It's not really clear to me why you have such interest in this specific
patch.  Does my patch series fix the tearing issue you've described?  If
so maybe without my patches tiled memory is just so slow that the frame
is being submitted too late, which perhaps causes an async page flip
(does it? I'm not sure) thus resulting in tearing?

Matthijs
Tony Lindgren Jan. 4, 2020, 5:50 a.m. UTC | #4
* Matthijs van Duin <matthijsvanduin@gmail.com> [200104 05:10]:
> On Sat, Dec 21, 2019 at 08:41:41AM -0800, Tony Lindgren wrote:
> > Also, I'm wondering if this change makes sense alone without the pinning
> > changes for a fix, or if also the pinning changes are needed.
> 
> Both pinning and page-alignment are done just to support the direct
> userspace mapping.  By themselves they mostly just waste tiler memory
> but otherwise don't really have much impact.

OK thanks, so no specific fix, that's the part I was wondering about :)

> It's not really clear to me why you have such interest in this specific
> patch.  Does my patch series fix the tearing issue you've described?  If
> so maybe without my patches tiled memory is just so slow that the frame
> is being submitted too late, which perhaps causes an async page flip
> (does it? I'm not sure) thus resulting in tearing?

Just changing the alingment fixes the issue. Looks like the minimum
alignment we currently allow is 128, I think 512 was the minimum
that worked for me, so maybe the right fix would be to just change
the minimum to 512 with no specific need to use 4096 for now.

I did not see any issue with frame updates being too slow, I just
picked that alignment change manually without trying the rest of
your series.

Regards,

Tony
Tony Lindgren Jan. 4, 2020, 5:53 a.m. UTC | #5
* Matthijs van Duin <matthijsvanduin@gmail.com> [200104 04:53]:
> On Fri, Dec 20, 2019 at 04:57:11PM -0800, Tony Lindgren wrote:
> > On my droid4 I noticed bad constant tearing on the LCD with stellarium in
> > landscape mode with xorg-video-omap rotated with xrandr --rotate right.
> > Every second or so update gets squeezed down in size to only the top half
> > of the LCD panel.
> 
> Odd, there's not really a good reason for rotation to have any effect on
> whether tearing happens or not.
> 
> BTW, with "top half", I assume you mean the top of the screen (i.e.
> right side of the display), not the top of the display (i.e. left side
> of the screen) ?

Correct right side of the display that appears on top after rotate right.

> > This issue does not happen with xrandr --rotate normal, or when HDMI
> > display is also connected.
> 
> Ehhhh, mirroring onto HDMI fixes the problem?  Strange

Yup just connecting an additional HDMI panel fixes the issue on the LCD :)

> > Looking around what might affect BO_TILED, I noticed Matthijs had this
> > change in his earlier pyra tiler patches. The earlier patch "XXX omapdrm:
> > force tiled buffers to be pinned and page-aligned" has no commit log
> > though, so I'm not sure what other issues this might fix.
> 
> This is just part of a hacky patch series to improve performance for
> userspace access to tiled buffers.  Page alignment has no effect by
> itself, but it's necessary to allow the tiled memory allocated by
> tiled_reserve_2d() to be mapped directly into userspace instead of using
> the really slow "usergart" mechanism.

OK

> You can find the full patch series in github.com/mvduin/linux branch
> 4.15/patch/tiler-fast (based on mainline 4.15-rc6):
> 
> ae664249050b ARM: introduce pgprot_device()
> fc1e8ffd1334 drm: omapdrm: improve choice of memory type for tiled memory
>    these improve performance on omap5/dra7 by mapping tiled buffers as
>    "device" memory by default instead of the pointlessly slow "strongly
>    ordered" which is currently used as a workaround for the
>    incompatibility between TILER and the bizarre way the ARM Cortex-A15
>    implements loads from normal non-cacheable memory.
> 
> 3d4c98cc47dd XXX omapdrm: factor out _omap_gem_(un)pin
> 70593563f531 XXX omapdrm: force tiled buffers to be pinned and page-aligned
> e061e454afd5 XXX omapdrm: fast userspace mapping of tiled buffers
>    these greatly improve performance of userspace access to tiled
>    buffers (on all devices that use tiler) at the expense of using more
>    tiler virtual memory.  note that tiler virtual memory is a less
>    scarce resource on omap5/dra7 where 2d and 1d mappings have separate
>    page tables than on omap4 where they share a page table.
> 
> None of this should have any impact on tearing.

OK so the alignment change just happens to fix the issue then.

Regards,

Tony
Tony Lindgren Jan. 5, 2020, 8:37 p.m. UTC | #6
Hi,

* Tony Lindgren <tony@atomide.com> [200104 05:51]:
> 
> Just changing the alingment fixes the issue. Looks like the minimum
> alignment we currently allow is 128, I think 512 was the minimum
> that worked for me, so maybe the right fix would be to just change
> the minimum to 512 with no specific need to use 4096 for now.

So Matthijs and I chatted about this a bit on irc, and here's what
we concluded so far:

1. We have at least three different alignment needs for tiler

- Linux use of tiler aligned to 128 bytes

- SGX use of tiler aligned to 4096 bytes (or 512 bytes?)

- Fast userspace mapping aligned to 4096 bytes

2. The alignment need may need to be configured by the tiler consumer
   in #1 above

3. The alignment need for SGX seems to be based on SGX MMU page size

4. The issue I'm seeing with stellarium on droid4 may be a stride
   issue as about one out of 3 or 4 frames is OK and aligning to
   512 also fixes the issue maybe because it happens to make
   multiple frames align to 4096

So let's wait on this patch until we have more info and know how
the different alignments should be handled.

Regards,

Tony
Matthijs van Duin Jan. 6, 2020, 6:06 p.m. UTC | #7
On Sun, Jan 05, 2020 at 12:37:04PM -0800, Tony Lindgren wrote:
> - Linux use of tiler aligned to 128 bytes

which is presumably a simplification of the alignment imposed by hardware:
-  64 bytes (64 pixels) for 8bpp
- 128 bytes (64 pixels) for 16bpp
- 128 bytes (32 pixels) for 32bpp


> - Fast userspace mapping aligned to 4096 bytes

Also for kernel mapping (vmap), e.g. for supporting a rotated fbdev


> 2. The alignment need may need to be configured by the tiler consumer

I guess omap_gem_pin should check for the alignment requirement by
iterating over all dma_buf attachments and intersecting their
attach->dev->dma_parms.segment_boundary_mask or something like that?
Since I feel like this is something more dma_buf exporters presumably
need, is there no existing utility function for this?


Of course in principle it ought to be possible (modulo bugs and API
restrictions) to use buffers that aren't aligned to MMU page size if
either

a. hardware is trusted to not read or touch data from unrelated buffers
that share MMU pages, or

b. buffers are only permitted to share MMU pages if they belong to the
same security context (though this is something that could also be done
in userspace: e.g. instead of allocating three separate 720-pixel wide
buffers, each one page per line, allocate one buffer that's 720*3 pixels
wide (2 pages per line) and use three slices thereof).


> - SGX use of tiler aligned to 4096 bytes (or 512 bytes?)
> 
> ...
> 
> 3. The alignment need for SGX seems to be based on SGX MMU page size

which is 4096 bytes apparently:

#define SGX_MMU_PAGE_SHIFT	(12)
#define SGX_MMU_PAGE_SIZE	(1U<<SGX_MMU_PAGE_SHIFT)

or rather, this is the smallest page size it supports:
#define SGX_MMU_PDE_PAGE_SIZE_4K	(0x00000000U)
#define SGX_MMU_PDE_PAGE_SIZE_16K	(0x00000002U)
#define SGX_MMU_PDE_PAGE_SIZE_64K	(0x00000004U)
#define SGX_MMU_PDE_PAGE_SIZE_256K	(0x00000006U)
#define SGX_MMU_PDE_PAGE_SIZE_1M	(0x00000008U)
#define SGX_MMU_PDE_PAGE_SIZE_4M	(0x0000000AU)
#define SGX_MMU_PDE_PAGE_SIZE_MASK	(0x0000000EU)


> 4. The issue I'm seeing with stellarium on droid4 may be a stride
>    issue as about one out of 3 or 4 frames is OK and aligning to
>    512 also fixes the issue maybe because it happens to make
>    multiple frames align to 4096

Yeah if your buffers are 960 pixels wide (assuming the droid4's screen
is natively portrait) and 32bpp then 512-byte alignment suffices to
automatically make them 4KB alignment.

The most obvious thing I can think of that could do wrong is that it
might contiguously map the pages that cover each line, which is what
will happen if they use e.g. for_each_sg_page, but subsequently assume
that the stride in sgx virtual memory is ALIGN( width * cpp, PAGE_SIZE )
without taking the offset of the buffer inside the mapping into account.

If each line is at most 4KB (i.e. 1024 pixels @ 32bpp) but each line
straddles an MMU page boundary, then the result would be that the even
lines of the frame are written to the top half of the buffer, causing it
to be scaled to 50% vertically, while the odd lines are "lost" (written
outside the buffer, either to a different buffer or unmapped tiler
memory).  This sounds like what you described on irc?


Matthijs
Tomi Valkeinen Jan. 7, 2020, 1:30 p.m. UTC | #8
On 05/01/2020 22:37, Tony Lindgren wrote:
> Hi,
> 
> * Tony Lindgren <tony@atomide.com> [200104 05:51]:
>>
>> Just changing the alingment fixes the issue. Looks like the minimum
>> alignment we currently allow is 128, I think 512 was the minimum
>> that worked for me, so maybe the right fix would be to just change
>> the minimum to 512 with no specific need to use 4096 for now.
> 
> So Matthijs and I chatted about this a bit on irc, and here's what
> we concluded so far:
> 
> 1. We have at least three different alignment needs for tiler
> 
> - Linux use of tiler aligned to 128 bytes
> 
> - SGX use of tiler aligned to 4096 bytes (or 512 bytes?)
> 
> - Fast userspace mapping aligned to 4096 bytes
> 
> 2. The alignment need may need to be configured by the tiler consumer
>     in #1 above
> 
> 3. The alignment need for SGX seems to be based on SGX MMU page size
> 
> 4. The issue I'm seeing with stellarium on droid4 may be a stride
>     issue as about one out of 3 or 4 frames is OK and aligning to
>     512 also fixes the issue maybe because it happens to make
>     multiple frames align to 4096
> 
> So let's wait on this patch until we have more info and know how
> the different alignments should be handled.

I don't know if these are related to the issue you have, and it's been a while since I looked at 
TILER, but two thoughts after looking at the thread:

- If the usergart is not used, the width of each line has to be expanded to be page size aligned, 
right? Do the patches from Matthijs do this?

- Afaik, there's no safe way to use TILER 2D buffers with other IPs. dmabuf exposes the buffers as 
contiguous, i.e. including also the area outside the buffer itself (the TILER "stride"). The IPs 
have no idea that they should not be touching the are outside the buffer itself. My experience with 
this is with the capture drivers, and I don't know if the SGX driver deals with the buffers the same 
way, though. If there's special TILER 2D support code in the SGX driver, then it might be safe.

  Tomi
Tomi Valkeinen Jan. 7, 2020, 1:31 p.m. UTC | #9
On 07/01/2020 15:30, Tomi Valkeinen wrote:
> On 05/01/2020 22:37, Tony Lindgren wrote:
>> Hi,
>>
>> * Tony Lindgren <tony@atomide.com> [200104 05:51]:
>>>
>>> Just changing the alingment fixes the issue. Looks like the minimum
>>> alignment we currently allow is 128, I think 512 was the minimum
>>> that worked for me, so maybe the right fix would be to just change
>>> the minimum to 512 with no specific need to use 4096 for now.
>>
>> So Matthijs and I chatted about this a bit on irc, and here's what
>> we concluded so far:
>>
>> 1. We have at least three different alignment needs for tiler
>>
>> - Linux use of tiler aligned to 128 bytes
>>
>> - SGX use of tiler aligned to 4096 bytes (or 512 bytes?)
>>
>> - Fast userspace mapping aligned to 4096 bytes
>>
>> 2. The alignment need may need to be configured by the tiler consumer
>>     in #1 above
>>
>> 3. The alignment need for SGX seems to be based on SGX MMU page size
>>
>> 4. The issue I'm seeing with stellarium on droid4 may be a stride
>>     issue as about one out of 3 or 4 frames is OK and aligning to
>>     512 also fixes the issue maybe because it happens to make
>>     multiple frames align to 4096
>>
>> So let's wait on this patch until we have more info and know how
>> the different alignments should be handled.
> 
> I don't know if these are related to the issue you have, and it's been a while since I looked at 
> TILER, but two thoughts after looking at the thread:
> 
> - If the usergart is not used, the width of each line has to be expanded to be page size aligned, 
> right? Do the patches from Matthijs do this?

And immediately after pressing "send", I realized that aligning the start of the buffers to page 
size already accomplishes this...

  Tomi
Tony Lindgren Jan. 8, 2020, 4:57 p.m. UTC | #10
* Matthijs van Duin <matthijsvanduin@gmail.com> [200106 10:07]:
> On Sun, Jan 05, 2020 at 12:37:04PM -0800, Tony Lindgren wrote:
> > 4. The issue I'm seeing with stellarium on droid4 may be a stride
> >    issue as about one out of 3 or 4 frames is OK and aligning to
> >    512 also fixes the issue maybe because it happens to make
> >    multiple frames align to 4096
> 
> Yeah if your buffers are 960 pixels wide (assuming the droid4's screen
> is natively portrait) and 32bpp then 512-byte alignment suffices to
> automatically make them 4KB alignment.

Hmm sounds like I need to retest this. But doesn't 512-byte alignment
only make the 960 pixels case 2KB aligned?

> The most obvious thing I can think of that could do wrong is that it
> might contiguously map the pages that cover each line, which is what
> will happen if they use e.g. for_each_sg_page, but subsequently assume
> that the stride in sgx virtual memory is ALIGN( width * cpp, PAGE_SIZE )
> without taking the offset of the buffer inside the mapping into account.

OK

> If each line is at most 4KB (i.e. 1024 pixels @ 32bpp) but each line
> straddles an MMU page boundary, then the result would be that the even
> lines of the frame are written to the top half of the buffer, causing it
> to be scaled to 50% vertically, while the odd lines are "lost" (written
> outside the buffer, either to a different buffer or unmapped tiler
> memory).  This sounds like what you described on irc?

Yes this sounds like what I've been seeing.

Regards,

Tony

Patch
diff mbox series

diff --git a/drivers/gpu/drm/omapdrm/omap_gem.c b/drivers/gpu/drm/omapdrm/omap_gem.c
--- a/drivers/gpu/drm/omapdrm/omap_gem.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem.c
@@ -789,7 +789,7 @@  int omap_gem_pin(struct drm_gem_object *obj, dma_addr_t *dma_addr)
 			if (omap_obj->flags & OMAP_BO_TILED_MASK) {
 				block = tiler_reserve_2d(fmt,
 						omap_obj->width,
-						omap_obj->height, 0);
+						omap_obj->height, PAGE_SIZE);
 			} else {
 				block = tiler_reserve_1d(obj->size);
 			}