mbox series

[00/10] rm/bochs: Modernize driver

Message ID 20240823124422.286989-1-tzimmermann@suse.de (mailing list archive)
Headers show
Series rm/bochs: Modernize driver | expand

Message

Thomas Zimmermann Aug. 23, 2024, 12:28 p.m. UTC
Bochs is lagging behind the overall state of DRM. This series gives
the driver an update.

Patch 1 removes duplicated functionality that is already handled
by the DRM core.

Patches 2 and 3 streamlines driver cleanup. Patch 2 reworks EDID
handling to follow current best practices. All buffers with EDID
data will now be cleaned up automatically. Patch 3 adds managed
cleanups for I/O resources. No fini function is needed.

Patches 4 to 6 embed struct drm_device in struct bochs_device and
remove all uses of dev_private.

Patch 7 replaces simple display helpers with regular atomic helpers.
The former are a mid-layer that is more often 'in the way' than helping.
Regular atomic helpers are composable with each other. Simple-pipe
is not.

Patch 8 replaces GEM VRAM with GEM SHMEM. The new memory manager
is more reliable and allows for larger resolutions. Display updates
were so slow that Gnome was unmanageable with a flickering cursor and
single FPS. The new memory management makes Gnome at least useable.

Patch 9 implements display-mode validation against the available video
memory. Modes should now be useable iff they passed mode_valid.

Patch 10 removes code from GEM VRAM helpers that is now no longer in
use.

Tested with qemu emulation.

Thomas Zimmermann (10):
  drm/bochs: Remove manual format test from fb_create
  drm/bochs: Use helpers for struct drm_edid
  drm/bochs: Do managed resource cleanup
  drm/bochs: Pass bochs device to various functions
  drm/bochs: Upcast with to_bochs_device()
  drm/bochs: Allocate DRM device in struct bochs_device
  drm/bochs: Use regular atomic helpers
  drm/bochs: Use GEM SHMEM helpers for memory management
  drm/bochs: Validate display modes against available video memory
  drm/gem-vram: Remove support for simple display pipelines

 drivers/gpu/drm/drm_gem_vram_helper.c |  45 ---
 drivers/gpu/drm/tiny/Kconfig          |   4 +-
 drivers/gpu/drm/tiny/bochs.c          | 384 +++++++++++++++-----------
 include/drm/drm_gem_vram_helper.h     |  13 -
 4 files changed, 224 insertions(+), 222 deletions(-)

Comments

Gerd Hoffmann Aug. 23, 2024, 2:34 p.m. UTC | #1
Hi,

> Patch 8 replaces GEM VRAM with GEM SHMEM. The new memory manager
> is more reliable and allows for larger resolutions.

Valid point.

> Display updates were so slow that Gnome was unmanageable with a
> flickering cursor and single FPS. The new memory management makes
> Gnome at least useable.

Hmm?  I'm wondering where this huge improvement comes from?

With enough video memory VRAM performance should be ok.

If video memory is tight and ttm is forced to shuffle around
framebuffers between vram and system memory on each page flip (touching
much of vram along the way which causes additional overhead on the qemu
side), that is obviously very bad for performance.  One of the reasons
why cirrus uses SHMEM + shadowing since years.

Shadow buffering comes with some overhead too, so the switch isn't an
obvious win (assuming enough vram).  Hiding the page flips from qemu
might reduce the work qemu has to do though, especially if the shadowing
uses dirty tracking and only touches the vram pages which have actually
changed content.  So there is a fair chance that this outweighs the
shadowing overhead and ends up being a net win.  I don't expect the
difference being very big though.  Also different display usage patterns
might yield different results (fbcon vs. gnome for example).

So this probably makes sense, but I'd like to see a bit more background
information ...

On vram sizes:  The default qemu vram size (16M) should be fine for the
default display resolution (1280x800).  For FullHD vram size should be
doubled (-device VGA,vgamem_mb=32).


Skimmed the other patches, looks sane overall, but I don't follow drm
close enough any more to do an full review.  So I leave this here:

Acked-by: Gerd Hoffmann <kraxel@redhat.com>

take care,
  Gerd
Thomas Zimmermann Aug. 23, 2024, 3 p.m. UTC | #2
Hi

Am 23.08.24 um 16:34 schrieb Gerd Hoffmann:
>    Hi,
>
>> Patch 8 replaces GEM VRAM with GEM SHMEM. The new memory manager
>> is more reliable and allows for larger resolutions.
> Valid point.
>
>> Display updates were so slow that Gnome was unmanageable with a
>> flickering cursor and single FPS. The new memory management makes
>> Gnome at least useable.
> Hmm?  I'm wondering where this huge improvement comes from?
>
> With enough video memory VRAM performance should be ok.
>
> If video memory is tight and ttm is forced to shuffle around
> framebuffers between vram and system memory on each page flip (touching
> much of vram along the way which causes additional overhead on the qemu
> side), that is obviously very bad for performance.  One of the reasons
> why cirrus uses SHMEM + shadowing since years.
>
> Shadow buffering comes with some overhead too, so the switch isn't an
> obvious win (assuming enough vram).  Hiding the page flips from qemu
> might reduce the work qemu has to do though, especially if the shadowing
> uses dirty tracking and only touches the vram pages which have actually
> changed content.  So there is a fair chance that this outweighs the
> shadowing overhead and ends up being a net win.  I don't expect the
> difference being very big though.  Also different display usage patterns
> might yield different results (fbcon vs. gnome for example).
>
> So this probably makes sense, but I'd like to see a bit more background
> information ...

The difference is in damage handling.

The old code had two BOs in video memory and flipped between them. IDK 
the details of the old rendering, but from the massive flickering of the 
cursor, I assume that X11's internal either copies a full buffer during 
each redraw, or doesn't really handle damage well. It could also happen 
that X didn't use a shadow buffer for rendering. Bochs didn't request 
one. Without, drawing to I/O memory is really slow. If that applies to 
virtual I/O memory as well IDK.

The new driver code only copies areas that have been changed from 
rendering. The flickering is gone and the overall update performance is 
acceptable.

>
> On vram sizes:  The default qemu vram size (16M) should be fine for the
> default display resolution (1280x800).  For FullHD vram size should be
> doubled (-device VGA,vgamem_mb=32).

Right. Bochs never really tested that. So I saw something like 5k by 3k 
resolutions on my test setup with 16 MiB. Now that video-memory 
requirements for each mode can be calculated easily, we can sort out the 
invalid modes.

>
>
> Skimmed the other patches, looks sane overall, but I don't follow drm
> close enough any more to do an full review.  So I leave this here:
>
> Acked-by: Gerd Hoffmann <kraxel@redhat.com>

Thank you so much.

Best regards
Thomas

>
> take care,
>    Gerd
>
Gerd Hoffmann Aug. 26, 2024, 7:32 a.m. UTC | #3
Hi,

> > So this probably makes sense, but I'd like to see a bit more background
> > information ...
> 
> The difference is in damage handling.
> 
> The old code had two BOs in video memory and flipped between them. IDK the
> details of the old rendering, but from the massive flickering of the cursor,
> I assume that X11's internal either copies a full buffer during each redraw,
> or doesn't really handle damage well. It could also happen that X didn't use
> a shadow buffer for rendering. Bochs didn't request one. Without, drawing to
> I/O memory is really slow. If that applies to virtual I/O memory as well
> IDK.
> 
> The new driver code only copies areas that have been changed from rendering.
> The flickering is gone and the overall update performance is acceptable.

Thanks.

Have you tried wayland and fbcon too?

> > On vram sizes:  The default qemu vram size (16M) should be fine for the
> > default display resolution (1280x800).  For FullHD vram size should be
> > doubled (-device VGA,vgamem_mb=32).
> 
> Right. Bochs never really tested that.  So I saw something like 5k by
> 3k resolutions on my test setup with 16 MiB.

IIRC there used to be a check in the past, limiting resolutions to
buffer sizes which fit into vram twice (to allow for double buffering).

Crude heuristic.  I do not remember when and why it went away.  Also
I've seen wayland use three not two buffers ...

> Now that video-memory requirements for each mode can be calculated
> easily, we can sort out the invalid modes.

Yes.  Also I think trading higher main memory (shmem) usage for lower
vram usage is good overall.  Main memory can be uses for something else
if not needed whereas vram sits around unused.

take care,
  Gerd
Thomas Zimmermann Aug. 26, 2024, 8:08 a.m. UTC | #4
Hi

Am 26.08.24 um 09:32 schrieb Gerd Hoffmann:
>    Hi,
>
>>> So this probably makes sense, but I'd like to see a bit more background
>>> information ...
>> The difference is in damage handling.
>>
>> The old code had two BOs in video memory and flipped between them. IDK the
>> details of the old rendering, but from the massive flickering of the cursor,
>> I assume that X11's internal either copies a full buffer during each redraw,
>> or doesn't really handle damage well. It could also happen that X didn't use
>> a shadow buffer for rendering. Bochs didn't request one. Without, drawing to
>> I/O memory is really slow. If that applies to virtual I/O memory as well
>> IDK.
>>
>> The new driver code only copies areas that have been changed from rendering.
>> The flickering is gone and the overall update performance is acceptable.
> Thanks.
>
> Have you tried wayland and fbcon too?

Fbcon is ok for text at least. I've yet to try wayland.

>
>>> On vram sizes:  The default qemu vram size (16M) should be fine for the
>>> default display resolution (1280x800).  For FullHD vram size should be
>>> doubled (-device VGA,vgamem_mb=32).
>> Right. Bochs never really tested that.  So I saw something like 5k by
>> 3k resolutions on my test setup with 16 MiB.
> IIRC there used to be a check in the past, limiting resolutions to
> buffer sizes which fit into vram twice (to allow for double buffering).
>
> Crude heuristic.  I do not remember when and why it went away.  Also
> I've seen wayland use three not two buffers ...

IDK where that test went.

The problem with TTM pinning/unpinning and eviction is that it requires 
3 times the maximum consumption of physical video memory, because there 
is fragmentation in some corner cases. So the 16 MiB default size seem 
really low.

Best regards
Thomas

>
>> Now that video-memory requirements for each mode can be calculated
>> easily, we can sort out the invalid modes.
> Yes.  Also I think trading higher main memory (shmem) usage for lower
> vram usage is good overall.  Main memory can be uses for something else
> if not needed whereas vram sits around unused.
>
> take care,
>    Gerd
>