Message ID | 20240823124422.286989-1-tzimmermann@suse.de (mailing list archive) |
---|---|
Headers | show |
Series | rm/bochs: Modernize driver | expand |
Hi, > Patch 8 replaces GEM VRAM with GEM SHMEM. The new memory manager > is more reliable and allows for larger resolutions. Valid point. > Display updates were so slow that Gnome was unmanageable with a > flickering cursor and single FPS. The new memory management makes > Gnome at least useable. Hmm? I'm wondering where this huge improvement comes from? With enough video memory VRAM performance should be ok. If video memory is tight and ttm is forced to shuffle around framebuffers between vram and system memory on each page flip (touching much of vram along the way which causes additional overhead on the qemu side), that is obviously very bad for performance. One of the reasons why cirrus uses SHMEM + shadowing since years. Shadow buffering comes with some overhead too, so the switch isn't an obvious win (assuming enough vram). Hiding the page flips from qemu might reduce the work qemu has to do though, especially if the shadowing uses dirty tracking and only touches the vram pages which have actually changed content. So there is a fair chance that this outweighs the shadowing overhead and ends up being a net win. I don't expect the difference being very big though. Also different display usage patterns might yield different results (fbcon vs. gnome for example). So this probably makes sense, but I'd like to see a bit more background information ... On vram sizes: The default qemu vram size (16M) should be fine for the default display resolution (1280x800). For FullHD vram size should be doubled (-device VGA,vgamem_mb=32). Skimmed the other patches, looks sane overall, but I don't follow drm close enough any more to do an full review. So I leave this here: Acked-by: Gerd Hoffmann <kraxel@redhat.com> take care, Gerd
Hi Am 23.08.24 um 16:34 schrieb Gerd Hoffmann: > Hi, > >> Patch 8 replaces GEM VRAM with GEM SHMEM. The new memory manager >> is more reliable and allows for larger resolutions. > Valid point. > >> Display updates were so slow that Gnome was unmanageable with a >> flickering cursor and single FPS. The new memory management makes >> Gnome at least useable. > Hmm? I'm wondering where this huge improvement comes from? > > With enough video memory VRAM performance should be ok. > > If video memory is tight and ttm is forced to shuffle around > framebuffers between vram and system memory on each page flip (touching > much of vram along the way which causes additional overhead on the qemu > side), that is obviously very bad for performance. One of the reasons > why cirrus uses SHMEM + shadowing since years. > > Shadow buffering comes with some overhead too, so the switch isn't an > obvious win (assuming enough vram). Hiding the page flips from qemu > might reduce the work qemu has to do though, especially if the shadowing > uses dirty tracking and only touches the vram pages which have actually > changed content. So there is a fair chance that this outweighs the > shadowing overhead and ends up being a net win. I don't expect the > difference being very big though. Also different display usage patterns > might yield different results (fbcon vs. gnome for example). > > So this probably makes sense, but I'd like to see a bit more background > information ... The difference is in damage handling. The old code had two BOs in video memory and flipped between them. IDK the details of the old rendering, but from the massive flickering of the cursor, I assume that X11's internal either copies a full buffer during each redraw, or doesn't really handle damage well. It could also happen that X didn't use a shadow buffer for rendering. Bochs didn't request one. Without, drawing to I/O memory is really slow. If that applies to virtual I/O memory as well IDK. The new driver code only copies areas that have been changed from rendering. The flickering is gone and the overall update performance is acceptable. > > On vram sizes: The default qemu vram size (16M) should be fine for the > default display resolution (1280x800). For FullHD vram size should be > doubled (-device VGA,vgamem_mb=32). Right. Bochs never really tested that. So I saw something like 5k by 3k resolutions on my test setup with 16 MiB. Now that video-memory requirements for each mode can be calculated easily, we can sort out the invalid modes. > > > Skimmed the other patches, looks sane overall, but I don't follow drm > close enough any more to do an full review. So I leave this here: > > Acked-by: Gerd Hoffmann <kraxel@redhat.com> Thank you so much. Best regards Thomas > > take care, > Gerd >
Hi, > > So this probably makes sense, but I'd like to see a bit more background > > information ... > > The difference is in damage handling. > > The old code had two BOs in video memory and flipped between them. IDK the > details of the old rendering, but from the massive flickering of the cursor, > I assume that X11's internal either copies a full buffer during each redraw, > or doesn't really handle damage well. It could also happen that X didn't use > a shadow buffer for rendering. Bochs didn't request one. Without, drawing to > I/O memory is really slow. If that applies to virtual I/O memory as well > IDK. > > The new driver code only copies areas that have been changed from rendering. > The flickering is gone and the overall update performance is acceptable. Thanks. Have you tried wayland and fbcon too? > > On vram sizes: The default qemu vram size (16M) should be fine for the > > default display resolution (1280x800). For FullHD vram size should be > > doubled (-device VGA,vgamem_mb=32). > > Right. Bochs never really tested that. So I saw something like 5k by > 3k resolutions on my test setup with 16 MiB. IIRC there used to be a check in the past, limiting resolutions to buffer sizes which fit into vram twice (to allow for double buffering). Crude heuristic. I do not remember when and why it went away. Also I've seen wayland use three not two buffers ... > Now that video-memory requirements for each mode can be calculated > easily, we can sort out the invalid modes. Yes. Also I think trading higher main memory (shmem) usage for lower vram usage is good overall. Main memory can be uses for something else if not needed whereas vram sits around unused. take care, Gerd
Hi Am 26.08.24 um 09:32 schrieb Gerd Hoffmann: > Hi, > >>> So this probably makes sense, but I'd like to see a bit more background >>> information ... >> The difference is in damage handling. >> >> The old code had two BOs in video memory and flipped between them. IDK the >> details of the old rendering, but from the massive flickering of the cursor, >> I assume that X11's internal either copies a full buffer during each redraw, >> or doesn't really handle damage well. It could also happen that X didn't use >> a shadow buffer for rendering. Bochs didn't request one. Without, drawing to >> I/O memory is really slow. If that applies to virtual I/O memory as well >> IDK. >> >> The new driver code only copies areas that have been changed from rendering. >> The flickering is gone and the overall update performance is acceptable. > Thanks. > > Have you tried wayland and fbcon too? Fbcon is ok for text at least. I've yet to try wayland. > >>> On vram sizes: The default qemu vram size (16M) should be fine for the >>> default display resolution (1280x800). For FullHD vram size should be >>> doubled (-device VGA,vgamem_mb=32). >> Right. Bochs never really tested that. So I saw something like 5k by >> 3k resolutions on my test setup with 16 MiB. > IIRC there used to be a check in the past, limiting resolutions to > buffer sizes which fit into vram twice (to allow for double buffering). > > Crude heuristic. I do not remember when and why it went away. Also > I've seen wayland use three not two buffers ... IDK where that test went. The problem with TTM pinning/unpinning and eviction is that it requires 3 times the maximum consumption of physical video memory, because there is fragmentation in some corner cases. So the 16 MiB default size seem really low. Best regards Thomas > >> Now that video-memory requirements for each mode can be calculated >> easily, we can sort out the invalid modes. > Yes. Also I think trading higher main memory (shmem) usage for lower > vram usage is good overall. Main memory can be uses for something else > if not needed whereas vram sits around unused. > > take care, > Gerd >