mbox series

[RFC,00/13] gfxstream + rutabaga_gfx: a surprising delight or startling epiphany?

Message ID 20230421011223.718-1-gurchetansingh@chromium.org (mailing list archive)
Headers show
Series gfxstream + rutabaga_gfx: a surprising delight or startling epiphany? | expand

Message

Gurchetan Singh April 21, 2023, 1:12 a.m. UTC
From: Gurchetan Singh <gurchetansingh@google.com>

Rationale:

- gfxstream [a] is good for the Android Emulator/upstream QEMU
  alignment
- Wayland passhthrough [b] via the cross-domain context type is good
  for Linux on Linux display virtualization
- rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
  virglrenderer
- This series ports rutabaga_gfx to QEMU

Feedback requested:

- How is everyone feeling about gfxstream/rutabaga_gfx, especially UI
  maintainers?  I've been assuming it is a definite win, so if there's
  a divergence of opinion on that, we should resolve that quickly.

- Need help from memory region API experts on "HACK: use memory region
  API to inject memory to guest"

- Need help from QEMU multi-threaded experts on "HACK: schedule fence
  return on main AIO context"

----------
| Longer |
----------

Dear all,

The people have demanded it, and we have listened.  Just the other
day, some kids came up to me on the street -- hat in hand, teardrops
in their eyes -- and asked "please sir, can you perchance port
gfxstream and rutabaga_gfx to upstream QEMU?".  I honestly can't take
it anymore.

In a way, I can understand the fanaticism of the gfxstreamists -- the
benefits of gfxstream + rutabaga_gfx in upstream QEMU are massive for
all involved:

(i) Android Emulator aligned with QEMU

The biggest use case is no doubt the Android Emulator.  Although used
by millions of developers around the world [d][e], the Android Emulator
itself uses currently uses a forked QEMU 2.12.  The initial fork
happened in the early days of Android (circa 2006 [f]) and while the
situation has improved, a QEMU update inside the Android Emulator only
happens once every 3-5 years. Indeed, most Android Emulator developers
aren't even subscribed to qemu-devel@ given this situation.  Their
task is often to get the next foldable config working or fix that UI
bug, but long term technical debt is something that is rarely
prioritized.

This one of those years when QEMU will be upreved, though.  Soon, the
emulator will be based on QEMU7.2 and new controls will be instituted
to make QEMU modifications harder.  Things that can be upstreamed
will be upstreamed.

One of the biggest downstream pieces of the Android Emulator is the
gfxstream graphics stack, and it has some nontrivial features that
aren't easy to implement elsewhere [g].

The lore of gfxstream is detailed in patch 10, but suffice to say
getting gfxstream mainlined would help move the Android Emulator out
of it's downstream mud hut into the light, love and compassion of
upstream.

(ii) Wayland passthrough

For the Linux guest on Linux host use case, we've elected to port
rutabaga_gfx into QEMU rather than gfxstream.  rutabaga_gfx sits on
top of gfxstream, virglrenderer, and the cross-domain context type.
With the cross-domain context type, one can avoid a guest compositor
pass to display VM windows like host normal windows.  It's now
possible to run the examples found in the crosvm book [h] with this
patchset.  There are a few problems [i], but fixing them is O(days).

This use case is less strong than the Android Emulator one, since
anyone who would play a game in a Linux guest via QEMU would be able
to run it natively.  But it could be good for developers who need to
test code in a virtual machine.

------------------
| Issues         |
------------------

The two biggest unsolved issues are the last two "HACK:" patches.
Feedback from QEMU memory management and threading experts would be
greatly appreciated.

------------------
| UI integration |
------------------

This patchset intentionally uses the simplest KMS display integration
possible: framebuffer copies to Pixman.  The reason is Linux guests
are expected to use Wayland Passthrough, and the Android Emulator UI
integration is very complex.  gfxstream doesn't have a "context 0"
like virglrenderer that can force synchronization between QEMU's and
gfxstream's GL code.

Initially, we just want to run the Android Emulator in headless mode,
and we have a few subsequent followup ideas in mind for UI integration
(all of with the goal to be minimally invasive for QEMU).  Note: even
with Android in headless mode, QEMU upstream will be used in production
and not just be a developer toy.

--------------------------
| Packaging / Versioning |
--------------------------

We have to build QEMU from sources due to compliance reasons, so we
haven't created Debian packages for either gfxstream or rutabaga_gfx
yet.  QEMU is upstream of Debian/Portage anyways.  Let us know the
standard on packaging and we should be able to follow it.

Versioning would be keyed on initial merge into QEMU.

--------------------------
| Testing                |
--------------------------

A document on how to test the patchset is availble on QEMU Gitlab [j].

[a] https://android.googlesource.com/device/generic/vulkan-cereal/
[b] https://www.youtube.com/watch?v=OZJiHMtIQ2M
[c] https://github.com/google/crosvm/blob/main/rutabaga_gfx/ffi/src/include/rutabaga_gfx_ffi.h
[d] https://www.youtube.com/watch?v=LgRRmgfrFQM
[e] https://maltewolfcastle.medium.com/how-to-setup-an-automotive-android-emulator-f287a4061b19
[f] https://groups.google.com/g/android-emulator-dev/c/dltBnUW_HzU
[g] https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg04271.html
[h] https://crosvm.dev/book/devices/wayland.html
[i] https://github.com/talex5/wayland-proxy-virtwl/blob/master/virtio-spec.md#problem
[j] https://gitlab.com/qemu-project/qemu/-/issues/1611

Antonio Caggiano (2):
  virtio-gpu blob prep: improve decoding and add memory region
  virtio-gpu: CONTEXT_INIT feature

Dr. David Alan Gilbert (1):
  virtio: Add shared memory capability

Gerd Hoffmann (1):
  virtio-gpu: hostmem

Gurchetan Singh (9):
  gfxstream + rutabaga prep: virtio_gpu_gl -> virtio_gpu_virgl
  gfxstream + rutabaga prep: make GL device more library agnostic
  gfxstream + rutabaga prep: define callbacks in realize function
  gfxstream + rutabaga prep: added need defintions, fields, and options
  gfxstream + rutabaga: add required meson changes
  gfxstream + rutabaga: add initial support for gfxstream
  gfxstream + rutabaga: enable rutabaga
  HACK: use memory region API to inject memory to guest
  HACK: schedule fence return on main AIO context

 hw/display/meson.build                 |   40 +-
 hw/display/virtio-gpu-base.c           |    6 +-
 hw/display/virtio-gpu-gl.c             |  121 +--
 hw/display/virtio-gpu-pci.c            |   14 +
 hw/display/virtio-gpu-rutabaga-stubs.c |    8 +
 hw/display/virtio-gpu-rutabaga.c       | 1032 ++++++++++++++++++++++++
 hw/display/virtio-gpu-virgl-stubs.c    |    8 +
 hw/display/virtio-gpu-virgl.c          |  138 +++-
 hw/display/virtio-gpu.c                |   17 +-
 hw/display/virtio-vga.c                |   33 +-
 hw/virtio/virtio-pci.c                 |   18 +
 include/hw/virtio/virtio-gpu-bswap.h   |   18 +
 include/hw/virtio/virtio-gpu.h         |   38 +-
 include/hw/virtio/virtio-pci.h         |    4 +
 meson.build                            |    8 +
 meson_options.txt                      |    2 +
 scripts/meson-buildoptions.sh          |    3 +
 17 files changed, 1356 insertions(+), 152 deletions(-)
 create mode 100644 hw/display/virtio-gpu-rutabaga-stubs.c
 create mode 100644 hw/display/virtio-gpu-rutabaga.c
 create mode 100644 hw/display/virtio-gpu-virgl-stubs.c

Comments

Stefan Hajnoczi April 21, 2023, 4:02 p.m. UTC | #1
On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
<gurchetansingh@chromium.org> wrote:
>
> From: Gurchetan Singh <gurchetansingh@google.com>
>
> Rationale:
>
> - gfxstream [a] is good for the Android Emulator/upstream QEMU
>   alignment
> - Wayland passhthrough [b] via the cross-domain context type is good
>   for Linux on Linux display virtualization
> - rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
>   virglrenderer
> - This series ports rutabaga_gfx to QEMU

What rutabaga_gfx and gfxstream? Can you explain where they sit in the
stack and how they build on or complement virtio-gpu and
virglrenderer?

Stefan
Gurchetan Singh April 21, 2023, 11:54 p.m. UTC | #2
On Fri, Apr 21, 2023 at 9:02 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
>
> On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
> <gurchetansingh@chromium.org> wrote:
> >
> > From: Gurchetan Singh <gurchetansingh@google.com>
> >
> > Rationale:
> >
> > - gfxstream [a] is good for the Android Emulator/upstream QEMU
> >   alignment
> > - Wayland passhthrough [b] via the cross-domain context type is good
> >   for Linux on Linux display virtualization
> > - rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
> >   virglrenderer
> > - This series ports rutabaga_gfx to QEMU
>
> What rutabaga_gfx and gfxstream? Can you explain where they sit in the
> stack and how they build on or complement virtio-gpu and
> virglrenderer?

rutabaga_gfx and gfxstream are both libraries that implement the
virtio-gpu protocol.  There's a document available in the Gitlab issue
to see where they fit in the stack [a].

gfxstream grew out of the Android Emulator's need to virtualize
graphics for app developers.  There's a short history of gfxstream in
patch 10.  It complements virglrenderer in that it's a bit more
cross-platform and targets different use cases -- more detail here
[b].  The ultimate goal is ditch out-of-tree kernel drivers in the
Android Emulator and adopt virtio, and porting gfxstream to QEMU would
speed up that transition.

rutabaga_gfx is a much smaller Rust library that sits on top of
gfxstream and even virglrenderer, but does a few extra things.  It
implements the cross-domain context type, which provides Wayland
passthrough.  This helps virtio-gpu by providing more modern display
virtualization.  For example, Microsoft for WSL2 also uses a similar
technique [c], but I believe it is not virtio-based nor open-source.
With this, we can have the same open-source Wayland passthrough
solution on crosvm, QEMU and even Fuchsia [d].  Also, there might be
an additional small Rust context type for security-sensitive use cases
in the future -- rutabaga_gfx wouldn't compile its gfxstream bindings
(since it's C++ based) in such cases.

Both gfxstream and rutabaga_gfx are a part of the virtio spec [e] now too.

[a] https://gitlab.com/qemu-project/qemu/-/issues/1611
[b] https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg04271.html
[c] https://www.youtube.com/watch?v=EkNBsBx501Q
[d] https://fuchsia-review.googlesource.com/c/fuchsia/+/778764
[e] https://github.com/oasis-tcs/virtio-spec/blob/master/device-types/gpu/description.tex#L533

>
> Stefan
Akihiko Odaki April 22, 2023, 4:41 p.m. UTC | #3
On 2023/04/22 8:54, Gurchetan Singh wrote:
> On Fri, Apr 21, 2023 at 9:02 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>
>> On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
>> <gurchetansingh@chromium.org> wrote:
>>>
>>> From: Gurchetan Singh <gurchetansingh@google.com>
>>>
>>> Rationale:
>>>
>>> - gfxstream [a] is good for the Android Emulator/upstream QEMU
>>>    alignment
>>> - Wayland passhthrough [b] via the cross-domain context type is good
>>>    for Linux on Linux display virtualization
>>> - rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
>>>    virglrenderer
>>> - This series ports rutabaga_gfx to QEMU
>>
>> What rutabaga_gfx and gfxstream? Can you explain where they sit in the
>> stack and how they build on or complement virtio-gpu and
>> virglrenderer?
> 
> rutabaga_gfx and gfxstream are both libraries that implement the
> virtio-gpu protocol.  There's a document available in the Gitlab issue
> to see where they fit in the stack [a].
> 
> gfxstream grew out of the Android Emulator's need to virtualize
> graphics for app developers.  There's a short history of gfxstream in
> patch 10.  It complements virglrenderer in that it's a bit more
> cross-platform and targets different use cases -- more detail here
> [b].  The ultimate goal is ditch out-of-tree kernel drivers in the
> Android Emulator and adopt virtio, and porting gfxstream to QEMU would
> speed up that transition.

I wonder what is motivation for maintaining gfxstream instead of 
switching to virglrenderer/venus.

> 
> rutabaga_gfx is a much smaller Rust library that sits on top of
> gfxstream and even virglrenderer, but does a few extra things.  It
> implements the cross-domain context type, which provides Wayland
> passthrough.  This helps virtio-gpu by providing more modern display
> virtualization.  For example, Microsoft for WSL2 also uses a similar
> technique [c], but I believe it is not virtio-based nor open-source.

The guest side components of WSLg are open-source, but the host side is 
not: https://github.com/microsoft/wslg
It also uses DirectX for acceleration so it's not really portable for 
outside Windows.

> With this, we can have the same open-source Wayland passthrough
> solution on crosvm, QEMU and even Fuchsia [d].  Also, there might be
> an additional small Rust context type for security-sensitive use cases
> in the future -- rutabaga_gfx wouldn't compile its gfxstream bindings
> (since it's C++ based) in such cases.
> 
> Both gfxstream and rutabaga_gfx are a part of the virtio spec [e] now too.
> 
> [a] https://gitlab.com/qemu-project/qemu/-/issues/1611
> [b] https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg04271.html
> [c] https://www.youtube.com/watch?v=EkNBsBx501Q
> [d] https://fuchsia-review.googlesource.com/c/fuchsia/+/778764
> [e] https://github.com/oasis-tcs/virtio-spec/blob/master/device-types/gpu/description.tex#L533
> 
>>
>> Stefan
Gurchetan Singh April 25, 2023, 12:16 a.m. UTC | #4
On Sat, Apr 22, 2023 at 9:41 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote:
>
> On 2023/04/22 8:54, Gurchetan Singh wrote:
> > On Fri, Apr 21, 2023 at 9:02 AM Stefan Hajnoczi <stefanha@gmail.com> wrote:
> >>
> >> On Thu, 20 Apr 2023 at 21:13, Gurchetan Singh
> >> <gurchetansingh@chromium.org> wrote:
> >>>
> >>> From: Gurchetan Singh <gurchetansingh@google.com>
> >>>
> >>> Rationale:
> >>>
> >>> - gfxstream [a] is good for the Android Emulator/upstream QEMU
> >>>    alignment
> >>> - Wayland passhthrough [b] via the cross-domain context type is good
> >>>    for Linux on Linux display virtualization
> >>> - rutabaga_gfx [c] sits on top of gfxstream, cross-domain and even
> >>>    virglrenderer
> >>> - This series ports rutabaga_gfx to QEMU
> >>
> >> What rutabaga_gfx and gfxstream? Can you explain where they sit in the
> >> stack and how they build on or complement virtio-gpu and
> >> virglrenderer?
> >
> > rutabaga_gfx and gfxstream are both libraries that implement the
> > virtio-gpu protocol.  There's a document available in the Gitlab issue
> > to see where they fit in the stack [a].
> >
> > gfxstream grew out of the Android Emulator's need to virtualize
> > graphics for app developers.  There's a short history of gfxstream in
> > patch 10.  It complements virglrenderer in that it's a bit more
> > cross-platform and targets different use cases -- more detail here
> > [b].  The ultimate goal is ditch out-of-tree kernel drivers in the
> > Android Emulator and adopt virtio, and porting gfxstream to QEMU would
> > speed up that transition.
>
> I wonder what is motivation for maintaining gfxstream instead of
> switching to virglrenderer/venus.

gfxstream GLES has features that would require significant redesign to
implement in virgl: multi-threading, live migration, widespread CTS
conformance (virgl only works well on FOSS Linux due to TGSI issues),
memory management to name a few.

Re: gfxstream VK and venus, it's a question of minimizing technical
risk.  Going from upstream to a shipping product that works on
MacOS/Windows/Linux means there's always going to be a long tail of
bugs.

The Android Emulator is still on QEMU 2.12 and the update won't be
easy (there are other things that need to be upstreamed besides GPU),
cross-platform API layering over Vulkan is expected to take 1+ year,
Vulkan doesn't work on many GPUs due to KVM issues [a], and no Vulkan
at all support has landed in upstream QEMU.

Probably the most pragmatic way to do this is to take it step by step,
and align over time by sharing components.  There might be a few
proposals to mesa-dev on that front, but getting upstream QEMU working
is a higher priority right now.  A bulk transition from one stack or
the other would be more difficult to pull off.

The great thing about context types/rutabaga_gfx,
gfxstream/virglrenderer details are largely hidden from QEMU and
present little maintenance burden.  Yes, a dependency on a new Rust
library is added, but moving towards Rust makes a ton of sense
security-wise long-term anyways.

[a] https://lore.kernel.org/all/20230330085802.2414466-1-stevensd@google.com/
-- even if this patch lands today, users will still need 1-2 years to
update

>
> >
> > rutabaga_gfx is a much smaller Rust library that sits on top of
> > gfxstream and even virglrenderer, but does a few extra things.  It
> > implements the cross-domain context type, which provides Wayland
> > passthrough.  This helps virtio-gpu by providing more modern display
> > virtualization.  For example, Microsoft for WSL2 also uses a similar
> > technique [c], but I believe it is not virtio-based nor open-source.
>
> The guest side components of WSLg are open-source, but the host side is
> not: https://github.com/microsoft/wslg
> It also uses DirectX for acceleration so it's not really portable for
> outside Windows.
>
> > With this, we can have the same open-source Wayland passthrough
> > solution on crosvm, QEMU and even Fuchsia [d].  Also, there might be
> > an additional small Rust context type for security-sensitive use cases
> > in the future -- rutabaga_gfx wouldn't compile its gfxstream bindings
> > (since it's C++ based) in such cases.
> >
> > Both gfxstream and rutabaga_gfx are a part of the virtio spec [e] now too.
> >
> > [a] https://gitlab.com/qemu-project/qemu/-/issues/1611
> > [b] https://lists.gnu.org/archive/html/qemu-devel/2023-03/msg04271.html
> > [c] https://www.youtube.com/watch?v=EkNBsBx501Q
> > [d] https://fuchsia-review.googlesource.com/c/fuchsia/+/778764
> > [e] https://github.com/oasis-tcs/virtio-spec/blob/master/device-types/gpu/description.tex#L533
> >
> >>
> >> Stefan