Message ID | 20240519212712.2605419-1-dmitry.osipenko@collabora.com (mailing list archive) |
---|---|
Headers | show |
Series | Support blob memory and venus on qemu | expand |
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > Hello, > > This series enables Vulkan Venus context support on virtio-gpu. > > All virglrender and almost all Linux kernel prerequisite changes > needed by Venus are already in upstream. For kernel there is a pending > KVM patchset that fixes mapping of compound pages needed for DRM drivers > using TTM [1], othewrwise hostmem blob mapping will fail with a KVM error > from Qemu. > > [1] https://lore.kernel.org/kvm/20240229025759.1187910-1-stevensd@google.com/ > > You'll need to use recent Mesa version containing patch that removes > dependency on cross-device feature from Venus that isn't supported by > Qemu [2]. > > [2] https://gitlab.freedesktop.org/mesa/mesa/-/commit/087e9a96d13155e26987befae78b6ccbb7ae242b > > Example Qemu cmdline that enables Venus: > > qemu-system-x86_64 -device virtio-vga-gl,hostmem=4G,blob=true,venus=true \ > -machine q35,accel=kvm,memory-backend=mem1 \ > -object memory-backend-memfd,id=mem1,size=8G -m 8G What is the correct device for non-x86 guests? We have virtio-gpu-gl-pci but when doing that I get: -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true qemu-system-aarch64: -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true: opengl is not available According to 37f86af087 (virtio-gpu: move virgl realize + properties): Drop the virgl property, the virtio-gpu-gl-device has virgl enabled no matter what. Just use virtio-gpu-device instead if you don't want enable virgl and opengl. This simplifies the logic and reduces the test matrix. but that's not a good solution because that needs virtio-mmio and there are reasons to have a PCI device (for one thing no ambiguity about discovery).
Alex Bennée <alex.bennee@linaro.org> writes: > Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > >> Hello, >> >> This series enables Vulkan Venus context support on virtio-gpu. >> >> All virglrender and almost all Linux kernel prerequisite changes >> needed by Venus are already in upstream. For kernel there is a pending >> KVM patchset that fixes mapping of compound pages needed for DRM drivers >> using TTM [1], othewrwise hostmem blob mapping will fail with a KVM error >> from Qemu. >> >> [1] https://lore.kernel.org/kvm/20240229025759.1187910-1-stevensd@google.com/ >> >> You'll need to use recent Mesa version containing patch that removes >> dependency on cross-device feature from Venus that isn't supported by >> Qemu [2]. >> >> [2] https://gitlab.freedesktop.org/mesa/mesa/-/commit/087e9a96d13155e26987befae78b6ccbb7ae242b >> >> Example Qemu cmdline that enables Venus: >> >> qemu-system-x86_64 -device virtio-vga-gl,hostmem=4G,blob=true,venus=true \ >> -machine q35,accel=kvm,memory-backend=mem1 \ >> -object memory-backend-memfd,id=mem1,size=8G -m 8G > > What is the correct device for non-x86 guests? We have virtio-gpu-gl-pci > but when doing that I get: > > -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true > qemu-system-aarch64: -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true: opengl is not available > > According to 37f86af087 (virtio-gpu: move virgl realize + properties): > > Drop the virgl property, the virtio-gpu-gl-device has virgl enabled no > matter what. Just use virtio-gpu-device instead if you don't want > enable virgl and opengl. This simplifies the logic and reduces the test > matrix. > > but that's not a good solution because that needs virtio-mmio and there > are reasons to have a PCI device (for one thing no ambiguity about > discovery). Oops my mistake forgetting: --display gtk,gl=on Although I do see a lot of eglMakeContext failures.
On 5/21/24 17:57, Alex Bennée wrote: > Alex Bennée <alex.bennee@linaro.org> writes: > >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >>> Hello, >>> >>> This series enables Vulkan Venus context support on virtio-gpu. >>> >>> All virglrender and almost all Linux kernel prerequisite changes >>> needed by Venus are already in upstream. For kernel there is a pending >>> KVM patchset that fixes mapping of compound pages needed for DRM drivers >>> using TTM [1], othewrwise hostmem blob mapping will fail with a KVM error >>> from Qemu. >>> >>> [1] https://lore.kernel.org/kvm/20240229025759.1187910-1-stevensd@google.com/ >>> >>> You'll need to use recent Mesa version containing patch that removes >>> dependency on cross-device feature from Venus that isn't supported by >>> Qemu [2]. >>> >>> [2] https://gitlab.freedesktop.org/mesa/mesa/-/commit/087e9a96d13155e26987befae78b6ccbb7ae242b >>> >>> Example Qemu cmdline that enables Venus: >>> >>> qemu-system-x86_64 -device virtio-vga-gl,hostmem=4G,blob=true,venus=true \ >>> -machine q35,accel=kvm,memory-backend=mem1 \ >>> -object memory-backend-memfd,id=mem1,size=8G -m 8G >> >> What is the correct device for non-x86 guests? We have virtio-gpu-gl-pci >> but when doing that I get: >> >> -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true >> qemu-system-aarch64: -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true: opengl is not available >> >> According to 37f86af087 (virtio-gpu: move virgl realize + properties): >> >> Drop the virgl property, the virtio-gpu-gl-device has virgl enabled no >> matter what. Just use virtio-gpu-device instead if you don't want >> enable virgl and opengl. This simplifies the logic and reduces the test >> matrix. >> >> but that's not a good solution because that needs virtio-mmio and there >> are reasons to have a PCI device (for one thing no ambiguity about >> discovery). > > Oops my mistake forgetting: > > --display gtk,gl=on > > Although I do see a lot of eglMakeContext failures. Please post the full Qemu cmdline you're using
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 5/21/24 17:57, Alex Bennée wrote: >> Alex Bennée <alex.bennee@linaro.org> writes: >> >>> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >>> >>>> Hello, >>>> >>>> This series enables Vulkan Venus context support on virtio-gpu. >>>> >>>> All virglrender and almost all Linux kernel prerequisite changes >>>> needed by Venus are already in upstream. For kernel there is a pending >>>> KVM patchset that fixes mapping of compound pages needed for DRM drivers >>>> using TTM [1], othewrwise hostmem blob mapping will fail with a KVM error >>>> from Qemu. >>>> >>>> [1] https://lore.kernel.org/kvm/20240229025759.1187910-1-stevensd@google.com/ >>>> >>>> You'll need to use recent Mesa version containing patch that removes >>>> dependency on cross-device feature from Venus that isn't supported by >>>> Qemu [2]. >>>> >>>> [2] https://gitlab.freedesktop.org/mesa/mesa/-/commit/087e9a96d13155e26987befae78b6ccbb7ae242b >>>> >>>> Example Qemu cmdline that enables Venus: >>>> >>>> qemu-system-x86_64 -device virtio-vga-gl,hostmem=4G,blob=true,venus=true \ >>>> -machine q35,accel=kvm,memory-backend=mem1 \ >>>> -object memory-backend-memfd,id=mem1,size=8G -m 8G >>> >>> What is the correct device for non-x86 guests? We have virtio-gpu-gl-pci >>> but when doing that I get: >>> >>> -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true >>> qemu-system-aarch64: -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true: opengl is not available >>> >>> According to 37f86af087 (virtio-gpu: move virgl realize + properties): >>> >>> Drop the virgl property, the virtio-gpu-gl-device has virgl enabled no >>> matter what. Just use virtio-gpu-device instead if you don't want >>> enable virgl and opengl. This simplifies the logic and reduces the test >>> matrix. >>> >>> but that's not a good solution because that needs virtio-mmio and there >>> are reasons to have a PCI device (for one thing no ambiguity about >>> discovery). >> >> Oops my mistake forgetting: >> >> --display gtk,gl=on >> >> Although I do see a lot of eglMakeContext failures. > > Please post the full Qemu cmdline you're using With: ./qemu-system-aarch64 \ -machine type=virt,virtualization=on,pflash0=rom,pflash1=efivars \ -cpu neoverse-n1 \ -smp 4 \ -accel tcg \ -device virtio-net-pci,netdev=unet \ -device virtio-scsi-pci \ -device scsi-hd,drive=hd \ -netdev user,id=unet,hostfwd=tcp::2222-:22 \ -blockdev driver=raw,node-name=hd,file.driver=host_device,file.filename=/dev/zen-ssd2/trixie-arm64,discard=unmap \ -serial mon:stdio \ -blockdev node-name=rom,driver=file,filename=(pwd)/pc-bios/edk2-aarch64-code.fd,read-only=true \ -blockdev node-name=efivars,driver=file,filename=$HOME/images/qemu-arm64-efivars \ -m 8192 \ -object memory-backend-memfd,id=mem,size=8G,share=on \ -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true \ -display gtk,gl=on,show-cursor=on -vga none \ -device qemu-xhci -device usb-kbd -device usb-tablet I get a boot up with a lot of: (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed In the guest I run: meson devenv -C /root/lsrc/graphics/mesa.git/build fish to bring in the latest Mesa (with virtio enabled). Running vulkaninfo reports two cards: ========== VULKANINFO ========== Vulkan Instance Version: 1.3.280 Instance Extensions: count = 14 ------------------------------- VK_EXT_debug_report : extension revision 10 VK_EXT_debug_utils : extension revision 2 VK_EXT_headless_surface : extension revision 1 VK_KHR_device_group_creation : extension revision 1 VK_KHR_external_fence_capabilities : extension revision 1 VK_KHR_external_memory_capabilities : extension revision 1 VK_KHR_external_semaphore_capabilities : extension revision 1 VK_KHR_get_physical_device_properties2 : extension revision 2 VK_KHR_get_surface_capabilities2 : extension revision 1 VK_KHR_portability_enumeration : extension revision 1 VK_KHR_surface : extension revision 25 VK_KHR_surface_protected_capabilities : extension revision 1 VK_KHR_wayland_surface : extension revision 6 VK_LUNARG_direct_driver_loading : extension revision 1 Instance Layers: count = 2 -------------------------- VK_LAYER_MESA_device_select Linux device selection layer 1.3.211 version 1 VK_LAYER_MESA_overlay Mesa Overlay layer 1.3.211 version 1 Devices: ======== GPU0: apiVersion = 1.3.230 driverVersion = 24.1.99 vendorID = 0x8086 deviceID = 0xa780 deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU deviceName = Virtio-GPU Venus (Intel(R) Graphics (RPL-S)) driverID = DRIVER_ID_MESA_VENUS driverName = venus driverInfo = Mesa 24.2.0-devel (git-0b582449f0) conformanceVersion = 1.3.0.0 deviceUUID = 29d2e940-a1a0-3054-0f9a-9f7dec52a084 driverUUID = 3694c390-f245-612a-12ce-7d3a99127622 GPU1: apiVersion = 1.2.0 driverVersion = 24.1.99 vendorID = 0x10005 deviceID = 0x0000 deviceType = PHYSICAL_DEVICE_TYPE_CPU deviceName = Virtio-GPU Venus (llvmpipe (LLVM 15.0.6, 256 bits)) driverID = DRIVER_ID_MESA_VENUS driverName = venus driverInfo = Mesa 24.2.0-devel (git-0b582449f0) conformanceVersion = 1.3.0.0 deviceUUID = 5fb5c03f-c537-f0fe-a7e6-9cd5866acb8d driverUUID = 3694c390-f245-612a-12ce-7d3a99127622 Running weston and then vkcube-wayland reports its selecting "GPU 0: Virtio-GPU Venus (Intel(R) Graphics (RPL-S))" but otherwise produces no output. If I run with "-display sdl,gl=on,show-cursor=on" and the same other command line options the results for vulkaninfo are the same. However vkcube-wayland gets a little further and draws the initial cube on the screen before locking up with: MESA-VIRTIO: debug: stuck in fence wait with iter at xxxx where xxxx grows each time it prints. On shutting down I see some virgl errors interspersed with the systemd logs: [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) [ OK ] Stopped systemd-logind.service - User Login Management. virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 [ 475.257111] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) [ OK ] Stopped target network-online.target - Network is Online. [ OK ] Stopped target remote-fs.target - Remote File Systems. [ OK ] Stopped NetworkManager-wait-online…vice - Network Manager Wait Online. Stopping avahi-daemon.service - Avahi mDNS/DNS-SD Stack... Stopping cups.service - CUPS Scheduler... Stopping user-runtime-dir@0.servic…er Runtime Directory /run/user/0... [ OK ] Stopped avahi-daemon.service - Avahi mDNS/DNS-SD Stack. [ OK ] Stopped cups.service - CUPS Scheduler. virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 [ 475.357543] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) [ OK ] Stopped target network.target - Network. [ OK ] Stopped target nss-user-lookup.target - User and Group Name Lookups. Stopping NetworkManager.service - Network Manager... Stopping networking.service - Raise network interfaces... Stopping wpa_supplicant.service - WPA supplicant... [ OK ] Stopped wpa_supplicant.service - WPA supplicant. virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 [ 493.585261] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209)
On 5/22/24 12:00, Alex Bennée wrote: > Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > >> On 5/21/24 17:57, Alex Bennée wrote: >>> Alex Bennée <alex.bennee@linaro.org> writes: >>> >>>> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >>>> >>>>> Hello, >>>>> >>>>> This series enables Vulkan Venus context support on virtio-gpu. >>>>> >>>>> All virglrender and almost all Linux kernel prerequisite changes >>>>> needed by Venus are already in upstream. For kernel there is a pending >>>>> KVM patchset that fixes mapping of compound pages needed for DRM drivers >>>>> using TTM [1], othewrwise hostmem blob mapping will fail with a KVM error >>>>> from Qemu. >>>>> >>>>> [1] https://lore.kernel.org/kvm/20240229025759.1187910-1-stevensd@google.com/ >>>>> >>>>> You'll need to use recent Mesa version containing patch that removes >>>>> dependency on cross-device feature from Venus that isn't supported by >>>>> Qemu [2]. >>>>> >>>>> [2] https://gitlab.freedesktop.org/mesa/mesa/-/commit/087e9a96d13155e26987befae78b6ccbb7ae242b >>>>> >>>>> Example Qemu cmdline that enables Venus: >>>>> >>>>> qemu-system-x86_64 -device virtio-vga-gl,hostmem=4G,blob=true,venus=true \ >>>>> -machine q35,accel=kvm,memory-backend=mem1 \ >>>>> -object memory-backend-memfd,id=mem1,size=8G -m 8G >>>> >>>> What is the correct device for non-x86 guests? We have virtio-gpu-gl-pci >>>> but when doing that I get: >>>> >>>> -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true >>>> qemu-system-aarch64: -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true: opengl is not available >>>> >>>> According to 37f86af087 (virtio-gpu: move virgl realize + properties): >>>> >>>> Drop the virgl property, the virtio-gpu-gl-device has virgl enabled no >>>> matter what. Just use virtio-gpu-device instead if you don't want >>>> enable virgl and opengl. This simplifies the logic and reduces the test >>>> matrix. >>>> >>>> but that's not a good solution because that needs virtio-mmio and there >>>> are reasons to have a PCI device (for one thing no ambiguity about >>>> discovery). >>> >>> Oops my mistake forgetting: >>> >>> --display gtk,gl=on >>> >>> Although I do see a lot of eglMakeContext failures. >> >> Please post the full Qemu cmdline you're using > > With: > > ./qemu-system-aarch64 \ > -machine type=virt,virtualization=on,pflash0=rom,pflash1=efivars \ > -cpu neoverse-n1 \ > -smp 4 \ > -accel tcg \ > -device virtio-net-pci,netdev=unet \ > -device virtio-scsi-pci \ > -device scsi-hd,drive=hd \ > -netdev user,id=unet,hostfwd=tcp::2222-:22 \ > -blockdev driver=raw,node-name=hd,file.driver=host_device,file.filename=/dev/zen-ssd2/trixie-arm64,discard=unmap \ > -serial mon:stdio \ > -blockdev node-name=rom,driver=file,filename=(pwd)/pc-bios/edk2-aarch64-code.fd,read-only=true \ > -blockdev node-name=efivars,driver=file,filename=$HOME/images/qemu-arm64-efivars \ > -m 8192 \ > -object memory-backend-memfd,id=mem,size=8G,share=on \ > -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true \ > -display gtk,gl=on,show-cursor=on -vga none \ > -device qemu-xhci -device usb-kbd -device usb-tablet > > I get a boot up with a lot of: > > > (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed > > (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed > > (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed > > (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed > > In the guest I run: > > meson devenv -C /root/lsrc/graphics/mesa.git/build fish > > to bring in the latest Mesa (with virtio enabled). Running vulkaninfo > reports two cards: > > ========== > VULKANINFO > ========== > > Vulkan Instance Version: 1.3.280 > > > Instance Extensions: count = 14 > ------------------------------- > VK_EXT_debug_report : extension revision 10 > VK_EXT_debug_utils : extension revision 2 > VK_EXT_headless_surface : extension revision 1 > VK_KHR_device_group_creation : extension revision 1 > VK_KHR_external_fence_capabilities : extension revision 1 > VK_KHR_external_memory_capabilities : extension revision 1 > VK_KHR_external_semaphore_capabilities : extension revision 1 > VK_KHR_get_physical_device_properties2 : extension revision 2 > VK_KHR_get_surface_capabilities2 : extension revision 1 > VK_KHR_portability_enumeration : extension revision 1 > VK_KHR_surface : extension revision 25 > VK_KHR_surface_protected_capabilities : extension revision 1 > VK_KHR_wayland_surface : extension revision 6 > VK_LUNARG_direct_driver_loading : extension revision 1 > > Instance Layers: count = 2 > -------------------------- > VK_LAYER_MESA_device_select Linux device selection layer 1.3.211 version 1 > VK_LAYER_MESA_overlay Mesa Overlay layer 1.3.211 > version 1 > > Devices: > ======== > GPU0: > apiVersion = 1.3.230 > driverVersion = 24.1.99 > vendorID = 0x8086 > deviceID = 0xa780 > deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU > deviceName = Virtio-GPU Venus (Intel(R) Graphics (RPL-S)) > driverID = DRIVER_ID_MESA_VENUS > driverName = venus > driverInfo = Mesa 24.2.0-devel (git-0b582449f0) > conformanceVersion = 1.3.0.0 > deviceUUID = 29d2e940-a1a0-3054-0f9a-9f7dec52a084 > driverUUID = 3694c390-f245-612a-12ce-7d3a99127622 > GPU1: > apiVersion = 1.2.0 > driverVersion = 24.1.99 > vendorID = 0x10005 > deviceID = 0x0000 > deviceType = PHYSICAL_DEVICE_TYPE_CPU > deviceName = Virtio-GPU Venus (llvmpipe (LLVM 15.0.6, 256 bits)) > driverID = DRIVER_ID_MESA_VENUS > driverName = venus > driverInfo = Mesa 24.2.0-devel (git-0b582449f0) > conformanceVersion = 1.3.0.0 > deviceUUID = 5fb5c03f-c537-f0fe-a7e6-9cd5866acb8d > driverUUID = 3694c390-f245-612a-12ce-7d3a99127622 > > Running weston and then vkcube-wayland reports its selecting "GPU 0: > Virtio-GPU Venus (Intel(R) Graphics (RPL-S))" but otherwise produces no > output. > > If I run with "-display sdl,gl=on,show-cursor=on" and the same other > command line options the results for vulkaninfo are the same. However > vkcube-wayland gets a little further and draws the initial cube on the > screen before locking up with: > > MESA-VIRTIO: debug: stuck in fence wait with iter at xxxx > > where xxxx grows each time it prints. On shutting down I see some virgl > errors interspersed with the systemd logs: > > [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) > [ OK ] Stopped systemd-logind.service - User Login Management. > virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 > [ 475.257111] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) > [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) > [ OK ] Stopped target network-online.target - Network is Online. > [ OK ] Stopped target remote-fs.target - Remote File Systems. > [ OK ] Stopped NetworkManager-wait-online…vice - Network Manager Wait Online. > Stopping avahi-daemon.service - Avahi mDNS/DNS-SD Stack... > Stopping cups.service - CUPS Scheduler... > Stopping user-runtime-dir@0.servic…er Runtime Directory /run/user/0... > [ OK ] Stopped avahi-daemon.service - Avahi mDNS/DNS-SD Stack. > [ OK ] Stopped cups.service - CUPS Scheduler. > virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 > [ 475.357543] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) > [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) > [ OK ] Stopped target network.target - Network. > [ OK ] Stopped target nss-user-lookup.target - User and Group Name Lookups. > Stopping NetworkManager.service - Network Manager... > Stopping networking.service - Raise network interfaces... > Stopping wpa_supplicant.service - WPA supplicant... > [ OK ] Stopped wpa_supplicant.service - WPA supplicant. > virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 > [ 493.585261] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) > [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) > I've reproduced this with qemu-system-aarch64. Vkcube works for a second and then stops, Qemu compeltely gets frozen after closing and re-running vkcube. Doesn't feel like this is a problem with venus, but with arm64. For now don't know where is the bug, will take a closer look.
On 5/22/24 12:00, Alex Bennée wrote: > I get a boot up with a lot of: > (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed > (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed > (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed > (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed Have same problem with GTK and arm64/UEFI. Something is resetting virtio-gpu device during boot (maybe UEFI fw) and it doesn't work properly with GTK. I'd expect x86 should have same issue, but don't recall x86 having it.
On 5/27/24 02:46, Dmitry Osipenko wrote: > On 5/22/24 12:00, Alex Bennée wrote: >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >>> On 5/21/24 17:57, Alex Bennée wrote: >>>> Alex Bennée <alex.bennee@linaro.org> writes: >>>> >>>>> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >>>>> >>>>>> Hello, >>>>>> >>>>>> This series enables Vulkan Venus context support on virtio-gpu. >>>>>> >>>>>> All virglrender and almost all Linux kernel prerequisite changes >>>>>> needed by Venus are already in upstream. For kernel there is a pending >>>>>> KVM patchset that fixes mapping of compound pages needed for DRM drivers >>>>>> using TTM [1], othewrwise hostmem blob mapping will fail with a KVM error >>>>>> from Qemu. >>>>>> >>>>>> [1] https://lore.kernel.org/kvm/20240229025759.1187910-1-stevensd@google.com/ >>>>>> >>>>>> You'll need to use recent Mesa version containing patch that removes >>>>>> dependency on cross-device feature from Venus that isn't supported by >>>>>> Qemu [2]. >>>>>> >>>>>> [2] https://gitlab.freedesktop.org/mesa/mesa/-/commit/087e9a96d13155e26987befae78b6ccbb7ae242b >>>>>> >>>>>> Example Qemu cmdline that enables Venus: >>>>>> >>>>>> qemu-system-x86_64 -device virtio-vga-gl,hostmem=4G,blob=true,venus=true \ >>>>>> -machine q35,accel=kvm,memory-backend=mem1 \ >>>>>> -object memory-backend-memfd,id=mem1,size=8G -m 8G >>>>> >>>>> What is the correct device for non-x86 guests? We have virtio-gpu-gl-pci >>>>> but when doing that I get: >>>>> >>>>> -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true >>>>> qemu-system-aarch64: -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true: opengl is not available >>>>> >>>>> According to 37f86af087 (virtio-gpu: move virgl realize + properties): >>>>> >>>>> Drop the virgl property, the virtio-gpu-gl-device has virgl enabled no >>>>> matter what. Just use virtio-gpu-device instead if you don't want >>>>> enable virgl and opengl. This simplifies the logic and reduces the test >>>>> matrix. >>>>> >>>>> but that's not a good solution because that needs virtio-mmio and there >>>>> are reasons to have a PCI device (for one thing no ambiguity about >>>>> discovery). >>>> >>>> Oops my mistake forgetting: >>>> >>>> --display gtk,gl=on >>>> >>>> Although I do see a lot of eglMakeContext failures. >>> >>> Please post the full Qemu cmdline you're using >> >> With: >> >> ./qemu-system-aarch64 \ >> -machine type=virt,virtualization=on,pflash0=rom,pflash1=efivars \ >> -cpu neoverse-n1 \ >> -smp 4 \ >> -accel tcg \ >> -device virtio-net-pci,netdev=unet \ >> -device virtio-scsi-pci \ >> -device scsi-hd,drive=hd \ >> -netdev user,id=unet,hostfwd=tcp::2222-:22 \ >> -blockdev driver=raw,node-name=hd,file.driver=host_device,file.filename=/dev/zen-ssd2/trixie-arm64,discard=unmap \ >> -serial mon:stdio \ >> -blockdev node-name=rom,driver=file,filename=(pwd)/pc-bios/edk2-aarch64-code.fd,read-only=true \ >> -blockdev node-name=efivars,driver=file,filename=$HOME/images/qemu-arm64-efivars \ >> -m 8192 \ >> -object memory-backend-memfd,id=mem,size=8G,share=on \ >> -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true \ >> -display gtk,gl=on,show-cursor=on -vga none \ >> -device qemu-xhci -device usb-kbd -device usb-tablet >> >> I get a boot up with a lot of: >> >> >> (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed >> >> (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed >> >> (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed >> >> (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed >> >> In the guest I run: >> >> meson devenv -C /root/lsrc/graphics/mesa.git/build fish >> >> to bring in the latest Mesa (with virtio enabled). Running vulkaninfo >> reports two cards: >> >> ========== >> VULKANINFO >> ========== >> >> Vulkan Instance Version: 1.3.280 >> >> >> Instance Extensions: count = 14 >> ------------------------------- >> VK_EXT_debug_report : extension revision 10 >> VK_EXT_debug_utils : extension revision 2 >> VK_EXT_headless_surface : extension revision 1 >> VK_KHR_device_group_creation : extension revision 1 >> VK_KHR_external_fence_capabilities : extension revision 1 >> VK_KHR_external_memory_capabilities : extension revision 1 >> VK_KHR_external_semaphore_capabilities : extension revision 1 >> VK_KHR_get_physical_device_properties2 : extension revision 2 >> VK_KHR_get_surface_capabilities2 : extension revision 1 >> VK_KHR_portability_enumeration : extension revision 1 >> VK_KHR_surface : extension revision 25 >> VK_KHR_surface_protected_capabilities : extension revision 1 >> VK_KHR_wayland_surface : extension revision 6 >> VK_LUNARG_direct_driver_loading : extension revision 1 >> >> Instance Layers: count = 2 >> -------------------------- >> VK_LAYER_MESA_device_select Linux device selection layer 1.3.211 version 1 >> VK_LAYER_MESA_overlay Mesa Overlay layer 1.3.211 >> version 1 >> >> Devices: >> ======== >> GPU0: >> apiVersion = 1.3.230 >> driverVersion = 24.1.99 >> vendorID = 0x8086 >> deviceID = 0xa780 >> deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU >> deviceName = Virtio-GPU Venus (Intel(R) Graphics (RPL-S)) >> driverID = DRIVER_ID_MESA_VENUS >> driverName = venus >> driverInfo = Mesa 24.2.0-devel (git-0b582449f0) >> conformanceVersion = 1.3.0.0 >> deviceUUID = 29d2e940-a1a0-3054-0f9a-9f7dec52a084 >> driverUUID = 3694c390-f245-612a-12ce-7d3a99127622 >> GPU1: >> apiVersion = 1.2.0 >> driverVersion = 24.1.99 >> vendorID = 0x10005 >> deviceID = 0x0000 >> deviceType = PHYSICAL_DEVICE_TYPE_CPU >> deviceName = Virtio-GPU Venus (llvmpipe (LLVM 15.0.6, 256 bits)) >> driverID = DRIVER_ID_MESA_VENUS >> driverName = venus >> driverInfo = Mesa 24.2.0-devel (git-0b582449f0) >> conformanceVersion = 1.3.0.0 >> deviceUUID = 5fb5c03f-c537-f0fe-a7e6-9cd5866acb8d >> driverUUID = 3694c390-f245-612a-12ce-7d3a99127622 >> >> Running weston and then vkcube-wayland reports its selecting "GPU 0: >> Virtio-GPU Venus (Intel(R) Graphics (RPL-S))" but otherwise produces no >> output. >> >> If I run with "-display sdl,gl=on,show-cursor=on" and the same other >> command line options the results for vulkaninfo are the same. However >> vkcube-wayland gets a little further and draws the initial cube on the >> screen before locking up with: >> >> MESA-VIRTIO: debug: stuck in fence wait with iter at xxxx >> >> where xxxx grows each time it prints. On shutting down I see some virgl >> errors interspersed with the systemd logs: >> >> [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [ OK ] Stopped systemd-logind.service - User Login Management. >> virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 >> [ 475.257111] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [ OK ] Stopped target network-online.target - Network is Online. >> [ OK ] Stopped target remote-fs.target - Remote File Systems. >> [ OK ] Stopped NetworkManager-wait-online…vice - Network Manager Wait Online. >> Stopping avahi-daemon.service - Avahi mDNS/DNS-SD Stack... >> Stopping cups.service - CUPS Scheduler... >> Stopping user-runtime-dir@0.servic…er Runtime Directory /run/user/0... >> [ OK ] Stopped avahi-daemon.service - Avahi mDNS/DNS-SD Stack. >> [ OK ] Stopped cups.service - CUPS Scheduler. >> virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 >> [ 475.357543] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [ OK ] Stopped target network.target - Network. >> [ OK ] Stopped target nss-user-lookup.target - User and Group Name Lookups. >> Stopping NetworkManager.service - Network Manager... >> Stopping networking.service - Raise network interfaces... >> Stopping wpa_supplicant.service - WPA supplicant... >> [ OK ] Stopped wpa_supplicant.service - WPA supplicant. >> virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 >> [ 493.585261] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> > > I've reproduced this with qemu-system-aarch64. Vkcube works for a second > and then stops, Qemu compeltely gets frozen after closing and re-running > vkcube. Doesn't feel like this is a problem with venus, but with arm64. > For now don't know where is the bug, will take a closer look. Interestingly, on another try vkcube now works with no issues.
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 5/22/24 12:00, Alex Bennée wrote: >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >>> On 5/21/24 17:57, Alex Bennée wrote: >>>> Alex Bennée <alex.bennee@linaro.org> writes: >>>> >>>>> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >>>>> >>>>>> Hello, >>>>>> >>>>>> This series enables Vulkan Venus context support on virtio-gpu. >>>>>> >>>>>> All virglrender and almost all Linux kernel prerequisite changes >>>>>> needed by Venus are already in upstream. For kernel there is a pending >>>>>> KVM patchset that fixes mapping of compound pages needed for DRM drivers >>>>>> using TTM [1], othewrwise hostmem blob mapping will fail with a KVM error >>>>>> from Qemu. >>>>>> >>>>>> [1] https://lore.kernel.org/kvm/20240229025759.1187910-1-stevensd@google.com/ >>>>>> >>>>>> You'll need to use recent Mesa version containing patch that removes >>>>>> dependency on cross-device feature from Venus that isn't supported by >>>>>> Qemu [2]. >>>>>> >>>>>> [2] https://gitlab.freedesktop.org/mesa/mesa/-/commit/087e9a96d13155e26987befae78b6ccbb7ae242b >>>>>> >>>>>> Example Qemu cmdline that enables Venus: >>>>>> >>>>>> qemu-system-x86_64 -device virtio-vga-gl,hostmem=4G,blob=true,venus=true \ >>>>>> -machine q35,accel=kvm,memory-backend=mem1 \ >>>>>> -object memory-backend-memfd,id=mem1,size=8G -m 8G >>>>> >>>>> What is the correct device for non-x86 guests? We have virtio-gpu-gl-pci >>>>> but when doing that I get: >>>>> >>>>> -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true >>>>> qemu-system-aarch64: -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true: opengl is not available >>>>> >>>>> According to 37f86af087 (virtio-gpu: move virgl realize + properties): >>>>> >>>>> Drop the virgl property, the virtio-gpu-gl-device has virgl enabled no >>>>> matter what. Just use virtio-gpu-device instead if you don't want >>>>> enable virgl and opengl. This simplifies the logic and reduces the test >>>>> matrix. >>>>> >>>>> but that's not a good solution because that needs virtio-mmio and there >>>>> are reasons to have a PCI device (for one thing no ambiguity about >>>>> discovery). >>>> >>>> Oops my mistake forgetting: >>>> >>>> --display gtk,gl=on >>>> >>>> Although I do see a lot of eglMakeContext failures. >>> >>> Please post the full Qemu cmdline you're using >> >> With: >> >> ./qemu-system-aarch64 \ >> -machine type=virt,virtualization=on,pflash0=rom,pflash1=efivars \ >> -cpu neoverse-n1 \ >> -smp 4 \ >> -accel tcg \ >> -device virtio-net-pci,netdev=unet \ >> -device virtio-scsi-pci \ >> -device scsi-hd,drive=hd \ >> -netdev user,id=unet,hostfwd=tcp::2222-:22 \ >> -blockdev driver=raw,node-name=hd,file.driver=host_device,file.filename=/dev/zen-ssd2/trixie-arm64,discard=unmap \ >> -serial mon:stdio \ >> -blockdev node-name=rom,driver=file,filename=(pwd)/pc-bios/edk2-aarch64-code.fd,read-only=true \ >> -blockdev node-name=efivars,driver=file,filename=$HOME/images/qemu-arm64-efivars \ >> -m 8192 \ >> -object memory-backend-memfd,id=mem,size=8G,share=on \ >> -device virtio-gpu-gl-pci,hostmem=4G,blob=true,venus=true \ >> -display gtk,gl=on,show-cursor=on -vga none \ >> -device qemu-xhci -device usb-kbd -device usb-tablet >> >> I get a boot up with a lot of: >> >> >> (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed >> >> (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed >> >> (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed >> >> (qemu:1545322): Gdk-WARNING **: 09:26:09.470: eglMakeCurrent failed >> >> In the guest I run: >> >> meson devenv -C /root/lsrc/graphics/mesa.git/build fish >> >> to bring in the latest Mesa (with virtio enabled). Running vulkaninfo >> reports two cards: >> >> ========== >> VULKANINFO >> ========== >> >> Vulkan Instance Version: 1.3.280 >> >> >> Instance Extensions: count = 14 >> ------------------------------- >> VK_EXT_debug_report : extension revision 10 >> VK_EXT_debug_utils : extension revision 2 >> VK_EXT_headless_surface : extension revision 1 >> VK_KHR_device_group_creation : extension revision 1 >> VK_KHR_external_fence_capabilities : extension revision 1 >> VK_KHR_external_memory_capabilities : extension revision 1 >> VK_KHR_external_semaphore_capabilities : extension revision 1 >> VK_KHR_get_physical_device_properties2 : extension revision 2 >> VK_KHR_get_surface_capabilities2 : extension revision 1 >> VK_KHR_portability_enumeration : extension revision 1 >> VK_KHR_surface : extension revision 25 >> VK_KHR_surface_protected_capabilities : extension revision 1 >> VK_KHR_wayland_surface : extension revision 6 >> VK_LUNARG_direct_driver_loading : extension revision 1 >> >> Instance Layers: count = 2 >> -------------------------- >> VK_LAYER_MESA_device_select Linux device selection layer 1.3.211 version 1 >> VK_LAYER_MESA_overlay Mesa Overlay layer 1.3.211 >> version 1 >> >> Devices: >> ======== >> GPU0: >> apiVersion = 1.3.230 >> driverVersion = 24.1.99 >> vendorID = 0x8086 >> deviceID = 0xa780 >> deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU >> deviceName = Virtio-GPU Venus (Intel(R) Graphics (RPL-S)) >> driverID = DRIVER_ID_MESA_VENUS >> driverName = venus >> driverInfo = Mesa 24.2.0-devel (git-0b582449f0) >> conformanceVersion = 1.3.0.0 >> deviceUUID = 29d2e940-a1a0-3054-0f9a-9f7dec52a084 >> driverUUID = 3694c390-f245-612a-12ce-7d3a99127622 >> GPU1: >> apiVersion = 1.2.0 >> driverVersion = 24.1.99 >> vendorID = 0x10005 >> deviceID = 0x0000 >> deviceType = PHYSICAL_DEVICE_TYPE_CPU >> deviceName = Virtio-GPU Venus (llvmpipe (LLVM 15.0.6, 256 bits)) >> driverID = DRIVER_ID_MESA_VENUS >> driverName = venus >> driverInfo = Mesa 24.2.0-devel (git-0b582449f0) >> conformanceVersion = 1.3.0.0 >> deviceUUID = 5fb5c03f-c537-f0fe-a7e6-9cd5866acb8d >> driverUUID = 3694c390-f245-612a-12ce-7d3a99127622 >> >> Running weston and then vkcube-wayland reports its selecting "GPU 0: >> Virtio-GPU Venus (Intel(R) Graphics (RPL-S))" but otherwise produces no >> output. >> >> If I run with "-display sdl,gl=on,show-cursor=on" and the same other >> command line options the results for vulkaninfo are the same. However >> vkcube-wayland gets a little further and draws the initial cube on the >> screen before locking up with: >> >> MESA-VIRTIO: debug: stuck in fence wait with iter at xxxx >> >> where xxxx grows each time it prints. On shutting down I see some virgl >> errors interspersed with the systemd logs: >> >> [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [ OK ] Stopped systemd-logind.service - User Login Management. >> virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 >> [ 475.257111] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [ OK ] Stopped target network-online.target - Network is Online. >> [ OK ] Stopped target remote-fs.target - Remote File Systems. >> [ OK ] Stopped NetworkManager-wait-online…vice - Network Manager Wait Online. >> Stopping avahi-daemon.service - Avahi mDNS/DNS-SD Stack... >> Stopping cups.service - CUPS Scheduler... >> Stopping user-runtime-dir@0.servic…er Runtime Directory /run/user/0... >> [ OK ] Stopped avahi-daemon.service - Avahi mDNS/DNS-SD Stack. >> [ OK ] Stopped cups.service - CUPS Scheduler. >> virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 >> [ 475.357543] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [ OK ] Stopped target network.target - Network. >> [ OK ] Stopped target nss-user-lookup.target - User and Group Name Lookups. >> Stopping NetworkManager.service - Network Manager... >> Stopping networking.service - Raise network interfaces... >> Stopping wpa_supplicant.service - WPA supplicant... >> [ OK ] Stopped wpa_supplicant.service - WPA supplicant. >> virtio_gpu_virgl_process_cmd: ctrl 0x209, error 0x1200 >> [ 493.585261] [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> [drm:virtio_gpu_dequeue_ctrl_func [virtio_gpu]] *ERROR* response 0x1200 (command 0x209) >> > > I've reproduced this with qemu-system-aarch64. Vkcube works for a second > and then stops, Qemu compeltely gets frozen after closing and re-running > vkcube. Doesn't feel like this is a problem with venus, but with arm64. > For now don't know where is the bug, will take a closer look. I'm guessing some sort of resource leak, if I run vkcube-wayland in the guest it complains about being stuck on a fence with the iterator going up. However on the host I see: virtio_gpu_fence_ctrl fence 0x13f1, type 0x207 virtio_gpu_fence_ctrl fence 0x13f2, type 0x207 virtio_gpu_fence_resp fence 0x13f1 virtio_gpu_fence_resp fence 0x13f2 virtio_gpu_fence_ctrl fence 0x13f3, type 0x207 virtio_gpu_fence_ctrl fence 0x13f4, type 0x207 virtio_gpu_fence_resp fence 0x13f3 virtio_gpu_fence_resp fence 0x13f4 virtio_gpu_fence_ctrl fence 0x13f5, type 0x207 virtio_gpu_fence_ctrl fence 0x13f6, type 0x207 virtio_gpu_fence_resp fence 0x13f5 virtio_gpu_fence_resp fence 0x13f6 virtio_gpu_fence_ctrl fence 0x13f7, type 0x207 virtio_gpu_fence_ctrl fence 0x13f8, type 0x207 virtio_gpu_fence_resp fence 0x13f7 virtio_gpu_fence_resp fence 0x13f8 virtio_gpu_fence_ctrl fence 0x13f9, type 0x204 virtio_gpu_fence_resp fence 0x13f9 which looks like its going ok. However when I git Ctrl-C in the guest it kills QEMU: virtio_gpu_fence_ctrl fence 0x13fc, type 0x207 virtio_gpu_fence_ctrl fence 0x13fd, type 0x207 virtio_gpu_fence_ctrl fence 0x13fe, type 0x204 virtio_gpu_fence_ctrl fence 0x13ff, type 0x207 virtio_gpu_fence_ctrl fence 0x1400, type 0x207 virtio_gpu_fence_resp fence 0x13fc virtio_gpu_fence_resp fence 0x13fd virtio_gpu_fence_resp fence 0x13fe virtio_gpu_fence_resp fence 0x13ff virtio_gpu_fence_resp fence 0x1400 qemu-system-aarch64: ../../subprojects/virglrenderer/src/virglrenderer.c:1282: virgl_renderer_resource_unmap: Assertion `!ret' failed. fish: Job 1, './qemu-system-aarch64 \' terminated by signal -machine type=virt,virtuali… ( -cpu neoverse-n1 \) fish: Job -smp 4 \, ' -accel tcg \' terminated by signal -device virtio-net-pci,netd… ( -device virtio-scsi-pci \) fish: Job -device scsi-hd,drive=hd \, ' -netdev user,id=unet,hostfw…' terminated by signal -blockdev driver=raw,node-n… ( -serial mon:stdio \) fish: Job -blockdev node-name=rom,dri…, ' -blockdev node-name=efivars…' terminated by signal -m 8192 \ ( -object memory-backend-memf…) fish: Job -device virtio-gpu-gl-pci,h…, ' -display sdl,gl=on,show-cur…' terminated by signal -device qemu-xhci -device u… ( -kernel /home/alex/lsrc/lin…) fish: Job -d guest_errors,unimp,trace…, 'SIGABRT' terminated by signal Abort () The backtrace (and the 18G size of the core file!) indicates a leak: (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44 #1 0x00007f0fa68a9e8f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78 #2 0x00007f0fa685afb2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007f0fa6845472 in __GI_abort () at ./stdlib/abort.c:79 #4 0x00007f0fa6845395 in __assert_fail_base (fmt=0x7f0fa69b9a90 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=assertion@entry=0x55c3e1b0762d "!ret", file=file@entry=0x55c3e1d306f0 "../../subprojects/virglrenderer/src/virglrenderer.c", line=line@entry=1282, function=function@entry=0x55c3e1d30910 <__PRETTY_FUNCTION__.2> "virgl_renderer_resource_unmap") at ./assert/assert.c:92 #5 0x00007f0fa6853eb2 in __GI___assert_fail (assertion=assertion@entry=0x55c3e1b0762d "!ret", file=file@entry=0x55c3e1d306f0 "../../subprojects/virglrenderer/src/virglrenderer.c", line=line@entry=1282, function=function@entry=0x55c3e1d30910 <__PRETTY_FUNCTION__.2> "virgl_renderer_resource_unmap") at ./assert/assert.c:101 #6 0x000055c3e1958b50 in virgl_renderer_resource_unmap (res_handle=<optimized out>) at ../../subprojects/virglrenderer/src/virglrenderer.c:1282 #7 0x000055c3e13d8507 in virtio_gpu_virgl_unmap_resource_blob (g=g@entry=0x55c3e5fed600, res=0x55c3e6e67b60, cmd_suspended=cmd_suspended@entry=0x7ffd5d720aaf) at ../../hw/display/virtio-gpu-virgl.c:188 #8 0x000055c3e13d9af4 in virgl_cmd_resource_unmap_blob (cmd_suspended=0x7ffd5d720aaf, cmd=0x55c3e5bd9710, g=0x55c3e5fed600) at ../../hw/display/virtio-gpu-virgl.c:797 #9 virtio_gpu_virgl_process_cmd (g=0x55c3e5fed600, cmd=0x55c3e5bd9710) at ../../hw/display/virtio-gpu-virgl.c:979 #10 0x000055c3e13d6019 in virtio_gpu_process_cmdq (g=0x55c3e5fed600) at ../../hw/display/virtio-gpu.c:1055 #11 0x000055c3e190c646 in aio_bh_poll (ctx=ctx@entry=0x55c3e4c03710) at ../../util/async.c:218 #12 0x000055c3e18f562e in aio_dispatch (ctx=0x55c3e4c03710) at ../../util/aio-posix.c:423 #13 0x000055c3e190c2ce in aio_ctx_dispatch (source=<optimized out>, callback=<optimized out>, user_data=<optimized out>) at ../../util/async.c:360 #14 0x00007f0fa8b047a9 in g_main_context_dispatch () at /lib/x86_64-linux-gnu/libglib-2.0.so.0 #15 0x000055c3e190db78 in glib_pollfds_poll () at ../../util/main-loop.c:287 #16 os_host_main_loop_wait (timeout=1882878) at ../../util/main-loop.c:310 #17 main_loop_wait (nonblocking=nonblocking@entry=0) at ../../util/main-loop.c:589 #18 0x000055c3e1348ac9 in qemu_main_loop () at ../../system/runstate.c:796 #19 0x000055c3e174f786 in qemu_default_main () at ../../system/main.c:37 #20 0x00007f0fa684624a in __libc_start_call_main (main=main@entry=0x55c3e10286e0 <main>, argc=argc@entry=47, argv=argv@entry=0x7ffd5d720f18) at ../sysdeps/nptl/libc_start_call_main.h:58 #21 0x00007f0fa6846305 in __libc_start_main_impl (main=0x55c3e10286e0 <main>, argc=47, argv=0x7ffd5d720f18, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd5d720f08) at ../csu/libc-start.c:360 #22 0x000055c3e102a3f1 in _start ()
On 6/5/24 17:47, Alex Bennée wrote: .... > I'm guessing some sort of resource leak, if I run vkcube-wayland in the > guest it complains about being stuck on a fence with the iterator going > up. However on the host I see: > > virtio_gpu_fence_ctrl fence 0x13f1, type 0x207 > virtio_gpu_fence_ctrl fence 0x13f2, type 0x207 > virtio_gpu_fence_resp fence 0x13f1 > virtio_gpu_fence_resp fence 0x13f2 > virtio_gpu_fence_ctrl fence 0x13f3, type 0x207 > virtio_gpu_fence_ctrl fence 0x13f4, type 0x207 > virtio_gpu_fence_resp fence 0x13f3 > virtio_gpu_fence_resp fence 0x13f4 > virtio_gpu_fence_ctrl fence 0x13f5, type 0x207 > virtio_gpu_fence_ctrl fence 0x13f6, type 0x207 > virtio_gpu_fence_resp fence 0x13f5 > virtio_gpu_fence_resp fence 0x13f6 > virtio_gpu_fence_ctrl fence 0x13f7, type 0x207 > virtio_gpu_fence_ctrl fence 0x13f8, type 0x207 > virtio_gpu_fence_resp fence 0x13f7 > virtio_gpu_fence_resp fence 0x13f8 > virtio_gpu_fence_ctrl fence 0x13f9, type 0x204 > virtio_gpu_fence_resp fence 0x13f9 > > which looks like its going ok. However when I git Ctrl-C in the guest it > kills QEMU: > > virtio_gpu_fence_ctrl fence 0x13fc, type 0x207 > virtio_gpu_fence_ctrl fence 0x13fd, type 0x207 > virtio_gpu_fence_ctrl fence 0x13fe, type 0x204 > virtio_gpu_fence_ctrl fence 0x13ff, type 0x207 > virtio_gpu_fence_ctrl fence 0x1400, type 0x207 > virtio_gpu_fence_resp fence 0x13fc > virtio_gpu_fence_resp fence 0x13fd > virtio_gpu_fence_resp fence 0x13fe > virtio_gpu_fence_resp fence 0x13ff > virtio_gpu_fence_resp fence 0x1400 > qemu-system-aarch64: ../../subprojects/virglrenderer/src/virglrenderer.c:1282: virgl_renderer_resource_unmap: Assertion `!ret' failed. > fish: Job 1, './qemu-system-aarch64 \' terminated by signal -machine type=virt,virtuali… ( -cpu neoverse-n1 \) > fish: Job -smp 4 \, ' -accel tcg \' terminated by signal -device virtio-net-pci,netd… ( -device virtio-scsi-pci \) > fish: Job -device scsi-hd,drive=hd \, ' -netdev user,id=unet,hostfw…' terminated by signal -blockdev driver=raw,node-n… ( -serial mon:stdio \) > fish: Job -blockdev node-name=rom,dri…, ' -blockdev node-name=efivars…' terminated by signal -m 8192 \ ( -object memory-backend-memf…) > fish: Job -device virtio-gpu-gl-pci,h…, ' -display sdl,gl=on,show-cur…' terminated by signal -device qemu-xhci -device u… ( -kernel /home/alex/lsrc/lin…) > fish: Job -d guest_errors,unimp,trace…, 'SIGABRT' terminated by signal Abort () > > The backtrace (and the 18G size of the core file!) indicates a leak: The unmap debug-assert tells that BO wasn't mapped because mapping failed, likely due to OOM. You won't hit this abort with a release build of libvirglrenderer. The leak likely happens due to unsignalled fence. Please try to run vkcube with disabled fence-feedback feature: # VN_PERF_NO_FENCE_FEEDBACK=1 vkcube-wayland It fixes hang for me. We had problems with combination of this Venus optimization feature + Intel ANV driver for a long time and hoped that it's fixed by now, apparently the issue was only masked.
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 6/5/24 17:47, Alex Bennée wrote: > .... >> I'm guessing some sort of resource leak, if I run vkcube-wayland in the >> guest it complains about being stuck on a fence with the iterator going >> up. However on the host I see: >> <snip> >> >> The backtrace (and the 18G size of the core file!) indicates a leak: > > The unmap debug-assert tells that BO wasn't mapped because mapping > failed, likely due to OOM. You won't hit this abort with a release build > of libvirglrenderer. AFAIK I should be building a release build (or at least I hope that is what the wrapper I posted does): Message-Id: <20240605133527.529950-1-alex.bennee@linaro.org> Date: Wed, 5 Jun 2024 14:35:27 +0100 Subject: [RFC PATCH] subprojects: add a wrapper for libvirglrenderer From: =?UTF-8?q?Alex=20Benn=C3=A9e?= <alex.bennee@linaro.org> Maybe I need to explicitly set builtype=release in the default options? > The leak likely happens due to unsignalled fence. > > Please try to run vkcube with disabled fence-feedback feature: > > # VN_PERF_NO_FENCE_FEEDBACK=1 vkcube-wayland > > It fixes hang for me. We had problems with combination of this Venus > optimization feature + Intel ANV driver for a long time and hoped that > it's fixed by now, apparently the issue was only masked. That doesn't help, still causes the crash: virtio_gpu_fence_ctrl fence 0xdfd, type 0x204 virtio_gpu_fence_ctrl fence 0xdfe, type 0x207 virtio_gpu_fence_ctrl fence 0xdff, type 0x207 virtio_gpu_fence_ctrl fence 0xe00, type 0x207 virtio_gpu_fence_ctrl fence 0xe01, type 0x207 virtio_gpu_fence_ctrl fence 0xe02, type 0x207 virtio_gpu_fence_ctrl fence 0xe03, type 0x207 virtio_gpu_fence_resp fence 0xdfd virtio_gpu_fence_resp fence 0xdfe virtio_gpu_fence_resp fence 0xdff virtio_gpu_fence_resp fence 0xe00 virtio_gpu_fence_resp fence 0xe01 virtio_gpu_fence_resp fence 0xe02 virtio_gpu_fence_resp fence 0xe03 stats: vq req 100, 7 -- 3D 25 (19560) vrend_renderer_resource_unmap: invalid bits 0x83 virgl_renderer_resource_unmap: unexpected ret = -22, pipe:0x555559e5d0c0 fd_type:0 Thread 1 "qemu-system-aar" received signal SIGABRT, Aborted. __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44 44 ./nptl/pthread_kill.c: No such file or directory. Which I think means VREND_STORAGE_GL_MEMOBJ | VREND_STORAGE_GL_TEXTURE | VREND_STORAGE_GUEST_MEMORY (I note the sense of has_bits is meant to be mask, bit but I don't think that makes any difference)