From patchwork Tue Nov 26 03:13:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Kasireddy X-Patchwork-Id: 13885465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BA462D5A6DA for ; Tue, 26 Nov 2024 03:39:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 06C2410E77F; Tue, 26 Nov 2024 03:39:48 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="ZbdjmE8s"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) by gabe.freedesktop.org (Postfix) with ESMTPS id D4BB110E77E for ; Tue, 26 Nov 2024 03:39:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1732592384; x=1764128384; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=/5dy91sGL0gfhZUmcdA8sy0LpW07bTpKY73knscUniY=; b=ZbdjmE8s6Tu+jzBk21+36ANT8V7487NXuxAf8Kj6+GyvtTP0Nu0w0CUL u4RlE6M0rQFyIQim+VYRla98OgmC04upidE049dTumIW1DOyCbf1Jv86O Im6rvoKwprCUN3MvNJLhlrQjaauuj6CIBQv6HkdBqR/f/tgChKDjRUU8s 0beTqQUrjZg7nHjgX30HyDC/NU5XIkvPacyTv/eQUU1fmHBeahwBzBy9Z Ze/gUxYkj4voKADT34KAKSbi1tqVgF7v7FDOzR/TTaEEjTHw+vo70bETt 6IEQkuS/B7Eojqx8arsMqr2ZQegMEonRccArKHCNkoVW4lMlDHPU0DSEE w==; X-CSE-ConnectionGUID: AOlMeAV0TH2WPZv+QvoLdg== X-CSE-MsgGUID: WYmoyiTqQkW91eX4bHZ+3g== X-IronPort-AV: E=McAfee;i="6700,10204,11267"; a="32106125" X-IronPort-AV: E=Sophos;i="6.12,184,1728975600"; d="scan'208";a="32106125" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Nov 2024 19:39:43 -0800 X-CSE-ConnectionGUID: Krun4QvZQxmBqE+7gmZGeg== X-CSE-MsgGUID: ZLNmo12iQIS9+Lgr2vvHsg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.12,184,1728975600"; d="scan'208";a="95553442" Received: from vkasired-desk2.fm.intel.com ([10.105.128.132]) by fmviesa003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Nov 2024 19:39:44 -0800 From: Vivek Kasireddy To: dri-devel@lists.freedesktop.org Cc: Vivek Kasireddy , Gerd Hoffmann , Dongwon Kim , Simona Vetter , Christian Koenig , Dmitry Osipenko , Rob Clark , Gurchetan Singh , Chia-I Wu Subject: [PATCH v5 0/5] drm/virtio: Import scanout buffers from other devices Date: Mon, 25 Nov 2024 19:13:41 -0800 Message-ID: <20241126031643.3490496-1-vivek.kasireddy@intel.com> X-Mailer: git-send-email 2.45.1 MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Having virtio-gpu import scanout buffers (via prime) from other devices means that we'd be adding a head to headless GPUs assigned to a Guest VM or additional heads to regular GPU devices that are passthrough'd to the Guest. In these cases, the Guest compositor can render into the scanout buffer using a primary GPU and has the secondary GPU (virtio-gpu) import it for display purposes. The main advantage with this is that the imported scanout buffer can either be displayed locally on the Host (e.g, using Qemu + GTK UI) or encoded and streamed to a remote client (e.g, Qemu + Spice UI). Note that since Qemu uses udmabuf driver, there would be no copies made of the scanout buffer as it is displayed. This should be possible even when it might reside in device memory such has VRAM. The specific use-case that can be supported with this series is when running Weston or other guest compositors with "additional-devices" feature (./weston --drm-device=card1 --additional-devices=card0). More info about this feature can be found at: https://gitlab.freedesktop.org/wayland/weston/-/merge_requests/736 In the above scenario, card1 could be a dGPU or an iGPU and card0 would be virtio-gpu in KMS only mode. However, the case where this patch series could be particularly useful is when card1 is a GPU VF that needs to share its scanout buffer (in a zero-copy way) with the GPU PF on the Host. Or, it can also be useful when the scanout buffer needs to be shared between any two GPU devices (assuming one of them is assigned to a Guest VM) as long as they are P2P DMA compatible. As part of the import, the virtio-gpu driver shares the dma addresses and lengths with Qemu which then determines whether the memory region they belong to is owned by a VFIO device or whether it is part of the Guest's system ram. If it is the former, it can use the VFIO_DEVICE_FEATURE_DMA_BUF feature flag while invoking the ioctl against the VFIO device fd and get a dmabuf fd in return. In the latter case, Qemu obtains the dmabuf fd using the udmabuf driver. Note that the virtio-gpu driver registers a move_notify() callback to track location changes associated with the scanout buffer and sends attach/detach backing cmds to Qemu when appropriate. And, synchronization (that is, ensuring that Guest and Host are not using the scanout buffer at the same time) is ensured by pinning/ unpinning the dmabuf as part of prepare/cleanup fb and using a fence in resource_flush cmd. Changelog: v4 -> v5 (changes suggested by Dmitry): - Replace the variable detached with attached and use it in virtio_gpu_object_attach/detach to track a BO's backing - Use the unlocked version of dma_buf_unmap_attachment() to avoid having to hold dma resv lock while freeing the object v3 -> v4 (changes suggested by Dmitry): - Change the return type of virtgpu_dma_buf_import_sgt() from long to int - Add missing virtio_gpu_detach_object_fenced() while trying to free the obj in virtgpu_dma_buf_free_obj() - Remove the extra newline added at the end of the file in patch 4 v2 -> v3: - Rebase on 6.12 v1 -> v2: - Use a fenced version of VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING cmd (Dmitry) RFC -> v1: - Use virtio_gpu_cleanup_object() to cleanup the imported obj - Do pin/unpin as part of prepare and cleanup fb for the imported dmabuf obj instead doing it as part of plane update - Tested with gnome-shell/mutter (wayland backend) Patchset overview: Patch 1: Implement VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING cmd Patch 2-3: Helpers to initalize, import, free imported object Patch 4-5: Import and use buffers from other devices for scanout This series is tested using the following method: - Run Qemu with the following relevant options: qemu-system-x86_64 -m 4096m .... -device vfio-pci,host=0000:03:00.0 -device virtio-vga,max_outputs=1,blob=true,xres=1920,yres=1080 -display gtk,gl=on -object memory-backend-memfd,id=mem1,size=4096M -machine memory-backend=mem1 ... - Run upstream Weston with the following options in the Guest VM: ./weston --drm-device=card1 --additional-devices=card0 - Or run Gnome-shell/Mutter (wayland backend) with this additional patch: https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/3745 XDG_SESSION_TYPE=wayland dbus-run-session -- /usr/bin/gnome-shell --wayland --no-x11 where card1 is a DG2 dGPU (passthrough'd and using i915 driver in Guest VM), card0 is virtio-gpu and the Host is using a RPL iGPU. Cc: Gerd Hoffmann Cc: Dongwon Kim Cc: Simona Vetter Cc: Christian Koenig Cc: Dmitry Osipenko Cc: Rob Clark Cc: Gurchetan Singh Cc: Chia-I Wu Vivek Kasireddy (5): drm/virtio: Implement VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING cmd drm/virtio: Add a helper to map and note the dma addrs and lengths drm/virtio: Add helpers to initialize and free the imported object drm/virtio: Import prime buffers from other devices as guest blobs drm/virtio: Add prepare and cleanup routines for imported dmabuf obj drivers/gpu/drm/virtio/virtgpu_drv.h | 10 ++ drivers/gpu/drm/virtio/virtgpu_object.c | 24 ++++ drivers/gpu/drm/virtio/virtgpu_plane.c | 65 ++++++++- drivers/gpu/drm/virtio/virtgpu_prime.c | 173 +++++++++++++++++++++++- drivers/gpu/drm/virtio/virtgpu_vq.c | 35 +++++ 5 files changed, 305 insertions(+), 2 deletions(-)