From patchwork Sun Apr 16 11:52:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Osipenko X-Patchwork-Id: 13212874 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3A85DC77B7A for ; Sun, 16 Apr 2023 11:53:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5DD1A10E0C1; Sun, 16 Apr 2023 11:53:31 +0000 (UTC) Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3878010E0B7 for ; Sun, 16 Apr 2023 11:53:24 +0000 (UTC) Received: from workpc.. (unknown [IPv6:2a00:1370:817e:4eb4:c5e6:4b85:1e3f:55e4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: dmitry.osipenko) by madras.collabora.co.uk (Postfix) with ESMTPSA id E1FBE66031BE; Sun, 16 Apr 2023 12:53:19 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1681646000; bh=ACjBY6sLl7U6/q+14CXYboBMLohgskA6O9zZS8tRJ+w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PhtHn62Ar9HHXucUolMk7t5FZHQKne7Q8s6rIAgd805CJozy18gKvkP75X4Pu7rE7 dJ6rQkEibuSxgFmSxNDb6u8KQ3sU6Np+jrFRw0qmMMfLnwgu9A/8p7aV5P2lpCMVz7 wpmNRkBa/a0vkBtKAkL0M7KRe1iO81Rrbg8lL/Cwe/Hcufe90sn+0Xe231urFUfUeg Zhn8bASfnGH/y9Fddihw/fDe3iDQbu5h/tDSSFMFbJc5v4WToOD7Uaau69GpiuZGA8 fqw4hf9ugWCMiyY8Eieu3zFU8fb3sNcugUU2zyyaEu1ejfZ0f6D1wJcv1i8ls3QIkA vtj2B+TGj4GIg== From: Dmitry Osipenko To: David Airlie , Gerd Hoffmann , Gurchetan Singh , Chia-I Wu , Daniel Vetter , Rob Clark , =?utf-8?b?TWFyZWsgT2zFocOhaw==?= , Pierre-Eric Pelloux-Prayer , Emil Velikov Subject: [PATCH v6 1/3] drm/virtio: Refactor and optimize job submission code path Date: Sun, 16 Apr 2023 14:52:35 +0300 Message-Id: <20230416115237.798604-2-dmitry.osipenko@collabora.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230416115237.798604-1-dmitry.osipenko@collabora.com> References: <20230416115237.798604-1-dmitry.osipenko@collabora.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kernel@collabora.com, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, virtualization@lists.linux-foundation.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Move virtio_gpu_execbuffer_ioctl() into separate virtgpu_submit.c file, refactoring and optimizing the code along the way to ease addition of new features to the ioctl. The optimization is done by using optimal ordering of the job's submission steps, reducing code path from the start of the ioctl to the point of pushing job to virtio queue. Job's initialization is now performed before in-fence is awaited and out-fence setup is made after sending out job to virtio. Reviewed-by: Rob Clark Reviewed-by; Emil Velikov Signed-off-by: Dmitry Osipenko --- drivers/gpu/drm/virtio/Makefile | 2 +- drivers/gpu/drm/virtio/virtgpu_drv.h | 4 + drivers/gpu/drm/virtio/virtgpu_ioctl.c | 182 --------------- drivers/gpu/drm/virtio/virtgpu_submit.c | 295 ++++++++++++++++++++++++ 4 files changed, 300 insertions(+), 183 deletions(-) create mode 100644 drivers/gpu/drm/virtio/virtgpu_submit.c diff --git a/drivers/gpu/drm/virtio/Makefile b/drivers/gpu/drm/virtio/Makefile index b99fa4a73b68..d2e1788a8227 100644 --- a/drivers/gpu/drm/virtio/Makefile +++ b/drivers/gpu/drm/virtio/Makefile @@ -6,6 +6,6 @@ virtio-gpu-y := virtgpu_drv.o virtgpu_kms.o virtgpu_gem.o virtgpu_vram.o \ virtgpu_display.o virtgpu_vq.o \ virtgpu_fence.o virtgpu_object.o virtgpu_debugfs.o virtgpu_plane.o \ - virtgpu_ioctl.o virtgpu_prime.o virtgpu_trace_points.o + virtgpu_ioctl.o virtgpu_prime.o virtgpu_trace_points.o virtgpu_submit.o obj-$(CONFIG_DRM_VIRTIO_GPU) += virtio-gpu.o diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index af6ffb696086..4126c384286b 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -486,4 +486,8 @@ void virtio_gpu_vram_unmap_dma_buf(struct device *dev, struct sg_table *sgt, enum dma_data_direction dir); +/* virtgpu_submit.c */ +int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); + #endif diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c b/drivers/gpu/drm/virtio/virtgpu_ioctl.c index da45215a933d..b24b11f25197 100644 --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c @@ -38,36 +38,6 @@ VIRTGPU_BLOB_FLAG_USE_SHAREABLE | \ VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE) -static int virtio_gpu_fence_event_create(struct drm_device *dev, - struct drm_file *file, - struct virtio_gpu_fence *fence, - uint32_t ring_idx) -{ - struct virtio_gpu_fpriv *vfpriv = file->driver_priv; - struct virtio_gpu_fence_event *e = NULL; - int ret; - - if (!(vfpriv->ring_idx_mask & BIT_ULL(ring_idx))) - return 0; - - e = kzalloc(sizeof(*e), GFP_KERNEL); - if (!e) - return -ENOMEM; - - e->event.type = VIRTGPU_EVENT_FENCE_SIGNALED; - e->event.length = sizeof(e->event); - - ret = drm_event_reserve_init(dev, file, &e->base, &e->event); - if (ret) - goto free; - - fence->e = e; - return 0; -free: - kfree(e); - return ret; -} - /* Must be called with &virtio_gpu_fpriv.struct_mutex held. */ static void virtio_gpu_create_context_locked(struct virtio_gpu_device *vgdev, struct virtio_gpu_fpriv *vfpriv) @@ -108,158 +78,6 @@ static int virtio_gpu_map_ioctl(struct drm_device *dev, void *data, &virtio_gpu_map->offset); } -/* - * Usage of execbuffer: - * Relocations need to take into account the full VIRTIO_GPUDrawable size. - * However, the command as passed from user space must *not* contain the initial - * VIRTIO_GPUReleaseInfo struct (first XXX bytes) - */ -static int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, - struct drm_file *file) -{ - struct drm_virtgpu_execbuffer *exbuf = data; - struct virtio_gpu_device *vgdev = dev->dev_private; - struct virtio_gpu_fpriv *vfpriv = file->driver_priv; - struct virtio_gpu_fence *out_fence; - int ret; - uint32_t *bo_handles = NULL; - void __user *user_bo_handles = NULL; - struct virtio_gpu_object_array *buflist = NULL; - struct sync_file *sync_file; - int out_fence_fd = -1; - void *buf; - uint64_t fence_ctx; - uint32_t ring_idx; - - fence_ctx = vgdev->fence_drv.context; - ring_idx = 0; - - if (vgdev->has_virgl_3d == false) - return -ENOSYS; - - if ((exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS)) - return -EINVAL; - - if ((exbuf->flags & VIRTGPU_EXECBUF_RING_IDX)) { - if (exbuf->ring_idx >= vfpriv->num_rings) - return -EINVAL; - - if (!vfpriv->base_fence_ctx) - return -EINVAL; - - fence_ctx = vfpriv->base_fence_ctx; - ring_idx = exbuf->ring_idx; - } - - virtio_gpu_create_context(dev, file); - if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_IN) { - struct dma_fence *in_fence; - - in_fence = sync_file_get_fence(exbuf->fence_fd); - - if (!in_fence) - return -EINVAL; - - /* - * Wait if the fence is from a foreign context, or if the fence - * array contains any fence from a foreign context. - */ - ret = 0; - if (!dma_fence_match_context(in_fence, fence_ctx + ring_idx)) - ret = dma_fence_wait(in_fence, true); - - dma_fence_put(in_fence); - if (ret) - return ret; - } - - if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_OUT) { - out_fence_fd = get_unused_fd_flags(O_CLOEXEC); - if (out_fence_fd < 0) - return out_fence_fd; - } - - if (exbuf->num_bo_handles) { - bo_handles = kvmalloc_array(exbuf->num_bo_handles, - sizeof(uint32_t), GFP_KERNEL); - if (!bo_handles) { - ret = -ENOMEM; - goto out_unused_fd; - } - - user_bo_handles = u64_to_user_ptr(exbuf->bo_handles); - if (copy_from_user(bo_handles, user_bo_handles, - exbuf->num_bo_handles * sizeof(uint32_t))) { - ret = -EFAULT; - goto out_unused_fd; - } - - buflist = virtio_gpu_array_from_handles(file, bo_handles, - exbuf->num_bo_handles); - if (!buflist) { - ret = -ENOENT; - goto out_unused_fd; - } - kvfree(bo_handles); - bo_handles = NULL; - } - - buf = vmemdup_user(u64_to_user_ptr(exbuf->command), exbuf->size); - if (IS_ERR(buf)) { - ret = PTR_ERR(buf); - goto out_unused_fd; - } - - if (buflist) { - ret = virtio_gpu_array_lock_resv(buflist); - if (ret) - goto out_memdup; - } - - out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); - if(!out_fence) { - ret = -ENOMEM; - goto out_unresv; - } - - ret = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx); - if (ret) - goto out_unresv; - - if (out_fence_fd >= 0) { - sync_file = sync_file_create(&out_fence->f); - if (!sync_file) { - dma_fence_put(&out_fence->f); - ret = -ENOMEM; - goto out_unresv; - } - - exbuf->fence_fd = out_fence_fd; - fd_install(out_fence_fd, sync_file->file); - } - - virtio_gpu_cmd_submit(vgdev, buf, exbuf->size, - vfpriv->ctx_id, buflist, out_fence); - dma_fence_put(&out_fence->f); - virtio_gpu_notify(vgdev); - return 0; - -out_unresv: - if (buflist) - virtio_gpu_array_unlock_resv(buflist); -out_memdup: - kvfree(buf); -out_unused_fd: - kvfree(bo_handles); - if (buflist) - virtio_gpu_array_put_free(buflist); - - if (out_fence_fd >= 0) - put_unused_fd(out_fence_fd); - - return ret; -} - static int virtio_gpu_getparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file) { diff --git a/drivers/gpu/drm/virtio/virtgpu_submit.c b/drivers/gpu/drm/virtio/virtgpu_submit.c new file mode 100644 index 000000000000..84e7c4d9d8c7 --- /dev/null +++ b/drivers/gpu/drm/virtio/virtgpu_submit.c @@ -0,0 +1,295 @@ +// SPDX-License-Identifier: MIT +/* + * Copyright (C) 2015 Red Hat, Inc. + * All Rights Reserved. + * + * Authors: + * Dave Airlie + * Alon Levy + */ + +#include +#include +#include +#include + +#include +#include + +#include "virtgpu_drv.h" + +struct virtio_gpu_submit { + struct virtio_gpu_object_array *buflist; + struct drm_virtgpu_execbuffer *exbuf; + struct virtio_gpu_fence *out_fence; + struct virtio_gpu_fpriv *vfpriv; + struct virtio_gpu_device *vgdev; + struct sync_file *sync_file; + struct drm_file *file; + int out_fence_fd; + u64 fence_ctx; + u32 ring_idx; + void *buf; +}; + +static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit, + struct dma_fence *in_fence) +{ + u32 context = submit->fence_ctx + submit->ring_idx; + + if (dma_fence_match_context(in_fence, context)) + return 0; + + return dma_fence_wait(in_fence, true); +} + +static int virtio_gpu_fence_event_create(struct drm_device *dev, + struct drm_file *file, + struct virtio_gpu_fence *fence, + u32 ring_idx) +{ + struct virtio_gpu_fpriv *vfpriv = file->driver_priv; + struct virtio_gpu_fence_event *e = NULL; + int ret; + + if (!(vfpriv->ring_idx_mask & BIT_ULL(ring_idx))) + return 0; + + e = kzalloc(sizeof(*e), GFP_KERNEL); + if (!e) + return -ENOMEM; + + e->event.type = VIRTGPU_EVENT_FENCE_SIGNALED; + e->event.length = sizeof(e->event); + + ret = drm_event_reserve_init(dev, file, &e->base, &e->event); + if (ret) { + kfree(e); + return ret; + } + + fence->e = e; + + return 0; +} + +static int virtio_gpu_init_submit_buflist(struct virtio_gpu_submit *submit) +{ + struct drm_virtgpu_execbuffer *exbuf = submit->exbuf; + u32 *bo_handles; + + if (!exbuf->num_bo_handles) + return 0; + + bo_handles = kvmalloc_array(exbuf->num_bo_handles, sizeof(*bo_handles), + GFP_KERNEL); + if (!bo_handles) + return -ENOMEM; + + if (copy_from_user(bo_handles, u64_to_user_ptr(exbuf->bo_handles), + exbuf->num_bo_handles * sizeof(*bo_handles))) { + kvfree(bo_handles); + return -EFAULT; + } + + submit->buflist = virtio_gpu_array_from_handles(submit->file, bo_handles, + exbuf->num_bo_handles); + if (!submit->buflist) { + kvfree(bo_handles); + return -ENOENT; + } + + kvfree(bo_handles); + + return 0; +} + +static void virtio_gpu_cleanup_submit(struct virtio_gpu_submit *submit) +{ + if (!IS_ERR(submit->buf)) + kvfree(submit->buf); + + if (submit->buflist) + virtio_gpu_array_put_free(submit->buflist); + + if (submit->out_fence_fd >= 0) + put_unused_fd(submit->out_fence_fd); + + if (submit->out_fence) + dma_fence_put(&submit->out_fence->f); + + if (submit->sync_file) + fput(submit->sync_file->file); +} + +static void virtio_gpu_submit(struct virtio_gpu_submit *submit) +{ + virtio_gpu_cmd_submit(submit->vgdev, submit->buf, submit->exbuf->size, + submit->vfpriv->ctx_id, submit->buflist, + submit->out_fence); + virtio_gpu_notify(submit->vgdev); +} + +static void virtio_gpu_complete_submit(struct virtio_gpu_submit *submit) +{ + submit->buf = NULL; + submit->buflist = NULL; + submit->sync_file = NULL; + submit->out_fence = NULL; + submit->out_fence_fd = -1; +} + +static int virtio_gpu_init_submit(struct virtio_gpu_submit *submit, + struct drm_virtgpu_execbuffer *exbuf, + struct drm_device *dev, + struct drm_file *file, + u64 fence_ctx, u32 ring_idx) +{ + struct virtio_gpu_fpriv *vfpriv = file->driver_priv; + struct virtio_gpu_device *vgdev = dev->dev_private; + struct virtio_gpu_fence *out_fence; + int err; + + memset(submit, 0, sizeof(*submit)); + + out_fence = virtio_gpu_fence_alloc(vgdev, fence_ctx, ring_idx); + if (!out_fence) + return -ENOMEM; + + err = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx); + if (err) { + dma_fence_put(&out_fence->f); + return err; + } + + submit->out_fence = out_fence; + submit->fence_ctx = fence_ctx; + submit->ring_idx = ring_idx; + submit->out_fence_fd = -1; + submit->vfpriv = vfpriv; + submit->vgdev = vgdev; + submit->exbuf = exbuf; + submit->file = file; + + err = virtio_gpu_init_submit_buflist(submit); + if (err) + return err; + + submit->buf = vmemdup_user(u64_to_user_ptr(exbuf->command), exbuf->size); + if (IS_ERR(submit->buf)) + return PTR_ERR(submit->buf); + + if (exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_OUT) { + err = get_unused_fd_flags(O_CLOEXEC); + if (err < 0) + return err; + + submit->out_fence_fd = err; + + submit->sync_file = sync_file_create(&out_fence->f); + if (!submit->sync_file) + return -ENOMEM; + } + + return 0; +} + +static int virtio_gpu_wait_in_fence(struct virtio_gpu_submit *submit) +{ + int ret = 0; + + if (submit->exbuf->flags & VIRTGPU_EXECBUF_FENCE_FD_IN) { + struct dma_fence *in_fence = + sync_file_get_fence(submit->exbuf->fence_fd); + if (!in_fence) + return -EINVAL; + + /* + * Wait if the fence is from a foreign context, or if the + * fence array contains any fence from a foreign context. + */ + ret = virtio_gpu_dma_fence_wait(submit, in_fence); + + dma_fence_put(in_fence); + } + + return ret; +} + +static void virtio_gpu_install_out_fence_fd(struct virtio_gpu_submit *submit) +{ + if (submit->sync_file) { + submit->exbuf->fence_fd = submit->out_fence_fd; + fd_install(submit->out_fence_fd, submit->sync_file->file); + } +} + +static int virtio_gpu_lock_buflist(struct virtio_gpu_submit *submit) +{ + if (submit->buflist) + return virtio_gpu_array_lock_resv(submit->buflist); + + return 0; +} + +int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct virtio_gpu_device *vgdev = dev->dev_private; + struct virtio_gpu_fpriv *vfpriv = file->driver_priv; + u64 fence_ctx = vgdev->fence_drv.context; + struct drm_virtgpu_execbuffer *exbuf = data; + struct virtio_gpu_submit submit; + u32 ring_idx = 0; + int ret = -EINVAL; + + if (!vgdev->has_virgl_3d) + return -ENOSYS; + + if (exbuf->flags & ~VIRTGPU_EXECBUF_FLAGS) + return ret; + + if (exbuf->flags & VIRTGPU_EXECBUF_RING_IDX) { + if (exbuf->ring_idx >= vfpriv->num_rings) + return ret; + + if (!vfpriv->base_fence_ctx) + return ret; + + fence_ctx = vfpriv->base_fence_ctx; + ring_idx = exbuf->ring_idx; + } + + virtio_gpu_create_context(dev, file); + + ret = virtio_gpu_init_submit(&submit, exbuf, dev, file, + fence_ctx, ring_idx); + if (ret) + goto cleanup; + + /* + * Await in-fences in the end of the job submission path to + * optimize the path by proceeding directly to the submission + * to virtio after the waits. + */ + ret = virtio_gpu_wait_in_fence(&submit); + if (ret) + goto cleanup; + + ret = virtio_gpu_lock_buflist(&submit); + if (ret) + goto cleanup; + + virtio_gpu_submit(&submit); + + /* + * Set up usr-out data after submitting the job to optimize + * the job submission path. + */ + virtio_gpu_install_out_fence_fd(&submit); + virtio_gpu_complete_submit(&submit); +cleanup: + virtio_gpu_cleanup_submit(&submit); + + return ret; +} From patchwork Sun Apr 16 11:52:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Osipenko X-Patchwork-Id: 13212871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7D025C77B61 for ; Sun, 16 Apr 2023 11:53:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4024110E0BB; Sun, 16 Apr 2023 11:53:24 +0000 (UTC) Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by gabe.freedesktop.org (Postfix) with ESMTPS id 57B8D10E0B7 for ; Sun, 16 Apr 2023 11:53:23 +0000 (UTC) Received: from workpc.. (unknown [IPv6:2a00:1370:817e:4eb4:c5e6:4b85:1e3f:55e4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: dmitry.osipenko) by madras.collabora.co.uk (Postfix) with ESMTPSA id 2D3BB6603234; Sun, 16 Apr 2023 12:53:21 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1681646002; bh=fwT8AKV/EdqplpLFzWlWVv7m8/aj98/oTicpTA0Rbz4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oMxpQFyz3KOewQkhNGyNtbZlH8KUH+MUBOIxKxD+TX3NCNKuGqTaGGpbY1V1SiVa+ VFOLzIPfnQSzQWsYplt2PGzWPjkcsdgAYQ7JJCZrsUsvZf1irZ/HaWT1j0Q0GQv5/k 6YOEO396U+X7FUgs1XOmqanti7Ip5mUNdaIHEx0RDHYu0MNJGfPCxkZx14yv2riJsp pDIT2llV8OzMarEibdu5YJ+WfLKkAwJ9vMfQdFpU6GgKprQmdwUt7Lrs4tWr6s65pH wZkkdJdoYSIbvb+wZqio4n20Unw6Iu4Rfy69XwJitXR+fhRPGm2hv63lPkHa3M05Rl 7NLdXak1+ZKZQ== From: Dmitry Osipenko To: David Airlie , Gerd Hoffmann , Gurchetan Singh , Chia-I Wu , Daniel Vetter , Rob Clark , =?utf-8?b?TWFyZWsgT2zFocOhaw==?= , Pierre-Eric Pelloux-Prayer , Emil Velikov Subject: [PATCH v6 2/3] drm/virtio: Wait for each dma-fence of in-fence array individually Date: Sun, 16 Apr 2023 14:52:36 +0300 Message-Id: <20230416115237.798604-3-dmitry.osipenko@collabora.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230416115237.798604-1-dmitry.osipenko@collabora.com> References: <20230416115237.798604-1-dmitry.osipenko@collabora.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kernel@collabora.com, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, virtualization@lists.linux-foundation.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Use dma-fence-unwrap API for waiting each dma-fence of the in-fence array individually. Sync file's in-fence array always has a non-matching fence context ID, which doesn't allow to skip waiting of fences with a matching context ID in a case of a merged sync file fence. Suggested-by: Rob Clark Reviewed-by; Emil Velikov Signed-off-by: Dmitry Osipenko --- drivers/gpu/drm/virtio/virtgpu_submit.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_submit.c b/drivers/gpu/drm/virtio/virtgpu_submit.c index 84e7c4d9d8c7..cf3c04b16a7a 100644 --- a/drivers/gpu/drm/virtio/virtgpu_submit.c +++ b/drivers/gpu/drm/virtio/virtgpu_submit.c @@ -32,8 +32,8 @@ struct virtio_gpu_submit { void *buf; }; -static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit, - struct dma_fence *in_fence) +static int virtio_gpu_do_fence_wait(struct virtio_gpu_submit *submit, + struct dma_fence *in_fence) { u32 context = submit->fence_ctx + submit->ring_idx; @@ -43,6 +43,22 @@ static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit, return dma_fence_wait(in_fence, true); } +static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit, + struct dma_fence *fence) +{ + struct dma_fence_unwrap itr; + struct dma_fence *f; + int err; + + dma_fence_unwrap_for_each(f, &itr, fence) { + err = virtio_gpu_do_fence_wait(submit, f); + if (err) + return err; + } + + return 0; +} + static int virtio_gpu_fence_event_create(struct drm_device *dev, struct drm_file *file, struct virtio_gpu_fence *fence, From patchwork Sun Apr 16 11:52:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Osipenko X-Patchwork-Id: 13212873 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B6512C77B77 for ; Sun, 16 Apr 2023 11:53:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4510410E0B7; Sun, 16 Apr 2023 11:53:31 +0000 (UTC) Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by gabe.freedesktop.org (Postfix) with ESMTPS id 76B1A10E0BE for ; Sun, 16 Apr 2023 11:53:24 +0000 (UTC) Received: from workpc.. (unknown [IPv6:2a00:1370:817e:4eb4:c5e6:4b85:1e3f:55e4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: dmitry.osipenko) by madras.collabora.co.uk (Postfix) with ESMTPSA id 5668B6603242; Sun, 16 Apr 2023 12:53:22 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1681646003; bh=sqddcWUiE06YeLZPkiTfcRceF34Y3LsZ4KnYjbnMu0s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UP5xeBqJS4VQFtlc3ur/90p2DlVGY8uUUhnfanMKiH/ddpQ2Wnkap9Da+mu4s+ANz 9o2IOsTkg5zPnh4zen7K2adK3W/Wte4Or3lousanFKSTaesD60PezgWqu8yD4/WuKL eKeYwjzW7JRVRivG4EhJ08BFyxLvcwfWDxNmHNNTnDtj7hV4JrheI6HyEkCgh/BBaX C3iJ6fu3/NeIHP2tyUIICtDmCU+48R/TVJP7OCYgJ7V8FGmV3375W0SVUZGtrNr5h+ 5CAf+WC77jKUoyDz62cb0ESVDtkUCwB9ztxUHzS2al3TXZHNuOPyLZ0zI5pyax8ufj G6gORnanW/SDA== From: Dmitry Osipenko To: David Airlie , Gerd Hoffmann , Gurchetan Singh , Chia-I Wu , Daniel Vetter , Rob Clark , =?utf-8?b?TWFyZWsgT2zFocOhaw==?= , Pierre-Eric Pelloux-Prayer , Emil Velikov Subject: [PATCH v6 3/3] drm/virtio: Support sync objects Date: Sun, 16 Apr 2023 14:52:37 +0300 Message-Id: <20230416115237.798604-4-dmitry.osipenko@collabora.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230416115237.798604-1-dmitry.osipenko@collabora.com> References: <20230416115237.798604-1-dmitry.osipenko@collabora.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kernel@collabora.com, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, virtualization@lists.linux-foundation.org Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add sync object DRM UAPI support to VirtIO-GPU driver. Sync objects support is needed by native context VirtIO-GPU Mesa drivers, it also will be used by Venus and Virgl contexts. Reviewed-by; Emil Velikov Signed-off-by: Dmitry Osipenko Tested-by: Pierre-Eric Pelloux-Prayer --- drivers/gpu/drm/virtio/virtgpu_drv.c | 3 +- drivers/gpu/drm/virtio/virtgpu_submit.c | 219 ++++++++++++++++++++++++ include/uapi/drm/virtgpu_drm.h | 16 +- 3 files changed, 236 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c index add075681e18..a22155577152 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.c +++ b/drivers/gpu/drm/virtio/virtgpu_drv.c @@ -176,7 +176,8 @@ static const struct drm_driver driver = { * If KMS is disabled DRIVER_MODESET and DRIVER_ATOMIC are masked * out via drm_device::driver_features: */ - .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC, + .driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_RENDER | DRIVER_ATOMIC | + DRIVER_SYNCOBJ | DRIVER_SYNCOBJ_TIMELINE, .open = virtio_gpu_driver_open, .postclose = virtio_gpu_driver_postclose, diff --git a/drivers/gpu/drm/virtio/virtgpu_submit.c b/drivers/gpu/drm/virtio/virtgpu_submit.c index cf3c04b16a7a..5a0f2526c1a0 100644 --- a/drivers/gpu/drm/virtio/virtgpu_submit.c +++ b/drivers/gpu/drm/virtio/virtgpu_submit.c @@ -14,11 +14,24 @@ #include #include +#include #include #include "virtgpu_drv.h" +struct virtio_gpu_submit_post_dep { + struct drm_syncobj *syncobj; + struct dma_fence_chain *chain; + u64 point; +}; + struct virtio_gpu_submit { + struct virtio_gpu_submit_post_dep *post_deps; + unsigned int num_out_syncobjs; + + struct drm_syncobj **in_syncobjs; + unsigned int num_in_syncobjs; + struct virtio_gpu_object_array *buflist; struct drm_virtgpu_execbuffer *exbuf; struct virtio_gpu_fence *out_fence; @@ -59,6 +72,199 @@ static int virtio_gpu_dma_fence_wait(struct virtio_gpu_submit *submit, return 0; } +static void virtio_gpu_free_syncobjs(struct drm_syncobj **syncobjs, + u32 nr_syncobjs) +{ + u32 i = nr_syncobjs; + + while (i--) { + if (syncobjs[i]) + drm_syncobj_put(syncobjs[i]); + } + + kvfree(syncobjs); +} + +static int +virtio_gpu_parse_deps(struct virtio_gpu_submit *submit) +{ + struct drm_virtgpu_execbuffer *exbuf = submit->exbuf; + struct drm_virtgpu_execbuffer_syncobj syncobj_desc; + size_t syncobj_stride = exbuf->syncobj_stride; + u32 num_in_syncobjs = exbuf->num_in_syncobjs; + struct drm_syncobj **syncobjs; + int ret = 0, i; + + if (!num_in_syncobjs) + return 0; + + /* + * kvalloc at first tries to allocate memory using kmalloc and + * falls back to vmalloc only on failure. It also uses GFP_NOWARN + * internally for allocations larger than a page size, preventing + * storm of KMSG warnings. + */ + syncobjs = kvcalloc(num_in_syncobjs, sizeof(*syncobjs), GFP_KERNEL); + if (!syncobjs) + return -ENOMEM; + + for (i = 0; i < num_in_syncobjs; i++) { + u64 address = exbuf->in_syncobjs + i * syncobj_stride; + struct dma_fence *fence; + + memset(&syncobj_desc, 0, sizeof(syncobj_desc)); + + if (copy_from_user(&syncobj_desc, + u64_to_user_ptr(address), + min(syncobj_stride, sizeof(syncobj_desc)))) { + ret = -EFAULT; + break; + } + + if (syncobj_desc.flags & ~VIRTGPU_EXECBUF_SYNCOBJ_FLAGS) { + ret = -EINVAL; + break; + } + + ret = drm_syncobj_find_fence(submit->file, syncobj_desc.handle, + syncobj_desc.point, 0, &fence); + if (ret) + break; + + ret = virtio_gpu_dma_fence_wait(submit, fence); + + dma_fence_put(fence); + if (ret) + break; + + if (syncobj_desc.flags & VIRTGPU_EXECBUF_SYNCOBJ_RESET) { + syncobjs[i] = drm_syncobj_find(submit->file, + syncobj_desc.handle); + if (!syncobjs[i]) { + ret = -EINVAL; + break; + } + } + } + + if (ret) { + virtio_gpu_free_syncobjs(syncobjs, i); + return ret; + } + + submit->num_in_syncobjs = num_in_syncobjs; + submit->in_syncobjs = syncobjs; + + return ret; +} + +static void virtio_gpu_reset_syncobjs(struct drm_syncobj **syncobjs, + u32 nr_syncobjs) +{ + u32 i; + + for (i = 0; i < nr_syncobjs; i++) { + if (syncobjs[i]) + drm_syncobj_replace_fence(syncobjs[i], NULL); + } +} + +static void +virtio_gpu_free_post_deps(struct virtio_gpu_submit_post_dep *post_deps, + u32 nr_syncobjs) +{ + u32 i = nr_syncobjs; + + while (i--) { + kfree(post_deps[i].chain); + drm_syncobj_put(post_deps[i].syncobj); + } + + kvfree(post_deps); +} + +static int virtio_gpu_parse_post_deps(struct virtio_gpu_submit *submit) +{ + struct drm_virtgpu_execbuffer *exbuf = submit->exbuf; + struct drm_virtgpu_execbuffer_syncobj syncobj_desc; + struct virtio_gpu_submit_post_dep *post_deps; + u32 num_out_syncobjs = exbuf->num_out_syncobjs; + size_t syncobj_stride = exbuf->syncobj_stride; + int ret = 0, i; + + if (!num_out_syncobjs) + return 0; + + post_deps = kvcalloc(num_out_syncobjs, sizeof(*post_deps), GFP_KERNEL); + if (!post_deps) + return -ENOMEM; + + for (i = 0; i < num_out_syncobjs; i++) { + u64 address = exbuf->out_syncobjs + i * syncobj_stride; + + memset(&syncobj_desc, 0, sizeof(syncobj_desc)); + + if (copy_from_user(&syncobj_desc, + u64_to_user_ptr(address), + min(syncobj_stride, sizeof(syncobj_desc)))) { + ret = -EFAULT; + break; + } + + post_deps[i].point = syncobj_desc.point; + + if (syncobj_desc.flags) { + ret = -EINVAL; + break; + } + + if (syncobj_desc.point) { + post_deps[i].chain = dma_fence_chain_alloc(); + if (!post_deps[i].chain) { + ret = -ENOMEM; + break; + } + } + + post_deps[i].syncobj = drm_syncobj_find(submit->file, + syncobj_desc.handle); + if (!post_deps[i].syncobj) { + kfree(post_deps[i].chain); + ret = -EINVAL; + break; + } + } + + if (ret) { + virtio_gpu_free_post_deps(post_deps, i); + return ret; + } + + submit->num_out_syncobjs = num_out_syncobjs; + submit->post_deps = post_deps; + + return 0; +} + +static void +virtio_gpu_process_post_deps(struct virtio_gpu_submit *submit) +{ + struct virtio_gpu_submit_post_dep *post_deps = submit->post_deps; + struct dma_fence *fence = &submit->out_fence->f; + u32 i; + + for (i = 0; i < submit->num_out_syncobjs; i++) { + if (post_deps[i].chain) { + drm_syncobj_add_point(post_deps[i].syncobj, + post_deps[i].chain, + fence, post_deps[i].point); + post_deps[i].chain = NULL; + } else { + drm_syncobj_replace_fence(post_deps[i].syncobj, fence); + } + } +} + static int virtio_gpu_fence_event_create(struct drm_device *dev, struct drm_file *file, struct virtio_gpu_fence *fence, @@ -122,6 +328,10 @@ static int virtio_gpu_init_submit_buflist(struct virtio_gpu_submit *submit) static void virtio_gpu_cleanup_submit(struct virtio_gpu_submit *submit) { + virtio_gpu_reset_syncobjs(submit->in_syncobjs, submit->num_in_syncobjs); + virtio_gpu_free_syncobjs(submit->in_syncobjs, submit->num_in_syncobjs); + virtio_gpu_free_post_deps(submit->post_deps, submit->num_out_syncobjs); + if (!IS_ERR(submit->buf)) kvfree(submit->buf); @@ -288,6 +498,14 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, * optimize the path by proceeding directly to the submission * to virtio after the waits. */ + ret = virtio_gpu_parse_post_deps(&submit); + if (ret) + goto cleanup; + + ret = virtio_gpu_parse_deps(&submit); + if (ret) + goto cleanup; + ret = virtio_gpu_wait_in_fence(&submit); if (ret) goto cleanup; @@ -303,6 +521,7 @@ int virtio_gpu_execbuffer_ioctl(struct drm_device *dev, void *data, * the job submission path. */ virtio_gpu_install_out_fence_fd(&submit); + virtio_gpu_process_post_deps(&submit); virtio_gpu_complete_submit(&submit); cleanup: virtio_gpu_cleanup_submit(&submit); diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h index 7b158fcb02b4..b1d0e56565bc 100644 --- a/include/uapi/drm/virtgpu_drm.h +++ b/include/uapi/drm/virtgpu_drm.h @@ -64,6 +64,16 @@ struct drm_virtgpu_map { __u32 pad; }; +#define VIRTGPU_EXECBUF_SYNCOBJ_RESET 0x01 +#define VIRTGPU_EXECBUF_SYNCOBJ_FLAGS ( \ + VIRTGPU_EXECBUF_SYNCOBJ_RESET | \ + 0) +struct drm_virtgpu_execbuffer_syncobj { + __u32 handle; + __u32 flags; + __u64 point; +}; + /* fence_fd is modified on success if VIRTGPU_EXECBUF_FENCE_FD_OUT flag is set. */ struct drm_virtgpu_execbuffer { __u32 flags; @@ -73,7 +83,11 @@ struct drm_virtgpu_execbuffer { __u32 num_bo_handles; __s32 fence_fd; /* in/out fence fd (see VIRTGPU_EXECBUF_FENCE_FD_IN/OUT) */ __u32 ring_idx; /* command ring index (see VIRTGPU_EXECBUF_RING_IDX) */ - __u32 pad; + __u32 syncobj_stride; /* size of @drm_virtgpu_execbuffer_syncobj */ + __u32 num_in_syncobjs; + __u32 num_out_syncobjs; + __u64 in_syncobjs; + __u64 out_syncobjs; }; #define VIRTGPU_PARAM_3D_FEATURES 1 /* do we have 3D features in the hw */