From patchwork Wed Aug 16 04:13:55 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jason Ekstrand X-Patchwork-Id: 9902871 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id AD023603B5 for ; Wed, 16 Aug 2017 04:14:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9F813288BE for ; Wed, 16 Aug 2017 04:14:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8E122288CA; Wed, 16 Aug 2017 04:14:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 4CC262897E for ; Wed, 16 Aug 2017 04:14:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6798D6E3B9; Wed, 16 Aug 2017 04:14:18 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-pg0-x244.google.com (mail-pg0-x244.google.com [IPv6:2607:f8b0:400e:c05::244]) by gabe.freedesktop.org (Postfix) with ESMTPS id 987A46E3B5 for ; Wed, 16 Aug 2017 04:14:16 +0000 (UTC) Received: by mail-pg0-x244.google.com with SMTP id 83so4241050pgb.4 for ; Tue, 15 Aug 2017 21:14:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=jlekstrand-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=S3WNWDAHjyABBlTrhJDu3LvwQp7nX67NFUj/1wGcRik=; b=aIOxD9MriOOVulPxb0uP1OpAhGHYubBD9Bu1WxZDOzBF5NJ8hRC5gX4RU5r1w/a8f0 SrJQpyuaMQ5MAwgzuab9EX9ebgZANZmIhwV4K8g0R7614tPfuRqPyPQl7vnJNDRbV0E1 6TonM5y5nITY1seqwAuY8fgvyZwU76yc5H+W5OutQp6sCNxvNAVZA35xHZqbtczzywV7 jCSbPmx4oATiH88u/MoHKcIvypvveaPiGcRxb7Ckbow9Fvaa5zSNuvjQfQzc+v8Z/8Kr smBmnuVVTIn+mpa7kYYVZmpS2IRBSO3ns0u5AHeoLQNo3ktFr7BSRjGdtLcMQgG1HEHx kgxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=S3WNWDAHjyABBlTrhJDu3LvwQp7nX67NFUj/1wGcRik=; b=D58FripDfSf1l9ekDS+CtRLBeHynsgHUD9BRvd8EFKULntE3vA/Iy7z0BNOBQhTx0g VdxEmWcvcNqkcWquesZCrkvLXa6nP3xHOaLKkQyNWIyaFlHFUj9IV+oaIzQpR1Ri6ouA R99dCHpxCoNWp/DfziIrWPKVtP4Uy5K80RHtk9pRwvabFAhWVr7fO2oByoSSYlf5IYCd CQOgnTs/FRsLEhvUdwpkpMh6Hz5+P8i9eW4Ym6DZTcOiBL9c/Wq7vKxvy1qNq5FtFBrC I3JdFLGkIHGEZYdBwyuzJJQpNzYeq7+6cJl/utuvy8dOTua3A3MpJbtok1A04HWinGoh TzJg== X-Gm-Message-State: AHYfb5ihjAeDXZRykPCI3poe+c89FokWvFNgBvEDjh6pLPjbGXUOxar8 tsH6Wd9+qKVChvSN0xfqxA== X-Received: by 10.84.248.69 with SMTP id e5mr486411pln.38.1502856855783; Tue, 15 Aug 2017 21:14:15 -0700 (PDT) Received: from omlet.jf.intel.com ([192.55.54.40]) by smtp.gmail.com with ESMTPSA id f88sm19684595pff.74.2017.08.15.21.14.13 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Tue, 15 Aug 2017 21:14:14 -0700 (PDT) From: Jason Ekstrand X-Google-Original-From: Jason Ekstrand To: dri-devel@lists.freedesktop.org Subject: [PATCH 7/7] drm/syncobj: Allow wait for submit and signal behavior (v5) Date: Tue, 15 Aug 2017 21:13:55 -0700 Message-Id: <1502856835-9433-8-git-send-email-jason.ekstrand@intel.com> X-Mailer: git-send-email 2.5.0.400.gff86faf In-Reply-To: <1502856835-9433-1-git-send-email-jason.ekstrand@intel.com> References: <1502856835-9433-1-git-send-email-jason.ekstrand@intel.com> MIME-Version: 1.0 Cc: Jason Ekstrand , Dave Airlie , =?UTF-8?q?Christian=20K=C3=B6nig?= , Jason Ekstrand X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Vulkan VkFence semantics require that the application be able to perform a CPU wait on work which may not yet have been submitted. This is perfectly safe because the CPU wait has a timeout which will get triggered eventually if no work is ever submitted. This behavior is advantageous for multi-threaded workloads because, so long as all of the threads agree on what fences to use up-front, you don't have the extra cross-thread synchronization cost of thread A telling thread B that it has submitted its dependent work and thread B is now free to wait. Within a single process, this can be implemented in the userspace driver by doing exactly the same kind of tracking the app would have to do using posix condition variables or similar. However, in order for this to work cross-process (as is required by VK_KHR_external_fence), we need to handle this in the kernel. This commit adds a WAIT_FOR_SUBMIT flag to DRM_IOCTL_SYNCOBJ_WAIT which instructs the IOCTL to wait for the syncobj to have a non-null fence and then wait on the fence. Combined with DRM_IOCTL_SYNCOBJ_RESET, you can easily get the Vulkan behavior. v2: - Fix a bug in the invalid syncobj error path - Unify the wait-all and wait-any cases v3: - Unify the timeout == 0 case a bit with the timeout > 0 case - Use wait_event_interruptible_timeout v4: - Use proxy fence v5: - Revert to a combination of v2 and v3 - Don't use proxy fences - Don't use wait_event_interruptible_timeout because it just adds an extra layer of callbacks Signed-off-by: Jason Ekstrand Cc: Dave Airlie Cc: Chris Wilson Cc: Christian König --- drivers/gpu/drm/drm_syncobj.c | 252 ++++++++++++++++++++++++++++++++++-------- include/uapi/drm/drm.h | 1 + 2 files changed, 208 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c index 24a5286..7dfdb98 100644 --- a/drivers/gpu/drm/drm_syncobj.c +++ b/drivers/gpu/drm/drm_syncobj.c @@ -51,6 +51,7 @@ #include #include #include +#include #include "drm_internal.h" #include @@ -88,6 +89,35 @@ static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj, list_add_tail(&cb->node, &syncobj->cb_list); } +static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj, + struct dma_fence **fence, + struct drm_syncobj_cb *cb, + drm_syncobj_func_t func) +{ + int ret; + + *fence = drm_syncobj_fence_get(syncobj); + if (*fence) + return 1; + + spin_lock(&syncobj->lock); + /* We've already tried once to get a fence and failed. Now that we + * have the lock, try one more time just to be sure we don't add a + * callback when a fence has already been set. + */ + if (syncobj->fence) { + *fence = dma_fence_get(syncobj->fence); + ret = 1; + } else { + *fence = NULL; + drm_syncobj_add_callback_locked(syncobj, cb, func); + ret = 0; + } + spin_unlock(&syncobj->lock); + + return ret; +} + /** * drm_syncobj_add_callback - adds a callback to syncobj::cb_list * @syncobj: Sync object to which to add the callback @@ -509,6 +539,160 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data, &args->handle); } +struct syncobj_wait_entry { + struct task_struct *task; + struct dma_fence *fence; + struct dma_fence_cb fence_cb; + struct drm_syncobj_cb syncobj_cb; +}; + +static void syncobj_wait_fence_func(struct dma_fence *fence, + struct dma_fence_cb *cb) +{ + struct syncobj_wait_entry *wait = + container_of(cb, struct syncobj_wait_entry, fence_cb); + + wake_up_process(wait->task); +} + +static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj, + struct drm_syncobj_cb *cb) +{ + struct syncobj_wait_entry *wait = + container_of(cb, struct syncobj_wait_entry, syncobj_cb); + + /* This happens inside the syncobj lock */ + wait->fence = dma_fence_get(syncobj->fence); + wake_up_process(wait->task); +} + +static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs, + uint32_t count, + uint32_t flags, + signed long timeout, + uint32_t *idx) +{ + struct syncobj_wait_entry *entries; + struct dma_fence *fence; + signed long ret; + uint32_t signaled_count, i; + + entries = kcalloc(count, sizeof(*entries), GFP_KERNEL); + if (!entries) + return -ENOMEM; + + /* Walk the list of sync objects and initialize entries. We do + * this up-front so that we can properly return -EINVAL if there is + * a syncobj with a missing fence and then never have the chance of + * returning -EINVAL again. + */ + signaled_count = 0; + for (i = 0; i < count; ++i) { + entries[i].task = current; + entries[i].fence = drm_syncobj_fence_get(syncobjs[i]); + if (!entries[i].fence) { + if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) { + continue; + } else { + ret = -EINVAL; + goto cleanup_entries; + } + } + + if (dma_fence_is_signaled(entries[i].fence)) { + if (signaled_count == 0 && idx) + *idx = i; + signaled_count++; + } + } + + /* Initialize ret to the max of timeout and 1. That way, the + * default return value indicates a successful wait and not a + * timeout. + */ + ret = max_t(signed long, timeout, 1); + + if (signaled_count == count || + (signaled_count > 0 && + !(flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL))) + goto cleanup_entries; + + /* There's a very annoying laxness in the dma_fence API here, in + * that backends are not required to automatically report when a + * fence is signaled prior to fence->ops->enable_signaling() being + * called. So here if we fail to match signaled_count, we need to + * fallthough and try a 0 timeout wait! + */ + + if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) { + for (i = 0; i < count; ++i) { + drm_syncobj_fence_get_or_add_callback(syncobjs[i], + &entries[i].fence, + &entries[i].syncobj_cb, + syncobj_wait_syncobj_func); + } + } + + do { + set_current_state(TASK_INTERRUPTIBLE); + + signaled_count = 0; + for (i = 0; i < count; ++i) { + fence = entries[i].fence; + if (!fence) + continue; + + if (dma_fence_is_signaled(fence) || + (!entries[i].fence_cb.func && + dma_fence_add_callback(fence, + &entries[i].fence_cb, + syncobj_wait_fence_func))) { + /* The fence has been signaled */ + if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) { + signaled_count++; + } else { + if (idx) + *idx = i; + goto done_waiting; + } + } + } + + if (signaled_count == count) + goto done_waiting; + + if (timeout == 0) { + /* If we are doing a 0 timeout wait and we got + * here, then we just timed out. + */ + ret = 0; + goto done_waiting; + } + + ret = schedule_timeout(ret); + + if (ret > 0 && signal_pending(current)) + ret = -ERESTARTSYS; + } while (ret > 0); + +done_waiting: + __set_current_state(TASK_RUNNING); + +cleanup_entries: + for (i = 0; i < count; ++i) { + if (entries[i].syncobj_cb.func) + drm_syncobj_remove_callback(syncobjs[i], + &entries[i].syncobj_cb); + if (entries[i].fence_cb.func) + dma_fence_remove_callback(entries[i].fence, + &entries[i].fence_cb); + dma_fence_put(entries[i].fence); + } + kfree(entries); + + return ret; +} + /** * drm_timeout_abs_to_jiffies - calculate jiffies timeout from absolute value * @@ -541,43 +725,19 @@ static signed long drm_timeout_abs_to_jiffies(int64_t timeout_nsec) return timeout_jiffies64 + 1; } -static int drm_syncobj_wait_fences(struct drm_device *dev, - struct drm_file *file_private, - struct drm_syncobj_wait *wait, - struct dma_fence **fences) +static int drm_syncobj_array_wait(struct drm_device *dev, + struct drm_file *file_private, + struct drm_syncobj_wait *wait, + struct drm_syncobj **syncobjs) { signed long timeout = drm_timeout_abs_to_jiffies(wait->timeout_nsec); signed long ret = 0; uint32_t first = ~0; - if (wait->flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) { - uint32_t i; - for (i = 0; i < wait->count_handles; i++) { - ret = dma_fence_wait_timeout(fences[i], true, timeout); - - /* Various dma_fence wait callbacks will return - * ENOENT to indicate that the fence has already - * been signaled. We need to sanitize this to 0 so - * we don't return early and the client doesn't see - * an unexpected error. - */ - if (ret == -ENOENT) - ret = 0; - - if (ret < 0) - return ret; - if (ret == 0) - break; - timeout = ret; - } - first = 0; - } else { - ret = dma_fence_wait_any_timeout(fences, - wait->count_handles, - true, timeout, - &first); - } - + ret = drm_syncobj_array_wait_timeout(syncobjs, + wait->count_handles, + wait->flags, + timeout, &first); if (ret < 0) return ret; @@ -593,14 +753,15 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data, { struct drm_syncobj_wait *args = data; uint32_t *handles; - struct dma_fence **fences; + struct drm_syncobj **syncobjs; int ret = 0; uint32_t i; if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ)) return -ENODEV; - if (args->flags != 0 && args->flags != DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) + if (args->flags & ~(DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL | + DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT)) return -EINVAL; if (args->count_handles == 0) @@ -619,27 +780,28 @@ drm_syncobj_wait_ioctl(struct drm_device *dev, void *data, goto err_free_handles; } - fences = kcalloc(args->count_handles, - sizeof(struct dma_fence *), GFP_KERNEL); - if (!fences) { + syncobjs = kcalloc(args->count_handles, + sizeof(struct drm_syncobj *), GFP_KERNEL); + if (!syncobjs) { ret = -ENOMEM; goto err_free_handles; } for (i = 0; i < args->count_handles; i++) { - ret = drm_syncobj_find_fence(file_private, handles[i], - &fences[i]); - if (ret) + syncobjs[i] = drm_syncobj_find(file_private, handles[i]); + if (!syncobjs[i]) { + ret = -ENOENT; goto err_free_fence_array; + } } - ret = drm_syncobj_wait_fences(dev, file_private, - args, fences); + ret = drm_syncobj_array_wait(dev, file_private, + args, syncobjs); err_free_fence_array: - for (i = 0; i < args->count_handles; i++) - dma_fence_put(fences[i]); - kfree(fences); + while (i-- > 0) + drm_syncobj_put(syncobjs[i]); + kfree(syncobjs); err_free_handles: kfree(handles); diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index 4b301b4..f8ec8fe 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -719,6 +719,7 @@ struct drm_syncobj_handle { }; #define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0) +#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1) struct drm_syncobj_wait { __u64 handles; /* absolute timeout */