From patchwork Fri Nov 15 10:21:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13876055 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0B0AAD6DDF1 for ; Fri, 15 Nov 2024 10:22:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4D51F10E846; Fri, 15 Nov 2024 10:22:07 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="PHRk2kbX"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id D354E10E836 for ; Fri, 15 Nov 2024 10:21:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=2KNBYgktl52oWCVCDac2MxdPU0iGlTv9UYcQi5VUaNE=; b=PHRk2kbXuvckiGQhkU2L2KkDeA SSDTZkH7GwKylJVv6ZNj8WotPoiwR04Pa9xDkShcP7X4AZQj1K1JZUY14bf1u8K0wnphCT8A5tDmw ULdIlUqKBLupWt2LOGn7cfjb+BArX2OMPrNtlV+1lS/4sga6oyF0dCWk/0RSZ2f0AZOPenO4uTys2 7oJf9rf2SUGiyPpYIgoMauiOSZ8SPkYbTVR4FrZndeQ2yB2GJ9iLMcUrpInfrFgGR80PoRNIz6BBx Zh+W6XbgDcQcB8lvnyDskmIgp2Q3KlgN8o1TyWQi8vDMnfdJquIW9OJUil4X41WGLbwjmU6h0neKg pEy3n3MA==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tBtSe-007EAY-9T; Fri, 15 Nov 2024 11:21:56 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Daniel Vetter , Sumit Semwal , Gustavo Padovan , Friedrich Vock , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, stable@vger.kernel.org Subject: [PATCH 1/5] dma-fence: Fix reference leak on fence merge failure path Date: Fri, 15 Nov 2024 10:21:49 +0000 Message-ID: <20241115102153.1980-2-tursulin@igalia.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241115102153.1980-1-tursulin@igalia.com> References: <20241115102153.1980-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Release all fence references if the output dma-fence-array could not be allocated. Signed-off-by: Tvrtko Ursulin Fixes: 245a4a7b531c ("dma-buf: generalize dma_fence unwrap & merging v3") Cc: Christian König Cc: Daniel Vetter Cc: Sumit Semwal Cc: Gustavo Padovan Cc: Friedrich Vock Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: # v6.0+ --- drivers/dma-buf/dma-fence-unwrap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c index 628af51c81af..b19d0adf6086 100644 --- a/drivers/dma-buf/dma-fence-unwrap.c +++ b/drivers/dma-buf/dma-fence-unwrap.c @@ -164,6 +164,8 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, dma_fence_context_alloc(1), 1, false); if (!result) { + for (i = 0; i < count; i++) + dma_fence_put(array[i]); tmp = NULL; goto return_tmp; } From patchwork Fri Nov 15 10:21:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13876053 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BED27D6DDED for ; Fri, 15 Nov 2024 10:22:05 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3756B10E839; Fri, 15 Nov 2024 10:22:02 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="d3JM5mkL"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id CD22F10E835 for ; Fri, 15 Nov 2024 10:21:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=eqW/t25YKHYvQCP61dOJTS86fGo5Twr3i+hAXKR6SFs=; b=d3JM5mkLrgNDMY7ZKVqxqOLAJR CfEMDlY4jokEozFgJ/fqeikoDKgRJ0e5Aj+ZiMC65Tj3MEz+n2Yl7c1y23D8ijJgqZrVSBZtfdgNL iX48Tzv9eUoeGV/D0Q5gL+Y5jLrufJB6JDhA3bg3iTYhPYNL/+oT4sU4R+10nV3DuWj/vP8j3w/I4 xMTyFupe7SFsDIzccpA49KYiVfayErsjZVeDbKjr4hdEei9E/si5Jn4i/I0geh2v4VnF2cE2gtByp 9fQj0m+E5RnlHezDIdcJyROyHM2WHQOQS3cUBCxrbGFgPyBlEhPcgl0wMgTjBGw9hetjXB1+z/17x R5FvykJA==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tBtSf-007EAe-2u; Fri, 15 Nov 2024 11:21:57 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Daniel Vetter , Sumit Semwal , Gustavo Padovan , Friedrich Vock , linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, stable@vger.kernel.org Subject: [PATCH 2/5] dma-fence: Use kernel's sort for merging fences Date: Fri, 15 Nov 2024 10:21:50 +0000 Message-ID: <20241115102153.1980-3-tursulin@igalia.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241115102153.1980-1-tursulin@igalia.com> References: <20241115102153.1980-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin One alternative to the fix Christian proposed in https://lore.kernel.org/dri-devel/20241024124159.4519-3-christian.koenig@amd.com/ is to replace the rather complex open coded sorting loops with the kernel standard sort followed by a context squashing pass. Proposed advantage of this would be readability but one concern Christian raised was that there could be many fences, that they are typically mostly sorted, and so the kernel's heap sort would be much worse by the proposed algorithm. I had a look running some games and vkcube to see what are the typical number of input fences. Tested scenarios: 1) Hogwarts Legacy under Gamescope 450 calls per second to __dma_fence_unwrap_merge. Percentages per number of fences buckets, before and after checking for signalled status, sorting and flattening: N Before After 0 0.91% 1 69.40% 2-3 28.72% 9.4% (90.6% resolved to one fence) 4-5 0.93% 6-9 0.03% 10+ 2) Cyberpunk 2077 under Gamescope 1050 calls per second, amounting to 0.01% CPU time according to perf top. N Before After 0 1.13% 1 52.30% 2-3 40.34% 55.57% 4-5 1.46% 0.50% 6-9 2.44% 10+ 2.34% 3) vkcube under Plasma 90 calls per second. N Before After 0 1 2-3 100% 0% (Ie. all resolved to a single fence) 4-5 6-9 10+ In the case of vkcube all invocations in the 2-3 bucket were actually just two input fences. From these numbers it looks like the heap sort should not be a disadvantage, given how the dominant case is <= 2 input fences which heap sort solves with just one compare and swap. (And for the case of one input fence we have a fast path in the previous patch.) A complementary possibility is to implement a different sorting algorithm under the same API as the kernel's sort() and so keep the simplicity, potentially moving the new sort under lib/ if it would be found more widely useful. v2: * Hold on to fence references and reduce commentary. (Christian) * Record and use latest signaled timestamp in the 2nd loop too. * Consolidate zero or one fences fast paths. v3: * Reverse the seqno sort order for a simpler squashing pass. (Christian) Signed-off-by: Tvrtko Ursulin Fixes: 245a4a7b531c ("dma-buf: generalize dma_fence unwrap & merging v3") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3617 Cc: Christian König Cc: Daniel Vetter Cc: Sumit Semwal Cc: Gustavo Padovan Cc: Friedrich Vock Cc: linux-media@vger.kernel.org Cc: dri-devel@lists.freedesktop.org Cc: linaro-mm-sig@lists.linaro.org Cc: # v6.0+ --- drivers/dma-buf/dma-fence-unwrap.c | 128 ++++++++++++++--------------- 1 file changed, 61 insertions(+), 67 deletions(-) diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c index b19d0adf6086..6345062731f1 100644 --- a/drivers/dma-buf/dma-fence-unwrap.c +++ b/drivers/dma-buf/dma-fence-unwrap.c @@ -12,6 +12,7 @@ #include #include #include +#include /* Internal helper to start new array iteration, don't use directly */ static struct dma_fence * @@ -59,6 +60,25 @@ struct dma_fence *dma_fence_unwrap_next(struct dma_fence_unwrap *cursor) } EXPORT_SYMBOL_GPL(dma_fence_unwrap_next); + +static int fence_cmp(const void *_a, const void *_b) +{ + struct dma_fence *a = *(struct dma_fence **)_a; + struct dma_fence *b = *(struct dma_fence **)_b; + + if (a->context < b->context) + return -1; + else if (a->context > b->context) + return 1; + + if (dma_fence_is_later(b, a)) + return 1; + else if (dma_fence_is_later(a, b)) + return -1; + + return 0; +} + /* Implementation for the dma_fence_merge() marco, don't use directly */ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, struct dma_fence **fences, @@ -67,8 +87,7 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, struct dma_fence_array *result; struct dma_fence *tmp, **array; ktime_t timestamp; - unsigned int i; - size_t count; + int i, j, count; count = 0; timestamp = ns_to_ktime(0); @@ -96,80 +115,55 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, if (!array) return NULL; - /* - * This trashes the input fence array and uses it as position for the - * following merge loop. This works because the dma_fence_merge() - * wrapper macro is creating this temporary array on the stack together - * with the iterators. - */ - for (i = 0; i < num_fences; ++i) - fences[i] = dma_fence_unwrap_first(fences[i], &iter[i]); - count = 0; - do { - unsigned int sel; - -restart: - tmp = NULL; - for (i = 0; i < num_fences; ++i) { - struct dma_fence *next; - - while (fences[i] && dma_fence_is_signaled(fences[i])) - fences[i] = dma_fence_unwrap_next(&iter[i]); - - next = fences[i]; - if (!next) - continue; - - /* - * We can't guarantee that inpute fences are ordered by - * context, but it is still quite likely when this - * function is used multiple times. So attempt to order - * the fences by context as we pass over them and merge - * fences with the same context. - */ - if (!tmp || tmp->context > next->context) { - tmp = next; - sel = i; - - } else if (tmp->context < next->context) { - continue; - - } else if (dma_fence_is_later(tmp, next)) { - fences[i] = dma_fence_unwrap_next(&iter[i]); - goto restart; + for (i = 0; i < num_fences; ++i) { + dma_fence_unwrap_for_each(tmp, &iter[i], fences[i]) { + if (!dma_fence_is_signaled(tmp)) { + array[count++] = dma_fence_get(tmp); } else { - fences[sel] = dma_fence_unwrap_next(&iter[sel]); - goto restart; + ktime_t t = dma_fence_timestamp(tmp); + + if (ktime_after(t, timestamp)) + timestamp = t; } } - - if (tmp) { - array[count++] = dma_fence_get(tmp); - fences[sel] = dma_fence_unwrap_next(&iter[sel]); - } - } while (tmp); - - if (count == 0) { - tmp = dma_fence_allocate_private_stub(ktime_get()); - goto return_tmp; } - if (count == 1) { - tmp = array[0]; - goto return_tmp; - } + if (count == 0 || count == 1) + goto return_fastpath; + + sort(array, count, sizeof(*array), fence_cmp, NULL); - result = dma_fence_array_create(count, array, - dma_fence_context_alloc(1), - 1, false); - if (!result) { - for (i = 0; i < count; i++) + /* + * Only keep the most recent fence for each context. + */ + j = 0; + for (i = 1; i < count; i++) { + if (array[i]->context == array[j]->context) dma_fence_put(array[i]); - tmp = NULL; - goto return_tmp; + else + array[++j] = array[i]; } - return &result->base; + count = ++j; + + if (count > 1) { + result = dma_fence_array_create(count, array, + dma_fence_context_alloc(1), + 1, false); + if (!result) { + for (i = 0; i < count; i++) + dma_fence_put(array[i]); + tmp = NULL; + goto return_tmp; + } + return &result->base; + } + +return_fastpath: + if (count == 0) + tmp = dma_fence_allocate_private_stub(timestamp); + else + tmp = array[0]; return_tmp: kfree(array); From patchwork Fri Nov 15 10:21:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13876051 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C8565D6DDEF for ; Fri, 15 Nov 2024 10:22:02 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3CB2010E835; Fri, 15 Nov 2024 10:22:01 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="q2oVlvKx"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 98A4610E835 for ; Fri, 15 Nov 2024 10:21:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=kogeFYUUnU8BfsJ9/rRtQz86Pgg1TlOwUm+J0HWGRiQ=; b=q2oVlvKxNfr1OxsDISy3EKNsyg DwQkmK//x+BbMU368UGd2aXWVJyV30mKUFwSgS/Uo+im+pYS3riN3bCVsMl7N1Uz25hTCFPkeZzOQ T1cH2HUpcPd97/vNg8RwSrzhen/Oz/l8mFvWDPzupTwYqhYXcGEF2SVOzjjn0siIVc9qCjWscM5jb /YtegtsYtvZn2YodQXZuds2vxQnsrEmy9lnOaJQ2naVAS1CQTCQEAb54jtQ5VptIm+e01L7ZA6jr4 ikeRrxbHQ9vI8BzSN43xI79iL2b/k0yhNUpdalidlGUswmLsxLZqN0Z+Ge6y+qg/ZPoCKt7Qk9X/E u0NHvWpA==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tBtSf-007EAg-Qi; Fri, 15 Nov 2024 11:21:57 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Friedrich Vock Subject: [PATCH 3/5] dma-fence: Add a single fence fast path for fence merging Date: Fri, 15 Nov 2024 10:21:51 +0000 Message-ID: <20241115102153.1980-4-tursulin@igalia.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241115102153.1980-1-tursulin@igalia.com> References: <20241115102153.1980-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Testing some workloads in two different scenarios, such as games running under Gamescope on a Steam Deck, or vkcube under a Plasma desktop, shows that in a significant portion of calls the dma_fence_unwrap_merge helper is called with just a single unsignalled fence. Therefore it is worthile to add a fast path for that case and so bypass the memory allocation and insertion sort attempts. Tested scenarios: 1) Hogwarts Legacy under Gamescope ~1500 calls per second to __dma_fence_unwrap_merge. Percentages per number of fences buckets, before and after checking for signalled status, sorting and flattening: N Before After 0 0.85% 1 69.80% -> The new fast path. 2-9 29.36% 9% (Ie. 91% of this bucket flattened to 1 fence) 10-19 20-40 50+ 2) Cyberpunk 2077 under Gamescope ~2400 calls per second. N Before After 0 0.71% 1 52.53% -> The new fast path. 2-9 44.38% 50.60% (Ie. half resolved to a single fence) 10-19 2.34% 20-40 0.06% 50+ 3) vkcube under Plasma 90 calls per second. N Before After 0 1 2-9 100% 0% (Ie. all resolved to a single fence) 10-19 20-40 50+ In the case of vkcube all invocations in the 2-9 bucket were actually just two input fences. v2: * Correct local variable name and hold on to unsignaled reference. (Chistian) Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Friedrich Vock --- drivers/dma-buf/dma-fence-unwrap.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-unwrap.c index 6345062731f1..2a059ac0ed27 100644 --- a/drivers/dma-buf/dma-fence-unwrap.c +++ b/drivers/dma-buf/dma-fence-unwrap.c @@ -84,8 +84,8 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, struct dma_fence **fences, struct dma_fence_unwrap *iter) { + struct dma_fence *tmp, *unsignaled = NULL, **array; struct dma_fence_array *result; - struct dma_fence *tmp, **array; ktime_t timestamp; int i, j, count; @@ -94,6 +94,8 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, for (i = 0; i < num_fences; ++i) { dma_fence_unwrap_for_each(tmp, &iter[i], fences[i]) { if (!dma_fence_is_signaled(tmp)) { + dma_fence_put(unsignaled); + unsignaled = dma_fence_get(tmp); ++count; } else { ktime_t t = dma_fence_timestamp(tmp); @@ -107,9 +109,16 @@ struct dma_fence *__dma_fence_unwrap_merge(unsigned int num_fences, /* * If we couldn't find a pending fence just return a private signaled * fence with the timestamp of the last signaled one. + * + * Or if there was a single unsignaled fence left we can return it + * directly and early since that is a major path on many workloads. */ if (count == 0) return dma_fence_allocate_private_stub(timestamp); + else if (count == 1) + return unsignaled; + + dma_fence_put(unsignaled); array = kmalloc_array(count, sizeof(*array), GFP_KERNEL); if (!array) From patchwork Fri Nov 15 10:21:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13876052 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6763BD6DDF1 for ; Fri, 15 Nov 2024 10:22:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 78A3710E836; Fri, 15 Nov 2024 10:22:01 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="PXaxHNCV"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id A77C910E835 for ; Fri, 15 Nov 2024 10:22:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=N1yj0jsAepEy9kVoLqQyfXKCQn4j4CPnFWv0BEBYmI0=; b=PXaxHNCVeg/poDKFKnWFEr1LL5 T2rwBy4JiNi6wPG6xca/TP3PsRI7/AFflnXeplwRCUBei/ux03kIKLbV+WQo2yx6SXFdy1o7+Z62k J4Ko+FWkVNPkIQoh3FQ1QqkqrtfwTu0B16Hi6BlLf7owxOeTCaDZaC2Ywxk1NJ/f+C7emoWyXbinJ NrqYGRsFCwyYY9lCGuOIZOcVa1YCbgMpo87k7hBDYV+XQnhRj/TPeN0C7/mgH6eSEmODbQ3p8Ejzn 0A4W6Rn80KofRKL2z0peg3uTMChdeQadu0junrMJJHyC6eVyL8VQQ84OQm+29yQcT/lRd0d3uYbYD Ubls1Kjg==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tBtSg-007EAk-Gn; Fri, 15 Nov 2024 11:21:58 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, =?utf-8?q?Christian_K=C3=B6nig?= , =?utf-8?q?Christian_K=C3=B6nig?= , Tvrtko Ursulin Subject: [PATCH 4/5] dma-buf: add selftest for fence order after merge Date: Fri, 15 Nov 2024 10:21:52 +0000 Message-ID: <20241115102153.1980-5-tursulin@igalia.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241115102153.1980-1-tursulin@igalia.com> References: <20241115102153.1980-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Christian König Add a test which double checks that fences are in the expected order after a merge. While at it also switch to using a mock array for the complex test instead of a merge. Signed-off-by: Christian König Signed-off-by: Tvrtko Ursulin Reviewed-by: Tvrtko Ursulin --- drivers/dma-buf/st-dma-fence-unwrap.c | 69 ++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/drivers/dma-buf/st-dma-fence-unwrap.c b/drivers/dma-buf/st-dma-fence-unwrap.c index f0cee984b6c7..876eabddb08f 100644 --- a/drivers/dma-buf/st-dma-fence-unwrap.c +++ b/drivers/dma-buf/st-dma-fence-unwrap.c @@ -304,6 +304,72 @@ static int unwrap_merge(void *arg) return err; } +static int unwrap_merge_order(void *arg) +{ + struct dma_fence *fence, *f1, *f2, *a1, *a2, *c1, *c2; + struct dma_fence_unwrap iter; + int err = 0; + + f1 = mock_fence(); + if (!f1) + return -ENOMEM; + + dma_fence_enable_sw_signaling(f1); + + f2 = mock_fence(); + if (!f2) { + dma_fence_put(f1); + return -ENOMEM; + } + + dma_fence_enable_sw_signaling(f2); + + a1 = mock_array(2, f1, f2); + if (!a1) + return -ENOMEM; + + c1 = mock_chain(NULL, dma_fence_get(f1)); + if (!c1) + goto error_put_a1; + + c2 = mock_chain(c1, dma_fence_get(f2)); + if (!c2) + goto error_put_a1; + + /* + * The fences in the chain are the same as in a1 but in oposite order, + * the dma_fence_merge() function should be able to handle that. + */ + a2 = dma_fence_unwrap_merge(a1, c2); + + dma_fence_unwrap_for_each(fence, &iter, a2) { + if (fence == f1) { + f1 = NULL; + if (!f2) + pr_err("Unexpected order!\n"); + } else if (fence == f2) { + f2 = NULL; + if (f1) + pr_err("Unexpected order!\n"); + } else { + pr_err("Unexpected fence!\n"); + err = -EINVAL; + } + } + + if (f1 || f2) { + pr_err("Not all fences seen!\n"); + err = -EINVAL; + } + + dma_fence_put(a2); + return err; + +error_put_a1: + dma_fence_put(a1); + return -ENOMEM; +} + static int unwrap_merge_complex(void *arg) { struct dma_fence *fence, *f1, *f2, *f3, *f4, *f5; @@ -327,7 +393,7 @@ static int unwrap_merge_complex(void *arg) goto error_put_f2; /* The resulting array has the fences in reverse */ - f4 = dma_fence_unwrap_merge(f2, f1); + f4 = mock_array(2, dma_fence_get(f2), dma_fence_get(f1)); if (!f4) goto error_put_f3; @@ -375,6 +441,7 @@ int dma_fence_unwrap(void) SUBTEST(unwrap_chain), SUBTEST(unwrap_chain_array), SUBTEST(unwrap_merge), + SUBTEST(unwrap_merge_order), SUBTEST(unwrap_merge_complex), }; From patchwork Fri Nov 15 10:21:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13876054 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 046A0D6DDF0 for ; Fri, 15 Nov 2024 10:22:07 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4991A10E83D; Fri, 15 Nov 2024 10:22:02 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="QML/eYUa"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0902210E835 for ; Fri, 15 Nov 2024 10:22:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=haEsapmNWnqpWWLObyxx1NShh+D3I8+2RUxr+TIQDDA=; b=QML/eYUapBRh8s5xWaV2/Pf+3X /PvC72qIYjxAOcrjKTlau3hV3gakSgP/m7ZiXOV3VSDhbQYqD4/+zLq08bV0KY3paWb8aZTJhp2BJ 5xpYl2g2gQcSrAHBImGwfB/2GHpcCvifEJLn85Tz71oSBZk8NrEOESvfd1FR1KzSU4xTDP4VlZw9y 9AznezHjIAO9ybuuod9wIA2HOznevpac4DhSH/9VczbyrqkkVOzA1OIg/9XLICiElwgUfjpw9ckQL Wd8hX3pOjsKOJvd+/9PouXT5H0g150QeX1XROMxwxsu1kluy2CmDgrh6eHW+FMTOL/Hzd4SZCtJCm 02Pz9OgA==; Received: from [90.241.98.187] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1tBtSh-007EAs-7E; Fri, 15 Nov 2024 11:21:59 +0100 From: Tvrtko Ursulin To: dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH 5/5] dma-fence: Add some more fence-merge-unwrap tests Date: Fri, 15 Nov 2024 10:21:53 +0000 Message-ID: <20241115102153.1980-6-tursulin@igalia.com> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241115102153.1980-1-tursulin@igalia.com> References: <20241115102153.1980-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin So far all tests use seqno one and only vary the context. Lets add some tests which vary the seqno too. Signed-off-by: Tvrtko Ursulin Cc: Christian König --- drivers/dma-buf/st-dma-fence-unwrap.c | 199 +++++++++++++++++++++++++- 1 file changed, 196 insertions(+), 3 deletions(-) diff --git a/drivers/dma-buf/st-dma-fence-unwrap.c b/drivers/dma-buf/st-dma-fence-unwrap.c index 876eabddb08f..a3be888ae2e8 100644 --- a/drivers/dma-buf/st-dma-fence-unwrap.c +++ b/drivers/dma-buf/st-dma-fence-unwrap.c @@ -28,7 +28,7 @@ static const struct dma_fence_ops mock_ops = { .get_timeline_name = mock_name, }; -static struct dma_fence *mock_fence(void) +static struct dma_fence *__mock_fence(u64 context, u64 seqno) { struct mock_fence *f; @@ -37,12 +37,16 @@ static struct dma_fence *mock_fence(void) return NULL; spin_lock_init(&f->lock); - dma_fence_init(&f->base, &mock_ops, &f->lock, - dma_fence_context_alloc(1), 1); + dma_fence_init(&f->base, &mock_ops, &f->lock, context, seqno); return &f->base; } +static struct dma_fence *mock_fence(void) +{ + return __mock_fence(dma_fence_context_alloc(1), 1); +} + static struct dma_fence *mock_array(unsigned int num_fences, ...) { struct dma_fence_array *array; @@ -304,6 +308,111 @@ static int unwrap_merge(void *arg) return err; } +static int unwrap_merge_duplicate(void *arg) +{ + struct dma_fence *fence, *f1, *f2; + struct dma_fence_unwrap iter; + int err = 0; + + f1 = mock_fence(); + if (!f1) + return -ENOMEM; + + dma_fence_enable_sw_signaling(f1); + + f2 = dma_fence_unwrap_merge(f1, f1); + if (!f2) { + err = -ENOMEM; + goto error_put_f1; + } + + dma_fence_unwrap_for_each(fence, &iter, f2) { + if (fence == f1) { + dma_fence_put(f1); + f1 = NULL; + } else { + pr_err("Unexpected fence!\n"); + err = -EINVAL; + } + } + + if (f1) { + pr_err("Not all fences seen!\n"); + err = -EINVAL; + } + + dma_fence_put(f2); +error_put_f1: + dma_fence_put(f1); + return err; +} + +static int unwrap_merge_seqno(void *arg) +{ + struct dma_fence *fence, *f1, *f2, *f3, *f4; + struct dma_fence_unwrap iter; + int err = 0; + u64 ctx[2]; + + ctx[0] = dma_fence_context_alloc(1); + ctx[1] = dma_fence_context_alloc(1); + + f1 = __mock_fence(ctx[1], 1); + if (!f1) + return -ENOMEM; + + dma_fence_enable_sw_signaling(f1); + + f2 = __mock_fence(ctx[1], 2); + if (!f2) { + err = -ENOMEM; + goto error_put_f1; + } + + dma_fence_enable_sw_signaling(f2); + + f3 = __mock_fence(ctx[0], 1); + if (!f3) { + err = -ENOMEM; + goto error_put_f2; + } + + dma_fence_enable_sw_signaling(f3); + + f4 = dma_fence_unwrap_merge(f1, f2, f3); + if (!f4) { + err = -ENOMEM; + goto error_put_f3; + } + + dma_fence_unwrap_for_each(fence, &iter, f4) { + if (fence == f3 && f2) { + dma_fence_put(f3); + f3 = NULL; + } else if (fence == f2 && !f3) { + dma_fence_put(f2); + f2 = NULL; + } else { + pr_err("Unexpected fence!\n"); + err = -EINVAL; + } + } + + if (f2 || f3) { + pr_err("Not all fences seen!\n"); + err = -EINVAL; + } + + dma_fence_put(f4); +error_put_f3: + dma_fence_put(f3); +error_put_f2: + dma_fence_put(f2); +error_put_f1: + dma_fence_put(f1); + return err; +} + static int unwrap_merge_order(void *arg) { struct dma_fence *fence, *f1, *f2, *a1, *a2, *c1, *c2; @@ -433,6 +542,87 @@ static int unwrap_merge_complex(void *arg) return err; } +static int unwrap_merge_complex_seqno(void *arg) +{ + struct dma_fence *fence, *f1, *f2, *f3, *f4, *f5, *f6, *f7; + struct dma_fence_unwrap iter; + int err = -ENOMEM; + u64 ctx[2]; + + ctx[0] = dma_fence_context_alloc(1); + ctx[1] = dma_fence_context_alloc(1); + + f1 = __mock_fence(ctx[0], 2); + if (!f1) + return -ENOMEM; + + dma_fence_enable_sw_signaling(f1); + + f2 = __mock_fence(ctx[1], 1); + if (!f2) + goto error_put_f1; + + dma_fence_enable_sw_signaling(f2); + + f3 = __mock_fence(ctx[0], 1); + if (!f3) + goto error_put_f2; + + dma_fence_enable_sw_signaling(f3); + + f4 = __mock_fence(ctx[1], 2); + if (!f4) + goto error_put_f3; + + dma_fence_enable_sw_signaling(f4); + + f5 = mock_array(2, dma_fence_get(f1), dma_fence_get(f2)); + if (!f5) + goto error_put_f4; + + f6 = mock_array(2, dma_fence_get(f3), dma_fence_get(f4)); + if (!f6) + goto error_put_f5; + + f7 = dma_fence_unwrap_merge(f5, f6); + if (!f7) + goto error_put_f6; + + err = 0; + dma_fence_unwrap_for_each(fence, &iter, f7) { + if (fence == f1 && f4) { + dma_fence_put(f1); + f1 = NULL; + } else if (fence == f4 && !f1) { + dma_fence_put(f4); + f4 = NULL; + } else { + pr_err("Unexpected fence!\n"); + err = -EINVAL; + } + } + + if (f1 || f4) { + pr_err("Not all fences seen!\n"); + err = -EINVAL; + } + + dma_fence_put(f7); +error_put_f6: + dma_fence_put(f6); +error_put_f5: + dma_fence_put(f5); +error_put_f4: + dma_fence_put(f4); +error_put_f3: + dma_fence_put(f3); +error_put_f2: + dma_fence_put(f2); +error_put_f1: + dma_fence_put(f1); + return err; +} + int dma_fence_unwrap(void) { static const struct subtest tests[] = { @@ -441,8 +631,11 @@ int dma_fence_unwrap(void) SUBTEST(unwrap_chain), SUBTEST(unwrap_chain_array), SUBTEST(unwrap_merge), + SUBTEST(unwrap_merge_duplicate), + SUBTEST(unwrap_merge_seqno), SUBTEST(unwrap_merge_order), SUBTEST(unwrap_merge_complex), + SUBTEST(unwrap_merge_complex_seqno), }; return subtests(tests, NULL);