From patchwork Thu May 6 17:30:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Brost X-Patchwork-Id: 12242831 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6796C433B4 for ; Thu, 6 May 2021 17:13:10 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 47580610D2 for ; Thu, 6 May 2021 17:13:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 47580610D2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A42186ECCD; Thu, 6 May 2021 17:13:06 +0000 (UTC) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTPS id 880E06ECCB; Thu, 6 May 2021 17:13:05 +0000 (UTC) IronPort-SDR: LJyUS3l+jNzBRZt3USXr1zJsGWpqGg2gzfvtHzcxkcGn+uIlQN4/BFGMsQTYLzs7b5mlSMdD5D HPzHv8Qgaqpg== X-IronPort-AV: E=McAfee;i="6200,9189,9976"; a="195412171" X-IronPort-AV: E=Sophos;i="5.82,278,1613462400"; d="scan'208";a="195412171" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 May 2021 10:13:05 -0700 IronPort-SDR: p1rw+IaA7P6Eg6GNmiG2bem6/JOWdWJ8RvHn7k9nFeg8m7Oxk0iW8nQ8I6ljxr1pHjtMSgMNiR v7CWDD35XyLA== X-IronPort-AV: E=Sophos;i="5.82,278,1613462400"; d="scan'208";a="622533965" Received: from dhiatt-server.jf.intel.com ([10.54.81.3]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 May 2021 10:13:04 -0700 From: Matthew Brost To: , Date: Thu, 6 May 2021 10:30:48 -0700 Message-Id: <20210506173049.72503-5-matthew.brost@intel.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20210506173049.72503-1-matthew.brost@intel.com> References: <20210506173049.72503-1-matthew.brost@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [RFC PATCH 4/5] drm/i915: Introduce 'set parallel submit' extension X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: carl.zhang@intel.com, jason.ekstrand@intel.com, daniel.vetter@intel.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" i915_drm.h updates for 'set parallel submit' extension. Cc: Tvrtko Ursulin Cc: Tony Ye CC: Carl Zhang Cc: Daniel Vetter Cc: Jason Ekstrand Signed-off-by: Matthew Brost --- include/uapi/drm/i915_drm.h | 126 ++++++++++++++++++++++++++++++++++++ 1 file changed, 126 insertions(+) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 26d2e135aa31..0175b12b33b8 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1712,6 +1712,7 @@ struct drm_i915_gem_context_param { * Extensions: * i915_context_engines_load_balance (I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE) * i915_context_engines_bond (I915_CONTEXT_ENGINES_EXT_BOND) + * i915_context_engines_parallel_submit (I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT) */ #define I915_CONTEXT_PARAM_ENGINES 0xa @@ -1894,9 +1895,134 @@ struct i915_context_param_engines { __u64 extensions; /* linked chain of extension blocks, 0 terminates */ #define I915_CONTEXT_ENGINES_EXT_LOAD_BALANCE 0 /* see i915_context_engines_load_balance */ #define I915_CONTEXT_ENGINES_EXT_BOND 1 /* see i915_context_engines_bond */ +#define I915_CONTEXT_ENGINES_EXT_PARALLEL_SUBMIT 2 /* see i915_context_engines_parallel_submit */ struct i915_engine_class_instance engines[0]; } __attribute__((packed)); +/* + * i915_context_engines_parallel_submit: + * + * Setup a gem context to allow multiple BBs to be submitted in a single execbuf + * IOCTL. Those BBs will then be scheduled to run on the GPU in parallel. + * + * All hardware contexts in the engine set are configured for parallel + * submission (i.e. once this gem context is configured for parallel submission, + * all the hardware contexts, regardless if a BB is available on each individual + * context, will be submitted to the GPU in parallel). A user can submit BBs to + * subset of the hardware contexts, in a single execbuf IOCTL, but it is not + * recommended as it may reserve physical engines with nothing to run on them. + * Highly recommended to configure the gem context with N hardware contexts then + * always submit N BBs in a single IOCTL. + * + * Their are two currently defined ways to control the placement of the + * hardware contexts on physical engines: default behavior (no flags) and + * I915_PARALLEL_IMPLICT_BONDS (a flag). More flags may be added the in the + * future as new hardware / use cases arise. Details of how to use this + * interface below above the flags. + * + * Returns -EINVAL if hardware context placement configuration invalid or if the + * placement configuration isn't supported on the platform / submission + * interface. + * Returns -ENODEV if extension isn't supported on the platform / submission + * inteface. + */ +struct i915_context_engines_parallel_submit { + struct i915_user_extension base; + +/* + * Default placement behvavior (currently unsupported): + * + * Rather than restricting parallel submission to a single class with a + * logically contiguous placement (I915_PARALLEL_IMPLICT_BONDS), add a mode that + * enables parallel submission across multiple engine classes. In this case each + * context's logical engine mask indicates where that context can placed. It is + * implied in this mode that all contexts have mutual exclusive placement (e.g. + * if one context is running CS0 no other contexts can run on CS0). + * + * Example 1 pseudo code: + * CSX[Y] = engine class X, logical instance Y + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE + * set_engines(INVALID, INVALID) + * set_load_balance(engine_index=0, num_siblings=2, engines=CS0[0],CS0[1]) + * set_load_balance(engine_index=1, num_siblings=2, engines=CS1[0],CS1[1]) + * set_parallel() + * + * Results in the following valid placements: + * CS0[0], CS1[0] + * CS0[0], CS1[1] + * CS0[1], CS1[0] + * CS0[1], CS1[1] + * + * Example 2 pseudo code: + * CS[X] = generic engine of same class, logical instance X + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE + * set_engines(INVALID, INVALID) + * set_load_balance(engine_index=0, num_siblings=3, engines=CS[0],CS[1],CS[2]) + * set_load_balance(engine_index=1, num_siblings=3, engines=CS[0],CS[1],CS[2]) + * set_parallel() + * + * Results in the following valid placements: + * CS[0], CS[1] + * CS[0], CS[2] + * CS[1], CS[0] + * CS[1], CS[2] + * CS[2], CS[0] + * CS[2], CS[1] + * + * This enables a use case where all engines are created equally, we don't care + * where they are scheduled, we just want a certain number of resources, for + * those resources to be scheduled in parallel, and possibly across multiple + * engine classes. + */ + +/* + * I915_PARALLEL_IMPLICT_BONDS - Create implict bonds between each context. + * Each context must have the same number sibling and bonds are implictly create + * of the siblings. + * + * All of the below examples are in logical space. + * + * Example 1 pseudo code: + * CS[X] = generic engine of same class, logical instance X + * set_engines(CS[0], CS[1]) + * set_parallel(flags=I915_PARALLEL_IMPLICT_BONDS) + * + * Results in the following valid placements: + * CS[0], CS[1] + * + * Example 2 pseudo code: + * CS[X] = generic engine of same class, logical instance X + * INVALID = I915_ENGINE_CLASS_INVALID, I915_ENGINE_CLASS_INVALID_NONE + * set_engines(INVALID, INVALID) + * set_load_balance(engine_index=0, num_siblings=2, engines=CS[0],CS[2]) + * set_load_balance(engine_index=1, num_siblings=2, engines=CS[1],CS[3]) + * set_parallel(flags=I915_PARALLEL_IMPLICT_BONDS) + * + * Results in the following valid placements: + * CS[0], CS[1] + * CS[2], CS[3] + * + * This enables a use case where all engines are not equal and certain placement + * rules are required (i.e. split-frame requires all contexts to be placed in a + * logically contiguous order on the VCS engines on gen11+ platforms). This use + * case (logically contiguous placement, within a single engine class) is + * supported when using GuC submission. Execlist mode could support all possible + * bonding configurations but currently doesn't support this extension. + */ +#define I915_PARALLEL_IMPLICT_BONDS (1<<0) +/* + * Do not allow BBs to be preempted mid BB rather insert coordinated preemption + * points on all hardware contexts between each set of BBs. An example use case + * of this feature is split-frame on gen11+ hardware. When using this feature a + * BB must be submitted on each hardware context in the parallel gem context. + * The execbuf2 IOCTL enforces the user adheres to policy. + */ +#define I915_PARALLEL_NO_PREEMPT_MID_BATCH (1<<1) +#define I915_PARALLEL_UNKNOWN_FLAGS (-(I915_PARALLEL_NO_PREEMPT_MID_BATCH << 1)) + __u64 flags; /* all undefined flags must be zero */ + __u64 mbz64[4]; /* reserved for future use; must be zero */ +} __attribute__ ((packed)); + #define I915_DEFINE_CONTEXT_PARAM_ENGINES(name__, N__) struct { \ __u64 extensions; \ struct i915_engine_class_instance engines[N__]; \