From patchwork Mon Feb 26 17:57:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lionel Landwerlin X-Patchwork-Id: 10242909 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 49831602DC for ; Mon, 26 Feb 2018 17:57:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 313392A209 for ; Mon, 26 Feb 2018 17:57:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 25C112A210; Mon, 26 Feb 2018 17:57:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 7A1432A209 for ; Mon, 26 Feb 2018 17:57:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0C0B46E50F; Mon, 26 Feb 2018 17:57:23 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6F8376E4EF for ; Mon, 26 Feb 2018 17:57:21 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Feb 2018 09:57:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.47,397,1515484800"; d="scan'208";a="203884200" Received: from delly.ld.intel.com ([10.103.238.202]) by orsmga005.jf.intel.com with ESMTP; 26 Feb 2018 09:57:20 -0800 From: Lionel Landwerlin To: intel-gfx@lists.freedesktop.org Date: Mon, 26 Feb 2018 17:57:11 +0000 Message-Id: <20180226175711.15380-7-lionel.g.landwerlin@intel.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: <20180226175711.15380-1-lionel.g.landwerlin@intel.com> References: <20180226175711.15380-1-lionel.g.landwerlin@intel.com> Subject: [Intel-gfx] [PATCH v15 6/6] drm/i915: expose rcs topology through query uAPI X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Virus-Scanned: ClamAV using ClamSMTP With the introduction of asymmetric slices in CNL, we cannot rely on the previous SUBSLICE_MASK getparam to tell userspace what subslices are available. Here we introduce a more detailed way of querying the Gen's GPU topology that doesn't aggregate numbers. This is essential for monitoring parts of the GPU with the OA unit, because counters need to be normalized to the number of EUs/subslices/slices. The current aggregated numbers like EU_TOTAL do not gives us sufficient information. The Mesa series making use of this API is : https://patchwork.freedesktop.org/series/38795/ As a bonus we can draw representations of the GPU : https://imgur.com/a/vuqpa v2: Rename uapi struct s/_mask/_info/ (Tvrtko) Report max_slice/subslice/eus_per_subslice rather than strides (Tvrtko) Add uapi macros to read data from *_info structs (Tvrtko) v3: Use !!(v & DRM_I915_BIT()) for uapi macros instead of custom shifts (Tvrtko) v4: factorize query item writting (Tvrtko) tweak uapi struct/define names (Tvrtko) v5: Replace ALIGN() macro (Chris) v6: Updated uapi comments (Tvrtko) Moved flags != 0 checks into vfuncs (Tvrtko) v7: Use access_ok() before copying anything, to avoid overflows (Chris) Switch BUG_ON() to GEM_WARN_ON() (Tvrtko) v8: Tweak uapi comments style to match the coding style (Lionel) v9: Fix error in comment about computation of enabled subslice (Tvrtko) v10: Fix/update comments in uAPI (Sagar) v11: Drop drm_i915_query_(slice|subslice|eu)_info in favor of a single drm_i915_query_topology_info (Joonas) v12: Add subslice_stride/eu_stride in drm_i915_query_topology_info (Joonas) Signed-off-by: Lionel Landwerlin Acked-by: Chris Wilson --- drivers/gpu/drm/i915/i915_query.c | 82 +++++++++++++++++++++++++++++++++++++++ include/uapi/drm/i915_drm.h | 62 +++++++++++++++++++++++++++++ 2 files changed, 144 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_query.c b/drivers/gpu/drm/i915/i915_query.c index c23bfc9be617..2f11fabb6aff 100644 --- a/drivers/gpu/drm/i915/i915_query.c +++ b/drivers/gpu/drm/i915/i915_query.c @@ -7,8 +7,90 @@ #include "i915_drv.h" #include +static int query_topology_info(struct drm_i915_private *dev_priv, + struct drm_i915_query_item *query_item) +{ + const struct sseu_dev_info *sseu = &INTEL_INFO(dev_priv)->sseu; + struct drm_i915_query_topology_info topo_info; + u32 slice_length, subslice_length, eu_length, total_length; + + if (query_item->flags != 0) + return -EINVAL; + + if (sseu->max_slices == 0) + return -ENODEV; + + /* + * If we ever change the internal slice mask data type, we'll need to + * update this function. + */ + BUILD_BUG_ON(sizeof(u8) != sizeof(sseu->slice_mask)); + + slice_length = sizeof(sseu->slice_mask); + subslice_length = sseu->max_slices * + DIV_ROUND_UP(sseu->max_subslices, + sizeof(sseu->subslice_mask[0]) * BITS_PER_BYTE); + eu_length = sseu->max_slices * sseu->max_subslices * + DIV_ROUND_UP(sseu->max_eus_per_subslice, BITS_PER_BYTE); + + total_length = sizeof(topo_info) + slice_length + subslice_length + eu_length; + + if (query_item->length == 0) + return total_length; + + if (query_item->length < total_length) + return -EINVAL; + + if (copy_from_user(&topo_info, u64_to_user_ptr(query_item->data_ptr), + sizeof(topo_info))) + return -EFAULT; + + if (topo_info.flags != 0) + return -EINVAL; + + if (!access_ok(VERIFY_WRITE, u64_to_user_ptr(query_item->data_ptr), + total_length)) + return -EFAULT; + + memset(&topo_info, 0, sizeof(topo_info)); + topo_info.max_slices = sseu->max_slices; + topo_info.max_subslices = sseu->max_subslices; + topo_info.max_eus_per_subslice = sseu->max_eus_per_subslice; + + topo_info.subslice_offset = slice_length; + topo_info.subslice_stride = DIV_ROUND_UP(sseu->max_subslices, + BITS_PER_BYTE); + topo_info.eu_offset = slice_length + subslice_length; + topo_info.eu_stride = DIV_ROUND_UP(sseu->max_eus_per_subslice, + BITS_PER_BYTE); + + if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr), + &topo_info, sizeof(topo_info))) + return -EFAULT; + + if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr + + sizeof(topo_info)), + &sseu->slice_mask, slice_length)) + return -EFAULT; + + if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr + + sizeof(topo_info) + + slice_length), + sseu->subslice_mask, subslice_length)) + return -EFAULT; + + if (__copy_to_user(u64_to_user_ptr(query_item->data_ptr + + sizeof(topo_info) + + slice_length + subslice_length), + sseu->eu_mask, eu_length)) + return -EFAULT; + + return total_length; +} + static int (* const i915_query_funcs[])(struct drm_i915_private *dev_priv, struct drm_i915_query_item *query_item) = { + query_topology_info, }; int i915_query_ioctl(struct drm_device *dev, void *data, struct drm_file *file) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 1cfb5038c60d..5d80573a6ba6 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1619,6 +1619,7 @@ struct drm_i915_perf_oa_config { struct drm_i915_query_item { __u64 query_id; +#define DRM_I915_QUERY_TOPOLOGY_INFO 1 /* * When set to zero by userspace, this is filled with the size of the @@ -1655,6 +1656,67 @@ struct drm_i915_query { __u64 items_ptr; }; +/* + * Data written by the kernel with query DRM_I915_QUERY_TOPOLOGY_INFO : + * + * data: contains the 3 pieces of information : + * + * - the slice mask with one bit per slice telling whether a slice is + * available. The availability of slice X can be queried with the following + * formula : + * + * (data[X / 8] >> (X % 8)) & 1 + * + * - the subslice mask for each slice with one bit per subslice telling + * whether a subslice is available. The availability of subslice Y in slice + * X can be queried with the following formula : + * + * (data[subslice_offset + + * X * subslice_stride + + * Y / 8] >> (Y % 8)) & 1 + * + * - the EU mask for each subslice in each slice with one bit per EU telling + * whether an EU is available. The availability of EU Z in subslice Y in + * slice X can be queried with the following formula : + * + * (data[eu_offset + + * (X * max_subslices Y) * eu_stride + + * Z / 8] >> (Z % 8)) & 1 + */ +struct drm_i915_query_topology_info { + /* + * Unused for now. + */ + __u16 flags; + + __u16 max_slices; + __u16 max_subslices; + __u16 max_eus_per_subslice; + + /* + * Offset in data[] at which the subslice masks are stored. + */ + __u16 subslice_offset; + + /* + * Stride at which each of the subslice masks for each slice are + * stored. + */ + __u16 subslice_stride; + + /* + * Offset in data[] at which the EU masks are stored. + */ + __u16 eu_offset; + + /* + * Stride at which each of the EU masks for each subslice are stored. + */ + __u16 eu_stride; + + __u8 data[]; +}; + #if defined(__cplusplus) } #endif