From patchwork Wed Jul 1 09:25:12 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: ankitprasad.r.sharma@intel.com X-Patchwork-Id: 6701741 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 329EF9F380 for ; Wed, 1 Jul 2015 09:39:07 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 1303F2063F for ; Wed, 1 Jul 2015 09:39:06 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id D79602062A for ; Wed, 1 Jul 2015 09:39:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8889F6E38B; Wed, 1 Jul 2015 02:39:04 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTP id 262EF6E38B for ; Wed, 1 Jul 2015 02:39:04 -0700 (PDT) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP; 01 Jul 2015 02:39:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.15,384,1432623600"; d="scan'208";a="598042408" Received: from ankitprasad-desktop.iind.intel.com ([10.223.82.39]) by orsmga003.jf.intel.com with ESMTP; 01 Jul 2015 02:39:03 -0700 From: ankitprasad.r.sharma@intel.com To: intel-gfx@lists.freedesktop.org Date: Wed, 1 Jul 2015 14:55:12 +0530 Message-Id: <1435742713-3014-2-git-send-email-ankitprasad.r.sharma@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1435742713-3014-1-git-send-email-ankitprasad.r.sharma@intel.com> References: <1435742713-3014-1-git-send-email-ankitprasad.r.sharma@intel.com> Cc: akash.goel@intel.com, Rodrigo Vivi , shashidhar.hiremath@intel.com Subject: [Intel-gfx] [PATCH 1/2] drm/i915: Extend GET_APERTURE ioctl to report available map space X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Rodrigo Vivi When constructing a batchbuffer, it is sometimes crucial to know the largest hole into which we can fit a fenceable buffer (for example when handling very large objects on gen2 and gen3). This depends on the fragmentation of pinned buffers inside the aperture, a question only the kernel can easily answer. This patch extends the current DRM_I915_GEM_GET_APERTURE ioctl to include a couple of new fields in its reply to userspace - the total amount of space available in the mappable region of the aperture and also the single largest block available. This is not quite what userspace wants to answer the question of whether this batch will fit as fences are also required to meet severe alignment constraints within the batch. For this purpose, a third conservative estimate of largest fence available is also provided. For when userspace needs more than one batch, we also provide the culmulative space available for fences such that it has some additional guidance to how much space it could allocate to fences. Conservatism still wins. The patch also adds a debugfs file for convenient testing and reporting. v2: The first object cannot end at offset 0, so we can use last==0 to detect the empty list. v3: Expand all values to 64bit, just in case. Report total mappable aperture size for userspace that cannot easily determine it by inspecting the PCI device. v4: (Rodrigo) Fixed rebase conflicts. v5: Rebased to the latest drm-intel-nightly (Ankit) Signed-off-by: Chris Wilson Signed-off-by: Rodrigo Vivi --- drivers/gpu/drm/i915/i915_debugfs.c | 27 +++++++++ drivers/gpu/drm/i915/i915_gem.c | 116 ++++++++++++++++++++++++++++++++++-- 2 files changed, 139 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index 31d8768..49ec438 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -512,6 +512,32 @@ static int i915_gem_object_info(struct seq_file *m, void* data) return 0; } +static int i915_gem_aperture_info(struct seq_file *m, void *data) +{ + struct drm_info_node *node = m->private; + struct drm_i915_gem_get_aperture arg; + int ret; + + ret = i915_gem_get_aperture_ioctl(node->minor->dev, &arg, NULL); + if (ret) + return ret; + + seq_printf(m, "Total size of the GTT: %llu bytes\n", + arg.aper_size); + seq_printf(m, "Available space in the GTT: %llu bytes\n", + arg.aper_available_size); + seq_printf(m, "Available space in the mappable aperture: %llu bytes\n", + arg.map_available_size); + seq_printf(m, "Single largest space in the mappable aperture: %llu bytes\n", + arg.map_largest_size); + seq_printf(m, "Available space for fences: %llu bytes\n", + arg.fence_available_size); + seq_printf(m, "Single largest fence available: %llu bytes\n", + arg.fence_largest_size); + + return 0; +} + static int i915_gem_gtt_info(struct seq_file *m, void *data) { struct drm_info_node *node = m->private; @@ -5030,6 +5056,7 @@ static int i915_debugfs_create(struct dentry *root, static const struct drm_info_list i915_debugfs_list[] = { {"i915_capabilities", i915_capabilities, 0}, {"i915_gem_objects", i915_gem_object_info, 0}, + {"i915_gem_aperture", i915_gem_aperture_info, 0}, {"i915_gem_gtt", i915_gem_gtt_info, 0}, {"i915_gem_pinned", i915_gem_gtt_info, 0, (void *) PINNED_LIST}, {"i915_gem_active", i915_gem_object_list_info, 0, (void *) ACTIVE_LIST}, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a2a4a27..ccfc8d3 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -32,6 +32,7 @@ #include "i915_vgpu.h" #include "i915_trace.h" #include "intel_drv.h" +#include #include #include #include @@ -143,6 +144,55 @@ int i915_mutex_lock_interruptible(struct drm_device *dev) return 0; } +static inline bool +i915_gem_object_is_inactive(struct drm_i915_gem_object *obj) +{ + return i915_gem_obj_bound_any(obj) && !obj->active; +} + +static int obj_rank_by_ggtt(void *priv, + struct list_head *A, + struct list_head *B) +{ + struct drm_i915_gem_object *a = list_entry(A,typeof(*a), obj_exec_link); + struct drm_i915_gem_object *b = list_entry(B,typeof(*b), obj_exec_link); + + return i915_gem_obj_ggtt_offset(a) - i915_gem_obj_ggtt_offset(b); +} + +static u32 __fence_size(struct drm_i915_private *dev_priv, u32 start, u32 end) +{ + u32 size = end - start; + u32 fence_size; + + if (INTEL_INFO(dev_priv)->gen < 4) { + u32 fence_max; + u32 fence_next; + + if (IS_GEN3(dev_priv)) { + fence_max = I830_FENCE_MAX_SIZE_VAL << 20; + fence_next = 1024*1024; + } else { + fence_max = I830_FENCE_MAX_SIZE_VAL << 19; + fence_next = 512*1024; + } + + fence_max = min(fence_max, size); + fence_size = 0; + while (fence_next <= fence_max) { + u32 base = ALIGN(start, fence_next); + if (base + fence_next > end) + break; + + fence_size = fence_next; + fence_next <<= 1; + } + } else + fence_size = size; + + return fence_size; +} + int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data, struct drm_file *file) @@ -150,17 +200,75 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data, struct drm_i915_private *dev_priv = dev->dev_private; struct drm_i915_gem_get_aperture *args = data; struct drm_i915_gem_object *obj; - size_t pinned; + struct list_head map_list; + const u32 map_limit = dev_priv->gtt.mappable_end; + size_t pinned, map_space, map_largest, fence_space, fence_largest; + u32 last, size; + + INIT_LIST_HEAD(&map_list); pinned = 0; + map_space = map_largest = 0; + fence_space = fence_largest = 0; + mutex_lock(&dev->struct_mutex); - list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) - if (i915_gem_obj_is_pinned(obj)) - pinned += i915_gem_obj_ggtt_size(obj); + list_for_each_entry(obj, &dev_priv->mm.bound_list, global_list) { + struct i915_vma *vma = i915_gem_obj_to_ggtt(obj); + + if (vma == NULL || !vma->pin_count) + continue; + + pinned += vma->node.size; + + if (vma->node.start < map_limit) + list_add(&obj->obj_exec_link, &map_list); + } + + last = 0; + list_sort(NULL, &map_list, obj_rank_by_ggtt); + while (!list_empty(&map_list)) { + struct i915_vma *vma; + + obj = list_first_entry(&map_list, typeof(*obj), obj_exec_link); + list_del_init(&obj->obj_exec_link); + + vma = i915_gem_obj_to_ggtt(obj); + if (last == 0) + goto skip_first; + + size = vma->node.start - last; + if (size > map_largest) + map_largest = size; + map_space += size; + + size = __fence_size(dev_priv, last, vma->node.start); + if (size > fence_largest) + fence_largest = size; + fence_space += size; + +skip_first: + last = vma->node.start + vma->node.size; + } + if (last < map_limit) { + size = map_limit - last; + if (size > map_largest) + map_largest = size; + map_space += size; + + size = __fence_size(dev_priv, last, map_limit); + if (size > fence_largest) + fence_largest = size; + fence_space += size; + } mutex_unlock(&dev->struct_mutex); args->aper_size = dev_priv->gtt.base.total; args->aper_available_size = args->aper_size - pinned; + args->map_available_size = map_space; + args->map_largest_size = map_largest; + args->map_total_size = dev_priv->gtt.mappable_end; + args->fence_available_size = fence_space; + args->fence_largest_size = fence_largest; return 0; }