From patchwork Mon Dec 14 05:46:04 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: ankitprasad.r.sharma@intel.com X-Patchwork-Id: 7841051 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 43FE39F387 for ; Mon, 14 Dec 2015 06:05:27 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 463AE20546 for ; Mon, 14 Dec 2015 06:05:26 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 404212054B for ; Mon, 14 Dec 2015 06:05:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id AE2CF6E02D; Sun, 13 Dec 2015 22:05:24 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTP id 1FD736E02D for ; Sun, 13 Dec 2015 22:05:23 -0800 (PST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP; 13 Dec 2015 22:05:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,426,1444719600"; d="scan'208";a="872985966" Received: from ankitprasad-desktop.iind.intel.com ([10.223.82.145]) by fmsmga002.fm.intel.com with ESMTP; 13 Dec 2015 22:05:22 -0800 From: ankitprasad.r.sharma@intel.com To: intel-gfx@lists.freedesktop.org Date: Mon, 14 Dec 2015 11:16:04 +0530 Message-Id: <1450071971-30321-3-git-send-email-ankitprasad.r.sharma@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1450071971-30321-1-git-send-email-ankitprasad.r.sharma@intel.com> References: <1450071971-30321-1-git-send-email-ankitprasad.r.sharma@intel.com> Cc: Ankitprasad Sharma , akash.goel@intel.com, shashidhar.hiremath@intel.com Subject: [Intel-gfx] [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ankitprasad Sharma In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First, we try a nonblocking pin for the whole object (since that is fastest if reused), then failing that we try to grab one page in the mappable aperture. It also allows us to handle objects larger than the mappable aperture (e.g. if we need to pwrite with vGPU restricting the aperture to a measely 8MiB or something like that). v2: Pin pages before starting pwrite, Combined duplicate loops (Chris) v3: Combined loops based on local patch by Chris (Chris) v4: Added i915 wrapper function for drm_mm_insert_node_in_range (Chris) Signed-off-by: Ankitprasad Sharma Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem.c | 86 ++++++++++++++++++++++++++++++----------- 1 file changed, 64 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index bf7f203..46c1e75 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -61,6 +61,21 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) return obj->pin_display; } +static int +i915_gem_insert_node_in_range(struct drm_i915_private *i915, + struct drm_mm_node *node, u64 size, + unsigned alignment, u64 start, u64 end) +{ + int ret; + + ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node, + size, alignment, 0, start, + end, DRM_MM_SEARCH_DEFAULT, + DRM_MM_SEARCH_DEFAULT); + + return ret; +} + /* some bookkeeping */ static void i915_gem_info_add_obj(struct drm_i915_private *dev_priv, size_t size) @@ -760,20 +775,29 @@ fast_user_write(struct io_mapping *mapping, * user into the GTT, uncached. */ static int -i915_gem_gtt_pwrite_fast(struct drm_device *dev, +i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, struct drm_i915_gem_object *obj, struct drm_i915_gem_pwrite *args, struct drm_file *file) { - struct drm_i915_private *dev_priv = dev->dev_private; - ssize_t remain; - loff_t offset, page_base; + struct drm_mm_node node; + uint64_t remain, offset; char __user *user_data; - int page_offset, page_length, ret; + int ret; ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE | PIN_NONBLOCK); - if (ret) - goto out; + if (ret) { + memset(&node, 0, sizeof(node)); + ret = i915_gem_insert_node_in_range(i915, &node, 4096, 0, + 0, i915->gtt.mappable_end); + if (ret) + goto out; + + i915_gem_object_pin_pages(obj); + } else { + node.start = i915_gem_obj_ggtt_offset(obj); + node.allocated = false; + } ret = i915_gem_object_set_to_gtt_domain(obj, true); if (ret) @@ -783,31 +807,39 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev, if (ret) goto out_unpin; - user_data = to_user_ptr(args->data_ptr); - remain = args->size; - - offset = i915_gem_obj_ggtt_offset(obj) + args->offset; - intel_fb_obj_invalidate(obj, ORIGIN_GTT); + obj->dirty = true; - while (remain > 0) { + user_data = to_user_ptr(args->data_ptr); + offset = args->offset; + remain = args->size; + while (remain) { /* Operation in this page * * page_base = page offset within aperture * page_offset = offset within page * page_length = bytes to copy for this page */ - page_base = offset & PAGE_MASK; - page_offset = offset_in_page(offset); - page_length = remain; - if ((page_offset + remain) > PAGE_SIZE) - page_length = PAGE_SIZE - page_offset; - + u32 page_base = node.start; + unsigned page_offset = offset_in_page(offset); + unsigned page_length = PAGE_SIZE - page_offset; + page_length = remain < page_length ? remain : page_length; + if (node.allocated) { + wmb(); + i915->gtt.base.insert_page(&i915->gtt.base, + i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT), + node.start, + I915_CACHE_NONE, + 0); + wmb(); + } else { + page_base += offset & PAGE_MASK; + } /* If we get a fault while copying data, then (presumably) our * source page isn't available. Return the error and we'll * retry in the slow path. */ - if (fast_user_write(dev_priv->gtt.mappable, page_base, + if (fast_user_write(i915->gtt.mappable, page_base, page_offset, user_data, page_length)) { ret = -EFAULT; goto out_flush; @@ -821,7 +853,17 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev, out_flush: intel_fb_obj_flush(obj, false, ORIGIN_GTT); out_unpin: - i915_gem_object_ggtt_unpin(obj); + if (node.allocated) { + wmb(); + i915->gtt.base.clear_range(&i915->gtt.base, + node.start, node.size, + true); + drm_mm_remove_node(&node); + i915_gem_object_unpin_pages(obj); + } + else { + i915_gem_object_ggtt_unpin(obj); + } out: return ret; } @@ -1086,7 +1128,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, if (obj->tiling_mode == I915_TILING_NONE && obj->base.write_domain != I915_GEM_DOMAIN_CPU && cpu_write_needs_clflush(obj)) { - ret = i915_gem_gtt_pwrite_fast(dev, obj, args, file); + ret = i915_gem_gtt_pwrite_fast(dev_priv, obj, args, file); /* Note that the gtt paths might fail with non-page-backed user * pointers (e.g. gtt mappings when moving data between * textures). Fallback to the shmem path in that case. */