From patchwork Wed Apr 20 11:17:35 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: ankitprasad.r.sharma@intel.com X-Patchwork-Id: 8889081 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 0D7F79F39A for ; Wed, 20 Apr 2016 11:43:18 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C13CB2010C for ; Wed, 20 Apr 2016 11:43:16 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id BB50B20211 for ; Wed, 20 Apr 2016 11:43:14 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 457E56E993; Wed, 20 Apr 2016 11:43:12 +0000 (UTC) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTP id 2A9F06E98A for ; Wed, 20 Apr 2016 11:42:52 +0000 (UTC) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP; 20 Apr 2016 04:42:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,509,1455004800"; d="scan'208";a="958939682" Received: from ankitprasad-desktop.iind.intel.com ([10.223.82.74]) by orsmga002.jf.intel.com with ESMTP; 20 Apr 2016 04:42:50 -0700 From: ankitprasad.r.sharma@intel.com To: intel-gfx@lists.freedesktop.org Date: Wed, 20 Apr 2016 16:47:35 +0530 Message-Id: <1461151059-16361-9-git-send-email-ankitprasad.r.sharma@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1461151059-16361-1-git-send-email-ankitprasad.r.sharma@intel.com> References: <1461151059-16361-1-git-send-email-ankitprasad.r.sharma@intel.com> Cc: Ankitprasad Sharma , akash.goel@intel.com Subject: [Intel-gfx] [PATCH 08/12] drm/i915: Support for pread/pwrite from/to non shmem backed objects X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ankitprasad Sharma This patch adds support for extending the pread/pwrite functionality for objects not backed by shmem. The access will be made through gtt interface. This will cover objects backed by stolen memory as well as other non-shmem backed objects. v2: Drop locks around slow_user_access, prefault the pages before access (Chris) v3: Rebased to the latest drm-intel-nightly (Ankit) v4: Moved page base & offset calculations outside the copy loop, corrected data types for size and offset variables, corrected if-else braces format (Tvrtko/kerneldocs) v5: Enabled pread/pwrite for all non-shmem backed objects including without tiling restrictions (Ankit) v6: Using pwrite_fast for non-shmem backed objects as well (Chris) v7: Updated commit message, Renamed i915_gem_gtt_read to i915_gem_gtt_copy, added pwrite slow path for non-shmem backed objects (Chris/Tvrtko) v8: Updated v7 commit message, mutex unlock around pwrite slow path for non-shmem backed objects (Tvrtko) v9: Corrected check during pread_ioctl, to avoid shmem_pread being called for non-shmem backed objects (Tvrtko) v10: Moved the write_domain check to needs_clflush and tiling mode check to pwrite_fast (Chris) v11: Use pwrite_fast fallback for all objects (shmem and non-shmem backed), call fast_user_write regardless of pagefault in previous iteration v12: Use page-by-page copy for slow user access too (Chris) v13: Handled EFAULT, Avoid use of WARN_ON, put_fence only if whole obj pinned (Chris) v14: Corrected datatypes/initializations (Tvrtko) Testcase: igt/gem_stolen, igt/gem_pread, igt/gem_pwrite Signed-off-by: Ankitprasad Sharma Reviewed-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_gem.c | 222 ++++++++++++++++++++++++++++++++++------ 1 file changed, 190 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index c5ba1bb..3e72737 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -54,6 +54,9 @@ static bool cpu_cache_is_coherent(struct drm_device *dev, static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) { + if (obj->base.write_domain == I915_GEM_DOMAIN_CPU) + return false; + if (!cpu_cache_is_coherent(obj->base.dev, obj->cache_level)) return true; @@ -641,6 +644,142 @@ shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length, return ret ? - EFAULT : 0; } +static inline unsigned long +slow_user_access(struct io_mapping *mapping, + uint64_t page_base, int page_offset, + char __user *user_data, + unsigned long length, bool pwrite) +{ + void __iomem *ioaddr; + void *vaddr; + uint64_t unwritten; + + ioaddr = io_mapping_map_wc(mapping, page_base); + /* We can use the cpu mem copy function because this is X86. */ + vaddr = (void __force *)ioaddr + page_offset; + if (pwrite) + unwritten = __copy_from_user(vaddr, user_data, length); + else + unwritten = __copy_to_user(user_data, vaddr, length); + + io_mapping_unmap(ioaddr); + return unwritten; +} + +static int +i915_gem_gtt_pread(struct drm_device *dev, + struct drm_i915_gem_object *obj, uint64_t size, + uint64_t data_offset, uint64_t data_ptr) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + struct i915_ggtt *ggtt = &dev_priv->ggtt; + struct drm_mm_node node; + char __user *user_data; + uint64_t remain; + uint64_t offset; + int ret; + + ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE); + if (ret) { + ret = insert_mappable_node(dev_priv, &node, PAGE_SIZE); + if (ret) + goto out; + + ret = i915_gem_object_get_pages(obj); + if (ret) { + remove_mappable_node(&node); + goto out; + } + + i915_gem_object_pin_pages(obj); + } else { + node.start = i915_gem_obj_ggtt_offset(obj); + node.allocated = false; + ret = i915_gem_object_put_fence(obj); + if (ret) + goto out_unpin; + } + + ret = i915_gem_object_set_to_gtt_domain(obj, false); + if (ret) + goto out_unpin; + + user_data = to_user_ptr(data_ptr); + remain = size; + offset = data_offset; + + mutex_unlock(&dev->struct_mutex); + if (likely(!i915.prefault_disable)) { + ret = fault_in_multipages_writeable(user_data, remain); + if (ret) { + mutex_lock(&dev->struct_mutex); + goto out_unpin; + } + } + + while (remain > 0) { + /* Operation in this page + * + * page_base = page offset within aperture + * page_offset = offset within page + * page_length = bytes to copy for this page + */ + u32 page_base = node.start; + unsigned page_offset = offset_in_page(offset); + unsigned page_length = PAGE_SIZE - page_offset; + page_length = remain < page_length ? remain : page_length; + if (node.allocated) { + wmb(); + ggtt->base.insert_page(&ggtt->base, + i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT), + node.start, + I915_CACHE_NONE, 0); + wmb(); + } else { + page_base += offset & PAGE_MASK; + } + /* This is a slow read/write as it tries to read from + * and write to user memory which may result into page + * faults, and so we cannot perform this under struct_mutex. + */ + if (slow_user_access(ggtt->mappable, page_base, + page_offset, user_data, + page_length, false)) { + ret = -EFAULT; + break; + } + + remain -= page_length; + user_data += page_length; + offset += page_length; + } + + mutex_lock(&dev->struct_mutex); + if (ret == 0 && (obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0) { + /* The user has modified the object whilst we tried + * reading from it, and we now have no idea what domain + * the pages should be in. As we have just been touching + * them directly, flush everything back to the GTT + * domain. + */ + ret = i915_gem_object_set_to_gtt_domain(obj, false); + } + +out_unpin: + if (node.allocated) { + wmb(); + ggtt->base.clear_range(&ggtt->base, + node.start, node.size, + true); + i915_gem_object_unpin_pages(obj); + remove_mappable_node(&node); + } else { + i915_gem_object_ggtt_unpin(obj); + } +out: + return ret; +} + static int i915_gem_shmem_pread(struct drm_device *dev, struct drm_i915_gem_object *obj, @@ -656,6 +795,9 @@ i915_gem_shmem_pread(struct drm_device *dev, int needs_clflush = 0; struct sg_page_iter sg_iter; + if (!obj->base.filp) + return -ENODEV; + user_data = to_user_ptr(args->data_ptr); remain = args->size; @@ -764,18 +906,15 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data, goto out; } - /* prime objects have no backing filp to GEM pread/pwrite - * pages from. - */ - if (!obj->base.filp) { - ret = -EINVAL; - goto out; - } - trace_i915_gem_object_pread(obj, args->offset, args->size); ret = i915_gem_shmem_pread(dev, obj, args, file); + /* pread for non shmem backed objects */ + if (ret == -EFAULT || ret == -ENODEV) + ret = i915_gem_gtt_pread(dev, obj, args->size, + args->offset, args->data_ptr); + out: drm_gem_object_unreference(&obj->base); unlock: @@ -817,10 +956,15 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, struct drm_file *file) { struct i915_ggtt *ggtt = &i915->ggtt; + struct drm_device *dev = obj->base.dev; struct drm_mm_node node; uint64_t remain, offset; char __user *user_data; int ret; + bool hit_slow_path = false; + + if (obj->tiling_mode != I915_TILING_NONE) + return -EFAULT; ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE | PIN_NONBLOCK); if (ret) { @@ -838,16 +982,15 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, } else { node.start = i915_gem_obj_ggtt_offset(obj); node.allocated = false; + ret = i915_gem_object_put_fence(obj); + if (ret) + goto out_unpin; } ret = i915_gem_object_set_to_gtt_domain(obj, true); if (ret) goto out_unpin; - ret = i915_gem_object_put_fence(obj); - if (ret) - goto out_unpin; - intel_fb_obj_invalidate(obj, ORIGIN_GTT); obj->dirty = true; @@ -866,22 +1009,34 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, unsigned page_length = PAGE_SIZE - page_offset; page_length = remain < page_length ? remain : page_length; if (node.allocated) { - wmb(); + wmb(); /* flush the write before we modify the GGTT */ ggtt->base.insert_page(&ggtt->base, i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT), node.start, I915_CACHE_NONE, 0); - wmb(); + wmb(); /* flush modifications to the GGTT (insert_page) */ } else { page_base += offset & PAGE_MASK; } /* If we get a fault while copying data, then (presumably) our * source page isn't available. Return the error and we'll * retry in the slow path. + * If the object is non-shmem backed, we retry again with the + * path that handles page fault. */ if (fast_user_write(ggtt->mappable, page_base, page_offset, user_data, page_length)) { - ret = -EFAULT; - goto out_flush; + hit_slow_path = true; + mutex_unlock(&dev->struct_mutex); + if (slow_user_access(ggtt->mappable, + page_base, + page_offset, user_data, + page_length, true)) { + ret = -EFAULT; + mutex_lock(&dev->struct_mutex); + goto out_flush; + } + + mutex_lock(&dev->struct_mutex); } remain -= page_length; @@ -890,6 +1045,19 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915, } out_flush: + if (hit_slow_path) { + if (ret == 0 && + (obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0) { + /* The user has modified the object whilst we tried + * reading from it, and we now have no idea what domain + * the pages should be in. As we have just been touching + * them directly, flush everything back to the GTT + * domain. + */ + ret = i915_gem_object_set_to_gtt_domain(obj, false); + } + } + intel_fb_obj_flush(obj, false, ORIGIN_GTT); out_unpin: if (node.allocated) { @@ -1146,14 +1314,6 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, goto out; } - /* prime objects have no backing filp to GEM pread/pwrite - * pages from. - */ - if (!obj->base.filp) { - ret = -EINVAL; - goto out; - } - trace_i915_gem_object_pwrite(obj, args->offset, args->size); ret = -EFAULT; @@ -1163,20 +1323,20 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, * pread/pwrite currently are reading and writing from the CPU * perspective, requiring manual detiling by the client. */ - if (obj->tiling_mode == I915_TILING_NONE && - obj->base.write_domain != I915_GEM_DOMAIN_CPU && - cpu_write_needs_clflush(obj)) { + if (!obj->base.filp || cpu_write_needs_clflush(obj)) { ret = i915_gem_gtt_pwrite_fast(dev_priv, obj, args, file); /* Note that the gtt paths might fail with non-page-backed user * pointers (e.g. gtt mappings when moving data between * textures). Fallback to the shmem path in that case. */ } - if (ret == -EFAULT || ret == -ENOSPC) { + if (ret == -EFAULT) { if (obj->phys_handle) ret = i915_gem_phys_pwrite(obj, args, file); - else + else if (obj->base.filp) ret = i915_gem_shmem_pwrite(dev, obj, args, file); + else + ret = -ENODEV; } out: @@ -4010,9 +4170,7 @@ out: * object is now coherent at its new cache level (with respect * to the access domain). */ - if (obj->cache_dirty && - obj->base.write_domain != I915_GEM_DOMAIN_CPU && - cpu_write_needs_clflush(obj)) { + if (obj->cache_dirty && cpu_write_needs_clflush(obj)) { if (i915_gem_clflush_object(obj, true)) i915_gem_chipset_flush(obj->base.dev); }