From patchwork Thu Aug 6 16:43:39 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 6961391 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id E856AC05AC for ; Thu, 6 Aug 2015 16:46:27 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id F2CD520704 for ; Thu, 6 Aug 2015 16:46:26 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 72CFE20702 for ; Thu, 6 Aug 2015 16:46:25 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B4CAC6E845; Thu, 6 Aug 2015 09:46:24 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (mail.fireflyinternet.com [87.106.93.118]) by gabe.freedesktop.org (Postfix) with ESMTP id 2868B6E845 for ; Thu, 6 Aug 2015 09:46:23 -0700 (PDT) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from haswell.alporthouse.com (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP id 43257802-1500048 for multiple; Thu, 06 Aug 2015 17:44:00 +0100 Received: by haswell.alporthouse.com (sSMTP sendmail emulation); Thu, 06 Aug 2015 17:43:41 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Thu, 6 Aug 2015 17:43:39 +0100 Message-Id: <1438879419-15555-1-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.5.0 MIME-Version: 1.0 X-Originating-IP: 78.156.65.138 X-Country: code=GB country="United Kingdom" ip=78.156.65.138 Cc: Daniel Vetter , "Goel, Akash" Subject: [Intel-gfx] [PATCH] drm/i915: Only move to the CPU write domain if keeping the GTT pages X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We have for a long time been ultra-paranoid about the situation whereby we hand back pages to the system that have been written to by the GPU and potentially simultaneously by the user through a CPU mmapping. We can relax this restriction when we know that the cache domain tracking is true and there can be no stale cacheline invalidatations. This is true if the object has never been CPU mmaped as all internal accesses (i.e. kmap/iomap) are carefully flushed. For a CPU mmaping, one would expect that the invalid cache lines are resolved on PTE/TLB shootdown during munmap(), so the only situation we need to be paranoid about is when such a CPU mmaping exists at the time of put_pages. Given that we need to treat put_pages carefully as we may return live data to the system that we want to use again in the future (i.e. I915_MADV_WILLNEED pages) we can simply treat a live CPU mmaping as a special case of WILLNEED (which it is!). Any I915_MADV_DONTNEED pages and their mmapings are shotdown immediately following put_pages. Signed-off-by: Chris Wilson Cc: "Goel, Akash" Cc: Ville Syrjälä Cc: Daniel Vetter Cc: Jesse Barnes --- drivers/gpu/drm/i915/i915_gem.c | 49 ++++++++++++++++++++++++++++++----------- 1 file changed, 36 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 2dfe707f11d3..24deace364a5 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2047,22 +2047,45 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj) BUG_ON(obj->madv == __I915_MADV_PURGED); - ret = i915_gem_object_set_to_cpu_domain(obj, true); - if (ret) { - /* In the event of a disaster, abandon all caches and - * hope for the best. - */ - WARN_ON(ret != -EIO); - i915_gem_clflush_object(obj, true); - obj->base.read_domains = obj->base.write_domain = I915_GEM_DOMAIN_CPU; - } - i915_gem_gtt_finish_object(obj); - if (i915_gem_object_needs_bit17_swizzle(obj)) - i915_gem_object_save_bit_17_swizzle(obj); + /* If we need to access the data in the future, we need to + * be sure that the contents of the object is coherent with + * the CPU prior to releasing the pages back to the system. + * Once we unpin them, the mm is free to move them to different + * zones or even swap them out to disk - all without our + * intervention. (Though we could track such operations with our + * own gemfs, if we ever write one.) As such if we want to keep + * the data, set it to the CPU domain now just in case someone + * else touches it. + * + * For a long time we have been paranoid about handing back + * pages to the system with stale cacheline invalidation. For + * all internal use (kmap/iomap), we know that the domain tracking is + * accurate. However, the userspace API is lax and the user can CPU + * mmap the object and invalidate cachelines without our accurate + * tracking. We have been paranoid to be sure that we always flushed + * the cachelines when we stopped using the pages. However, given + * that the CPU PTE/TLB shootdown must have invalidated the cachelines + * upon munmap(), we only need to be paranoid about a live CPU mmap + * now. For this, we need only treat it as live data see + * discard_backing_storage(). + */ + if (obj->madv == I915_MADV_WILLNEED) { + ret = i915_gem_object_set_to_cpu_domain(obj, true); + if (ret) { + /* In the event of a disaster, abandon all caches and + * hope for the best. + */ + WARN_ON(ret != -EIO); + i915_gem_clflush_object(obj, true); + obj->base.read_domains = I915_GEM_DOMAIN_CPU; + obj->base.write_domain = I915_GEM_DOMAIN_CPU; + } - if (obj->madv == I915_MADV_DONTNEED) + if (i915_gem_object_needs_bit17_swizzle(obj)) + i915_gem_object_save_bit_17_swizzle(obj); + } else obj->dirty = 0; st_for_each_page(&iter, obj->pages) {