From patchwork Mon Jan 11 10:45:11 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 8001511 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id C4BFB9F1C0 for ; Mon, 11 Jan 2016 10:48:21 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9B0A520295 for ; Mon, 11 Jan 2016 10:48:20 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 5FDB420279 for ; Mon, 11 Jan 2016 10:48:19 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6176388005; Mon, 11 Jan 2016 02:48:14 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-f65.google.com (mail-wm0-f65.google.com [74.125.82.65]) by gabe.freedesktop.org (Postfix) with ESMTPS id 891496E2D9 for ; Mon, 11 Jan 2016 02:47:14 -0800 (PST) Received: by mail-wm0-f65.google.com with SMTP id u188so25623878wmu.0 for ; Mon, 11 Jan 2016 02:47:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=8/NoyGc9SVud1dyiGWFIf++fIc/9KhYTf8WzuFH+xYQ=; b=PVUVbJTjnQLUgK091kkVyMRBxACkjGbuqvnxk92f5NEk9x6fU+BYiEsp3nP9dEx2dl 7I6otgoGRZ8srKO/NymIg31Xm1gANUB4Buc+lUddDTVM2ZoM+8LTsLU1bVNqcgubjmb5 wSEe6XSCXyRd05XgN7AzVZpxXttLUx4Gx73oJ2a8VrdKesQ/QfB5/0G2cAHL0kvz1EHM YRIwoeiaIk6JUuOfbMuZLjCsHZ4kTV72xh2wk0cpIzR1ztXC0ATNYCxyyom1PTNRQpwI gJSJ0jLCnqRU124/P4wyhqFggwk7Eq7dOSomzsNcvPh5su2e+oYP+hJdhIF84SI6EDuy u2Vg== X-Received: by 10.28.19.204 with SMTP id 195mr13903859wmt.1.1452509233661; Mon, 11 Jan 2016 02:47:13 -0800 (PST) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id t3sm118879383wjz.11.2016.01.11.02.47.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 11 Jan 2016 02:47:12 -0800 (PST) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Mon, 11 Jan 2016 10:45:11 +0000 Message-Id: <1452509174-16671-41-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.7.0.rc3 In-Reply-To: <1452509174-16671-1-git-send-email-chris@chris-wilson.co.uk> References: <1452503961-14837-1-git-send-email-chris@chris-wilson.co.uk> <1452509174-16671-1-git-send-email-chris@chris-wilson.co.uk> Subject: [Intel-gfx] [PATCH 127/190] drm/i915: Cache kmap between relocations X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When doing relocations, we have to obtain a mapping to the page containing the target address. This is either a kmap or iomap depending on GPU and its cache coherency. Neighbouring relocation entries are typically within the same page and so we can cache our kmapping between them and avoid those pesky TLB flushes. Note that there is some sleight-of-hand in how the slow relocate works as the reloc_entry_cache implies pagefaults disabled (as we are inside a kmap_atomic section). However, the slow relocate code is meant to be the fallback from the atomic fast path failing. Fortunately it works as we already have performed the copy_from_user for the relocation array (no more pagefaults there) and the kmap_atomic cache is enabled after we have waited upon an active buffer (so no more sleeping in atomic). Magic! Signed-off-by: Chris Wilson --- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 152 +++++++++++++++++++---------- 1 file changed, 102 insertions(+), 50 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 4d15dd32e365..f1dfb51ae4e3 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -305,9 +305,50 @@ relocation_target(struct drm_i915_gem_relocation_entry *reloc, return gen8_canonical_addr((int)reloc->delta + target_offset); } +struct reloc_cache { + void *vaddr; + unsigned page; + enum { KMAP, IOMAP } type; +}; + +static void reloc_cache_init(struct reloc_cache *cache) +{ + cache->page = -1; + cache->vaddr = NULL; +} + +static void reloc_cache_fini(struct reloc_cache *cache) +{ + if (cache->vaddr == NULL) + return; + + switch (cache->type) { + case KMAP: kunmap_atomic(cache->vaddr); break; + case IOMAP: io_mapping_unmap_atomic(cache->vaddr); break; + } +} + +static void *reloc_kmap(struct drm_i915_gem_object *obj, + struct reloc_cache *cache, + int page) +{ + if (cache->page == page) + return cache->vaddr; + + if (cache->vaddr) + kunmap_atomic(cache->vaddr); + + cache->page = page; + cache->vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, page)); + cache->type = KMAP; + + return cache->vaddr; +} + static int relocate_entry_cpu(struct drm_i915_gem_object *obj, struct drm_i915_gem_relocation_entry *reloc, + struct reloc_cache *cache, uint64_t target_offset) { struct drm_device *dev = obj->base.dev; @@ -320,30 +361,44 @@ relocate_entry_cpu(struct drm_i915_gem_object *obj, if (ret) return ret; - vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, - reloc->offset >> PAGE_SHIFT)); + vaddr = reloc_kmap(obj, cache, reloc->offset >> PAGE_SHIFT); *(uint32_t *)(vaddr + page_offset) = lower_32_bits(delta); if (INTEL_INFO(dev)->gen >= 8) { - page_offset = offset_in_page(page_offset + sizeof(uint32_t)); - - if (page_offset == 0) { - kunmap_atomic(vaddr); - vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, - (reloc->offset + sizeof(uint32_t)) >> PAGE_SHIFT)); + page_offset += sizeof(uint32_t); + if (page_offset == PAGE_SIZE) { + vaddr = reloc_kmap(obj, cache, cache->page + 1); + page_offset = 0; } - *(uint32_t *)(vaddr + page_offset) = upper_32_bits(delta); } - kunmap_atomic(vaddr); - return 0; } +static void *reloc_iomap(struct drm_i915_private *i915, + struct reloc_cache *cache, + uint64_t offset) +{ + if (cache->page == offset >> PAGE_SHIFT) + return cache->vaddr; + + if (cache->vaddr) + io_mapping_unmap_atomic(cache->vaddr); + + cache->page = offset >> PAGE_SHIFT; + cache->vaddr = + io_mapping_map_atomic_wc(i915->gtt.mappable, + offset & PAGE_MASK); + cache->type = IOMAP; + + return cache->vaddr; +} + static int relocate_entry_gtt(struct drm_i915_gem_object *obj, struct drm_i915_gem_relocation_entry *reloc, + struct reloc_cache *cache, uint64_t target_offset) { struct drm_device *dev = obj->base.dev; @@ -369,28 +424,19 @@ relocate_entry_gtt(struct drm_i915_gem_object *obj, /* Map the page containing the relocation we're going to perform. */ offset = vma->node.start; offset += reloc->offset; - reloc_page = io_mapping_map_atomic_wc(dev_priv->gtt.mappable, - offset & PAGE_MASK); + reloc_page = reloc_iomap(dev_priv, cache, offset); iowrite32(lower_32_bits(delta), reloc_page + offset_in_page(offset)); if (INTEL_INFO(dev)->gen >= 8) { offset += sizeof(uint32_t); - - if (offset_in_page(offset) == 0) { - io_mapping_unmap_atomic(reloc_page); - reloc_page = - io_mapping_map_atomic_wc(dev_priv->gtt.mappable, - offset); - } - + if (offset_in_page(offset) == 0) + reloc_page = reloc_iomap(dev_priv, cache, offset); iowrite32(upper_32_bits(delta), reloc_page + offset_in_page(offset)); } - io_mapping_unmap_atomic(reloc_page); - unpin: - i915_vma_unpin(vma); + __i915_vma_unpin(vma); return ret; } @@ -406,6 +452,7 @@ clflush_write32(void *addr, uint32_t value) static int relocate_entry_clflush(struct drm_i915_gem_object *obj, struct drm_i915_gem_relocation_entry *reloc, + struct reloc_cache *cache, uint64_t target_offset) { struct drm_device *dev = obj->base.dev; @@ -418,31 +465,26 @@ relocate_entry_clflush(struct drm_i915_gem_object *obj, if (ret) return ret; - vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, - reloc->offset >> PAGE_SHIFT)); + vaddr = reloc_kmap(obj, cache, reloc->offset >> PAGE_SHIFT); clflush_write32(vaddr + page_offset, lower_32_bits(delta)); if (INTEL_INFO(dev)->gen >= 8) { - page_offset = offset_in_page(page_offset + sizeof(uint32_t)); - - if (page_offset == 0) { - kunmap_atomic(vaddr); - vaddr = kmap_atomic(i915_gem_object_get_dirty_page(obj, - (reloc->offset + sizeof(uint32_t)) >> PAGE_SHIFT)); + page_offset += sizeof(uint32_t); + if (page_offset == PAGE_SIZE) { + vaddr = reloc_kmap(obj, cache, cache->page + 1); + page_offset = 0; } - clflush_write32(vaddr + page_offset, upper_32_bits(delta)); } - kunmap_atomic(vaddr); - return 0; } static int i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, struct eb_vmas *eb, - struct drm_i915_gem_relocation_entry *reloc) + struct drm_i915_gem_relocation_entry *reloc, + struct reloc_cache *cache) { struct drm_device *dev = obj->base.dev; struct drm_gem_object *target_obj; @@ -526,11 +568,11 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj, return -EFAULT; if (use_cpu_reloc(obj)) - ret = relocate_entry_cpu(obj, reloc, target_offset); + ret = relocate_entry_cpu(obj, reloc, cache, target_offset); else if (obj->map_and_fenceable) - ret = relocate_entry_gtt(obj, reloc, target_offset); + ret = relocate_entry_gtt(obj, reloc, cache, target_offset); else if (cpu_has_clflush) - ret = relocate_entry_clflush(obj, reloc, target_offset); + ret = relocate_entry_clflush(obj, reloc, cache, target_offset); else { WARN_ONCE(1, "Impossible case in relocation handling\n"); ret = -ENODEV; @@ -553,9 +595,11 @@ i915_gem_execbuffer_relocate_vma(struct i915_vma *vma, struct drm_i915_gem_relocation_entry stack_reloc[N_RELOC(512)]; struct drm_i915_gem_relocation_entry __user *user_relocs; struct drm_i915_gem_exec_object2 *entry = vma->exec_entry; - int remain, ret; + struct reloc_cache cache; + int remain, ret = 0; user_relocs = to_user_ptr(entry->relocs_ptr); + reloc_cache_init(&cache); remain = entry->relocation_count; while (remain) { @@ -565,21 +609,24 @@ i915_gem_execbuffer_relocate_vma(struct i915_vma *vma, count = ARRAY_SIZE(stack_reloc); remain -= count; - if (__copy_from_user_inatomic(r, user_relocs, count*sizeof(r[0]))) - return -EFAULT; + if (__copy_from_user_inatomic(r, user_relocs, count*sizeof(r[0]))) { + ret = -EFAULT; + goto out; + } do { u64 offset = r->presumed_offset; - ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, r); + ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, r, &cache); if (ret) - return ret; + goto out; if (r->presumed_offset != offset && __copy_to_user_inatomic(&user_relocs->presumed_offset, &r->presumed_offset, sizeof(r->presumed_offset))) { - return -EFAULT; + ret = -EFAULT; + goto out; } user_relocs++; @@ -587,7 +634,9 @@ i915_gem_execbuffer_relocate_vma(struct i915_vma *vma, } while (--count); } - return 0; +out: + reloc_cache_fini(&cache); + return ret; #undef N_RELOC } @@ -597,15 +646,18 @@ i915_gem_execbuffer_relocate_vma_slow(struct i915_vma *vma, struct drm_i915_gem_relocation_entry *relocs) { const struct drm_i915_gem_exec_object2 *entry = vma->exec_entry; - int i, ret; + struct reloc_cache cache; + int i, ret = 0; + reloc_cache_init(&cache); for (i = 0; i < entry->relocation_count; i++) { - ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, &relocs[i]); + ret = i915_gem_execbuffer_relocate_entry(vma->obj, eb, &relocs[i], &cache); if (ret) - return ret; + break; } + reloc_cache_fini(&cache); - return 0; + return ret; } static int