From patchwork Fri Oct 21 14:11:22 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 9389315 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 7B03B607F0 for ; Fri, 21 Oct 2016 14:12:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6962D2A236 for ; Fri, 21 Oct 2016 14:12:51 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5B7AB2A23A; Fri, 21 Oct 2016 14:12:51 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4C3112A239 for ; Fri, 21 Oct 2016 14:12:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934392AbcJUOMo (ORCPT ); Fri, 21 Oct 2016 10:12:44 -0400 Received: from mail-qk0-f196.google.com ([209.85.220.196]:34196 "EHLO mail-qk0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933741AbcJUOLn (ORCPT ); Fri, 21 Oct 2016 10:11:43 -0400 Received: by mail-qk0-f196.google.com with SMTP id n189so7280150qke.1 for ; Fri, 21 Oct 2016 07:11:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ursulin-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=XpsPDAvGqDfr6J0swt00GgKbyBO0TQs+c8HFgmMs4hY=; b=Tq4HjDHDQw/7+bnX7Ae951PbIEPtPJDmtu+7P4aFYLU2qV1Tpxg2P4IuhROc5M8frk PUdYJ3/KYcxn3DUBUfO39QDB0UPfWdbgKGYQMoH+63bA3chwqS3WlBkQ92KF+kVRB8OY 9iEU3wPekCTUrRUUcCzGGomgFPeXfdXjfX3vnLETC5tRwaNaRtEBPzX4ngIRlctx9+lA YLaQODDifi0abwDDIOFElhRLs/rCxQ54RCX3wIbtF4R7Je3GN/s9goZ+3RlfmoNWnZMG WV+iPaaWsdXW/G9+FiVuLhCMh+j2GHj9r4VEgKwTwZ/MtCM0t139oYAntyEeK6LGm4Ko GKDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=XpsPDAvGqDfr6J0swt00GgKbyBO0TQs+c8HFgmMs4hY=; b=jJ7ejx7+bHBpqW9PzlmmSP3ZwVDWh5eSAYAb70V042FI6BA4esneRMhtYWFwXvyNN4 Fu8S8h2km0ZbKukEKQU2SMieUU/igFXaCggNMV5U2Mfh5a0Y1CBBKFAuvuzc+xBPjIwc YV+MG3IbLL5roadvPJsF3EnkKxLSXog6zd1yMqgQsN+22hNK5813VROagNBsvitvTtFr qTX1j/uts+HH+s5IZb77OVwrVfKYgRc9ALtnarecZd1UAnTuXdHdw5P3cmVX9CQdHCK7 +7C386bIHM/U4dE7cgZoUS7VIJrAcLRJylL1hn+GwznD79tiqpEaxCuU4MJ/nbxSZP7p u3ow== X-Gm-Message-State: ABUngvfP1F2yZv/se5XHSq7XGljIbyEAFmBq5E6lvLwrXUNMVp9Iks3t19Te1yI/Axnneg== X-Received: by 10.194.170.163 with SMTP id an3mr947106wjc.73.1477059101732; Fri, 21 Oct 2016 07:11:41 -0700 (PDT) Received: from e31.Home ([2a02:c7d:9b6d:e300:916a:6cab:ac67:71c2]) by smtp.gmail.com with ESMTPSA id ya1sm3114013wjb.23.2016.10.21.07.11.40 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 21 Oct 2016 07:11:41 -0700 (PDT) From: Tvrtko Ursulin X-Google-Original-From: Tvrtko Ursulin To: Intel-gfx@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org, linux-media@vger.kernel.org, Chris Wilson , Tvrtko Ursulin Subject: [PATCH 4/5] drm/i915: Use __sg_alloc_table_from_pages for allocating object backing store Date: Fri, 21 Oct 2016 15:11:22 +0100 Message-Id: <1477059083-3500-5-git-send-email-tvrtko.ursulin@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1477059083-3500-1-git-send-email-tvrtko.ursulin@linux.intel.com> References: <1477059083-3500-1-git-send-email-tvrtko.ursulin@linux.intel.com> Sender: linux-media-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-media@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Tvrtko Ursulin With the current way of allocating backing store which over-estimates the number of sg entries required we typically waste around 1-6 MiB of memory on unused sg entries at runtime. We can instead have the intermediate step of storing our pages in an array and use __sg_alloc_table_from_pages which will create us the most compact list possible. Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/i915/i915_gem.c | 72 ++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 37 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 8ed8e24025ac..4bf675568a37 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2208,9 +2208,9 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj) static unsigned int swiotlb_max_size(void) { #if IS_ENABLED(CONFIG_SWIOTLB) - return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE); + return swiotlb_nr_tbl() << IO_TLB_SHIFT; #else - return 0; + return UINT_MAX; #endif } @@ -2221,11 +2221,8 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) int page_count, i; struct address_space *mapping; struct sg_table *st; - struct scatterlist *sg; - struct sgt_iter sgt_iter; - struct page *page; - unsigned long last_pfn = 0; /* suppress gcc warning */ - unsigned int max_segment; + struct page *page, **pages; + unsigned int max_segment = swiotlb_max_size(); int ret; gfp_t gfp; @@ -2236,18 +2233,16 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS); BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS); - max_segment = swiotlb_max_size(); - if (!max_segment) - max_segment = rounddown(UINT_MAX, PAGE_SIZE); - - st = kmalloc(sizeof(*st), GFP_KERNEL); - if (st == NULL) - return -ENOMEM; - page_count = obj->base.size / PAGE_SIZE; - if (sg_alloc_table(st, page_count, GFP_KERNEL)) { - kfree(st); + pages = drm_malloc_gfp(page_count, sizeof(struct page *), + GFP_TEMPORARY | __GFP_ZERO); + if (!pages) return -ENOMEM; + + st = kmalloc(sizeof(*st), GFP_KERNEL); + if (st == NULL) { + ret = -ENOMEM; + goto err_st; } /* Get the list of pages out of our struct file. They'll be pinned @@ -2258,8 +2253,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) mapping = obj->base.filp->f_mapping; gfp = mapping_gfp_constraint(mapping, ~(__GFP_IO | __GFP_RECLAIM)); gfp |= __GFP_NORETRY | __GFP_NOWARN; - sg = st->sgl; - st->nents = 0; for (i = 0; i < page_count; i++) { page = shmem_read_mapping_page_gfp(mapping, i, gfp); if (IS_ERR(page)) { @@ -2281,29 +2274,28 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) goto err_pages; } } - if (!i || - sg->length >= max_segment || - page_to_pfn(page) != last_pfn + 1) { - if (i) - sg = sg_next(sg); - st->nents++; - sg_set_page(sg, page, PAGE_SIZE, 0); - } else { - sg->length += PAGE_SIZE; - } - last_pfn = page_to_pfn(page); + + pages[i] = page; /* Check that the i965g/gm workaround works. */ - WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL)); + WARN_ON((gfp & __GFP_DMA32) && + (page_to_pfn(page) >= 0x00100000UL)); } - if (sg) /* loop terminated early; short sg table */ - sg_mark_end(sg); + + ret = __sg_alloc_table_from_pages(st, pages, page_count, 0, + obj->base.size, GFP_KERNEL, + max_segment); + if (ret) + goto err_pages; + obj->pages = st; ret = i915_gem_gtt_prepare_object(obj); if (ret) goto err_pages; + drm_free_large(pages); + if (i915_gem_object_needs_bit17_swizzle(obj)) i915_gem_object_do_bit_17_swizzle(obj); @@ -2314,10 +2306,13 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) return 0; err_pages: - sg_mark_end(sg); - for_each_sgt_page(page, sgt_iter, st) - put_page(page); - sg_free_table(st); + for (i = 0; i < page_count; i++) { + if (pages[i]) + put_page(pages[i]); + else + break; + } + kfree(st); /* shmemfs first checks if there is enough memory to allocate the page @@ -2331,6 +2326,9 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj) if (ret == -ENOSPC) ret = -ENOMEM; +err_st: + drm_free_large(pages); + return ret; }