From patchwork Mon Sep 13 18:05:58 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12490463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8815C433EF for ; Mon, 13 Sep 2021 18:08:25 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 69A9C60EE9 for ; Mon, 13 Sep 2021 18:08:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 69A9C60EE9 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F21286E222; Mon, 13 Sep 2021 18:08:16 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0CE8D6E212; Mon, 13 Sep 2021 18:08:15 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10106"; a="221792508" X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="221792508" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:13 -0700 X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="551796968" Received: from jpcooney-mobl1.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.5.201]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:12 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH 1/8] drm/i915/gem: Break out some shmem backend utils Date: Mon, 13 Sep 2021 19:05:58 +0100 Message-Id: <20210913180605.2778493-1-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Thomas Hellström Break out some shmem backend utils for future reuse by the TTM backend: shmem_alloc_st(), shmem_free_st() and __shmem_writeback() which we can use to provide a shmem-backed TTM page pool for cached-only TTM buffer objects. Main functional change here is that we now compute the page sizes using the dma segments rather than using the physical page address segments. Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld Signed-off-by: Matthew Auld Reported-by: kernel test robot --- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 182 +++++++++++++--------- 1 file changed, 107 insertions(+), 75 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index 11f072193f3b..ab02309f0752 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,46 +25,61 @@ static void check_release_pagevec(struct pagevec *pvec) cond_resched(); } -static int shmem_get_pages(struct drm_i915_gem_object *obj) +static void shmem_free_st(struct sg_table *st, struct address_space *mapping, + bool dirty, bool backup) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); - struct intel_memory_region *mem = obj->mm.region; - const unsigned long page_count = obj->base.size / PAGE_SIZE; + struct sgt_iter sgt_iter; + struct pagevec pvec; + struct page *page; + + mapping_clear_unevictable(mapping); + + pagevec_init(&pvec); + for_each_sgt_page(page, sgt_iter, st) { + if (dirty) + set_page_dirty(page); + + if (backup) + mark_page_accessed(page); + + if (!pagevec_add(&pvec, page)) + check_release_pagevec(&pvec); + } + if (pagevec_count(&pvec)) + check_release_pagevec(&pvec); + + sg_free_table(st); + kfree(st); +} + +static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, + size_t size, struct intel_memory_region *mr, + struct address_space *mapping, + unsigned int max_segment) +{ + const unsigned long page_count = size / PAGE_SIZE; unsigned long i; - struct address_space *mapping; struct sg_table *st; struct scatterlist *sg; - struct sgt_iter sgt_iter; struct page *page; unsigned long last_pfn = 0; /* suppress gcc warning */ - unsigned int max_segment = i915_sg_segment_size(); - unsigned int sg_page_sizes; gfp_t noreclaim; int ret; - /* - * Assert that the object is not currently in any GPU domain. As it - * wasn't in the GTT, there shouldn't be any way it could have been in - * a GPU cache - */ - GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS); - GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS); - /* * If there's no chance of allocating enough pages for the whole * object, bail early. */ - if (obj->base.size > resource_size(&mem->region)) - return -ENOMEM; + if (size > resource_size(&mr->region)) + return ERR_PTR(-ENOMEM); st = kmalloc(sizeof(*st), GFP_KERNEL); if (!st) - return -ENOMEM; + return ERR_PTR(-ENOMEM); -rebuild_st: if (sg_alloc_table(st, page_count, GFP_KERNEL)) { kfree(st); - return -ENOMEM; + return ERR_PTR(-ENOMEM); } /* @@ -73,14 +88,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) * * Fail silently without starting the shrinker */ - mapping = obj->base.filp->f_mapping; mapping_set_unevictable(mapping); noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM); noreclaim |= __GFP_NORETRY | __GFP_NOWARN; sg = st->sgl; st->nents = 0; - sg_page_sizes = 0; for (i = 0; i < page_count; i++) { const unsigned int shrink[] = { I915_SHRINK_BOUND | I915_SHRINK_UNBOUND, @@ -135,10 +148,9 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) if (!i || sg->length >= max_segment || page_to_pfn(page) != last_pfn + 1) { - if (i) { - sg_page_sizes |= sg->length; + if (i) sg = sg_next(sg); - } + st->nents++; sg_set_page(sg, page, PAGE_SIZE, 0); } else { @@ -149,14 +161,66 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) /* Check that the i965g/gm workaround works. */ GEM_BUG_ON(gfp & __GFP_DMA32 && last_pfn >= 0x00100000UL); } - if (sg) { /* loop terminated early; short sg table */ - sg_page_sizes |= sg->length; + if (sg) /* loop terminated early; short sg table */ sg_mark_end(sg); - } /* Trim unused sg entries to avoid wasting memory. */ i915_sg_trim(st); + return st; +err_sg: + sg_mark_end(sg); + if (sg != st->sgl) { + shmem_free_st(st, mapping, false, false); + } else { + mapping_clear_unevictable(mapping); + sg_free_table(st); + kfree(st); + } + + /* + * shmemfs first checks if there is enough memory to allocate the page + * and reports ENOSPC should there be insufficient, along with the usual + * ENOMEM for a genuine allocation failure. + * + * We use ENOSPC in our driver to mean that we have run out of aperture + * space and so want to translate the error from shmemfs back to our + * usual understanding of ENOMEM. + */ + if (ret == -ENOSPC) + ret = -ENOMEM; + + return ERR_PTR(ret); +} + +static int shmem_get_pages(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct intel_memory_region *mem = obj->mm.region; + const unsigned long page_count = obj->base.size / PAGE_SIZE; + struct address_space *mapping; + struct sg_table *st; + struct sgt_iter sgt_iter; + struct page *page; + unsigned int max_segment = i915_sg_segment_size(); + int ret; + + /* + * Assert that the object is not currently in any GPU domain. As it + * wasn't in the GTT, there shouldn't be any way it could have been in + * a GPU cache + */ + GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS); + GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS); + +rebuild_st: + st = shmem_alloc_st(i915, obj->base.size, mem, + obj->base.filp->f_mapping, max_segment); + if (IS_ERR(st)) { + ret = PTR_ERR(st); + goto err_st; + } + ret = i915_gem_gtt_prepare_pages(obj, st); if (ret) { /* @@ -168,6 +232,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) for_each_sgt_page(page, sgt_iter, st) put_page(page); sg_free_table(st); + kfree(st); max_segment = PAGE_SIZE; goto rebuild_st; @@ -200,28 +265,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER) obj->cache_dirty = true; - __i915_gem_object_set_pages(obj, st, sg_page_sizes); + __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); return 0; -err_sg: - sg_mark_end(sg); err_pages: - mapping_clear_unevictable(mapping); - if (sg != st->sgl) { - struct pagevec pvec; - - pagevec_init(&pvec); - for_each_sgt_page(page, sgt_iter, st) { - if (!pagevec_add(&pvec, page)) - check_release_pagevec(&pvec); - } - if (pagevec_count(&pvec)) - check_release_pagevec(&pvec); - } - sg_free_table(st); - kfree(st); - + shmem_free_st(st, mapping, false, false); /* * shmemfs first checks if there is enough memory to allocate the page * and reports ENOSPC should there be insufficient, along with the usual @@ -231,6 +280,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) * space and so want to translate the error from shmemfs back to our * usual understanding of ENOMEM. */ +err_st: if (ret == -ENOSPC) ret = -ENOMEM; @@ -251,10 +301,8 @@ shmem_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); } -static void -shmem_writeback(struct drm_i915_gem_object *obj) +static void __shmem_writeback(size_t size, struct address_space *mapping) { - struct address_space *mapping; struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, .nr_to_write = SWAP_CLUSTER_MAX, @@ -270,10 +318,9 @@ shmem_writeback(struct drm_i915_gem_object *obj) * instead of invoking writeback so they are aged and paged out * as normal. */ - mapping = obj->base.filp->f_mapping; /* Begin writeback on each dirty page */ - for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) { + for (i = 0; i < size >> PAGE_SHIFT; i++) { struct page *page; page = find_lock_page(mapping, i); @@ -296,6 +343,12 @@ shmem_writeback(struct drm_i915_gem_object *obj) } } +static void +shmem_writeback(struct drm_i915_gem_object *obj) +{ + __shmem_writeback(obj->base.size, obj->base.filp->f_mapping); +} + void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages, @@ -316,11 +369,6 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages) { - struct sgt_iter sgt_iter; - struct pagevec pvec; - struct page *page; - - GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev))); __i915_gem_object_release_shmem(obj, pages, true); i915_gem_gtt_finish_pages(obj, pages); @@ -328,25 +376,9 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_ if (i915_gem_object_needs_bit17_swizzle(obj)) i915_gem_object_save_bit_17_swizzle(obj, pages); - mapping_clear_unevictable(file_inode(obj->base.filp)->i_mapping); - - pagevec_init(&pvec); - for_each_sgt_page(page, sgt_iter, pages) { - if (obj->mm.dirty) - set_page_dirty(page); - - if (obj->mm.madv == I915_MADV_WILLNEED) - mark_page_accessed(page); - - if (!pagevec_add(&pvec, page)) - check_release_pagevec(&pvec); - } - if (pagevec_count(&pvec)) - check_release_pagevec(&pvec); + shmem_free_st(pages, file_inode(obj->base.filp)->i_mapping, + obj->mm.dirty, obj->mm.madv == I915_MADV_WILLNEED); obj->mm.dirty = false; - - sg_free_table(pages); - kfree(pages); } static void From patchwork Mon Sep 13 18:05:59 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12490461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F882C433FE for ; Mon, 13 Sep 2021 18:08:20 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 289E4610D1 for ; Mon, 13 Sep 2021 18:08:20 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 289E4610D1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 125AB6E21A; Mon, 13 Sep 2021 18:08:16 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 423386E216; Mon, 13 Sep 2021 18:08:15 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10106"; a="221792512" X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="221792512" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:15 -0700 X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="551796973" Received: from jpcooney-mobl1.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.5.201]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:13 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH 2/8] drm/ttm: add TTM_PAGE_FLAG_SHMEM Date: Mon, 13 Sep 2021 19:05:59 +0100 Message-Id: <20210913180605.2778493-2-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210913180605.2778493-1-matthew.auld@intel.com> References: <20210913180605.2778493-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add new flag to indicate special shmem based tt, which can directly handle swapping itself, and should be visible to some shrinker. As part of this we should skip the ttm_pages_allocated accounting, since such tt objects should already be reachable, and potentially reclaimable by some shrinker, if under memory pressure, and so shouldn't directly count towards the swap "watermark" level. We also need to stop touching the page->mapping and page->index for such objects, like in ttm_tt_add_mapping, since shmem already uses these. Some drivers seems to depend on the tt mapping/index behaviour for their own purposes, so directly using shmem tt likely won't be usable there as-is. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König --- drivers/gpu/drm/ttm/ttm_bo_vm.c | 4 ++-- drivers/gpu/drm/ttm/ttm_tt.c | 10 +++++----- include/drm/ttm/ttm_tt.h | 1 + 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index f56be5bc0861..e2131c73dcb6 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -346,8 +346,8 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf, } else if (unlikely(!page)) { break; } - page->index = drm_vma_node_start(&bo->base.vma_node) + - page_offset; + if (!(bo->ttm->page_flags & TTM_PAGE_FLAG_SHMEM)) + page->index = drm_vma_node_start(&bo->base.vma_node) + page_offset; pfn = page_to_pfn(page); } diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index dae52433beeb..cc4815c1f505 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -293,7 +293,7 @@ static void ttm_tt_add_mapping(struct ttm_device *bdev, struct ttm_tt *ttm) { pgoff_t i; - if (ttm->page_flags & TTM_PAGE_FLAG_SG) + if (ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM)) return; for (i = 0; i < ttm->num_pages; ++i) @@ -311,7 +311,7 @@ int ttm_tt_populate(struct ttm_device *bdev, if (ttm_tt_is_populated(ttm)) return 0; - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (!(ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM))) { atomic_long_add(ttm->num_pages, &ttm_pages_allocated); if (bdev->pool.use_dma32) atomic_long_add(ttm->num_pages, @@ -349,7 +349,7 @@ int ttm_tt_populate(struct ttm_device *bdev, return 0; error: - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (!(ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM))) { atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); if (bdev->pool.use_dma32) atomic_long_sub(ttm->num_pages, @@ -364,7 +364,7 @@ static void ttm_tt_clear_mapping(struct ttm_tt *ttm) pgoff_t i; struct page **page = ttm->pages; - if (ttm->page_flags & TTM_PAGE_FLAG_SG) + if (ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM)) return; for (i = 0; i < ttm->num_pages; ++i) { @@ -384,7 +384,7 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) else ttm_pool_free(&bdev->pool, ttm); - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (!(ttm->page_flags & (TTM_PAGE_FLAG_SG | TTM_PAGE_FLAG_SHMEM))) { atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); if (bdev->pool.use_dma32) atomic_long_sub(ttm->num_pages, diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index 89b15d673b22..20d550185065 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -42,6 +42,7 @@ struct ttm_operation_ctx; #define TTM_PAGE_FLAG_ZERO_ALLOC (1 << 6) #define TTM_PAGE_FLAG_SG (1 << 8) #define TTM_PAGE_FLAG_NO_RETRY (1 << 9) +#define TTM_PAGE_FLAG_SHMEM (1 << 10) #define TTM_PAGE_FLAG_PRIV_POPULATED (1 << 31) From patchwork Mon Sep 13 18:06:00 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12490467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CE5CC433F5 for ; Mon, 13 Sep 2021 18:08:34 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6A220610D1 for ; Mon, 13 Sep 2021 18:08:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6A220610D1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 40CE26ECD4; Mon, 13 Sep 2021 18:08:22 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id EF7CB6E221; Mon, 13 Sep 2021 18:08:16 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10106"; a="221792519" X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="221792519" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:16 -0700 X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="551796978" Received: from jpcooney-mobl1.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.5.201]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:15 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH 3/8] drm/i915/ttm: add tt shmem backend Date: Mon, 13 Sep 2021 19:06:00 +0100 Message-Id: <20210913180605.2778493-3-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210913180605.2778493-1-matthew.auld@intel.com> References: <20210913180605.2778493-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 214 ++++++++++++++++++--- 3 files changed, 206 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 48112b9d76df..561d6bd0a5c9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -618,6 +618,14 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj, bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj, enum intel_memory_type type); +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, + size_t size, struct intel_memory_region *mr, + struct address_space *mapping, + unsigned int max_segment); +void shmem_free_st(struct sg_table *st, struct address_space *mapping, + bool dirty, bool backup); +void __shmem_writeback(size_t size, struct address_space *mapping); + #ifdef CONFIG_MMU_NOTIFIER static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index ab02309f0752..81d8da2bede4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec) cond_resched(); } -static void shmem_free_st(struct sg_table *st, struct address_space *mapping, - bool dirty, bool backup) +void shmem_free_st(struct sg_table *st, struct address_space *mapping, + bool dirty, bool backup) { struct sgt_iter sgt_iter; struct pagevec pvec; @@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct address_space *mapping, kfree(st); } -static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, - size_t size, struct intel_memory_region *mr, - struct address_space *mapping, - unsigned int max_segment) +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, + size_t size, struct intel_memory_region *mr, + struct address_space *mapping, + unsigned int max_segment) { const unsigned long page_count = size / PAGE_SIZE; unsigned long i; @@ -301,7 +301,7 @@ shmem_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); } -static void __shmem_writeback(size_t size, struct address_space *mapping) +void __shmem_writeback(size_t size, struct address_space *mapping) { struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 59ca53a3ef6a..8f022c2889c0 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -37,6 +37,9 @@ * @ttm: The base TTM page vector. * @dev: The struct device used for dma mapping and unmapping. * @cached_st: The cached scatter-gather table. + * @obj: The GEM object. Should be valid while we have a valid bo->ttm. + * @filp: The shmem file, if using shmem backend. + * @backup: Swap out the pages when unpopulating, if using shmem backend. * * Note that DMA may be going on right up to the point where the page- * vector is unpopulated in delayed destroy. Hence keep the @@ -48,6 +51,9 @@ struct i915_ttm_tt { struct ttm_tt ttm; struct device *dev; struct sg_table *cached_st; + struct drm_i915_gem_object *obj; + struct file *filp; + bool backup; }; static const struct ttm_place sys_placement_flags = { @@ -167,12 +173,105 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj, placement->busy_placement = busy; } +static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev, + struct ttm_tt *ttm, + struct ttm_operation_ctx *ctx) +{ + struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev); + struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM]; + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + const unsigned int max_segment = i915_sg_segment_size(); + const size_t size = ttm->num_pages << PAGE_SHIFT; + struct drm_i915_gem_object *obj = i915_tt->obj; + struct file *filp = i915_tt->filp; + struct sgt_iter sgt_iter; + struct sg_table *st; + struct page *page; + unsigned long i; + int err; + + GEM_BUG_ON(obj->mm.madv != I915_MADV_WILLNEED); + + if (!filp) { + struct address_space *mapping; + gfp_t mask; + + filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE); + if (IS_ERR(filp)) + return PTR_ERR(filp); + + mask = GFP_HIGHUSER | __GFP_RECLAIMABLE; + + mapping = filp->f_mapping; + mapping_set_gfp_mask(mapping, mask); + GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM)); + + i915_tt->filp = filp; + } + + st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment); + if (IS_ERR(st)) + return PTR_ERR(st); + + err = dma_map_sg_attrs(i915_tt->dev, + st->sgl, st->nents, + PCI_DMA_BIDIRECTIONAL, + DMA_ATTR_SKIP_CPU_SYNC | + DMA_ATTR_NO_KERNEL_MAPPING | + DMA_ATTR_NO_WARN); + if (err <= 0) { + err = -EINVAL; + goto err_free_st; + } + + i = 0; + for_each_sgt_page(page, sgt_iter, st) + ttm->pages[i++] = page; + + if (ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) + ttm->page_flags &= ~TTM_PAGE_FLAG_SWAPPED; + + i915_tt->cached_st = st; + return 0; + +err_free_st: + shmem_free_st(st, filp->f_mapping, false, false); + return err; +} + +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) +{ + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + struct drm_i915_gem_object *obj = i915_tt->obj; + bool backup = i915_tt->backup; + + if (obj->mm.madv == I915_MADV_DONTNEED) { + obj->mm.dirty = false; + GEM_BUG_ON(backup); + } + + dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl, + i915_tt->cached_st->nents, + PCI_DMA_BIDIRECTIONAL); + + shmem_free_st(i915_tt->cached_st, + file_inode(i915_tt->filp)->i_mapping, + obj->mm.dirty, backup); + i915_tt->cached_st = NULL; + + obj->mm.dirty = false; + + if (backup) + ttm->page_flags |= TTM_PAGE_FLAG_SWAPPED; +} + static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { struct ttm_resource_manager *man = ttm_manager_type(bo->bdev, bo->resource->mem_type); struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); + enum ttm_caching caching = i915_ttm_select_tt_caching(obj); struct i915_ttm_tt *i915_tt; int ret; @@ -184,36 +283,60 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, man->use_tt) page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; - ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, - i915_ttm_select_tt_caching(obj)); - if (ret) { - kfree(i915_tt); - return NULL; - } + if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) + page_flags |= TTM_PAGE_FLAG_SHMEM; + + ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching); + if (ret) + goto err_free; i915_tt->dev = obj->base.dev->dev; + i915_tt->obj = obj; return &i915_tt->ttm; + +err_free: + kfree(i915_tt); + return NULL; +} + +static int i915_ttm_tt_populate(struct ttm_device *bdev, + struct ttm_tt *ttm, + struct ttm_operation_ctx *ctx) +{ + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + + if (ttm->page_flags & TTM_PAGE_FLAG_SHMEM) + return i915_ttm_tt_shmem_populate(bdev, ttm, ctx); + + return ttm_pool_alloc(&bdev->pool, ttm, ctx); } static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); - if (i915_tt->cached_st) { - dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st, - DMA_BIDIRECTIONAL, 0); - sg_free_table(i915_tt->cached_st); - kfree(i915_tt->cached_st); - i915_tt->cached_st = NULL; + if (ttm->page_flags & TTM_PAGE_FLAG_SHMEM) { + i915_ttm_tt_shmem_unpopulate(ttm); + } else { + if (i915_tt->cached_st) { + dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st, + DMA_BIDIRECTIONAL, 0); + sg_free_table(i915_tt->cached_st); + kfree(i915_tt->cached_st); + i915_tt->cached_st = NULL; + } + ttm_pool_free(&bdev->pool, ttm); } - ttm_pool_free(&bdev->pool, ttm); } static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + if (i915_tt->filp) + fput(i915_tt->filp); + ttm_tt_fini(ttm); kfree(i915_tt); } @@ -223,6 +346,10 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo, { struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); + if (place->mem_type == TTM_PL_SYSTEM && + bo->ttm && bo->ttm->page_flags & TTM_PAGE_FLAG_SHMEM) + return false; + /* Will do for now. Our pinned objects are still on TTM's LRU lists */ return i915_gem_object_evictable(obj); } @@ -316,9 +443,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj) i915_gem_object_set_cache_coherency(obj, cache_level); } -static void i915_ttm_purge(struct drm_i915_gem_object *obj) +static void i915_ttm_writeback(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); struct ttm_operation_ctx ctx = { .interruptible = true, .no_wait_gpu = false, @@ -326,18 +455,52 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj) struct ttm_placement place = {}; int ret; - if (obj->mm.madv == __I915_MADV_PURGED) + if (!bo->ttm || !(bo->ttm->page_flags & TTM_PAGE_FLAG_SHMEM)) return; - /* TTM's purge interface. Note that we might be reentering. */ + i915_tt->backup = true; ret = ttm_bo_validate(bo, &place, &ctx); - if (!ret) { - obj->write_domain = 0; - obj->read_domains = 0; - i915_ttm_adjust_gem_after_move(obj); - i915_ttm_free_cached_io_st(obj); - obj->mm.madv = __I915_MADV_PURGED; + i915_tt->backup = false; + if (ret) + return; + + __shmem_writeback(obj->base.size, i915_tt->filp->f_mapping); +} + +static void i915_ttm_purge(struct drm_i915_gem_object *obj) +{ + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct ttm_operation_ctx ctx = { + .interruptible = true, + .no_wait_gpu = false, + }; + struct ttm_placement place = {}; + + if (obj->mm.madv == __I915_MADV_PURGED) + return; + + if (ttm_bo_validate(bo, &place, &ctx)) + return; + + if (bo->ttm && bo->ttm->page_flags & TTM_PAGE_FLAG_SHMEM) { + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); + + GEM_BUG_ON(i915_tt->backup); + + if (i915_tt->filp) { + shmem_truncate_range(file_inode(i915_tt->filp), + 0, (loff_t)-1); + fput(i915_tt->filp); + i915_tt->filp = NULL; + } } + + obj->write_domain = 0; + obj->read_domains = 0; + i915_ttm_adjust_gem_after_move(obj); + i915_ttm_free_cached_io_st(obj); + obj->mm.madv = __I915_MADV_PURGED; } static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -606,6 +769,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo, static struct ttm_device_funcs i915_ttm_bo_driver = { .ttm_tt_create = i915_ttm_tt_create, + .ttm_tt_populate = i915_ttm_tt_populate, .ttm_tt_unpopulate = i915_ttm_tt_unpopulate, .ttm_tt_destroy = i915_ttm_tt_destroy, .eviction_valuable = i915_ttm_eviction_valuable, @@ -679,6 +843,8 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj, return PTR_ERR(st); __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); + if (!bo->ttm || !(bo->ttm->page_flags & TTM_PAGE_FLAG_SHMEM)) + i915_gem_object_make_unshrinkable(obj); } return ret; @@ -875,9 +1041,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = { .get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages, .truncate = i915_ttm_purge, + .writeback = i915_ttm_writeback, + .adjust_lru = i915_ttm_adjust_lru, .delayed_free = i915_ttm_delayed_free, .migrate = i915_ttm_migrate, + .mmap_offset = i915_ttm_mmap_offset, .mmap_ops = &vm_ops_ttm, }; @@ -919,7 +1088,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, drm_gem_private_object_init(&i915->drm, &obj->base, size); i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags); i915_gem_object_init_memory_region(obj, mem); - i915_gem_object_make_unshrinkable(obj); INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN); mutex_init(&obj->ttm.get_io_page.lock); bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device : From patchwork Mon Sep 13 18:06:01 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12490471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FD3BC433F5 for ; Mon, 13 Sep 2021 18:08:38 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 49747610D1 for ; Mon, 13 Sep 2021 18:08:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 49747610D1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B97996ECD1; Mon, 13 Sep 2021 18:08:21 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1D3736E220; Mon, 13 Sep 2021 18:08:18 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10106"; a="221792526" X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="221792526" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:17 -0700 X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="551796981" Received: from jpcooney-mobl1.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.5.201]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:16 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH 4/8] drm/i915/ttm: use cached system pages when evicting lmem Date: Mon, 13 Sep 2021 19:06:01 +0100 Message-Id: <20210913180605.2778493-4-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210913180605.2778493-1-matthew.auld@intel.com> References: <20210913180605.2778493-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This should let us do an accelerated copy directly to the shmem pages when temporarily moving lmem-only objects, where the i915-gem shrinker can later kick in to swap out the pages, if needed. Signed-off-by: Matthew Auld Cc: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 8f022c2889c0..08894f6a296b 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -123,11 +123,11 @@ static enum ttm_caching i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj) { /* - * Objects only allowed in system get cached cpu-mappings. - * Other objects get WC mapping for now. Even if in system. + * Objects only allowed in system get cached cpu-mappings, or when + * evicting lmem-only buffers to system for swapping. Other objects get + * WC mapping for now. Even if in system. */ - if (obj->mm.region->type == INTEL_MEMORY_SYSTEM && - obj->mm.n_placements <= 1) + if (obj->mm.n_placements <= 1) return ttm_cached; return ttm_write_combined; From patchwork Mon Sep 13 18:06:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12490465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 574F1C433FE for ; Mon, 13 Sep 2021 18:08:32 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2D7A6610D1 for ; Mon, 13 Sep 2021 18:08:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2D7A6610D1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 414696E221; Mon, 13 Sep 2021 18:08:21 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5BA876E220; Mon, 13 Sep 2021 18:08:19 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10106"; a="221792530" X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="221792530" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:19 -0700 X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="551796987" Received: from jpcooney-mobl1.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.5.201]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:18 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH 5/8] drm/i915: remember to call i915_sw_fence_fini Date: Mon, 13 Sep 2021 19:06:02 +0100 Message-Id: <20210913180605.2778493-5-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210913180605.2778493-1-matthew.auld@intel.com> References: <20210913180605.2778493-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Fixes some object-debug splat which appeared while debugging something unrelated. Signed-off-by: Matthew Auld Cc: Thomas Hellström --- drivers/gpu/drm/i915/gt/intel_context.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 745e84c72c90..0e7dd12a0871 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -420,6 +420,7 @@ void intel_context_fini(struct intel_context *ce) mutex_destroy(&ce->pin_mutex); i915_active_fini(&ce->active); + i915_sw_fence_fini(&ce->guc_blocked); } void i915_context_module_exit(void) From patchwork Mon Sep 13 18:06:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12490469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F99CC433EF for ; Mon, 13 Sep 2021 18:08:36 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5E93D60F23 for ; Mon, 13 Sep 2021 18:08:36 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5E93D60F23 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D508B6ECD9; Mon, 13 Sep 2021 18:08:22 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id A6D106E221; Mon, 13 Sep 2021 18:08:20 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10106"; a="221792532" X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="221792532" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:20 -0700 X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="551796990" Received: from jpcooney-mobl1.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.5.201]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:19 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH 6/8] drm/i915: try to simplify make_{un}shrinkable Date: Mon, 13 Sep 2021 19:06:03 +0100 Message-Id: <20210913180605.2778493-6-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210913180605.2778493-1-matthew.auld@intel.com> References: <20210913180605.2778493-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Drop the atomic shrink_pin stuff, and just have make_{un}shrinkable update the shrinker visible lists immediately. This at least simplifies the next patch, and does make the behaviour more obvious. The potential downside is that make_unshrinkable now grabs a global lock even when the object itself is no longer shrinkable(transitioning from purgeable <-> shrinkable doesn't seem to be a thing), for example in the ppGTT insertion paths we should now be careful not to needlessly call make_unshrinkable multiple times. Outside of that there is some fallout in intel_context which relies on nesting calls to shrink_pin. Signed-off-by: Matthew Auld Cc: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 9 ---- .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 16 +----- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 52 +++++++++++++------ drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 1 - drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 1 - drivers/gpu/drm/i915/gt/intel_context.c | 9 +--- 7 files changed, 41 insertions(+), 50 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 6fb9afb65034..e8265a432fcb 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -305,15 +305,6 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj) */ atomic_inc(&i915->mm.free_count); - /* - * This serializes freeing with the shrinker. Since the free - * is delayed, first by RCU then by the workqueue, we want the - * shrinker to be able to free pages of unreferenced objects, - * or else we may oom whilst there are plenty of deferred - * freed objects. - */ - i915_gem_object_make_unshrinkable(obj); - /* * Since we require blocking on struct_mutex to unbind the freed * object from the GPU before releasing resources back to the diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index 2471f36aaff3..a035ac26a090 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -453,7 +453,6 @@ struct drm_i915_gem_object { * instead go through the pin/unpin interfaces. */ atomic_t pages_pin_count; - atomic_t shrink_pin; /** * Priority list of potential placements for this object. @@ -514,7 +513,7 @@ struct drm_i915_gem_object { struct i915_gem_object_page_iter get_dma_page; /** - * Element within i915->mm.unbound_list or i915->mm.bound_list, + * Element within i915->mm.shrink_list or i915->mm.purge_list, * locked by i915->mm.obj_lock. */ struct list_head link; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 8eb1c3a6fc9c..f0df1394d7f6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -64,28 +64,16 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, GEM_BUG_ON(i915_gem_object_has_tiling_quirk(obj)); i915_gem_object_set_tiling_quirk(obj); GEM_BUG_ON(!list_empty(&obj->mm.link)); - atomic_inc(&obj->mm.shrink_pin); shrinkable = false; } if (shrinkable) { - struct list_head *list; - unsigned long flags; - assert_object_held(obj); - spin_lock_irqsave(&i915->mm.obj_lock, flags); - - i915->mm.shrink_count++; - i915->mm.shrink_memory += obj->base.size; if (obj->mm.madv != I915_MADV_WILLNEED) - list = &i915->mm.purge_list; + i915_gem_object_make_purgeable(obj); else - list = &i915->mm.shrink_list; - list_add_tail(&obj->mm.link, list); - - atomic_set(&obj->mm.shrink_pin, 0); - spin_unlock_irqrestore(&i915->mm.obj_lock, flags); + i915_gem_object_make_shrinkable(obj); } } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index e382b7f2353b..6b38e4414c5a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -455,23 +455,26 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915, #define obj_to_i915(obj__) to_i915((obj__)->base.dev) +/** + * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By + * default all object types that support shrinking(see IS_SHRINKABLE), will also + * make the object visible to the shrinker after allocating the system memory + * pages. + * @obj: The GEM object. + * + * This is typically used for special kernel internal objects that can't be + * easily processed by the shrinker, like if they are perma-pinned. + */ void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj) { struct drm_i915_private *i915 = obj_to_i915(obj); unsigned long flags; - /* - * We can only be called while the pages are pinned or when - * the pages are released. If pinned, we should only be called - * from a single caller under controlled conditions; and on release - * only one caller may release us. Neither the two may cross. - */ - if (atomic_add_unless(&obj->mm.shrink_pin, 1, 0)) + if (!i915_gem_object_is_shrinkable(obj)) return; spin_lock_irqsave(&i915->mm.obj_lock, flags); - if (!atomic_fetch_inc(&obj->mm.shrink_pin) && - !list_empty(&obj->mm.link)) { + if (!list_empty(&obj->mm.link)) { list_del_init(&obj->mm.link); i915->mm.shrink_count--; i915->mm.shrink_memory -= obj->base.size; @@ -489,28 +492,45 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, if (!i915_gem_object_is_shrinkable(obj)) return; - if (atomic_add_unless(&obj->mm.shrink_pin, -1, 1)) - return; - spin_lock_irqsave(&i915->mm.obj_lock, flags); + GEM_BUG_ON(!kref_read(&obj->base.refcount)); - if (atomic_dec_and_test(&obj->mm.shrink_pin)) { - GEM_BUG_ON(!list_empty(&obj->mm.link)); - list_add_tail(&obj->mm.link, head); + if (list_empty(&obj->mm.link)) { i915->mm.shrink_count++; i915->mm.shrink_memory += obj->base.size; - + list_add_tail(&obj->mm.link, head); + } else { + list_move_tail(&obj->mm.link, head); } + spin_unlock_irqrestore(&i915->mm.obj_lock, flags); } + +/** + * i915_gem_object_make_shrinkable - Move the object to the tail of the + * shrinkable list. Objects on this list might be swapped out. Used with + * WILLNEED objects. + * @obj: The GEM object. + * + * Should only be called on objects which have backing pages. + */ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) { __i915_gem_object_make_shrinkable(obj, &obj_to_i915(obj)->mm.shrink_list); } +/** + * i915_gem_object_make_purgeable - Move the object to the tail of the purgeable + * list. Used with DONTNEED objects. Unlike with shrinkable objects, the + * shrinker will attempt to discard the backing pages, instead of trying to swap + * them out. + * @obj: The GEM object. + * + * Should only be called on objects which have backing pages. + */ void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj) { __i915_gem_object_make_shrinkable(obj, diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index 1aee5e6b1b23..dd5ebc1659cf 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -185,7 +185,6 @@ static void gen6_alloc_va_range(struct i915_address_space *vm, pt = stash->pt[0]; __i915_gem_object_pin_pages(pt->base); - i915_gem_object_make_unshrinkable(pt->base); fill32_px(pt, vm->scratch[0]->encode); diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index 6a5af995f5b1..1b946978e377 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -301,7 +301,6 @@ static void __gen8_ppgtt_alloc(struct i915_address_space * const vm, pt = stash->pt[!!lvl]; __i915_gem_object_pin_pages(pt->base); - i915_gem_object_make_unshrinkable(pt->base); fill_px(pt, vm->scratch[lvl]->encode); diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 0e7dd12a0871..2eed28dfff9d 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -111,11 +111,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww) if (err) goto err_unpin; - /* - * And mark it as a globally pinned object to let the shrinker know - * it cannot reclaim the object until we release it. - */ - i915_vma_make_unshrinkable(vma); vma->obj->mm.dirty = true; return 0; @@ -127,7 +122,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww) static void __context_unpin_state(struct i915_vma *vma) { - i915_vma_make_shrinkable(vma); i915_active_release(&vma->active); __i915_vma_unpin(vma); } @@ -180,7 +174,6 @@ static int intel_context_pre_pin(struct intel_context *ce, if (err) goto err_timeline; - return 0; err_timeline: @@ -338,6 +331,8 @@ static void __intel_context_retire(struct i915_active *active) set_bit(CONTEXT_VALID_BIT, &ce->flags); intel_context_post_unpin(ce); + if (ce->state) + i915_vma_make_shrinkable(ce->state); intel_context_put(ce); } From patchwork Mon Sep 13 18:06:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12490473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F05C9C433F5 for ; Mon, 13 Sep 2021 18:08:46 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C214260F23 for ; Mon, 13 Sep 2021 18:08:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org C214260F23 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C065A6ECDE; Mon, 13 Sep 2021 18:08:44 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4E7A46E216; Mon, 13 Sep 2021 18:08:43 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10106"; a="221792536" X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="221792536" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:21 -0700 X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="551796994" Received: from jpcooney-mobl1.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.5.201]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:20 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH 7/8] drm/i915/ttm: make evicted shmem pages visible to the shrinker Date: Mon, 13 Sep 2021 19:06:04 +0100 Message-Id: <20210913180605.2778493-7-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210913180605.2778493-1-matthew.auld@intel.com> References: <20210913180605.2778493-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" We currently just evict lmem objects to system memory when under memory pressure. For this case we lack the usual object mm.pages, which effectively hides the pages from the i915-gem shrinker, until we actually "attach" the TT to the object, or in the case of lmem-only objects it just gets migrated back to lmem when touched again. For such cases we can make the object visible as soon as we populate the TT with shmem pages, and then hide it again when doing the unpopulate. Signed-off-by: Matthew Auld Cc: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 1 + drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 29 +++++++++++++++----- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 11 ++++++++ 3 files changed, 34 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 561d6bd0a5c9..28b831c78c47 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -540,6 +540,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj); void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj); +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj); void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj); static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 6b38e4414c5a..02175e8ad069 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -482,13 +482,12 @@ void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj) spin_unlock_irqrestore(&i915->mm.obj_lock, flags); } -static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, - struct list_head *head) +static void ___i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, + struct list_head *head) { struct drm_i915_private *i915 = obj_to_i915(obj); unsigned long flags; - GEM_BUG_ON(!i915_gem_object_has_pages(obj)); if (!i915_gem_object_is_shrinkable(obj)) return; @@ -507,6 +506,21 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, spin_unlock_irqrestore(&i915->mm.obj_lock, flags); } +/** + * __i915_gem_object_make_shrinkable - Move the object to the tail of the + * shrinkable list. Objects on this list might be swapped out. Used with + * WILLNEED objects. + * @obj: The GEM object. + * + * DO NOT USE. This is intended to be called on very special objects that don't + * yet have mm.pages, but are guaranteed to have potentially reclaimable pages + * underneath. + */ +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) +{ + ___i915_gem_object_make_shrinkable(obj, + &obj_to_i915(obj)->mm.shrink_list); +} /** * i915_gem_object_make_shrinkable - Move the object to the tail of the @@ -518,8 +532,8 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, */ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) { - __i915_gem_object_make_shrinkable(obj, - &obj_to_i915(obj)->mm.shrink_list); + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + __i915_gem_object_make_shrinkable(obj); } /** @@ -533,6 +547,7 @@ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) */ void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj) { - __i915_gem_object_make_shrinkable(obj, - &obj_to_i915(obj)->mm.purge_list); + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + ___i915_gem_object_make_shrinkable(obj, + &obj_to_i915(obj)->mm.purge_list); } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 08894f6a296b..dbf06a2af8bf 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -231,6 +231,15 @@ static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev, if (ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) ttm->page_flags &= ~TTM_PAGE_FLAG_SWAPPED; + /* + * Even if we lack mm.pages for this object(which will be the case when + * something is evicted to system memory by TTM), we still want to make + * this object visible to the shrinker, since the underlying ttm_tt + * still has the real shmem pages. When unpopulating the tt(possibly due + * to shrinking) we hide it again from the shrinker. + */ + __i915_gem_object_make_shrinkable(obj); + i915_tt->cached_st = st; return 0; @@ -245,6 +254,8 @@ static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) struct drm_i915_gem_object *obj = i915_tt->obj; bool backup = i915_tt->backup; + i915_gem_object_make_unshrinkable(obj); + if (obj->mm.madv == I915_MADV_DONTNEED) { obj->mm.dirty = false; GEM_BUG_ON(backup); From patchwork Mon Sep 13 18:06:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12490475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F6A5C433FE for ; Mon, 13 Sep 2021 18:08:48 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2DF52610E7 for ; Mon, 13 Sep 2021 18:08:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2DF52610E7 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 867C16E216; Mon, 13 Sep 2021 18:08:44 +0000 (UTC) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by gabe.freedesktop.org (Postfix) with ESMTPS id A3EB86E216; Mon, 13 Sep 2021 18:08:43 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10106"; a="221792539" X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="221792539" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:22 -0700 X-IronPort-AV: E=Sophos;i="5.85,290,1624345200"; d="scan'208";a="551796996" Received: from jpcooney-mobl1.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.5.201]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Sep 2021 11:08:21 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH 8/8] drm/i915/ttm: enable shmem tt backend Date: Mon, 13 Sep 2021 19:06:05 +0100 Message-Id: <20210913180605.2778493-8-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210913180605.2778493-1-matthew.auld@intel.com> References: <20210913180605.2778493-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Enable shmem tt backend, and enable shrinking. Signed-off-by: Matthew Auld Cc: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index dbf06a2af8bf..e888111c68c4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -1048,6 +1048,7 @@ static u64 i915_ttm_mmap_offset(struct drm_i915_gem_object *obj) static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = { .name = "i915_gem_object_ttm", + .flags = I915_GEM_OBJECT_IS_SHRINKABLE, .get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages,