From patchwork Mon Sep 27 11:41:02 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519717 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0A2DC433F5 for ; Mon, 27 Sep 2021 11:42:16 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 69B3D60F4B for ; Mon, 27 Sep 2021 11:42:16 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 69B3D60F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5B0F389EAE; Mon, 27 Sep 2021 11:42:07 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 31B1089EAC; Mon, 27 Sep 2021 11:42:05 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610588" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610588" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:04 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553158952" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:01 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH v5 01/13] drm/ttm: stop calling tt_swapin in vm_access Date: Mon, 27 Sep 2021 12:41:02 +0100 Message-Id: <20210927114114.152310-1-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" In commit: commit 09ac4fcb3f255e9225967c75f5893325c116cdbe Author: Felix Kuehling Date: Thu Jul 13 17:01:16 2017 -0400 drm/ttm: Implement vm_operations_struct.access v2 we added the vm_access hook, where we also directly call tt_swapin for some reason. If something is swapped-out then the ttm_tt must also be unpopulated, and since access_kmap should also call tt_populate, if needed, then swapping-in will already be handled there. If anything, calling tt_swapin directly here would likely always fail since the tt->pages won't yet be populated, or worse since the tt->pages array is never actually cleared in unpopulate this might lead to a nasty uaf. Fixes: 09ac4fcb3f25 ("drm/ttm: Implement vm_operations_struct.access v2") Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König Reviewed-by: Thomas Hellström Reviewed-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo_vm.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index f56be5bc0861..5b9b7fd01a69 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -519,11 +519,6 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr, switch (bo->resource->mem_type) { case TTM_PL_SYSTEM: - if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { - ret = ttm_tt_swapin(bo->ttm); - if (unlikely(ret != 0)) - return ret; - } fallthrough; case TTM_PL_TT: ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write); From patchwork Mon Sep 27 11:41:03 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519715 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4F2BC433F5 for ; Mon, 27 Sep 2021 11:42:08 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 967BB60F4B for ; Mon, 27 Sep 2021 11:42:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 967BB60F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DAFFD89EAC; Mon, 27 Sep 2021 11:42:06 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 65DF889EAE; Mon, 27 Sep 2021 11:42:05 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610587" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610587" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:04 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553158973" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:03 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH v5 02/13] drm/ttm: stop setting page->index for the ttm_tt Date: Mon, 27 Sep 2021 12:41:03 +0100 Message-Id: <20210927114114.152310-2-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" In commit: commit 58aa6622d32af7d2c08d45085f44c54554a16ed7 Author: Thomas Hellstrom Date: Fri Jan 3 11:47:23 2014 +0100 drm/ttm: Correctly set page mapping and -index members we started setting the page->mapping and page->index to point to the virtual address space, if the pages were faulted with TTM. Apparently this was needed for core-mm to able to reverse lookup the virtual address given the struct page, and potentially unmap it from the page tables. However as pointed out by Thomas, since we are now using PFN_MAP, instead of say PFN_MIXED, this should no longer be the case. There was also apparently some usecase in vmwgfx which needed this for dirty tracking, but that also doesn't appear to be the case anymore, as pointed out by Thomas. We still need keep the page->mapping for now, since that is still needed for different reasons, but we try to address that in the next patch. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König Reviewed-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 -- drivers/gpu/drm/ttm/ttm_tt.c | 4 +--- 2 files changed, 1 insertion(+), 5 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 5b9b7fd01a69..9a2119fe4bdd 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -346,8 +346,6 @@ vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf, } else if (unlikely(!page)) { break; } - page->index = drm_vma_node_start(&bo->base.vma_node) + - page_offset; pfn = page_to_pfn(page); } diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index dae52433beeb..1cc04c224988 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -367,10 +367,8 @@ static void ttm_tt_clear_mapping(struct ttm_tt *ttm) if (ttm->page_flags & TTM_PAGE_FLAG_SG) return; - for (i = 0; i < ttm->num_pages; ++i) { + for (i = 0; i < ttm->num_pages; ++i) (*page)->mapping = NULL; - (*page++)->index = 0; - } } void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) From patchwork Mon Sep 27 11:41:04 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519721 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B3271C433FE for ; Mon, 27 Sep 2021 11:42:48 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 84C9F60F4B for ; Mon, 27 Sep 2021 11:42:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 84C9F60F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 03DA189FCA; Mon, 27 Sep 2021 11:42:12 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5201A89EAC; Mon, 27 Sep 2021 11:42:06 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610593" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610593" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:06 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553158991" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:04 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH v5 03/13] drm/ttm: move ttm_tt_{add, clear}_mapping into amdgpu Date: Mon, 27 Sep 2021 12:41:04 +0100 Message-Id: <20210927114114.152310-3-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Now that setting page->index shouldn't be needed anymore, we are just left with setting page->mapping, and here it looks like amdgpu is the only user, where pointing the page->mapping at the dev_mapping is used to verify that the pages do indeed belong to the device, if userspace later tries to touch them. v2(Christian): - Drop the functions altogether and just inline modifying the page->mapping Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 15 ++++++++++++++- drivers/gpu/drm/ttm/ttm_tt.c | 25 ------------------------- 2 files changed, 14 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 820fcb24231f..438377a89aa3 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1119,6 +1119,8 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev, { struct amdgpu_device *adev = amdgpu_ttm_adev(bdev); struct amdgpu_ttm_tt *gtt = (void *)ttm; + pgoff_t i; + int ret; /* user pages are bound by amdgpu_ttm_tt_pin_userptr() */ if (gtt->userptr) { @@ -1131,7 +1133,14 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev, if (ttm->page_flags & TTM_PAGE_FLAG_SG) return 0; - return ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx); + ret = ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx); + if (ret) + return ret; + + for (i = 0; i < ttm->num_pages; ++i) + ttm->pages[i]->mapping = bdev->dev_mapping; + + return 0; } /* @@ -1145,6 +1154,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev, { struct amdgpu_ttm_tt *gtt = (void *)ttm; struct amdgpu_device *adev; + pgoff_t i; amdgpu_ttm_backend_unbind(bdev, ttm); @@ -1158,6 +1168,9 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev, if (ttm->page_flags & TTM_PAGE_FLAG_SG) return; + for (i = 0; i < ttm->num_pages; ++i) + ttm->pages[i]->mapping = NULL; + adev = amdgpu_ttm_adev(bdev); return ttm_pool_free(&adev->mman.bdev.pool, ttm); } diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 1cc04c224988..980ecb079b2c 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -289,17 +289,6 @@ int ttm_tt_swapout(struct ttm_device *bdev, struct ttm_tt *ttm, return ret; } -static void ttm_tt_add_mapping(struct ttm_device *bdev, struct ttm_tt *ttm) -{ - pgoff_t i; - - if (ttm->page_flags & TTM_PAGE_FLAG_SG) - return; - - for (i = 0; i < ttm->num_pages; ++i) - ttm->pages[i]->mapping = bdev->dev_mapping; -} - int ttm_tt_populate(struct ttm_device *bdev, struct ttm_tt *ttm, struct ttm_operation_ctx *ctx) { @@ -336,7 +325,6 @@ int ttm_tt_populate(struct ttm_device *bdev, if (ret) goto error; - ttm_tt_add_mapping(bdev, ttm); ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED; if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { ret = ttm_tt_swapin(ttm); @@ -359,24 +347,11 @@ int ttm_tt_populate(struct ttm_device *bdev, } EXPORT_SYMBOL(ttm_tt_populate); -static void ttm_tt_clear_mapping(struct ttm_tt *ttm) -{ - pgoff_t i; - struct page **page = ttm->pages; - - if (ttm->page_flags & TTM_PAGE_FLAG_SG) - return; - - for (i = 0; i < ttm->num_pages; ++i) - (*page)->mapping = NULL; -} - void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) { if (!ttm_tt_is_populated(ttm)) return; - ttm_tt_clear_mapping(ttm); if (bdev->funcs->ttm_tt_unpopulate) bdev->funcs->ttm_tt_unpopulate(bdev, ttm); else From patchwork Mon Sep 27 11:41:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519725 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1C66C433F5 for ; Mon, 27 Sep 2021 11:42:56 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 75E5E60F4B for ; Mon, 27 Sep 2021 11:42:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 75E5E60F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 30C5689FD9; Mon, 27 Sep 2021 11:42:15 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id BA85289EB4; Mon, 27 Sep 2021 11:42:07 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610597" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610597" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:07 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159004" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:06 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH v5 04/13] drm/ttm: remove TTM_PAGE_FLAG_NO_RETRY Date: Mon, 27 Sep 2021 12:41:05 +0100 Message-Id: <20210927114114.152310-4-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" No longer used it seems. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König Reviewed-by: Christian König --- include/drm/ttm/ttm_tt.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index 89b15d673b22..842ce756213c 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -41,7 +41,6 @@ struct ttm_operation_ctx; #define TTM_PAGE_FLAG_SWAPPED (1 << 4) #define TTM_PAGE_FLAG_ZERO_ALLOC (1 << 6) #define TTM_PAGE_FLAG_SG (1 << 8) -#define TTM_PAGE_FLAG_NO_RETRY (1 << 9) #define TTM_PAGE_FLAG_PRIV_POPULATED (1 << 31) From patchwork Mon Sep 27 11:41:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519723 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94AE4C433F5 for ; Mon, 27 Sep 2021 11:42:52 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6473160F91 for ; Mon, 27 Sep 2021 11:42:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6473160F91 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1BBD389FCE; Mon, 27 Sep 2021 11:42:12 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 68F5489EBD; Mon, 27 Sep 2021 11:42:09 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610601" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610601" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:09 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159018" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:07 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Christian_K=C3=B6nig?= , =?utf-8?q?Thom?= =?utf-8?q?as_Hellstr=C3=B6m?= Subject: [PATCH v5 05/13] drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/ Date: Mon, 27 Sep 2021 12:41:06 +0100 Message-Id: <20210927114114.152310-5-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" It covers more than just ttm_bo_type_sg usage, like with say dma-buf, since one other user is userptr in amdgpu, and in the future we might have some more. Hence EXTERNAL is likely a more suitable name. v2(Christian): - Rename these to TTM_TT_FLAGS_* - Fix up all the holes in the flag values Suggested-by: Christian König Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König Acked-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 +++++----- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 6 +++--- drivers/gpu/drm/nouveau/nouveau_bo.c | 4 ++-- drivers/gpu/drm/radeon/radeon_ttm.c | 8 ++++---- drivers/gpu/drm/ttm/ttm_bo.c | 4 ++-- drivers/gpu/drm/ttm/ttm_bo_util.c | 4 ++-- drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 +- drivers/gpu/drm/ttm/ttm_pool.c | 2 +- drivers/gpu/drm/ttm/ttm_tt.c | 24 ++++++++++++------------ include/drm/ttm/ttm_device.h | 2 +- include/drm/ttm/ttm_tt.h | 18 +++++++++--------- 11 files changed, 42 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 438377a89aa3..0cf94421665f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -894,7 +894,7 @@ static int amdgpu_ttm_backend_bind(struct ttm_device *bdev, DRM_ERROR("failed to pin userptr\n"); return r; } - } else if (ttm->page_flags & TTM_PAGE_FLAG_SG) { + } else if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL) { if (!ttm->sg) { struct dma_buf_attachment *attach; struct sg_table *sgt; @@ -1130,7 +1130,7 @@ static int amdgpu_ttm_tt_populate(struct ttm_device *bdev, return 0; } - if (ttm->page_flags & TTM_PAGE_FLAG_SG) + if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL) return 0; ret = ttm_pool_alloc(&adev->mman.bdev.pool, ttm, ctx); @@ -1165,7 +1165,7 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_device *bdev, return; } - if (ttm->page_flags & TTM_PAGE_FLAG_SG) + if (ttm->page_flags & TTM_TT_FLAG_EXTERNAL) return; for (i = 0; i < ttm->num_pages; ++i) @@ -1198,8 +1198,8 @@ int amdgpu_ttm_tt_set_userptr(struct ttm_buffer_object *bo, return -ENOMEM; } - /* Set TTM_PAGE_FLAG_SG before populate but after create. */ - bo->ttm->page_flags |= TTM_PAGE_FLAG_SG; + /* Set TTM_TT_FLAG_EXTERNAL before populate but after create. */ + bo->ttm->page_flags |= TTM_TT_FLAG_EXTERNAL; gtt = (void *)bo->ttm; gtt->userptr = addr; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index b94497989995..a77e90f300fe 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -194,7 +194,7 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, if (obj->flags & I915_BO_ALLOC_CPU_CLEAR && man->use_tt) - page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; + page_flags |= TTM_TT_FLAG_ZERO_ALLOC; ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, i915_ttm_select_tt_caching(obj)); @@ -562,7 +562,7 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, } /* Populate ttm with pages if needed. Typically system memory. */ - if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_PAGE_FLAG_SWAPPED))) { + if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_TT_FLAG_SWAPPED))) { ret = ttm_tt_populate(bo->bdev, ttm, ctx); if (ret) return ret; @@ -573,7 +573,7 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, return PTR_ERR(dst_st); clear = !cpu_maps_iomem(bo->resource) && (!ttm || !ttm_tt_is_populated(ttm)); - if (!(clear && ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))) + if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC))) __i915_ttm_move(bo, clear, dst_mem, bo->ttm, dst_st, true); ttm_bo_move_sync_cleanup(bo, dst_mem); diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c index d3b21d318b42..12b107acb6ee 100644 --- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -1250,7 +1250,7 @@ nouveau_ttm_tt_populate(struct ttm_device *bdev, struct ttm_tt *ttm_dma = (void *)ttm; struct nouveau_drm *drm; struct device *dev; - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); + bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL); if (ttm_tt_is_populated(ttm)) return 0; @@ -1273,7 +1273,7 @@ nouveau_ttm_tt_unpopulate(struct ttm_device *bdev, { struct nouveau_drm *drm; struct device *dev; - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); + bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL); if (slave) return; diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index 7793249bc549..11b21d605584 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -545,14 +545,14 @@ static int radeon_ttm_tt_populate(struct ttm_device *bdev, { struct radeon_device *rdev = radeon_get_rdev(bdev); struct radeon_ttm_tt *gtt = radeon_ttm_tt_to_gtt(rdev, ttm); - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); + bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL); if (gtt && gtt->userptr) { ttm->sg = kzalloc(sizeof(struct sg_table), GFP_KERNEL); if (!ttm->sg) return -ENOMEM; - ttm->page_flags |= TTM_PAGE_FLAG_SG; + ttm->page_flags |= TTM_TT_FLAG_EXTERNAL; return 0; } @@ -569,13 +569,13 @@ static void radeon_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm { struct radeon_device *rdev = radeon_get_rdev(bdev); struct radeon_ttm_tt *gtt = radeon_ttm_tt_to_gtt(rdev, ttm); - bool slave = !!(ttm->page_flags & TTM_PAGE_FLAG_SG); + bool slave = !!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL); radeon_ttm_tt_unbind(bdev, ttm); if (gtt && gtt->userptr) { kfree(ttm->sg); - ttm->page_flags &= ~TTM_PAGE_FLAG_SG; + ttm->page_flags &= ~TTM_TT_FLAG_EXTERNAL; return; } diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 3b22c0013dbf..d62b2013c367 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1115,8 +1115,8 @@ int ttm_bo_swapout(struct ttm_buffer_object *bo, struct ttm_operation_ctx *ctx, return -EBUSY; if (!bo->ttm || !ttm_tt_is_populated(bo->ttm) || - bo->ttm->page_flags & TTM_PAGE_FLAG_SG || - bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED || + bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL || + bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED || !ttm_bo_get_unless_zero(bo)) { if (locked) dma_resv_unlock(bo->base.resv); diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c index c893c3db2623..a342d701c91c 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_util.c +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c @@ -147,7 +147,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, bool clear; int ret = 0; - if (ttm && ((ttm->page_flags & TTM_PAGE_FLAG_SWAPPED) || + if (ttm && ((ttm->page_flags & TTM_TT_FLAG_SWAPPED) || dst_man->use_tt)) { ret = ttm_tt_populate(bdev, ttm, ctx); if (ret) @@ -169,7 +169,7 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, } clear = src_iter->ops->maps_tt && (!ttm || !ttm_tt_is_populated(ttm)); - if (!(clear && ttm && !(ttm->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC))) + if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC))) ttm_move_memcpy(clear, dst_mem->num_pages, dst_iter, src_iter); if (!src_iter->ops->maps_tt) diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 9a2119fe4bdd..950f4f132802 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -162,7 +162,7 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo, * Refuse to fault imported pages. This should be handled * (if at all) by redirecting mmap to the exporter. */ - if (bo->ttm && (bo->ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (bo->ttm && (bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { dma_resv_unlock(bo->base.resv); return VM_FAULT_SIGBUS; } diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index c961a788b519..1bba0a0ed3f9 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -371,7 +371,7 @@ int ttm_pool_alloc(struct ttm_pool *pool, struct ttm_tt *tt, WARN_ON(!num_pages || ttm_tt_is_populated(tt)); WARN_ON(dma_addr && !pool->dev); - if (tt->page_flags & TTM_PAGE_FLAG_ZERO_ALLOC) + if (tt->page_flags & TTM_TT_FLAG_ZERO_ALLOC) gfp_flags |= __GFP_ZERO; if (ctx->gfp_retry_mayfail) diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 980ecb079b2c..86f31fde6e35 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -68,12 +68,12 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc) switch (bo->type) { case ttm_bo_type_device: if (zero_alloc) - page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; + page_flags |= TTM_TT_FLAG_ZERO_ALLOC; break; case ttm_bo_type_kernel: break; case ttm_bo_type_sg: - page_flags |= TTM_PAGE_FLAG_SG; + page_flags |= TTM_TT_FLAG_EXTERNAL; break; default: pr_err("Illegal buffer object type\n"); @@ -156,7 +156,7 @@ EXPORT_SYMBOL(ttm_tt_init); void ttm_tt_fini(struct ttm_tt *ttm) { - WARN_ON(ttm->page_flags & TTM_PAGE_FLAG_PRIV_POPULATED); + WARN_ON(ttm->page_flags & TTM_TT_FLAG_PRIV_POPULATED); if (ttm->swap_storage) fput(ttm->swap_storage); @@ -178,7 +178,7 @@ int ttm_sg_tt_init(struct ttm_tt *ttm, struct ttm_buffer_object *bo, ttm_tt_init_fields(ttm, bo, page_flags, caching); - if (page_flags & TTM_PAGE_FLAG_SG) + if (page_flags & TTM_TT_FLAG_EXTERNAL) ret = ttm_sg_tt_alloc_page_directory(ttm); else ret = ttm_dma_tt_alloc_page_directory(ttm); @@ -224,7 +224,7 @@ int ttm_tt_swapin(struct ttm_tt *ttm) fput(swap_storage); ttm->swap_storage = NULL; - ttm->page_flags &= ~TTM_PAGE_FLAG_SWAPPED; + ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED; return 0; @@ -279,7 +279,7 @@ int ttm_tt_swapout(struct ttm_device *bdev, struct ttm_tt *ttm, ttm_tt_unpopulate(bdev, ttm); ttm->swap_storage = swap_storage; - ttm->page_flags |= TTM_PAGE_FLAG_SWAPPED; + ttm->page_flags |= TTM_TT_FLAG_SWAPPED; return ttm->num_pages; @@ -300,7 +300,7 @@ int ttm_tt_populate(struct ttm_device *bdev, if (ttm_tt_is_populated(ttm)) return 0; - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { atomic_long_add(ttm->num_pages, &ttm_pages_allocated); if (bdev->pool.use_dma32) atomic_long_add(ttm->num_pages, @@ -325,8 +325,8 @@ int ttm_tt_populate(struct ttm_device *bdev, if (ret) goto error; - ttm->page_flags |= TTM_PAGE_FLAG_PRIV_POPULATED; - if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { + ttm->page_flags |= TTM_TT_FLAG_PRIV_POPULATED; + if (unlikely(ttm->page_flags & TTM_TT_FLAG_SWAPPED)) { ret = ttm_tt_swapin(ttm); if (unlikely(ret != 0)) { ttm_tt_unpopulate(bdev, ttm); @@ -337,7 +337,7 @@ int ttm_tt_populate(struct ttm_device *bdev, return 0; error: - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); if (bdev->pool.use_dma32) atomic_long_sub(ttm->num_pages, @@ -357,14 +357,14 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) else ttm_pool_free(&bdev->pool, ttm); - if (!(ttm->page_flags & TTM_PAGE_FLAG_SG)) { + if (!(ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { atomic_long_sub(ttm->num_pages, &ttm_pages_allocated); if (bdev->pool.use_dma32) atomic_long_sub(ttm->num_pages, &ttm_dma32_pages_allocated); } - ttm->page_flags &= ~TTM_PAGE_FLAG_PRIV_POPULATED; + ttm->page_flags &= ~TTM_TT_FLAG_PRIV_POPULATED; } #ifdef CONFIG_DEBUG_FS diff --git a/include/drm/ttm/ttm_device.h b/include/drm/ttm/ttm_device.h index cbe03d45e883..0a4ddec78d8f 100644 --- a/include/drm/ttm/ttm_device.h +++ b/include/drm/ttm/ttm_device.h @@ -65,7 +65,7 @@ struct ttm_device_funcs { * ttm_tt_create * * @bo: The buffer object to create the ttm for. - * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags. + * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags. * * Create a struct ttm_tt to back data with system memory pages. * No pages are actually allocated. diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index 842ce756213c..b023cd58ff38 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -38,17 +38,17 @@ struct ttm_resource; struct ttm_buffer_object; struct ttm_operation_ctx; -#define TTM_PAGE_FLAG_SWAPPED (1 << 4) -#define TTM_PAGE_FLAG_ZERO_ALLOC (1 << 6) -#define TTM_PAGE_FLAG_SG (1 << 8) +#define TTM_TT_FLAG_SWAPPED (1 << 0) +#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) +#define TTM_TT_FLAG_EXTERNAL (1 << 2) -#define TTM_PAGE_FLAG_PRIV_POPULATED (1 << 31) +#define TTM_TT_FLAG_PRIV_POPULATED (1 << 31) /** * struct ttm_tt * * @pages: Array of pages backing the data. - * @page_flags: see TTM_PAGE_FLAG_* + * @page_flags: see TTM_TT_FLAG_* * @num_pages: Number of pages in the page array. * @sg: for SG objects via dma-buf * @dma_address: The DMA (bus) addresses of the pages @@ -84,7 +84,7 @@ struct ttm_kmap_iter_tt { static inline bool ttm_tt_is_populated(struct ttm_tt *tt) { - return tt->page_flags & TTM_PAGE_FLAG_PRIV_POPULATED; + return tt->page_flags & TTM_TT_FLAG_PRIV_POPULATED; } /** @@ -103,7 +103,7 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc); * * @ttm: The struct ttm_tt. * @bo: The buffer object we create the ttm for. - * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags. + * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags. * @caching: the desired caching state of the pages * * Create a struct ttm_tt to back data with system memory pages. @@ -178,7 +178,7 @@ void ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm); */ static inline void ttm_tt_mark_for_clear(struct ttm_tt *ttm) { - ttm->page_flags |= TTM_PAGE_FLAG_ZERO_ALLOC; + ttm->page_flags |= TTM_TT_FLAG_ZERO_ALLOC; } void ttm_tt_mgr_init(unsigned long num_pages, unsigned long num_dma32_pages); @@ -194,7 +194,7 @@ struct ttm_kmap_iter *ttm_kmap_iter_tt_init(struct ttm_kmap_iter_tt *iter_tt, * * @bo: Buffer object we allocate the ttm for. * @bridge: The agp bridge this device is sitting on. - * @page_flags: Page flags as identified by TTM_PAGE_FLAG_XX flags. + * @page_flags: Page flags as identified by TTM_TT_FLAG_XX flags. * * * Create a TTM backend that uses the indicated AGP bridge as an aperture From patchwork Mon Sep 27 11:41:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519719 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8EBEEC433EF for ; Mon, 27 Sep 2021 11:42:43 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 642A560F6C for ; Mon, 27 Sep 2021 11:42:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 642A560F6C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F3E5689FD1; Mon, 27 Sep 2021 11:42:13 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id DD25789FC8; Mon, 27 Sep 2021 11:42:10 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610606" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610606" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:10 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159031" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:09 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH v5 06/13] drm/ttm: add some kernel-doc for TTM_TT_FLAG_* Date: Mon, 27 Sep 2021 12:41:07 +0100 Message-Id: <20210927114114.152310-6-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Move it to inline kernel-doc, otherwise we can't add empty lines it seems. Also drop the kernel-doc for pages_list, which doesn't seem to exist. v2(Christian): - Add a note that FLAG_SWAPPED shouldn't need to be touched by drivers. - Mention what FLAG_POPULATED does. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König Reviewed-by: Christian König --- include/drm/ttm/ttm_tt.h | 60 +++++++++++++++++++++++++++------------- 1 file changed, 41 insertions(+), 19 deletions(-) diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index b023cd58ff38..86d74069be3e 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -38,35 +38,57 @@ struct ttm_resource; struct ttm_buffer_object; struct ttm_operation_ctx; -#define TTM_TT_FLAG_SWAPPED (1 << 0) -#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) -#define TTM_TT_FLAG_EXTERNAL (1 << 2) - -#define TTM_TT_FLAG_PRIV_POPULATED (1 << 31) - /** - * struct ttm_tt - * - * @pages: Array of pages backing the data. - * @page_flags: see TTM_TT_FLAG_* - * @num_pages: Number of pages in the page array. - * @sg: for SG objects via dma-buf - * @dma_address: The DMA (bus) addresses of the pages - * @swap_storage: Pointer to shmem struct file for swap storage. - * @pages_list: used by some page allocation backend - * @caching: The current caching state of the pages, see enum ttm_caching. - * - * This is a structure holding the pages, caching- and aperture binding - * status for a buffer object that isn't backed by fixed (VRAM / AGP) + * struct ttm_tt - This is a structure holding the pages, caching- and aperture + * binding status for a buffer object that isn't backed by fixed (VRAM / AGP) * memory. */ struct ttm_tt { + /** @pages: Array of pages backing the data. */ struct page **pages; + /** + * @page_flags: The page flags. + * + * Supported values: + * + * TTM_TT_FLAG_SWAPPED: Set by TTM when the pages have been unpopulated + * and swapped out by TTM. Calling ttm_tt_populate() will then swap the + * pages back in, and unset the flag. Drivers should in general never + * need to touch this. + * + * TTM_TT_FLAG_ZERO_ALLOC: Set if the pages will be zeroed on + * allocation. + * + * TTM_TT_FLAG_EXTERNAL: Set if the underlying pages were allocated + * externally, like with dma-buf or userptr. This effectively disables + * TTM swapping out such pages. Also important is to prevent TTM from + * ever directly mapping these pages. + * + * Note that enum ttm_bo_type.ttm_bo_type_sg objects will always enable + * this flag. + * + * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is + * set by TTM after ttm_tt_populate() has successfully returned, and is + * then unset when TTM calls ttm_tt_unpopulate(). + */ +#define TTM_TT_FLAG_SWAPPED (1 << 0) +#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) +#define TTM_TT_FLAG_EXTERNAL (1 << 2) + +#define TTM_TT_FLAG_PRIV_POPULATED (1 << 31) uint32_t page_flags; + /** @num_pages: Number of pages in the page array. */ uint32_t num_pages; + /** @sg: for SG objects via dma-buf. */ struct sg_table *sg; + /** @dma_address: The DMA (bus) addresses of the pages. */ dma_addr_t *dma_address; + /** @swap_storage: Pointer to shmem struct file for swap storage. */ struct file *swap_storage; + /** + * @caching: The current caching state of the pages, see enum + * ttm_caching. + */ enum ttm_caching caching; }; From patchwork Mon Sep 27 11:41:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40B84C433EF for ; Mon, 27 Sep 2021 11:43:15 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F3F3860F4B for ; Mon, 27 Sep 2021 11:43:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org F3F3860F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0574F6E846; Mon, 27 Sep 2021 11:42:21 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id AB27289FD3; Mon, 27 Sep 2021 11:42:12 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610612" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610612" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:12 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159040" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:10 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH v5 07/13] drm/ttm: add TTM_TT_FLAG_EXTERNAL_MAPPABLE Date: Mon, 27 Sep 2021 12:41:08 +0100 Message-Id: <20210927114114.152310-7-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" In commit: commit 667a50db0477d47fdff01c666f5ee1ce26b5264c Author: Thomas Hellstrom Date: Fri Jan 3 11:17:18 2014 +0100 drm/ttm: Refuse to fault (prime-) imported pages we introduced the restriction that imported pages should not be directly mappable through TTM(this also extends to userptr). In the next patch we want to introduce a shmem_tt backend, which should follow all the existing rules with TTM_PAGE_FLAG_EXTERNAL, since it will need to handle swapping itself, but with the above mapping restriction lifted. v2(Christian): - Don't OR together EXTERNAL and EXTERNAL_MAPPABLE in the definition of EXTERNAL_MAPPABLE, just leave it the caller to handle this correctly, otherwise we might encounter subtle issues. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König --- drivers/gpu/drm/ttm/ttm_bo_vm.c | 6 ++++-- drivers/gpu/drm/ttm/ttm_tt.c | 3 +++ include/drm/ttm/ttm_tt.h | 19 ++++++++++++++++--- 3 files changed, 23 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 950f4f132802..33680c94127c 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -163,8 +163,10 @@ vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo, * (if at all) by redirecting mmap to the exporter. */ if (bo->ttm && (bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)) { - dma_resv_unlock(bo->base.resv); - return VM_FAULT_SIGBUS; + if (!(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL_MAPPABLE)) { + dma_resv_unlock(bo->base.resv); + return VM_FAULT_SIGBUS; + } } return 0; diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c index 86f31fde6e35..7e83c00a3f48 100644 --- a/drivers/gpu/drm/ttm/ttm_tt.c +++ b/drivers/gpu/drm/ttm/ttm_tt.c @@ -84,6 +84,9 @@ int ttm_tt_create(struct ttm_buffer_object *bo, bool zero_alloc) if (unlikely(bo->ttm == NULL)) return -ENOMEM; + WARN_ON(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL_MAPPABLE && + !(bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL)); + return 0; } diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index 86d74069be3e..f20832139815 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -67,13 +67,26 @@ struct ttm_tt { * Note that enum ttm_bo_type.ttm_bo_type_sg objects will always enable * this flag. * + * TTM_TT_FLAG_EXTERNAL_MAPPABLE: Same behaviour as + * TTM_TT_FLAG_EXTERNAL, but with the reduced restriction that it is + * still valid to use TTM to map the pages directly. This is useful when + * implementing a ttm_tt backend which still allocates driver owned + * pages underneath(say with shmem). + * + * Note that since this also implies TTM_TT_FLAG_EXTERNAL, the usage + * here should always be: + * + * page_flags = TTM_TT_FLAG_EXTERNAL | + * TTM_TT_FLAG_EXTERNAL_MAPPABLE; + * * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is * set by TTM after ttm_tt_populate() has successfully returned, and is * then unset when TTM calls ttm_tt_unpopulate(). */ -#define TTM_TT_FLAG_SWAPPED (1 << 0) -#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) -#define TTM_TT_FLAG_EXTERNAL (1 << 2) +#define TTM_TT_FLAG_SWAPPED (1 << 0) +#define TTM_TT_FLAG_ZERO_ALLOC (1 << 1) +#define TTM_TT_FLAG_EXTERNAL (1 << 2) +#define TTM_TT_FLAG_EXTERNAL_MAPPABLE (1 << 3) #define TTM_TT_FLAG_PRIV_POPULATED (1 << 31) uint32_t page_flags; From patchwork Mon Sep 27 11:41:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519727 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B79ABC433EF for ; Mon, 27 Sep 2021 11:43:07 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 83D0E60F4B for ; Mon, 27 Sep 2021 11:43:07 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 83D0E60F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7BFE56E844; Mon, 27 Sep 2021 11:42:20 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id AB2C789FC8; Mon, 27 Sep 2021 11:42:13 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610614" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610614" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:13 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159055" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:12 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH v5 08/13] drm/i915/gem: Break out some shmem backend utils Date: Mon, 27 Sep 2021 12:41:09 +0100 Message-Id: <20210927114114.152310-8-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Thomas Hellström Break out some shmem backend utils for future reuse by the TTM backend: shmem_alloc_st(), shmem_free_st() and __shmem_writeback() which we can use to provide a shmem-backed TTM page pool for cached-only TTM buffer objects. Main functional change here is that we now compute the page sizes using the dma segments rather than using the physical page address segments. v2(Reported-by: kernel test robot ) - Make sure we initialise the mapping on the error path in shmem_get_pages() Signed-off-by: Thomas Hellström Reviewed-by: Matthew Auld Signed-off-by: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 181 +++++++++++++--------- 1 file changed, 106 insertions(+), 75 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index 11f072193f3b..36b711ae9e28 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,46 +25,61 @@ static void check_release_pagevec(struct pagevec *pvec) cond_resched(); } -static int shmem_get_pages(struct drm_i915_gem_object *obj) +static void shmem_free_st(struct sg_table *st, struct address_space *mapping, + bool dirty, bool backup) { - struct drm_i915_private *i915 = to_i915(obj->base.dev); - struct intel_memory_region *mem = obj->mm.region; - const unsigned long page_count = obj->base.size / PAGE_SIZE; + struct sgt_iter sgt_iter; + struct pagevec pvec; + struct page *page; + + mapping_clear_unevictable(mapping); + + pagevec_init(&pvec); + for_each_sgt_page(page, sgt_iter, st) { + if (dirty) + set_page_dirty(page); + + if (backup) + mark_page_accessed(page); + + if (!pagevec_add(&pvec, page)) + check_release_pagevec(&pvec); + } + if (pagevec_count(&pvec)) + check_release_pagevec(&pvec); + + sg_free_table(st); + kfree(st); +} + +static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, + size_t size, struct intel_memory_region *mr, + struct address_space *mapping, + unsigned int max_segment) +{ + const unsigned long page_count = size / PAGE_SIZE; unsigned long i; - struct address_space *mapping; struct sg_table *st; struct scatterlist *sg; - struct sgt_iter sgt_iter; struct page *page; unsigned long last_pfn = 0; /* suppress gcc warning */ - unsigned int max_segment = i915_sg_segment_size(); - unsigned int sg_page_sizes; gfp_t noreclaim; int ret; - /* - * Assert that the object is not currently in any GPU domain. As it - * wasn't in the GTT, there shouldn't be any way it could have been in - * a GPU cache - */ - GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS); - GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS); - /* * If there's no chance of allocating enough pages for the whole * object, bail early. */ - if (obj->base.size > resource_size(&mem->region)) - return -ENOMEM; + if (size > resource_size(&mr->region)) + return ERR_PTR(-ENOMEM); st = kmalloc(sizeof(*st), GFP_KERNEL); if (!st) - return -ENOMEM; + return ERR_PTR(-ENOMEM); -rebuild_st: if (sg_alloc_table(st, page_count, GFP_KERNEL)) { kfree(st); - return -ENOMEM; + return ERR_PTR(-ENOMEM); } /* @@ -73,14 +88,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) * * Fail silently without starting the shrinker */ - mapping = obj->base.filp->f_mapping; mapping_set_unevictable(mapping); noreclaim = mapping_gfp_constraint(mapping, ~__GFP_RECLAIM); noreclaim |= __GFP_NORETRY | __GFP_NOWARN; sg = st->sgl; st->nents = 0; - sg_page_sizes = 0; for (i = 0; i < page_count; i++) { const unsigned int shrink[] = { I915_SHRINK_BOUND | I915_SHRINK_UNBOUND, @@ -135,10 +148,9 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) if (!i || sg->length >= max_segment || page_to_pfn(page) != last_pfn + 1) { - if (i) { - sg_page_sizes |= sg->length; + if (i) sg = sg_next(sg); - } + st->nents++; sg_set_page(sg, page, PAGE_SIZE, 0); } else { @@ -149,14 +161,65 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) /* Check that the i965g/gm workaround works. */ GEM_BUG_ON(gfp & __GFP_DMA32 && last_pfn >= 0x00100000UL); } - if (sg) { /* loop terminated early; short sg table */ - sg_page_sizes |= sg->length; + if (sg) /* loop terminated early; short sg table */ sg_mark_end(sg); - } /* Trim unused sg entries to avoid wasting memory. */ i915_sg_trim(st); + return st; +err_sg: + sg_mark_end(sg); + if (sg != st->sgl) { + shmem_free_st(st, mapping, false, false); + } else { + mapping_clear_unevictable(mapping); + sg_free_table(st); + kfree(st); + } + + /* + * shmemfs first checks if there is enough memory to allocate the page + * and reports ENOSPC should there be insufficient, along with the usual + * ENOMEM for a genuine allocation failure. + * + * We use ENOSPC in our driver to mean that we have run out of aperture + * space and so want to translate the error from shmemfs back to our + * usual understanding of ENOMEM. + */ + if (ret == -ENOSPC) + ret = -ENOMEM; + + return ERR_PTR(ret); +} + +static int shmem_get_pages(struct drm_i915_gem_object *obj) +{ + struct drm_i915_private *i915 = to_i915(obj->base.dev); + struct intel_memory_region *mem = obj->mm.region; + struct address_space *mapping = obj->base.filp->f_mapping; + const unsigned long page_count = obj->base.size / PAGE_SIZE; + unsigned int max_segment = i915_sg_segment_size(); + struct sg_table *st; + struct sgt_iter sgt_iter; + struct page *page; + int ret; + + /* + * Assert that the object is not currently in any GPU domain. As it + * wasn't in the GTT, there shouldn't be any way it could have been in + * a GPU cache + */ + GEM_BUG_ON(obj->read_domains & I915_GEM_GPU_DOMAINS); + GEM_BUG_ON(obj->write_domain & I915_GEM_GPU_DOMAINS); + +rebuild_st: + st = shmem_alloc_st(i915, obj->base.size, mem, mapping, max_segment); + if (IS_ERR(st)) { + ret = PTR_ERR(st); + goto err_st; + } + ret = i915_gem_gtt_prepare_pages(obj, st); if (ret) { /* @@ -168,6 +231,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) for_each_sgt_page(page, sgt_iter, st) put_page(page); sg_free_table(st); + kfree(st); max_segment = PAGE_SIZE; goto rebuild_st; @@ -200,28 +264,12 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) if (IS_JSL_EHL(i915) && obj->flags & I915_BO_ALLOC_USER) obj->cache_dirty = true; - __i915_gem_object_set_pages(obj, st, sg_page_sizes); + __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); return 0; -err_sg: - sg_mark_end(sg); err_pages: - mapping_clear_unevictable(mapping); - if (sg != st->sgl) { - struct pagevec pvec; - - pagevec_init(&pvec); - for_each_sgt_page(page, sgt_iter, st) { - if (!pagevec_add(&pvec, page)) - check_release_pagevec(&pvec); - } - if (pagevec_count(&pvec)) - check_release_pagevec(&pvec); - } - sg_free_table(st); - kfree(st); - + shmem_free_st(st, mapping, false, false); /* * shmemfs first checks if there is enough memory to allocate the page * and reports ENOSPC should there be insufficient, along with the usual @@ -231,6 +279,7 @@ static int shmem_get_pages(struct drm_i915_gem_object *obj) * space and so want to translate the error from shmemfs back to our * usual understanding of ENOMEM. */ +err_st: if (ret == -ENOSPC) ret = -ENOMEM; @@ -251,10 +300,8 @@ shmem_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); } -static void -shmem_writeback(struct drm_i915_gem_object *obj) +static void __shmem_writeback(size_t size, struct address_space *mapping) { - struct address_space *mapping; struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, .nr_to_write = SWAP_CLUSTER_MAX, @@ -270,10 +317,9 @@ shmem_writeback(struct drm_i915_gem_object *obj) * instead of invoking writeback so they are aged and paged out * as normal. */ - mapping = obj->base.filp->f_mapping; /* Begin writeback on each dirty page */ - for (i = 0; i < obj->base.size >> PAGE_SHIFT; i++) { + for (i = 0; i < size >> PAGE_SHIFT; i++) { struct page *page; page = find_lock_page(mapping, i); @@ -296,6 +342,12 @@ shmem_writeback(struct drm_i915_gem_object *obj) } } +static void +shmem_writeback(struct drm_i915_gem_object *obj) +{ + __shmem_writeback(obj->base.size, obj->base.filp->f_mapping); +} + void __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages, @@ -316,11 +368,6 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj, void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_table *pages) { - struct sgt_iter sgt_iter; - struct pagevec pvec; - struct page *page; - - GEM_WARN_ON(IS_DGFX(to_i915(obj->base.dev))); __i915_gem_object_release_shmem(obj, pages, true); i915_gem_gtt_finish_pages(obj, pages); @@ -328,25 +375,9 @@ void i915_gem_object_put_pages_shmem(struct drm_i915_gem_object *obj, struct sg_ if (i915_gem_object_needs_bit17_swizzle(obj)) i915_gem_object_save_bit_17_swizzle(obj, pages); - mapping_clear_unevictable(file_inode(obj->base.filp)->i_mapping); - - pagevec_init(&pvec); - for_each_sgt_page(page, sgt_iter, pages) { - if (obj->mm.dirty) - set_page_dirty(page); - - if (obj->mm.madv == I915_MADV_WILLNEED) - mark_page_accessed(page); - - if (!pagevec_add(&pvec, page)) - check_release_pagevec(&pvec); - } - if (pagevec_count(&pvec)) - check_release_pagevec(&pvec); + shmem_free_st(pages, file_inode(obj->base.filp)->i_mapping, + obj->mm.dirty, obj->mm.madv == I915_MADV_WILLNEED); obj->mm.dirty = false; - - sg_free_table(pages); - kfree(pages); } static void From patchwork Mon Sep 27 11:41:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 594ECC43217 for ; Mon, 27 Sep 2021 11:43:11 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2ABFC60F6C for ; Mon, 27 Sep 2021 11:43:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2ABFC60F6C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D38136E845; Mon, 27 Sep 2021 11:42:20 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4971389FEC; Mon, 27 Sep 2021 11:42:15 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610622" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610622" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:15 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159059" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:13 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= , =?utf-8?q?Christian_K=C3=B6nig?= Subject: [PATCH v5 09/13] drm/i915/ttm: add tt shmem backend Date: Mon, 27 Sep 2021 12:41:10 +0100 Message-Id: <20210927114114.152310-9-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" For cached objects we can allocate our pages directly in shmem. This should make it possible(in a later patch) to utilise the existing i915-gem shrinker code for such objects. For now this is still disabled. v2(Thomas): - Add optional try_to_writeback hook for objects. Importantly we need to check if the object is even still shrinkable; in between us dropping the shrinker LRU lock and acquiring the object lock it could for example have been moved. Also we need to differentiate between "lazy" shrinking and the immediate writeback mode. Also later we need to handle objects which don't even have mm.pages, so bundling this into put_pages() would require somehow handling that edge case, hence just letting the ttm backend handle everything in try_to_writeback doesn't seem too bad. v3(Thomas): - Likely a bad idea to touch the object from the unpopulate hook, since it's not possible to hold a reference, without also creating circular dependency, so likely this is too fragile. For now just ensure we at least mark the pages as dirty/accessed when called from the shrinker on WILLNEED objects. - s/try_to_writeback/shrinker_release_pages, since this can do more than just writeback. - Get rid of do_backup boolean and just set the SWAPPED flag prior to calling unpopulate. - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since these just get skipped anyway. We can try to come up with something better later. Signed-off-by: Matthew Auld Cc: Thomas Hellström Cc: Christian König Reviewed-by: Thomas Hellström Signed-off-by: Matthew Auld --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 8 + .../gpu/drm/i915/gem/i915_gem_object_types.h | 2 + drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 14 +- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 17 +- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 240 ++++++++++++++++-- 5 files changed, 245 insertions(+), 36 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 3043fcbd31bd..1c9a1d8d3434 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -601,6 +601,14 @@ int i915_gem_object_wait_migration(struct drm_i915_gem_object *obj, bool i915_gem_object_placement_possible(struct drm_i915_gem_object *obj, enum intel_memory_type type); +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, + size_t size, struct intel_memory_region *mr, + struct address_space *mapping, + unsigned int max_segment); +void shmem_free_st(struct sg_table *st, struct address_space *mapping, + bool dirty, bool backup); +void __shmem_writeback(size_t size, struct address_space *mapping); + #ifdef CONFIG_MMU_NOTIFIER static inline bool i915_gem_object_is_userptr(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index fa2ba9e2a4d0..f0fb17be2f7a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -56,6 +56,8 @@ struct drm_i915_gem_object_ops { struct sg_table *pages); void (*truncate)(struct drm_i915_gem_object *obj); void (*writeback)(struct drm_i915_gem_object *obj); + int (*shrinker_release_pages)(struct drm_i915_gem_object *obj, + bool should_writeback); int (*pread)(struct drm_i915_gem_object *obj, const struct drm_i915_gem_pread *arg); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c index 36b711ae9e28..19e55cc29a15 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shmem.c @@ -25,8 +25,8 @@ static void check_release_pagevec(struct pagevec *pvec) cond_resched(); } -static void shmem_free_st(struct sg_table *st, struct address_space *mapping, - bool dirty, bool backup) +void shmem_free_st(struct sg_table *st, struct address_space *mapping, + bool dirty, bool backup) { struct sgt_iter sgt_iter; struct pagevec pvec; @@ -52,10 +52,10 @@ static void shmem_free_st(struct sg_table *st, struct address_space *mapping, kfree(st); } -static struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, - size_t size, struct intel_memory_region *mr, - struct address_space *mapping, - unsigned int max_segment) +struct sg_table *shmem_alloc_st(struct drm_i915_private *i915, + size_t size, struct intel_memory_region *mr, + struct address_space *mapping, + unsigned int max_segment) { const unsigned long page_count = size / PAGE_SIZE; unsigned long i; @@ -300,7 +300,7 @@ shmem_truncate(struct drm_i915_gem_object *obj) obj->mm.pages = ERR_PTR(-EFAULT); } -static void __shmem_writeback(size_t size, struct address_space *mapping) +void __shmem_writeback(size_t size, struct address_space *mapping) { struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index e382b7f2353b..cc80bd23d323 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -56,19 +56,24 @@ static bool unsafe_drop_pages(struct drm_i915_gem_object *obj, return false; } -static void try_to_writeback(struct drm_i915_gem_object *obj, - unsigned int flags) +static int try_to_writeback(struct drm_i915_gem_object *obj, unsigned int flags) { + if (obj->ops->shrinker_release_pages) + return obj->ops->shrinker_release_pages(obj, + flags & I915_SHRINK_WRITEBACK); + switch (obj->mm.madv) { case I915_MADV_DONTNEED: i915_gem_object_truncate(obj); - return; + return 0; case __I915_MADV_PURGED: - return; + return 0; } if (flags & I915_SHRINK_WRITEBACK) i915_gem_object_writeback(obj); + + return 0; } /** @@ -222,8 +227,8 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww, } if (!__i915_gem_object_put_pages(obj)) { - try_to_writeback(obj, shrink); - count += obj->base.size >> PAGE_SHIFT; + if (!try_to_writeback(obj, shrink)) + count += obj->base.size >> PAGE_SHIFT; } if (!ww) i915_gem_object_unlock(obj); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index a77e90f300fe..c7402995a8f9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -35,6 +35,8 @@ * @ttm: The base TTM page vector. * @dev: The struct device used for dma mapping and unmapping. * @cached_st: The cached scatter-gather table. + * @is_shmem: Set if using shmem. + * @filp: The shmem file, if using shmem backend. * * Note that DMA may be going on right up to the point where the page- * vector is unpopulated in delayed destroy. Hence keep the @@ -46,6 +48,9 @@ struct i915_ttm_tt { struct ttm_tt ttm; struct device *dev; struct sg_table *cached_st; + + bool is_shmem; + struct file *filp; }; static const struct ttm_place sys_placement_flags = { @@ -179,12 +184,90 @@ i915_ttm_placement_from_obj(const struct drm_i915_gem_object *obj, placement->busy_placement = busy; } +static int i915_ttm_tt_shmem_populate(struct ttm_device *bdev, + struct ttm_tt *ttm, + struct ttm_operation_ctx *ctx) +{ + struct drm_i915_private *i915 = container_of(bdev, typeof(*i915), bdev); + struct intel_memory_region *mr = i915->mm.regions[INTEL_MEMORY_SYSTEM]; + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + const unsigned int max_segment = i915_sg_segment_size(); + const size_t size = ttm->num_pages << PAGE_SHIFT; + struct file *filp = i915_tt->filp; + struct sgt_iter sgt_iter; + struct sg_table *st; + struct page *page; + unsigned long i; + int err; + + if (!filp) { + struct address_space *mapping; + gfp_t mask; + + filp = shmem_file_setup("i915-shmem-tt", size, VM_NORESERVE); + if (IS_ERR(filp)) + return PTR_ERR(filp); + + mask = GFP_HIGHUSER | __GFP_RECLAIMABLE; + + mapping = filp->f_mapping; + mapping_set_gfp_mask(mapping, mask); + GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM)); + + i915_tt->filp = filp; + } + + st = shmem_alloc_st(i915, size, mr, filp->f_mapping, max_segment); + if (IS_ERR(st)) + return PTR_ERR(st); + + err = dma_map_sg_attrs(i915_tt->dev, + st->sgl, st->nents, + PCI_DMA_BIDIRECTIONAL, + DMA_ATTR_SKIP_CPU_SYNC | + DMA_ATTR_NO_KERNEL_MAPPING | + DMA_ATTR_NO_WARN); + if (err <= 0) { + err = -EINVAL; + goto err_free_st; + } + + i = 0; + for_each_sgt_page(page, sgt_iter, st) + ttm->pages[i++] = page; + + if (ttm->page_flags & TTM_TT_FLAG_SWAPPED) + ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED; + + i915_tt->cached_st = st; + return 0; + +err_free_st: + shmem_free_st(st, filp->f_mapping, false, false); + return err; +} + +static void i915_ttm_tt_shmem_unpopulate(struct ttm_tt *ttm) +{ + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + bool backup = ttm->page_flags & TTM_TT_FLAG_SWAPPED; + + dma_unmap_sg(i915_tt->dev, i915_tt->cached_st->sgl, + i915_tt->cached_st->nents, + PCI_DMA_BIDIRECTIONAL); + + shmem_free_st(fetch_and_zero(&i915_tt->cached_st), + file_inode(i915_tt->filp)->i_mapping, + backup, backup); +} + static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, uint32_t page_flags) { struct ttm_resource_manager *man = ttm_manager_type(bo->bdev, bo->resource->mem_type); struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); + enum ttm_caching caching = i915_ttm_select_tt_caching(obj); struct i915_ttm_tt *i915_tt; int ret; @@ -196,36 +279,62 @@ static struct ttm_tt *i915_ttm_tt_create(struct ttm_buffer_object *bo, man->use_tt) page_flags |= TTM_TT_FLAG_ZERO_ALLOC; - ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, - i915_ttm_select_tt_caching(obj)); - if (ret) { - kfree(i915_tt); - return NULL; + if (i915_gem_object_is_shrinkable(obj) && caching == ttm_cached) { + page_flags |= TTM_TT_FLAG_EXTERNAL | + TTM_TT_FLAG_EXTERNAL_MAPPABLE; + i915_tt->is_shmem = true; } + ret = ttm_tt_init(&i915_tt->ttm, bo, page_flags, caching); + if (ret) + goto err_free; + i915_tt->dev = obj->base.dev->dev; return &i915_tt->ttm; + +err_free: + kfree(i915_tt); + return NULL; +} + +static int i915_ttm_tt_populate(struct ttm_device *bdev, + struct ttm_tt *ttm, + struct ttm_operation_ctx *ctx) +{ + struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + + if (i915_tt->is_shmem) + return i915_ttm_tt_shmem_populate(bdev, ttm, ctx); + + return ttm_pool_alloc(&bdev->pool, ttm, ctx); } static void i915_ttm_tt_unpopulate(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); - if (i915_tt->cached_st) { - dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st, - DMA_BIDIRECTIONAL, 0); - sg_free_table(i915_tt->cached_st); - kfree(i915_tt->cached_st); - i915_tt->cached_st = NULL; + if (i915_tt->is_shmem) { + i915_ttm_tt_shmem_unpopulate(ttm); + } else { + if (i915_tt->cached_st) { + dma_unmap_sgtable(i915_tt->dev, i915_tt->cached_st, + DMA_BIDIRECTIONAL, 0); + sg_free_table(i915_tt->cached_st); + kfree(i915_tt->cached_st); + i915_tt->cached_st = NULL; + } + ttm_pool_free(&bdev->pool, ttm); } - ttm_pool_free(&bdev->pool, ttm); } static void i915_ttm_tt_destroy(struct ttm_device *bdev, struct ttm_tt *ttm) { struct i915_ttm_tt *i915_tt = container_of(ttm, typeof(*i915_tt), ttm); + if (i915_tt->filp) + fput(i915_tt->filp); + ttm_tt_fini(ttm); kfree(i915_tt); } @@ -235,6 +344,14 @@ static bool i915_ttm_eviction_valuable(struct ttm_buffer_object *bo, { struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo); + /* + * EXTERNAL objects should never be swapped out by TTM, instead we need + * to handle that ourselves. TTM will already skip such objects for us, + * but we would like to avoid grabbing locks for no good reason. + */ + if (bo->ttm && bo->ttm->page_flags & TTM_TT_FLAG_EXTERNAL) + return -EBUSY; + /* Will do for now. Our pinned objects are still on TTM's LRU lists */ return i915_gem_object_evictable(obj); } @@ -328,9 +445,11 @@ static void i915_ttm_adjust_gem_after_move(struct drm_i915_gem_object *obj) i915_gem_object_set_cache_coherency(obj, cache_level); } -static void i915_ttm_purge(struct drm_i915_gem_object *obj) +static int __i915_ttm_purge(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); struct ttm_operation_ctx ctx = { .interruptible = true, .no_wait_gpu = false, @@ -339,17 +458,79 @@ static void i915_ttm_purge(struct drm_i915_gem_object *obj) int ret; if (obj->mm.madv == __I915_MADV_PURGED) - return; + return 0; - /* TTM's purge interface. Note that we might be reentering. */ ret = ttm_bo_validate(bo, &place, &ctx); - if (!ret) { - obj->write_domain = 0; - obj->read_domains = 0; - i915_ttm_adjust_gem_after_move(obj); - i915_ttm_free_cached_io_st(obj); - obj->mm.madv = __I915_MADV_PURGED; + if (ret) + return ret; + + if (bo->ttm && i915_tt->filp) { + /* + * The below fput(which eventually calls shmem_truncate) might + * be delayed by worker, so when directly called to purge the + * pages(like by the shrinker) we should try to be more + * aggressive and release the pages immediately. + */ + shmem_truncate_range(file_inode(i915_tt->filp), + 0, (loff_t)-1); + fput(fetch_and_zero(&i915_tt->filp)); + } + + obj->write_domain = 0; + obj->read_domains = 0; + i915_ttm_adjust_gem_after_move(obj); + i915_ttm_free_cached_io_st(obj); + obj->mm.madv = __I915_MADV_PURGED; + return 0; +} + +static void i915_ttm_purge(struct drm_i915_gem_object *obj) +{ + __i915_ttm_purge(obj); +} + +static int i915_ttm_shrinker_release_pages(struct drm_i915_gem_object *obj, + bool should_writeback) +{ + struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); + struct ttm_operation_ctx ctx = { + .interruptible = true, + .no_wait_gpu = false, + }; + struct ttm_placement place = {}; + int ret; + + if (!bo->ttm || bo->resource->mem_type != TTM_PL_SYSTEM) + return 0; + + GEM_BUG_ON(!i915_tt->is_shmem); + + if (!i915_tt->filp) + return 0; + + switch (obj->mm.madv) { + case I915_MADV_DONTNEED: + return __i915_ttm_purge(obj); + case __I915_MADV_PURGED: + return 0; + } + + if (bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED) + return 0; + + bo->ttm->page_flags |= TTM_TT_FLAG_SWAPPED; + ret = ttm_bo_validate(bo, &place, &ctx); + if (ret) { + bo->ttm->page_flags &= ~TTM_TT_FLAG_SWAPPED; + return ret; } + + if (should_writeback) + __shmem_writeback(obj->base.size, i915_tt->filp->f_mapping); + + return 0; } static void i915_ttm_swap_notify(struct ttm_buffer_object *bo) @@ -618,6 +799,7 @@ static unsigned long i915_ttm_io_mem_pfn(struct ttm_buffer_object *bo, static struct ttm_device_funcs i915_ttm_bo_driver = { .ttm_tt_create = i915_ttm_tt_create, + .ttm_tt_populate = i915_ttm_tt_populate, .ttm_tt_unpopulate = i915_ttm_tt_unpopulate, .ttm_tt_destroy = i915_ttm_tt_destroy, .eviction_valuable = i915_ttm_eviction_valuable, @@ -685,12 +867,17 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj, } if (!i915_gem_object_has_pages(obj)) { + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); + /* Object either has a page vector or is an iomem object */ st = bo->ttm ? i915_ttm_tt_get_st(bo->ttm) : obj->ttm.cached_io_st; if (IS_ERR(st)) return PTR_ERR(st); __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); + if (!bo->ttm || !i915_tt->is_shmem) + i915_gem_object_make_unshrinkable(obj); } return ret; @@ -770,6 +957,8 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj, static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) { struct ttm_buffer_object *bo = i915_gem_to_ttm(obj); + struct i915_ttm_tt *i915_tt = + container_of(bo->ttm, typeof(*i915_tt), ttm); /* * Don't manipulate the TTM LRUs while in TTM bo destruction. @@ -782,7 +971,10 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) * Put on the correct LRU list depending on the MADV status */ spin_lock(&bo->bdev->lru_lock); - if (obj->mm.madv != I915_MADV_WILLNEED) { + if (bo->ttm && i915_tt->filp) { + /* Try to keep shmem_tt from being considered for shrinking. */ + bo->priority = TTM_MAX_BO_PRIORITY - 1; + } else if (obj->mm.madv != I915_MADV_WILLNEED) { bo->priority = I915_TTM_PRIO_PURGE; } else if (!i915_gem_object_has_pages(obj)) { if (bo->priority < I915_TTM_PRIO_HAS_PAGES) @@ -887,9 +1079,12 @@ static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = { .get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages, .truncate = i915_ttm_purge, + .shrinker_release_pages = i915_ttm_shrinker_release_pages, + .adjust_lru = i915_ttm_adjust_lru, .delayed_free = i915_ttm_delayed_free, .migrate = i915_ttm_migrate, + .mmap_offset = i915_ttm_mmap_offset, .mmap_ops = &vm_ops_ttm, }; @@ -937,7 +1132,6 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem, drm_gem_private_object_init(&i915->drm, &obj->base, size); i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags); i915_gem_object_init_memory_region(obj, mem); - i915_gem_object_make_unshrinkable(obj); INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN); mutex_init(&obj->ttm.get_io_page.lock); bo_type = (obj->flags & I915_BO_ALLOC_USER) ? ttm_bo_type_device : From patchwork Mon Sep 27 11:41:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519735 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F00D8C433FE for ; Mon, 27 Sep 2021 11:43:21 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BAB7060F4B for ; Mon, 27 Sep 2021 11:43:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BAB7060F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3260D6E83D; Mon, 27 Sep 2021 11:42:27 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id E156689FF7; Mon, 27 Sep 2021 11:42:16 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610629" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610629" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:16 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159069" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:15 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH v5 10/13] drm/i915: try to simplify make_{un}shrinkable Date: Mon, 27 Sep 2021 12:41:11 +0100 Message-Id: <20210927114114.152310-10-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Drop the atomic shrink_pin stuff, and just have make_{un}shrinkable update the shrinker visible lists immediately. This at least simplifies the next patch, and does make the behaviour more obvious. The potential downside is that make_unshrinkable now grabs a global lock even when the object itself is no longer shrinkable(transitioning from purgeable <-> shrinkable doesn't seem to be a thing), for example in the ppGTT insertion paths we should now be careful not to needlessly call make_unshrinkable multiple times. Outside of that there is some fallout in intel_context which relies on nesting calls to shrink_pin. Signed-off-by: Matthew Auld Cc: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.c | 9 ---- .../gpu/drm/i915/gem/i915_gem_object_types.h | 3 +- drivers/gpu/drm/i915/gem/i915_gem_pages.c | 16 +----- drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 52 +++++++++++++------ drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 1 - drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 1 - drivers/gpu/drm/i915/gt/intel_context.c | 9 +--- 7 files changed, 41 insertions(+), 50 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.c b/drivers/gpu/drm/i915/gem/i915_gem_object.c index 6fb9afb65034..e8265a432fcb 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.c @@ -305,15 +305,6 @@ static void i915_gem_free_object(struct drm_gem_object *gem_obj) */ atomic_inc(&i915->mm.free_count); - /* - * This serializes freeing with the shrinker. Since the free - * is delayed, first by RCU then by the workqueue, we want the - * shrinker to be able to free pages of unreferenced objects, - * or else we may oom whilst there are plenty of deferred - * freed objects. - */ - i915_gem_object_make_unshrinkable(obj); - /* * Since we require blocking on struct_mutex to unbind the freed * object from the GPU before releasing resources back to the diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h index f0fb17be2f7a..e4f8a6774da8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -461,7 +461,6 @@ struct drm_i915_gem_object { * instead go through the pin/unpin interfaces. */ atomic_t pages_pin_count; - atomic_t shrink_pin; /** * Priority list of potential placements for this object. @@ -522,7 +521,7 @@ struct drm_i915_gem_object { struct i915_gem_object_page_iter get_dma_page; /** - * Element within i915->mm.unbound_list or i915->mm.bound_list, + * Element within i915->mm.shrink_list or i915->mm.purge_list, * locked by i915->mm.obj_lock. */ struct list_head link; diff --git a/drivers/gpu/drm/i915/gem/i915_gem_pages.c b/drivers/gpu/drm/i915/gem/i915_gem_pages.c index 8eb1c3a6fc9c..f0df1394d7f6 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_pages.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_pages.c @@ -64,28 +64,16 @@ void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj, GEM_BUG_ON(i915_gem_object_has_tiling_quirk(obj)); i915_gem_object_set_tiling_quirk(obj); GEM_BUG_ON(!list_empty(&obj->mm.link)); - atomic_inc(&obj->mm.shrink_pin); shrinkable = false; } if (shrinkable) { - struct list_head *list; - unsigned long flags; - assert_object_held(obj); - spin_lock_irqsave(&i915->mm.obj_lock, flags); - - i915->mm.shrink_count++; - i915->mm.shrink_memory += obj->base.size; if (obj->mm.madv != I915_MADV_WILLNEED) - list = &i915->mm.purge_list; + i915_gem_object_make_purgeable(obj); else - list = &i915->mm.shrink_list; - list_add_tail(&obj->mm.link, list); - - atomic_set(&obj->mm.shrink_pin, 0); - spin_unlock_irqrestore(&i915->mm.obj_lock, flags); + i915_gem_object_make_shrinkable(obj); } } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index cc80bd23d323..0440696f786a 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -460,23 +460,26 @@ void i915_gem_shrinker_taints_mutex(struct drm_i915_private *i915, #define obj_to_i915(obj__) to_i915((obj__)->base.dev) +/** + * i915_gem_object_make_unshrinkable - Hide the object from the shrinker. By + * default all object types that support shrinking(see IS_SHRINKABLE), will also + * make the object visible to the shrinker after allocating the system memory + * pages. + * @obj: The GEM object. + * + * This is typically used for special kernel internal objects that can't be + * easily processed by the shrinker, like if they are perma-pinned. + */ void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj) { struct drm_i915_private *i915 = obj_to_i915(obj); unsigned long flags; - /* - * We can only be called while the pages are pinned or when - * the pages are released. If pinned, we should only be called - * from a single caller under controlled conditions; and on release - * only one caller may release us. Neither the two may cross. - */ - if (atomic_add_unless(&obj->mm.shrink_pin, 1, 0)) + if (!i915_gem_object_is_shrinkable(obj)) return; spin_lock_irqsave(&i915->mm.obj_lock, flags); - if (!atomic_fetch_inc(&obj->mm.shrink_pin) && - !list_empty(&obj->mm.link)) { + if (!list_empty(&obj->mm.link)) { list_del_init(&obj->mm.link); i915->mm.shrink_count--; i915->mm.shrink_memory -= obj->base.size; @@ -494,28 +497,45 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, if (!i915_gem_object_is_shrinkable(obj)) return; - if (atomic_add_unless(&obj->mm.shrink_pin, -1, 1)) - return; - spin_lock_irqsave(&i915->mm.obj_lock, flags); + GEM_BUG_ON(!kref_read(&obj->base.refcount)); - if (atomic_dec_and_test(&obj->mm.shrink_pin)) { - GEM_BUG_ON(!list_empty(&obj->mm.link)); - list_add_tail(&obj->mm.link, head); + if (list_empty(&obj->mm.link)) { i915->mm.shrink_count++; i915->mm.shrink_memory += obj->base.size; - + list_add_tail(&obj->mm.link, head); + } else { + list_move_tail(&obj->mm.link, head); } + spin_unlock_irqrestore(&i915->mm.obj_lock, flags); } + +/** + * i915_gem_object_make_shrinkable - Move the object to the tail of the + * shrinkable list. Objects on this list might be swapped out. Used with + * WILLNEED objects. + * @obj: The GEM object. + * + * Should only be called on objects which have backing pages. + */ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) { __i915_gem_object_make_shrinkable(obj, &obj_to_i915(obj)->mm.shrink_list); } +/** + * i915_gem_object_make_purgeable - Move the object to the tail of the purgeable + * list. Used with DONTNEED objects. Unlike with shrinkable objects, the + * shrinker will attempt to discard the backing pages, instead of trying to swap + * them out. + * @obj: The GEM object. + * + * Should only be called on objects which have backing pages. + */ void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj) { __i915_gem_object_make_shrinkable(obj, diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index 890191f286e3..baea9770200a 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -185,7 +185,6 @@ static void gen6_alloc_va_range(struct i915_address_space *vm, pt = stash->pt[0]; __i915_gem_object_pin_pages(pt->base); - i915_gem_object_make_unshrinkable(pt->base); fill32_px(pt, vm->scratch[0]->encode); diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index 037a9a6e4889..8af2f709571c 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -301,7 +301,6 @@ static void __gen8_ppgtt_alloc(struct i915_address_space * const vm, pt = stash->pt[!!lvl]; __i915_gem_object_pin_pages(pt->base); - i915_gem_object_make_unshrinkable(pt->base); fill_px(pt, vm->scratch[lvl]->encode); diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index ff637147b1a9..1b7dc57e6ec1 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -111,11 +111,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww) if (err) goto err_unpin; - /* - * And mark it as a globally pinned object to let the shrinker know - * it cannot reclaim the object until we release it. - */ - i915_vma_make_unshrinkable(vma); vma->obj->mm.dirty = true; return 0; @@ -127,7 +122,6 @@ static int __context_pin_state(struct i915_vma *vma, struct i915_gem_ww_ctx *ww) static void __context_unpin_state(struct i915_vma *vma) { - i915_vma_make_shrinkable(vma); i915_active_release(&vma->active); __i915_vma_unpin(vma); } @@ -180,7 +174,6 @@ static int intel_context_pre_pin(struct intel_context *ce, if (err) goto err_timeline; - return 0; err_timeline: @@ -338,6 +331,8 @@ static void __intel_context_retire(struct i915_active *active) set_bit(CONTEXT_VALID_BIT, &ce->flags); intel_context_post_unpin(ce); + if (ce->state) + i915_vma_make_shrinkable(ce->state); intel_context_put(ce); } From patchwork Mon Sep 27 11:41:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519737 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BBB0FC433F5 for ; Mon, 27 Sep 2021 11:43:30 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8ED3360F58 for ; Mon, 27 Sep 2021 11:43:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 8ED3360F58 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B2A6D6E84E; Mon, 27 Sep 2021 11:42:33 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id C1ABA89FEC; Mon, 27 Sep 2021 11:42:18 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610636" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610636" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:18 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159076" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:16 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH v5 11/13] drm/i915/ttm: make evicted shmem pages visible to the shrinker Date: Mon, 27 Sep 2021 12:41:12 +0100 Message-Id: <20210927114114.152310-11-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" We currently just evict lmem objects to system memory when under memory pressure, and in the next patch we want to use the shmem backend even for this case. For this case we lack the usual object mm.pages, which effectively hides the pages from the i915-gem shrinker, until we actually "attach" the TT to the object, or in the case of lmem-only objects it just gets migrated back to lmem when touched again. For all cases we can just adjust the i915 shrinker LRU each time we also adjust the TTM LRU. The two cases we care about are: 1) When something is moved by TTM, including when initially populating an object. Importantly this covers the case where TTM moves something from lmem <-> smem, outside of the normal get_pages() interface, which should still ensure the shmem pages underneath are reclaimable. 2) When calling into i915_gem_object_unlock(). The unlock should ensure the object is removed from the shinker LRU, if it was indeed swapped out, or just purged, when the shrinker drops the object lock. We can optimise this(if needed) by tracking if the object is already visible to the shrinker(protected by the object lock), so we don't touch the shrinker LRU more than needed. v2(Thomas) - Handle managing the shrinker LRU in adjust_lru, where it is always safe to touch the object. Signed-off-by: Matthew Auld Cc: Thomas Hellström Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_object.h | 1 + drivers/gpu/drm/i915/gem/i915_gem_shrinker.c | 29 +++++++++++++++----- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 28 +++++++++++++++---- 3 files changed, 46 insertions(+), 12 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h index 1c9a1d8d3434..640dfbf1f01e 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -523,6 +523,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj, void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj); void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj); +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj); void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj); static inline bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c index 0440696f786a..4b6b2bb6f180 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_shrinker.c @@ -487,13 +487,12 @@ void i915_gem_object_make_unshrinkable(struct drm_i915_gem_object *obj) spin_unlock_irqrestore(&i915->mm.obj_lock, flags); } -static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, - struct list_head *head) +static void ___i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, + struct list_head *head) { struct drm_i915_private *i915 = obj_to_i915(obj); unsigned long flags; - GEM_BUG_ON(!i915_gem_object_has_pages(obj)); if (!i915_gem_object_is_shrinkable(obj)) return; @@ -512,6 +511,21 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, spin_unlock_irqrestore(&i915->mm.obj_lock, flags); } +/** + * __i915_gem_object_make_shrinkable - Move the object to the tail of the + * shrinkable list. Objects on this list might be swapped out. Used with + * WILLNEED objects. + * @obj: The GEM object. + * + * DO NOT USE. This is intended to be called on very special objects that don't + * yet have mm.pages, but are guaranteed to have potentially reclaimable pages + * underneath. + */ +void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) +{ + ___i915_gem_object_make_shrinkable(obj, + &obj_to_i915(obj)->mm.shrink_list); +} /** * i915_gem_object_make_shrinkable - Move the object to the tail of the @@ -523,8 +537,8 @@ static void __i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj, */ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) { - __i915_gem_object_make_shrinkable(obj, - &obj_to_i915(obj)->mm.shrink_list); + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + __i915_gem_object_make_shrinkable(obj); } /** @@ -538,6 +552,7 @@ void i915_gem_object_make_shrinkable(struct drm_i915_gem_object *obj) */ void i915_gem_object_make_purgeable(struct drm_i915_gem_object *obj) { - __i915_gem_object_make_shrinkable(obj, - &obj_to_i915(obj)->mm.purge_list); + GEM_BUG_ON(!i915_gem_object_has_pages(obj)); + ___i915_gem_object_make_shrinkable(obj, + &obj_to_i915(obj)->mm.purge_list); } diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index c7402995a8f9..194e5f1deda8 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -749,6 +749,8 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict, return ret; } + i915_ttm_adjust_lru(obj); + dst_st = i915_ttm_resource_get_st(obj, dst_mem); if (IS_ERR(dst_st)) return PTR_ERR(dst_st); @@ -856,7 +858,6 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj, return i915_ttm_err_to_gem(ret); } - i915_ttm_adjust_lru(obj); if (bo->ttm && !ttm_tt_is_populated(bo->ttm)) { ret = ttm_tt_populate(bo->bdev, bo->ttm, &ctx); if (ret) @@ -876,10 +877,10 @@ static int __i915_ttm_get_pages(struct drm_i915_gem_object *obj, return PTR_ERR(st); __i915_gem_object_set_pages(obj, st, i915_sg_dma_sizes(st->sgl)); - if (!bo->ttm || !i915_tt->is_shmem) - i915_gem_object_make_unshrinkable(obj); } + i915_ttm_adjust_lru(obj); + return ret; } @@ -950,8 +951,6 @@ static void i915_ttm_put_pages(struct drm_i915_gem_object *obj, * If the object is not destroyed next, The TTM eviction logic * and shrinkers will move it out if needed. */ - - i915_ttm_adjust_lru(obj); } static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) @@ -967,6 +966,17 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) if (!kref_read(&bo->kref)) return; + /* + * Even if we lack mm.pages for this object(which will be the case when + * something is evicted to system memory by TTM), we still want to make + * this object visible to the shrinker, since the underlying ttm_tt + * still has the real shmem pages. + */ + if (bo->ttm && i915_tt->filp && ttm_tt_is_populated(bo->ttm)) + __i915_gem_object_make_shrinkable(obj); + else + i915_gem_object_make_unshrinkable(obj); + /* * Put on the correct LRU list depending on the MADV status */ @@ -1006,6 +1016,14 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj) static void i915_ttm_delayed_free(struct drm_i915_gem_object *obj) { if (obj->ttm.created) { + /* + * We freely manage the shrinker LRU outide of the mm.pages life + * cycle. As a result when destroying the object it's up to us + * to ensure we remove it from the LRU, before we free the + * object. + */ + i915_gem_object_make_unshrinkable(obj); + ttm_bo_put(i915_gem_to_ttm(obj)); } else { __i915_gem_free_object(obj); From patchwork Mon Sep 27 11:41:13 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519733 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD6FAC433EF for ; Mon, 27 Sep 2021 11:43:18 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9002960F4B for ; Mon, 27 Sep 2021 11:43:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 9002960F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E5D1689FEC; Mon, 27 Sep 2021 11:42:22 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 13C166E841; Mon, 27 Sep 2021 11:42:20 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610642" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610642" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:19 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159092" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:18 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH v5 12/13] drm/i915/ttm: use cached system pages when evicting lmem Date: Mon, 27 Sep 2021 12:41:13 +0100 Message-Id: <20210927114114.152310-12-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This should let us do an accelerated copy directly to the shmem pages when temporarily moving lmem-only objects, where the i915-gem shrinker can later kick in to swap out the pages, if needed. Signed-off-by: Matthew Auld Cc: Thomas Hellström Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 194e5f1deda8..46d57541c0b2 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -134,11 +134,11 @@ static enum ttm_caching i915_ttm_select_tt_caching(const struct drm_i915_gem_object *obj) { /* - * Objects only allowed in system get cached cpu-mappings. - * Other objects get WC mapping for now. Even if in system. + * Objects only allowed in system get cached cpu-mappings, or when + * evicting lmem-only buffers to system for swapping. Other objects get + * WC mapping for now. Even if in system. */ - if (obj->mm.region->type == INTEL_MEMORY_SYSTEM && - obj->mm.n_placements <= 1) + if (obj->mm.n_placements <= 1) return ttm_cached; return ttm_write_combined; From patchwork Mon Sep 27 11:41:14 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matthew Auld X-Patchwork-Id: 12519739 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A7C8C433EF for ; Mon, 27 Sep 2021 11:43:35 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 27D9860F4B for ; Mon, 27 Sep 2021 11:43:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 27D9860F4B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0DBB36E852; Mon, 27 Sep 2021 11:42:34 +0000 (UTC) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6D5C66E848; Mon, 27 Sep 2021 11:42:21 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10119"; a="204610647" X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="204610647" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:21 -0700 X-IronPort-AV: E=Sophos;i="5.85,326,1624345200"; d="scan'208";a="553159105" Received: from ajhome-mobl.ger.corp.intel.com (HELO mwauld-desk1.intel.com) ([10.252.19.222]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Sep 2021 04:42:20 -0700 From: Matthew Auld To: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org, =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH v5 13/13] drm/i915/ttm: enable shmem tt backend Date: Mon, 27 Sep 2021 12:41:14 +0100 Message-Id: <20210927114114.152310-13-matthew.auld@intel.com> X-Mailer: git-send-email 2.26.3 In-Reply-To: <20210927114114.152310-1-matthew.auld@intel.com> References: <20210927114114.152310-1-matthew.auld@intel.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Turn on the shmem tt backend, and enable shrinking. Signed-off-by: Matthew Auld Cc: Thomas Hellström Reviewed-by: Thomas Hellström --- drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c index 46d57541c0b2..4ae630fbc5cd 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_ttm.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_ttm.c @@ -1093,6 +1093,7 @@ static u64 i915_ttm_mmap_offset(struct drm_i915_gem_object *obj) static const struct drm_i915_gem_object_ops i915_gem_ttm_obj_ops = { .name = "i915_gem_object_ttm", + .flags = I915_GEM_OBJECT_IS_SHRINKABLE, .get_pages = i915_ttm_get_pages, .put_pages = i915_ttm_put_pages,