From patchwork Mon Oct 11 16:11:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66AEDC433FE for ; Mon, 11 Oct 2021 16:09:23 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2A8D360EB1 for ; Mon, 11 Oct 2021 16:09:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2A8D360EB1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 41AAF6E58A; Mon, 11 Oct 2021 16:09:19 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 313526E58A; Mon, 11 Oct 2021 16:09:18 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056677" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056677" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:18 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477876" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:15 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Stuart Summers , Ramalingam C Date: Mon, 11 Oct 2021 21:41:42 +0530 Message-Id: <20211011161155.6397-2-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 01/14] drm/i915: Add has_64k_pages flag X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Stuart Summers Add a new platform flag, has_64k_pages, for platforms supporting base page sizes of 64k. Signed-off-by: Stuart Summers Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_pci.c | 2 ++ drivers/gpu/drm/i915/intel_device_info.h | 1 + 3 files changed, 5 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 12256218634f..a16fde38a252 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1714,6 +1714,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_MSLICES(dev_priv) \ (INTEL_INFO(dev_priv)->has_mslices) +#define HAS_64K_PAGES(dev_priv) (INTEL_INFO(dev_priv)->has_64k_pages) + #define HAS_IPC(dev_priv) (INTEL_INFO(dev_priv)->display.has_ipc) #define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i)) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 169837de395d..8ef484a23652 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -1015,6 +1015,7 @@ static const struct intel_device_info xehpsdv_info = { DGFX_FEATURES, PLATFORM(INTEL_XEHPSDV), .display = { }, + .has_64k_pages = 1, .pipe_mask = 0, .platform_engine_mask = BIT(RCS0) | BIT(BCS0) | @@ -1033,6 +1034,7 @@ static const struct intel_device_info dg2_info = { .graphics_rel = 55, .media_rel = 55, PLATFORM(INTEL_DG2), + .has_64k_pages = 1, .platform_engine_mask = BIT(RCS0) | BIT(BCS0) | BIT(VECS0) | BIT(VECS1) | diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index 8e6f48d1eb7b..dd453b96af19 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -123,6 +123,7 @@ enum intel_ppgtt_type { func(is_dgfx); \ /* Keep has_* in alphabetical order */ \ func(has_64bit_reloc); \ + func(has_64k_pages); \ func(gpu_reset_clobbers_display); \ func(has_reset_engine); \ func(has_global_mocs); \ From patchwork Mon Oct 11 16:11:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7A9E3C433F5 for ; Mon, 11 Oct 2021 16:09:28 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3A4E760EB6 for ; Mon, 11 Oct 2021 16:09:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3A4E760EB6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9BECC6E5A3; Mon, 11 Oct 2021 16:09:22 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id ABDAE6E5A3; Mon, 11 Oct 2021 16:09:21 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056693" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056693" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:21 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477887" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:18 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ramalingam C , Joonas Lahtinen , Rodrigo Vivi Date: Mon, 11 Oct 2021 21:41:43 +0530 Message-Id: <20211011161155.6397-3-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 02/14] drm/i915/xehpsdv: set min page-size to 64K X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Auld LMEM should be allocated at 64K granularity, since 4K page support will eventually be dropped for LMEM when using the PPGTT. Signed-off-by: Matthew Auld Signed-off-by: Stuart Summers Signed-off-by: Ramalingam C Cc: Joonas Lahtinen Cc: Rodrigo Vivi --- drivers/gpu/drm/i915/gem/i915_gem_stolen.c | 6 +++++- drivers/gpu/drm/i915/gt/intel_region_lmem.c | 5 ++++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c index ddd37ccb1362..f52a06f05fc7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_stolen.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_stolen.c @@ -778,6 +778,7 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type, struct intel_uncore *uncore = &i915->uncore; struct pci_dev *pdev = to_pci_dev(i915->drm.dev); struct intel_memory_region *mem; + resource_size_t min_page_size; resource_size_t io_start; resource_size_t lmem_size; u64 lmem_base; @@ -789,8 +790,11 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type, lmem_size = pci_resource_len(pdev, 2) - lmem_base; io_start = pci_resource_start(pdev, 2) + lmem_base; + min_page_size = HAS_64K_PAGES(i915) ? I915_GTT_PAGE_SIZE_64K : + I915_GTT_PAGE_SIZE_4K; + mem = intel_memory_region_create(i915, lmem_base, lmem_size, - I915_GTT_PAGE_SIZE_4K, io_start, + min_page_size, io_start, type, instance, &i915_region_stolen_lmem_ops); if (IS_ERR(mem)) diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c index afb35d2e5c73..073d28d96669 100644 --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c @@ -193,6 +193,7 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt) struct intel_uncore *uncore = gt->uncore; struct pci_dev *pdev = to_pci_dev(i915->drm.dev); struct intel_memory_region *mem; + resource_size_t min_page_size; resource_size_t io_start; resource_size_t lmem_size; int err; @@ -207,10 +208,12 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt) if (GEM_WARN_ON(lmem_size > pci_resource_len(pdev, 2))) return ERR_PTR(-ENODEV); + min_page_size = HAS_64K_PAGES(i915) ? I915_GTT_PAGE_SIZE_64K : + I915_GTT_PAGE_SIZE_4K; mem = intel_memory_region_create(i915, 0, lmem_size, - I915_GTT_PAGE_SIZE_4K, + min_page_size, io_start, INTEL_MEMORY_LOCAL, 0, From patchwork Mon Oct 11 16:11:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEE4DC433F5 for ; Mon, 11 Oct 2021 16:09:32 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id BDD6160551 for ; Mon, 11 Oct 2021 16:09:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org BDD6160551 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DBB886E849; Mon, 11 Oct 2021 16:09:26 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id F01776E88A; Mon, 11 Oct 2021 16:09:24 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056711" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056711" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:24 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477893" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:21 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ramalingam C , Joonas Lahtinen , Rodrigo Vivi Date: Mon, 11 Oct 2021 21:41:44 +0530 Message-Id: <20211011161155.6397-4-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 03/14] drm/i915/xehpsdv: enforce min GTT alignment X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Auld For local-memory objects we need to align the GTT addresses to 64K, both for the ppgtt and ggtt. Signed-off-by: Matthew Auld Signed-off-by: Stuart Summers Signed-off-by: Ramalingam C Cc: Joonas Lahtinen Cc: Rodrigo Vivi --- drivers/gpu/drm/i915/i915_vma.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 4b7fc4647e46..1ea1fa08efdf 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -670,8 +670,13 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) } color = 0; - if (vma->obj && i915_vm_has_cache_coloring(vma->vm)) - color = vma->obj->cache_level; + if (vma->obj) { + if (HAS_64K_PAGES(vma->vm->i915) && i915_gem_object_is_lmem(vma->obj)) + alignment = max(alignment, I915_GTT_PAGE_SIZE_64K); + + if (i915_vm_has_cache_coloring(vma->vm)) + color = vma->obj->cache_level; + } if (flags & PIN_OFFSET_FIXED) { u64 offset = flags & PIN_OFFSET_MASK; From patchwork Mon Oct 11 16:11:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61367C433EF for ; Mon, 11 Oct 2021 16:09:39 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 26D0F60551 for ; Mon, 11 Oct 2021 16:09:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 26D0F60551 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 905EF6E8E2; Mon, 11 Oct 2021 16:09:31 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2ACFC6E8DD; Mon, 11 Oct 2021 16:09:28 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056722" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056722" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:28 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477906" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:24 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ramalingam C Date: Mon, 11 Oct 2021 21:41:45 +0530 Message-Id: <20211011161155.6397-5-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 04/14] drm/i915: enforce min page size for scratch X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Auld If the device needs 64K minimum GTT pages for device local-memory, like on XEHPSDV, then we need to fail the allocation if we can't meet it, instead of falling back to 4K pages, otherwise we can't safely support the insertion of device local-memory pages for this vm, since the HW expects the correct physical alignment and size for every PTE, if we mark the page-table as 64K GTT mode. Signed-off-by: Matthew Auld Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/gt/intel_gtt.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 67d14afa6623..2a6eec5f0d58 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -334,6 +334,18 @@ int setup_scratch_page(struct i915_address_space *vm) if (size == I915_GTT_PAGE_SIZE_4K) return -ENOMEM; + /* + * If we need 64K minimum GTT pages for device local-memory, + * like on XEHPSDV, then we need to fail the allocation here, + * otherwise we can't safely support the insertion of + * local-memory pages for this vm, since the HW expects the + * correct physical alignment and size when the page-table is + * operating in 64K GTT mode, which includes any scratch PTEs, + * since userpsace can still touch them. + */ + if (HAS_64K_PAGES(vm->i915)) + return -ENOMEM; + size = I915_GTT_PAGE_SIZE_4K; } while (1); } From patchwork Mon Oct 11 16:11:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB221C433F5 for ; Mon, 11 Oct 2021 16:09:42 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 711F660EBB for ; Mon, 11 Oct 2021 16:09:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 711F660EBB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CEFB36E8E4; Mon, 11 Oct 2021 16:09:33 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 07B576E8E3; Mon, 11 Oct 2021 16:09:29 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056729" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056729" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:29 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477923" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:27 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ramalingam C Date: Mon, 11 Oct 2021 21:41:46 +0530 Message-Id: <20211011161155.6397-6-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 05/14] drm/i915/gtt/xehpsdv: move scratch page to system memory X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Auld On some platforms the hw has dropped support for 4K GTT pages when dealing with LMEM, and due to the design of 64K GTT pages in the hw, we can only mark the *entire* page-table as operating in 64K GTT mode, since the enable bit is still on the pde, and not the pte. And since we we still need to allow 4K GTT pages for SMEM objects, we can't have a "normal" 4K page-table with scratch pointing to LMEM, since that's undefined from the hw pov. The simplest solution is to just move the 64K scratch page to SMEM on such platforms and call it a day, since that should work for all configurations. Signed-off-by: Matthew Auld Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/gt/gen6_ppgtt.c | 1 + drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 23 +++++++++++++++++++++-- drivers/gpu/drm/i915/gt/intel_ggtt.c | 2 ++ drivers/gpu/drm/i915/gt/intel_gtt.c | 2 +- drivers/gpu/drm/i915/gt/intel_gtt.h | 2 ++ drivers/gpu/drm/i915/selftests/mock_gtt.c | 2 ++ 6 files changed, 29 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c index 890191f286e3..49e7651d764a 100644 --- a/drivers/gpu/drm/i915/gt/gen6_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen6_ppgtt.c @@ -440,6 +440,7 @@ struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt) ppgtt->base.vm.cleanup = gen6_ppgtt_cleanup; ppgtt->base.vm.alloc_pt_dma = alloc_pt_dma; + ppgtt->base.vm.alloc_scratch_dma = alloc_pt_dma; ppgtt->base.vm.pte_encode = ggtt->vm.pte_encode; ppgtt->base.pd = __alloc_pd(I915_PDES); diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index 037a9a6e4889..6bff6bf1a450 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -777,10 +777,29 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt, */ ppgtt->vm.has_read_only = !IS_GRAPHICS_VER(gt->i915, 11, 12); - if (HAS_LMEM(gt->i915)) + if (HAS_LMEM(gt->i915)) { ppgtt->vm.alloc_pt_dma = alloc_pt_lmem; - else + + /* + * On some platforms the hw has dropped support for 4K GTT pages + * when dealing with LMEM, and due to the design of 64K GTT + * pages in the hw, we can only mark the *entire* page-table as + * operating in 64K GTT mode, since the enable bit is still on + * the pde, and not the pte. And since we still need to allow + * 4K GTT pages for SMEM objects, we can't have a "normal" 4K + * page-table with scratch pointing to LMEM, since that's + * undefined from the hw pov. The simplest solution is to just + * move the 64K scratch page to SMEM on such platforms and call + * it a day, since that should work for all configurations. + */ + if (HAS_64K_PAGES(gt->i915)) + ppgtt->vm.alloc_scratch_dma = alloc_pt_dma; + else + ppgtt->vm.alloc_scratch_dma = alloc_pt_lmem; + } else { ppgtt->vm.alloc_pt_dma = alloc_pt_dma; + ppgtt->vm.alloc_scratch_dma = alloc_pt_dma; + } err = gen8_init_scratch(&ppgtt->vm); if (err) diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt.c b/drivers/gpu/drm/i915/gt/intel_ggtt.c index f17383e76eb7..289316007029 100644 --- a/drivers/gpu/drm/i915/gt/intel_ggtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ggtt.c @@ -1077,6 +1077,7 @@ static int gen6_gmch_probe(struct i915_ggtt *ggtt) ggtt->vm.total = (size / sizeof(gen6_pte_t)) * I915_GTT_PAGE_SIZE; ggtt->vm.alloc_pt_dma = alloc_pt_dma; + ggtt->vm.alloc_scratch_dma = alloc_pt_dma; ggtt->vm.clear_range = nop_clear_range; if (!HAS_FULL_PPGTT(i915) || intel_scanout_needs_vtd_wa(i915)) @@ -1129,6 +1130,7 @@ static int i915_gmch_probe(struct i915_ggtt *ggtt) (struct resource)DEFINE_RES_MEM(gmadr_base, ggtt->mappable_end); ggtt->vm.alloc_pt_dma = alloc_pt_dma; + ggtt->vm.alloc_scratch_dma = alloc_pt_dma; if (needs_idle_maps(i915)) { drm_notice(&i915->drm, diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 2a6eec5f0d58..56fbd37a6b54 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -298,7 +298,7 @@ int setup_scratch_page(struct i915_address_space *vm) do { struct drm_i915_gem_object *obj; - obj = vm->alloc_pt_dma(vm, size); + obj = vm->alloc_scratch_dma(vm, size); if (IS_ERR(obj)) goto skip; diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index bc6750263359..6d13f4ab4d4a 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -265,6 +265,8 @@ struct i915_address_space { struct drm_i915_gem_object * (*alloc_pt_dma)(struct i915_address_space *vm, int sz); + struct drm_i915_gem_object * + (*alloc_scratch_dma)(struct i915_address_space *vm, int sz); u64 (*pte_encode)(dma_addr_t addr, enum i915_cache_level level, diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c index cc047ec594f9..32ca8962d0ab 100644 --- a/drivers/gpu/drm/i915/selftests/mock_gtt.c +++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c @@ -78,6 +78,7 @@ struct i915_ppgtt *mock_ppgtt(struct drm_i915_private *i915, const char *name) i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT); ppgtt->vm.alloc_pt_dma = alloc_pt_dma; + ppgtt->vm.alloc_scratch_dma = alloc_pt_dma; ppgtt->vm.clear_range = mock_clear_range; ppgtt->vm.insert_page = mock_insert_page; @@ -118,6 +119,7 @@ void mock_init_ggtt(struct drm_i915_private *i915, struct i915_ggtt *ggtt) ggtt->vm.total = 4096 * PAGE_SIZE; ggtt->vm.alloc_pt_dma = alloc_pt_dma; + ggtt->vm.alloc_scratch_dma = alloc_pt_dma; ggtt->vm.clear_range = mock_clear_range; ggtt->vm.insert_page = mock_insert_page; From patchwork Mon Oct 11 16:11:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD3C8C433FE for ; Mon, 11 Oct 2021 16:09:43 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7990A60551 for ; Mon, 11 Oct 2021 16:09:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 7990A60551 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3FA296E8E5; Mon, 11 Oct 2021 16:09:35 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 51FBE6E8E5; Mon, 11 Oct 2021 16:09:33 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056740" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056740" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:33 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477934" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:29 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ramalingam C , Joonas Lahtinen , Rodrigo Vivi Date: Mon, 11 Oct 2021 21:41:47 +0530 Message-Id: <20211011161155.6397-7-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 06/14] drm/i915/xehpsdv: support 64K GTT pages X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Auld XEHPSDV optimises 64K GTT pages for local-memory, since everything should be allocated at 64K granularity. We say goodbye to sparse entries, and instead get a compact 256B page-table for 64K pages, which should be more cache friendly. 4K pages for local-memory are no longer supported by the HW. Signed-off-by: Matthew Auld Signed-off-by: Stuart Summers Signed-off-by: Ramalingam C Cc: Joonas Lahtinen Cc: Rodrigo Vivi --- .../gpu/drm/i915/gem/selftests/huge_pages.c | 61 ++++++++++ drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 106 +++++++++++++++++- drivers/gpu/drm/i915/gt/intel_gtt.h | 3 + drivers/gpu/drm/i915/gt/intel_ppgtt.c | 1 + 4 files changed, 168 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c index 41d0680f3bd7..9c2ffa4090f1 100644 --- a/drivers/gpu/drm/i915/gem/selftests/huge_pages.c +++ b/drivers/gpu/drm/i915/gem/selftests/huge_pages.c @@ -1451,6 +1451,66 @@ static int igt_ppgtt_sanity_check(void *arg) return err; } +static int igt_ppgtt_compact(void *arg) +{ + struct i915_gem_context *ctx = arg; + struct drm_i915_private *i915 = ctx->i915; + struct drm_i915_gem_object *obj; + int err; + + /* + * Simple test to catch issues with compact 64K pages -- since the pt is + * compacted to 256B that gives us 32 entries per pt, however since the + * backing page for the pt is 4K, any extra entries we might incorrectly + * write out should be ignored by the HW. If ever hit such a case this + * test should catch it since some of our writes would land in scratch. + */ + + if (!HAS_64K_PAGES(i915)) { + pr_info("device lacks compact 64K page support, skipping\n"); + return 0; + } + + if (!HAS_LMEM(i915)) { + pr_info("device lacks LMEM support, skipping\n"); + return 0; + } + + /* We want the range to cover multiple page-table boundaries. */ + obj = i915_gem_object_create_lmem(i915, SZ_4M, 0); + if (IS_ERR(obj)) + return err; + + err = i915_gem_object_pin_pages_unlocked(obj); + if (err) + goto out_put; + + if (obj->mm.page_sizes.phys < I915_GTT_PAGE_SIZE_64K) { + pr_info("LMEM compact unable to allocate huge-page(s)\n"); + goto out_unpin; + } + + /* + * Disable 2M GTT pages by forcing the page-size to 64K for the GTT + * insertion. + */ + obj->mm.page_sizes.sg = I915_GTT_PAGE_SIZE_64K; + + err = igt_write_huge(ctx, obj); + if (err) + pr_err("LMEM compact write-huge failed\n"); + +out_unpin: + i915_gem_object_unpin_pages(obj); +out_put: + i915_gem_object_put(obj); + + if (err == -ENOMEM) + err = 0; + + return err; +} + static int igt_tmpfs_fallback(void *arg) { struct i915_gem_context *ctx = arg; @@ -1681,6 +1741,7 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *i915) SUBTEST(igt_tmpfs_fallback), SUBTEST(igt_ppgtt_smoke_huge), SUBTEST(igt_ppgtt_sanity_check), + SUBTEST(igt_ppgtt_compact), }; struct i915_gem_context *ctx; struct i915_address_space *vm; diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index 6bff6bf1a450..fec0f20f1b93 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -233,6 +233,8 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm, start, end, lvl); } else { unsigned int count; + unsigned int pte = gen8_pd_index(start, 0); + unsigned int num_ptes; u64 *vaddr; count = gen8_pt_count(start, end); @@ -242,10 +244,18 @@ static u64 __gen8_ppgtt_clear(struct i915_address_space * const vm, atomic_read(&pt->used)); GEM_BUG_ON(!count || count >= atomic_read(&pt->used)); + num_ptes = count; + if (pt->is_compact) { + GEM_BUG_ON(num_ptes % 16); + GEM_BUG_ON(pte % 16); + num_ptes /= 16; + pte /= 16; + } + vaddr = px_vaddr(pt); - memset64(vaddr + gen8_pd_index(start, 0), + memset64(vaddr + pte, vm->scratch[0]->encode, - count); + num_ptes); atomic_sub(count, &pt->used); start += count; @@ -454,6 +464,93 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt, return idx; } +static void +xehpsdv_ppgtt_insert_huge(struct i915_vma *vma, + struct sgt_dma *iter, + enum i915_cache_level cache_level, + u32 flags) +{ + const gen8_pte_t pte_encode = vma->vm->pte_encode(0, cache_level, flags); + unsigned int rem = sg_dma_len(iter->sg); + u64 start = vma->node.start; + + GEM_BUG_ON(!i915_vm_is_4lvl(vma->vm)); + + do { + struct i915_page_directory * const pdp = + gen8_pdp_for_page_address(vma->vm, start); + struct i915_page_directory * const pd = + i915_pd_entry(pdp, __gen8_pte_index(start, 2)); + struct i915_page_table *pt = + i915_pt_entry(pd, __gen8_pte_index(start, 1)); + gen8_pte_t encode = pte_encode; + unsigned int page_size; + gen8_pte_t *vaddr; + u16 index, max; + + max = I915_PDES; + + if (vma->page_sizes.sg & I915_GTT_PAGE_SIZE_2M && + IS_ALIGNED(iter->dma, I915_GTT_PAGE_SIZE_2M) && + rem >= I915_GTT_PAGE_SIZE_2M && + !__gen8_pte_index(start, 0)) { + index = __gen8_pte_index(start, 1); + encode |= GEN8_PDE_PS_2M; + page_size = I915_GTT_PAGE_SIZE_2M; + + vaddr = px_vaddr(pd); + } else { + if (encode & GEN12_PPGTT_PTE_LM) { + GEM_BUG_ON(!i915_gem_object_is_lmem(vma->obj)); + GEM_BUG_ON(__gen8_pte_index(start, 0) % 16); + GEM_BUG_ON(rem < I915_GTT_PAGE_SIZE_64K); + GEM_BUG_ON(!IS_ALIGNED(iter->dma, + I915_GTT_PAGE_SIZE_64K)); + + index = __gen8_pte_index(start, 0) / 16; + page_size = I915_GTT_PAGE_SIZE_64K; + + max /= 16; + + vaddr = px_vaddr(pd); + vaddr[__gen8_pte_index(start, 1)] |= GEN12_PDE_64K; + + pt->is_compact = true; + } else { + GEM_BUG_ON(i915_gem_object_is_lmem(vma->obj)); + GEM_BUG_ON(pt->is_compact); + index = __gen8_pte_index(start, 0); + page_size = I915_GTT_PAGE_SIZE; + } + + vaddr = px_vaddr(pt); + } + + do { + GEM_BUG_ON(rem < page_size); + vaddr[index++] = encode | iter->dma; + + start += page_size; + iter->dma += page_size; + rem -= page_size; + if (iter->dma >= iter->max) { + iter->sg = __sg_next(iter->sg); + GEM_BUG_ON(!iter->sg); + + rem = sg_dma_len(iter->sg); + GEM_BUG_ON(!rem); + iter->dma = sg_dma_address(iter->sg); + iter->max = iter->dma + rem; + + if (unlikely(!IS_ALIGNED(iter->dma, page_size))) + break; + } + } while (rem >= page_size && index < max); + + vma->page_sizes.gtt |= page_size; + } while (iter->sg && sg_dma_len(iter->sg)); +} + static void gen8_ppgtt_insert_huge(struct i915_vma *vma, struct sgt_dma *iter, enum i915_cache_level cache_level, @@ -586,7 +683,10 @@ static void gen8_ppgtt_insert(struct i915_address_space *vm, struct sgt_dma iter = sgt_dma(vma); if (vma->page_sizes.sg > I915_GTT_PAGE_SIZE) { - gen8_ppgtt_insert_huge(vma, &iter, cache_level, flags); + if (HAS_64K_PAGES(vm->i915)) + xehpsdv_ppgtt_insert_huge(vma, &iter, cache_level, flags); + else + gen8_ppgtt_insert_huge(vma, &iter, cache_level, flags); } else { u64 idx = vma->node.start >> GEN8_PTE_SHIFT; diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 6d13f4ab4d4a..6d0233ffae17 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -89,6 +89,8 @@ typedef u64 gen8_pte_t; #define GEN12_GGTT_PTE_LM BIT_ULL(1) +#define GEN12_PDE_64K BIT(6) + /* * Cacheability Control is a 4-bit value. The low three bits are stored in bits * 3:1 of the PTE, while the fourth bit is stored in bit 11 of the PTE. @@ -154,6 +156,7 @@ struct i915_page_table { atomic_t used; struct i915_page_table *stash; }; + bool is_compact; }; struct i915_page_directory { diff --git a/drivers/gpu/drm/i915/gt/intel_ppgtt.c b/drivers/gpu/drm/i915/gt/intel_ppgtt.c index 4396bfd630d8..b8238f5bc8b1 100644 --- a/drivers/gpu/drm/i915/gt/intel_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/intel_ppgtt.c @@ -26,6 +26,7 @@ struct i915_page_table *alloc_pt(struct i915_address_space *vm) return ERR_PTR(-ENOMEM); } + pt->is_compact = false; atomic_set(&pt->used, 0); return pt; } From patchwork Mon Oct 11 16:11:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62373C433F5 for ; Mon, 11 Oct 2021 16:09:49 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2A57D60F6E for ; Mon, 11 Oct 2021 16:09:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2A57D60F6E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E2ACF6E8E9; Mon, 11 Oct 2021 16:09:39 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3E96D6E8E6; Mon, 11 Oct 2021 16:09:36 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056750" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056750" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:36 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477958" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:33 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Bommu Krishnaiah , Wilson Chris P , Ramalingam C Date: Mon, 11 Oct 2021 21:41:48 +0530 Message-Id: <20211011161155.6397-8-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 07/14] drm/i915: Add vm min alignment support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Bommu Krishnaiah Replace the hard coded 4K alignment value with vm->min_alignment. Cc: Wilson Chris P Signed-off-by: Bommu Krishnaiah Signed-off-by: Ramalingam C --- .../i915/gem/selftests/i915_gem_client_blt.c | 23 ++++++++++++------- drivers/gpu/drm/i915/gt/intel_gtt.c | 9 ++++++++ drivers/gpu/drm/i915/gt/intel_gtt.h | 9 ++++++++ 3 files changed, 33 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c index ecbcbb86ae1e..30c8d64df3b8 100644 --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_client_blt.c @@ -32,6 +32,7 @@ struct tiled_blits { struct blit_buffer scratch; struct i915_vma *batch; u64 hole; + u64 align; u32 width; u32 height; }; @@ -403,14 +404,21 @@ tiled_blits_create(struct intel_engine_cs *engine, struct rnd_state *prng) goto err_free; } - hole_size = 2 * PAGE_ALIGN(WIDTH * HEIGHT * 4); + t->align = I915_GTT_PAGE_SIZE_2M; /* XXX worst case, derive from vm! */ + t->align = max(t->align, + i915_vm_min_alignment(t->ce->vm, INTEL_MEMORY_LOCAL)); + t->align = max(t->align, + i915_vm_min_alignment(t->ce->vm, INTEL_MEMORY_SYSTEM)); + + hole_size = 2 * round_up(WIDTH * HEIGHT * 4, t->align); hole_size *= 2; /* room to maneuver */ - hole_size += 2 * I915_GTT_MIN_ALIGNMENT; + hole_size += 2 * t->align; /* padding on either side */ mutex_lock(&t->ce->vm->mutex); memset(&hole, 0, sizeof(hole)); err = drm_mm_insert_node_in_range(&t->ce->vm->mm, &hole, - hole_size, 0, I915_COLOR_UNEVICTABLE, + hole_size, t->align, + I915_COLOR_UNEVICTABLE, 0, U64_MAX, DRM_MM_INSERT_BEST); if (!err) @@ -421,7 +429,7 @@ tiled_blits_create(struct intel_engine_cs *engine, struct rnd_state *prng) goto err_put; } - t->hole = hole.start + I915_GTT_MIN_ALIGNMENT; + t->hole = hole.start + t->align; pr_info("Using hole at %llx\n", t->hole); err = tiled_blits_create_buffers(t, WIDTH, HEIGHT, prng); @@ -448,7 +456,7 @@ static void tiled_blits_destroy(struct tiled_blits *t) static int tiled_blits_prepare(struct tiled_blits *t, struct rnd_state *prng) { - u64 offset = PAGE_ALIGN(t->width * t->height * 4); + u64 offset = round_up(t->width * t->height * 4, t->align); u32 *map; int err; int i; @@ -479,8 +487,7 @@ static int tiled_blits_prepare(struct tiled_blits *t, static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng) { - u64 offset = - round_up(t->width * t->height * 4, 2 * I915_GTT_MIN_ALIGNMENT); + u64 offset = round_up(t->width * t->height * 4, 2 * t->align); int err; /* We want to check position invariant tiling across GTT eviction */ @@ -493,7 +500,7 @@ static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng) /* Reposition so that we overlap the old addresses, and slightly off */ err = tiled_blit(t, - &t->buffers[2], t->hole + I915_GTT_MIN_ALIGNMENT, + &t->buffers[2], t->hole + t->align, &t->buffers[1], t->hole + 3 * offset / 2); if (err) return err; diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.c b/drivers/gpu/drm/i915/gt/intel_gtt.c index 56fbd37a6b54..4743921b7638 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.c +++ b/drivers/gpu/drm/i915/gt/intel_gtt.c @@ -216,6 +216,15 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass) GEM_BUG_ON(!vm->total); drm_mm_init(&vm->mm, 0, vm->total); + + memset64(vm->min_alignment, I915_GTT_MIN_ALIGNMENT, + ARRAY_SIZE(vm->min_alignment)); + + if (HAS_64K_PAGES(vm->i915)) { + vm->min_alignment[INTEL_MEMORY_LOCAL] = I915_GTT_PAGE_SIZE_64K; + vm->min_alignment[INTEL_MEMORY_STOLEN_LOCAL] = I915_GTT_PAGE_SIZE_64K; + } + vm->mm.head_node.color = I915_COLOR_UNEVICTABLE; INIT_LIST_HEAD(&vm->bound_list); diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 6d0233ffae17..20101eef4c95 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -28,6 +28,8 @@ #include "gt/intel_reset.h" #include "i915_selftest.h" #include "i915_vma_types.h" +#include "i915_params.h" +#include "intel_memory_region.h" #define I915_GFP_ALLOW_FAIL (GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_NOWARN) @@ -224,6 +226,7 @@ struct i915_address_space { struct device *dma; u64 total; /* size addr space maps (ex. 2GB for ggtt) */ u64 reserved; /* size addr space reserved */ + u64 min_alignment[INTEL_MEMORY_STOLEN_LOCAL + 1]; unsigned int bind_async_flags; @@ -382,6 +385,12 @@ i915_vm_has_scratch_64K(struct i915_address_space *vm) return vm->scratch_order == get_order(I915_GTT_PAGE_SIZE_64K); } +static inline u64 i915_vm_min_alignment(struct i915_address_space *vm, + enum intel_memory_type type) +{ + return vm->min_alignment[type]; +} + static inline bool i915_vm_has_cache_coloring(struct i915_address_space *vm) { From patchwork Mon Oct 11 16:11:49 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550457 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58C2FC433FE for ; Mon, 11 Oct 2021 16:09:50 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 2850260551 for ; Mon, 11 Oct 2021 16:09:50 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2850260551 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BD7BD6E8EB; Mon, 11 Oct 2021 16:09:40 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id C9D6E6E8E9; Mon, 11 Oct 2021 16:09:38 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056758" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056758" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:38 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477961" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:36 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ramalingam C Date: Mon, 11 Oct 2021 21:41:49 +0530 Message-Id: <20211011161155.6397-9-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 08/14] drm/i915/selftests: account for min_alignment in GTT selftests X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Auld We need to support vm->min_alignment > 4K, depending on the vm itself and the type of object we are inserting. With this in mind update the GTT selftests to take this into account. Signed-off-by: Matthew Auld Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 96 ++++++++++++------- 1 file changed, 63 insertions(+), 33 deletions(-) diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c index 46f4236039a9..fdb4bf88293b 100644 --- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c @@ -237,6 +237,8 @@ static int lowlevel_hole(struct i915_address_space *vm, u64 hole_start, u64 hole_end, unsigned long end_time) { + const unsigned int min_alignment = + i915_vm_min_alignment(vm, INTEL_MEMORY_SYSTEM); I915_RND_STATE(seed_prng); struct i915_vma *mock_vma; unsigned int size; @@ -250,9 +252,10 @@ static int lowlevel_hole(struct i915_address_space *vm, I915_RND_SUBSTATE(prng, seed_prng); struct drm_i915_gem_object *obj; unsigned int *order, count, n; - u64 hole_size; + u64 hole_size, aligned_size; - hole_size = (hole_end - hole_start) >> size; + aligned_size = max_t(u32, ilog2(min_alignment), size); + hole_size = (hole_end - hole_start) >> aligned_size; if (hole_size > KMALLOC_MAX_SIZE / sizeof(u32)) hole_size = KMALLOC_MAX_SIZE / sizeof(u32); count = hole_size >> 1; @@ -273,8 +276,8 @@ static int lowlevel_hole(struct i915_address_space *vm, } GEM_BUG_ON(!order); - GEM_BUG_ON(count * BIT_ULL(size) > vm->total); - GEM_BUG_ON(hole_start + count * BIT_ULL(size) > hole_end); + GEM_BUG_ON(count * BIT_ULL(aligned_size) > vm->total); + GEM_BUG_ON(hole_start + count * BIT_ULL(aligned_size) > hole_end); /* Ignore allocation failures (i.e. don't report them as * a test failure) as we are purposefully allocating very @@ -297,10 +300,10 @@ static int lowlevel_hole(struct i915_address_space *vm, } for (n = 0; n < count; n++) { - u64 addr = hole_start + order[n] * BIT_ULL(size); + u64 addr = hole_start + order[n] * BIT_ULL(aligned_size); intel_wakeref_t wakeref; - GEM_BUG_ON(addr + BIT_ULL(size) > vm->total); + GEM_BUG_ON(addr + BIT_ULL(aligned_size) > vm->total); if (igt_timeout(end_time, "%s timed out before %d/%d\n", @@ -343,7 +346,7 @@ static int lowlevel_hole(struct i915_address_space *vm, } mock_vma->pages = obj->mm.pages; - mock_vma->node.size = BIT_ULL(size); + mock_vma->node.size = BIT_ULL(aligned_size); mock_vma->node.start = addr; with_intel_runtime_pm(vm->gt->uncore->rpm, wakeref) @@ -354,7 +357,7 @@ static int lowlevel_hole(struct i915_address_space *vm, i915_random_reorder(order, count, &prng); for (n = 0; n < count; n++) { - u64 addr = hole_start + order[n] * BIT_ULL(size); + u64 addr = hole_start + order[n] * BIT_ULL(aligned_size); intel_wakeref_t wakeref; GEM_BUG_ON(addr + BIT_ULL(size) > vm->total); @@ -398,8 +401,10 @@ static int fill_hole(struct i915_address_space *vm, { const u64 hole_size = hole_end - hole_start; struct drm_i915_gem_object *obj; + const unsigned int min_alignment = + i915_vm_min_alignment(vm, INTEL_MEMORY_SYSTEM); const unsigned long max_pages = - min_t(u64, ULONG_MAX - 1, hole_size/2 >> PAGE_SHIFT); + min_t(u64, ULONG_MAX - 1, (hole_size / 2) >> ilog2(min_alignment)); const unsigned long max_step = max(int_sqrt(max_pages), 2UL); unsigned long npages, prime, flags; struct i915_vma *vma; @@ -440,14 +445,17 @@ static int fill_hole(struct i915_address_space *vm, offset = p->offset; list_for_each_entry(obj, &objects, st_link) { + u64 aligned_size = round_up(obj->base.size, + min_alignment); + vma = i915_vma_instance(obj, vm, NULL); if (IS_ERR(vma)) continue; if (p->step < 0) { - if (offset < hole_start + obj->base.size) + if (offset < hole_start + aligned_size) break; - offset -= obj->base.size; + offset -= aligned_size; } err = i915_vma_pin(vma, 0, 0, offset | flags); @@ -469,22 +477,25 @@ static int fill_hole(struct i915_address_space *vm, i915_vma_unpin(vma); if (p->step > 0) { - if (offset + obj->base.size > hole_end) + if (offset + aligned_size > hole_end) break; - offset += obj->base.size; + offset += aligned_size; } } offset = p->offset; list_for_each_entry(obj, &objects, st_link) { + u64 aligned_size = round_up(obj->base.size, + min_alignment); + vma = i915_vma_instance(obj, vm, NULL); if (IS_ERR(vma)) continue; if (p->step < 0) { - if (offset < hole_start + obj->base.size) + if (offset < hole_start + aligned_size) break; - offset -= obj->base.size; + offset -= aligned_size; } if (!drm_mm_node_allocated(&vma->node) || @@ -505,22 +516,25 @@ static int fill_hole(struct i915_address_space *vm, } if (p->step > 0) { - if (offset + obj->base.size > hole_end) + if (offset + aligned_size > hole_end) break; - offset += obj->base.size; + offset += aligned_size; } } offset = p->offset; list_for_each_entry_reverse(obj, &objects, st_link) { + u64 aligned_size = round_up(obj->base.size, + min_alignment); + vma = i915_vma_instance(obj, vm, NULL); if (IS_ERR(vma)) continue; if (p->step < 0) { - if (offset < hole_start + obj->base.size) + if (offset < hole_start + aligned_size) break; - offset -= obj->base.size; + offset -= aligned_size; } err = i915_vma_pin(vma, 0, 0, offset | flags); @@ -542,22 +556,25 @@ static int fill_hole(struct i915_address_space *vm, i915_vma_unpin(vma); if (p->step > 0) { - if (offset + obj->base.size > hole_end) + if (offset + aligned_size > hole_end) break; - offset += obj->base.size; + offset += aligned_size; } } offset = p->offset; list_for_each_entry_reverse(obj, &objects, st_link) { + u64 aligned_size = round_up(obj->base.size, + min_alignment); + vma = i915_vma_instance(obj, vm, NULL); if (IS_ERR(vma)) continue; if (p->step < 0) { - if (offset < hole_start + obj->base.size) + if (offset < hole_start + aligned_size) break; - offset -= obj->base.size; + offset -= aligned_size; } if (!drm_mm_node_allocated(&vma->node) || @@ -578,9 +595,9 @@ static int fill_hole(struct i915_address_space *vm, } if (p->step > 0) { - if (offset + obj->base.size > hole_end) + if (offset + aligned_size > hole_end) break; - offset += obj->base.size; + offset += aligned_size; } } } @@ -610,6 +627,7 @@ static int walk_hole(struct i915_address_space *vm, const u64 hole_size = hole_end - hole_start; const unsigned long max_pages = min_t(u64, ULONG_MAX - 1, hole_size >> PAGE_SHIFT); + unsigned long min_alignment; unsigned long flags; u64 size; @@ -619,6 +637,8 @@ static int walk_hole(struct i915_address_space *vm, if (i915_is_ggtt(vm)) flags |= PIN_GLOBAL; + min_alignment = i915_vm_min_alignment(vm, INTEL_MEMORY_SYSTEM); + for_each_prime_number_from(size, 1, max_pages) { struct drm_i915_gem_object *obj; struct i915_vma *vma; @@ -637,7 +657,7 @@ static int walk_hole(struct i915_address_space *vm, for (addr = hole_start; addr + obj->base.size < hole_end; - addr += obj->base.size) { + addr += round_up(obj->base.size, min_alignment)) { err = i915_vma_pin(vma, 0, 0, addr | flags); if (err) { pr_err("%s bind failed at %llx + %llx [hole %llx- %llx] with err=%d\n", @@ -689,6 +709,7 @@ static int pot_hole(struct i915_address_space *vm, { struct drm_i915_gem_object *obj; struct i915_vma *vma; + unsigned int min_alignment; unsigned long flags; unsigned int pot; int err = 0; @@ -697,6 +718,8 @@ static int pot_hole(struct i915_address_space *vm, if (i915_is_ggtt(vm)) flags |= PIN_GLOBAL; + min_alignment = i915_vm_min_alignment(vm, INTEL_MEMORY_SYSTEM); + obj = i915_gem_object_create_internal(vm->i915, 2 * I915_GTT_PAGE_SIZE); if (IS_ERR(obj)) return PTR_ERR(obj); @@ -709,13 +732,13 @@ static int pot_hole(struct i915_address_space *vm, /* Insert a pair of pages across every pot boundary within the hole */ for (pot = fls64(hole_end - 1) - 1; - pot > ilog2(2 * I915_GTT_PAGE_SIZE); + pot > ilog2(2 * min_alignment); pot--) { u64 step = BIT_ULL(pot); u64 addr; - for (addr = round_up(hole_start + I915_GTT_PAGE_SIZE, step) - I915_GTT_PAGE_SIZE; - addr <= round_down(hole_end - 2*I915_GTT_PAGE_SIZE, step) - I915_GTT_PAGE_SIZE; + for (addr = round_up(hole_start + min_alignment, step) - min_alignment; + addr <= round_down(hole_end - (2 * min_alignment), step) - min_alignment; addr += step) { err = i915_vma_pin(vma, 0, 0, addr | flags); if (err) { @@ -760,6 +783,7 @@ static int drunk_hole(struct i915_address_space *vm, unsigned long end_time) { I915_RND_STATE(prng); + unsigned int min_alignment; unsigned int size; unsigned long flags; @@ -767,15 +791,18 @@ static int drunk_hole(struct i915_address_space *vm, if (i915_is_ggtt(vm)) flags |= PIN_GLOBAL; + min_alignment = i915_vm_min_alignment(vm, INTEL_MEMORY_SYSTEM); + /* Keep creating larger objects until one cannot fit into the hole */ for (size = 12; (hole_end - hole_start) >> size; size++) { struct drm_i915_gem_object *obj; unsigned int *order, count, n; struct i915_vma *vma; - u64 hole_size; + u64 hole_size, aligned_size; int err = -ENODEV; - hole_size = (hole_end - hole_start) >> size; + aligned_size = max_t(u32, ilog2(min_alignment), size); + hole_size = (hole_end - hole_start) >> aligned_size; if (hole_size > KMALLOC_MAX_SIZE / sizeof(u32)) hole_size = KMALLOC_MAX_SIZE / sizeof(u32); count = hole_size >> 1; @@ -815,7 +842,7 @@ static int drunk_hole(struct i915_address_space *vm, GEM_BUG_ON(vma->size != BIT_ULL(size)); for (n = 0; n < count; n++) { - u64 addr = hole_start + order[n] * BIT_ULL(size); + u64 addr = hole_start + order[n] * BIT_ULL(aligned_size); err = i915_vma_pin(vma, 0, 0, addr | flags); if (err) { @@ -867,11 +894,14 @@ static int __shrink_hole(struct i915_address_space *vm, { struct drm_i915_gem_object *obj; unsigned long flags = PIN_OFFSET_FIXED | PIN_USER; + unsigned int min_alignment; unsigned int order = 12; LIST_HEAD(objects); int err = 0; u64 addr; + min_alignment = i915_vm_min_alignment(vm, INTEL_MEMORY_SYSTEM); + /* Keep creating larger objects until one cannot fit into the hole */ for (addr = hole_start; addr < hole_end; ) { struct i915_vma *vma; @@ -912,7 +942,7 @@ static int __shrink_hole(struct i915_address_space *vm, } i915_vma_unpin(vma); - addr += size; + addr += round_up(size, min_alignment); /* * Since we are injecting allocation faults at random intervals, From patchwork Mon Oct 11 16:11:50 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550459 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9231CC433EF for ; Mon, 11 Oct 2021 16:09:54 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6072860551 for ; Mon, 11 Oct 2021 16:09:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6072860551 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CCCC76E88A; Mon, 11 Oct 2021 16:09:43 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id DFF1E6E8ED; Mon, 11 Oct 2021 16:09:41 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056776" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056776" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:41 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477972" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:38 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ramalingam C , Joonas Lahtinen , Rodrigo Vivi Date: Mon, 11 Oct 2021 21:41:50 +0530 Message-Id: <20211011161155.6397-10-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 09/14] drm/i915/xehpsdv: implement memory coloring X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Auld The basic idea is that each 2M block(page-table) has a color, depending on if the page-table is occupied by LMEM objects(64K) or SMEM objects(4K), where our goal is to prevent mixing 64K and 4K GTT pages in the page-table, which is not supported by the HW. Signed-off-by: Matthew Auld Signed-off-by: Stuart Summers Signed-off-by: Ramalingam C Cc: Joonas Lahtinen Cc: Rodrigo Vivi --- drivers/gpu/drm/i915/gt/gen8_ppgtt.c | 16 ++++++++++ drivers/gpu/drm/i915/gt/intel_gtt.h | 6 ++++ drivers/gpu/drm/i915/i915_gem_evict.c | 17 ++++++++++ drivers/gpu/drm/i915/i915_vma.c | 46 +++++++++++++++++++-------- 4 files changed, 71 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c index fec0f20f1b93..666745adbe93 100644 --- a/drivers/gpu/drm/i915/gt/gen8_ppgtt.c +++ b/drivers/gpu/drm/i915/gt/gen8_ppgtt.c @@ -464,6 +464,19 @@ gen8_ppgtt_insert_pte(struct i915_ppgtt *ppgtt, return idx; } +static void xehpsdv_ppgtt_color_adjust(const struct drm_mm_node *node, + unsigned long color, + u64 *start, + u64 *end) +{ + if (i915_node_color_differs(node, color)) + *start = round_up(*start, SZ_2M); + + node = list_next_entry(node, node_list); + if (i915_node_color_differs(node, color)) + *end = round_down(*end, SZ_2M); +} + static void xehpsdv_ppgtt_insert_huge(struct i915_vma *vma, struct sgt_dma *iter, @@ -901,6 +914,9 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt, ppgtt->vm.alloc_scratch_dma = alloc_pt_dma; } + if (HAS_64K_PAGES(gt->i915)) + ppgtt->vm.mm.color_adjust = xehpsdv_ppgtt_color_adjust; + err = gen8_init_scratch(&ppgtt->vm); if (err) goto err_free; diff --git a/drivers/gpu/drm/i915/gt/intel_gtt.h b/drivers/gpu/drm/i915/gt/intel_gtt.h index 20101eef4c95..34696acde342 100644 --- a/drivers/gpu/drm/i915/gt/intel_gtt.h +++ b/drivers/gpu/drm/i915/gt/intel_gtt.h @@ -397,6 +397,12 @@ i915_vm_has_cache_coloring(struct i915_address_space *vm) return i915_is_ggtt(vm) && vm->mm.color_adjust; } +static inline bool +i915_vm_has_memory_coloring(struct i915_address_space *vm) +{ + return !i915_is_ggtt(vm) && vm->mm.color_adjust; +} + static inline struct i915_ggtt * i915_vm_to_ggtt(struct i915_address_space *vm) { diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c index 2b73ddb11c66..006bf4924c24 100644 --- a/drivers/gpu/drm/i915/i915_gem_evict.c +++ b/drivers/gpu/drm/i915/i915_gem_evict.c @@ -292,6 +292,13 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, /* Always look at the page afterwards to avoid the end-of-GTT */ end += I915_GTT_PAGE_SIZE; + } else if (i915_vm_has_memory_coloring(vm)) { + /* + * Expand the search the cover the page-table boundries, in + * case we need to flip the color of the page-table(s). + */ + start = round_down(start, SZ_2M); + end = round_up(end, SZ_2M); } GEM_BUG_ON(start >= end); @@ -321,6 +328,16 @@ int i915_gem_evict_for_node(struct i915_address_space *vm, if (node->color == target->color) continue; } + } else if (i915_vm_has_memory_coloring(vm)) { + if (node->start + node->size <= target->start) { + if (node->color == target->color) + continue; + } + + if (node->start >= target->start + target->size) { + if (node->color == target->color) + continue; + } } if (i915_vma_is_pinned(vma)) { diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 1ea1fa08efdf..2664d3ab49b9 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -585,6 +585,10 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color) struct drm_mm_node *node = &vma->node; struct drm_mm_node *other; + /* Only valid to be called on an already inserted vma */ + GEM_BUG_ON(!drm_mm_node_allocated(node)); + GEM_BUG_ON(list_empty(&node->node_list)); + /* * On some machines we have to be careful when putting differing types * of snoopable memory together to avoid the prefetcher crossing memory @@ -592,22 +596,34 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long color) * these constraints apply and set the drm_mm.color_adjust * appropriately. */ - if (!i915_vm_has_cache_coloring(vma->vm)) - return true; - - /* Only valid to be called on an already inserted vma */ - GEM_BUG_ON(!drm_mm_node_allocated(node)); - GEM_BUG_ON(list_empty(&node->node_list)); + if (i915_vm_has_cache_coloring(vma->vm)) { + other = list_prev_entry(node, node_list); + if (i915_node_color_differs(other, color) && + !drm_mm_hole_follows(other)) + return false; - other = list_prev_entry(node, node_list); - if (i915_node_color_differs(other, color) && - !drm_mm_hole_follows(other)) - return false; + other = list_next_entry(node, node_list); + if (i915_node_color_differs(other, color) && + !drm_mm_hole_follows(node)) + return false; + /* + * On XEHPSDV we need to make sure we are not mixing LMEM and SMEM objects + * in the same page-table, i.e mixing 64K and 4K gtt pages in the same + * page-table. + */ + } else if (i915_vm_has_memory_coloring(vma->vm)) { + other = list_prev_entry(node, node_list); + if (i915_node_color_differs(other, color) && + !drm_mm_hole_follows(other) && + !IS_ALIGNED(other->start + other->size, SZ_2M)) + return false; - other = list_next_entry(node, node_list); - if (i915_node_color_differs(other, color) && - !drm_mm_hole_follows(node)) - return false; + other = list_next_entry(node, node_list); + if (i915_node_color_differs(other, color) && + !drm_mm_hole_follows(node) && + !IS_ALIGNED(other->start, SZ_2M)) + return false; + } return true; } @@ -676,6 +692,8 @@ i915_vma_insert(struct i915_vma *vma, u64 size, u64 alignment, u64 flags) if (i915_vm_has_cache_coloring(vma->vm)) color = vma->obj->cache_level; + else if (i915_vm_has_memory_coloring(vma->vm)) + color = i915_gem_object_is_lmem(vma->obj); } if (flags & PIN_OFFSET_FIXED) { From patchwork Mon Oct 11 16:11:51 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28105C433F5 for ; Mon, 11 Oct 2021 16:09:58 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E912660551 for ; Mon, 11 Oct 2021 16:09:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org E912660551 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5AE5E6E8EF; Mon, 11 Oct 2021 16:09:46 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id A0F3D6E8ED; Mon, 11 Oct 2021 16:09:44 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056785" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056785" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:44 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477977" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:41 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Joonas Lahtinen , Ramalingam C Date: Mon, 11 Oct 2021 21:41:51 +0530 Message-Id: <20211011161155.6397-11-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 10/14] drm/i915/xehpsdv: Add has_flat_ccs to device info X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: CQ Tang Gen12+ devices support 3D surface (buffer) compression and various compression formats. This is accomplished by an additional compression control state (CCS) stored for each surface. Gen 12 devices(TGL family and DG1) stores compression states in a separate region of memory. It is managed by user-space and has an associated set of user-space managed page tables used by hardware for address translation. In Gen12.5 devices(XEHPSDV, DG2, etc), there is a new feature introduced i.e Flat CCS. It replaced AUX page tables with a flat indexed region of device memory for storing compression states. Cc: Joonas Lahtinen Cc: Matthew Auld Signed-off-by: CQ Tang Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/i915_drv.h | 2 ++ drivers/gpu/drm/i915/i915_pci.c | 1 + drivers/gpu/drm/i915/intel_device_info.h | 1 + 3 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index a16fde38a252..57948e0ee48b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1721,6 +1721,8 @@ IS_SUBPLATFORM(const struct drm_i915_private *i915, #define HAS_REGION(i915, i) (INTEL_INFO(i915)->memory_regions & (i)) #define HAS_LMEM(i915) HAS_REGION(i915, REGION_LMEM) +#define HAS_FLAT_CCS(dev_priv) (INTEL_INFO(dev_priv)->has_flat_ccs) + #define HAS_GT_UC(dev_priv) (INTEL_INFO(dev_priv)->has_gt_uc) #define HAS_POOLED_EU(dev_priv) (INTEL_INFO(dev_priv)->has_pooled_eu) diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c index 8ef484a23652..68367b505dc4 100644 --- a/drivers/gpu/drm/i915/i915_pci.c +++ b/drivers/gpu/drm/i915/i915_pci.c @@ -991,6 +991,7 @@ static const struct intel_device_info adl_p_info = { XE_HP_PAGE_SIZES, \ .dma_mask_size = 46, \ .has_64bit_reloc = 1, \ + .has_flat_ccs = 1, \ .has_global_mocs = 1, \ .has_gt_uc = 1, \ .has_llc = 1, \ diff --git a/drivers/gpu/drm/i915/intel_device_info.h b/drivers/gpu/drm/i915/intel_device_info.h index dd453b96af19..87ee1d86d2ac 100644 --- a/drivers/gpu/drm/i915/intel_device_info.h +++ b/drivers/gpu/drm/i915/intel_device_info.h @@ -126,6 +126,7 @@ enum intel_ppgtt_type { func(has_64k_pages); \ func(gpu_reset_clobbers_display); \ func(has_reset_engine); \ + func(has_flat_ccs); \ func(has_global_mocs); \ func(has_gt_uc); \ func(has_l3_dpf); \ From patchwork Mon Oct 11 16:11:52 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00A26C433EF for ; Mon, 11 Oct 2021 16:10:01 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CFC4060EB6 for ; Mon, 11 Oct 2021 16:10:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org CFC4060EB6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 25E246E8EA; Mon, 11 Oct 2021 16:09:50 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 70A166E8F4; Mon, 11 Oct 2021 16:09:47 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056793" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056793" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:47 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477984" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:44 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Abdiel Janulgue , Ramalingam C Date: Mon, 11 Oct 2021 21:41:52 +0530 Message-Id: <20211011161155.6397-12-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 11/14] drm/i915/lmem: Enable lmem for platforms with Flat CCS X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Abdiel Janulgue A portion of device memory is reserved for Flat CCS so usable device memory will be reduced by size of Flat CCS. Size of Flat CCS is specified in “XEHPSDV_FLAT_CCS_BASE_ADDR”. So to get effective device memory we need to subtract total device memory by Flat CCS memory size. Cc: Matthew Auld Signed-off-by: Abdiel Janulgue Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/gt/intel_gt.c | 19 ++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_gt.h | 1 + drivers/gpu/drm/i915/gt/intel_region_lmem.c | 22 +++++++++++++++++++-- drivers/gpu/drm/i915/i915_reg.h | 3 +++ 4 files changed, 43 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index 1cb1948ac959..fd82ebee8724 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -900,6 +900,25 @@ u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg) return intel_uncore_read_fw(gt->uncore, reg); } +u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg) +{ + int type; + u8 sliceid, subsliceid; + + for (type = 0; type < NUM_STEERING_TYPES; type++) { + if (intel_gt_reg_needs_read_steering(gt, reg, type)) { + intel_gt_get_valid_steering(gt, type, &sliceid, + &subsliceid); + return intel_uncore_read_with_mcr_steering(gt->uncore, + reg, + sliceid, + subsliceid); + } + } + + return intel_uncore_read(gt->uncore, reg); +} + void intel_gt_info_print(const struct intel_gt_info *info, struct drm_printer *p) { diff --git a/drivers/gpu/drm/i915/gt/intel_gt.h b/drivers/gpu/drm/i915/gt/intel_gt.h index 74e771871a9b..24b78398a587 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.h +++ b/drivers/gpu/drm/i915/gt/intel_gt.h @@ -84,6 +84,7 @@ static inline bool intel_gt_needs_read_steering(struct intel_gt *gt, } u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg); +u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg); void intel_gt_info_print(const struct intel_gt_info *info, struct drm_printer *p); diff --git a/drivers/gpu/drm/i915/gt/intel_region_lmem.c b/drivers/gpu/drm/i915/gt/intel_region_lmem.c index 073d28d96669..d1f88beb26fe 100644 --- a/drivers/gpu/drm/i915/gt/intel_region_lmem.c +++ b/drivers/gpu/drm/i915/gt/intel_region_lmem.c @@ -201,8 +201,26 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt) if (!IS_DGFX(i915)) return ERR_PTR(-ENODEV); - /* Stolen starts from GSMBASE on DG1 */ - lmem_size = intel_uncore_read64(uncore, GEN12_GSMBASE); + if (HAS_FLAT_CCS(i915)) { + u64 tile_stolen, flat_ccs_base_addr_reg, flat_ccs_base; + + lmem_size = pci_resource_len(pdev, 2); + flat_ccs_base_addr_reg = intel_gt_read_register(gt, XEHPSDV_FLAT_CCS_BASE_ADDR); + flat_ccs_base = (flat_ccs_base_addr_reg >> XEHPSDV_CCS_BASE_SHIFT) * SZ_64K; + tile_stolen = lmem_size - flat_ccs_base; + + /* If the FLAT_CCS_BASE_ADDR register is not populated, flag an error */ + if (tile_stolen == lmem_size) + DRM_ERROR("CCS_BASE_ADDR register did not have expected value\n"); + + lmem_size -= tile_stolen; + } else { + /* Stolen starts from GSMBASE without CCS */ + lmem_size = intel_uncore_read64(&i915->uncore, GEN12_GSMBASE); + if (GEM_WARN_ON(lmem_size > pci_resource_len(pdev, 2))) + return ERR_PTR(-ENODEV); + } + io_start = pci_resource_start(pdev, 2); if (GEM_WARN_ON(lmem_size > pci_resource_len(pdev, 2))) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index a897f4abea0c..5a14e0ca9d4f 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -12480,6 +12480,9 @@ enum skl_power_gate { #define GEN12_GSMBASE _MMIO(0x108100) #define GEN12_DSMBASE _MMIO(0x1080C0) +#define XEHPSDV_FLAT_CCS_BASE_ADDR _MMIO(0x4910) +#define XEHPSDV_CCS_BASE_SHIFT 8 + /* gamt regs */ #define GEN8_L3_LRA_1_GPGPU _MMIO(0x4dd4) #define GEN8_L3_LRA_1_GPGPU_DEFAULT_VALUE_BDW 0x67F1427F /* max/min for LRA1/2 */ From patchwork Mon Oct 11 16:11:53 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550465 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52189C433F5 for ; Mon, 11 Oct 2021 16:10:05 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 228BF60ED4 for ; Mon, 11 Oct 2021 16:10:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 228BF60ED4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8A1E46E8F9; Mon, 11 Oct 2021 16:09:55 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3E3646E8EE; Mon, 11 Oct 2021 16:09:50 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056805" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056805" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:50 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441477997" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:47 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ayaz A Siddiqui , Ramalingam C Date: Mon, 11 Oct 2021 21:41:53 +0530 Message-Id: <20211011161155.6397-13-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 12/14] drm/i915/gt: Clear compress metadata for Gen12.5 >= platforms X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Ayaz A Siddiqui Gen12.5+ devices support Flat CCS which reserved a portion of the device memory to store compression metadata, during the clearing of device memory buffer object we also need to clear the associated CCS buffer. Flat CCS memory can not be directly accessed by S/W. Address of CCS buffer associated main BO is automatically calculated by device itself. KMD/UMD can only access this buffer indirectly using XY_CTRL_SURF_COPY_BLT cmd via the address of device memory buffer. Cc: CQ Tang Signed-off-by: Ayaz A Siddiqui Signed-off-by: Ramalingam C --- drivers/gpu/drm/i915/gt/intel_gpu_commands.h | 14 +++ drivers/gpu/drm/i915/gt/intel_migrate.c | 120 ++++++++++++++++++- 2 files changed, 131 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h index f8253012d166..07bf5a1753bd 100644 --- a/drivers/gpu/drm/i915/gt/intel_gpu_commands.h +++ b/drivers/gpu/drm/i915/gt/intel_gpu_commands.h @@ -203,6 +203,20 @@ #define GFX_OP_DRAWRECT_INFO ((0x3<<29)|(0x1d<<24)|(0x80<<16)|(0x3)) #define GFX_OP_DRAWRECT_INFO_I965 ((0x7900<<16)|0x2) +#define XY_CTRL_SURF_INSTR_SIZE 5 +#define MI_FLUSH_DW_SIZE 3 +#define XY_CTRL_SURF_COPY_BLT ((2 << 29) | (0x48 << 22) | 3) +#define SRC_ACCESS_TYPE_SHIFT 21 +#define DST_ACCESS_TYPE_SHIFT 20 +#define CCS_SIZE_SHIFT 8 +#define XY_CTRL_SURF_MOCS_SHIFT 25 +#define NUM_CCS_BYTES_PER_BLOCK 256 +#define NUM_CCS_BLKS_PER_XFER 1024 +#define INDIRECT_ACCESS 0 +#define DIRECT_ACCESS 1 +#define MI_FLUSH_LLC BIT(9) +#define MI_FLUSH_CCS BIT(16) + #define COLOR_BLT_CMD (2 << 29 | 0x40 << 22 | (5 - 2)) #define XY_COLOR_BLT_CMD (2 << 29 | 0x50 << 22) #define SRC_COPY_BLT_CMD (2 << 29 | 0x43 << 22) diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index afb1cce9a352..0bed01750884 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -17,6 +17,7 @@ struct insert_pte_data { }; #define CHUNK_SZ SZ_8M /* ~1ms at 8GiB/s preemption delay */ +#define GET_CCS_SIZE(i915, size) (HAS_FLAT_CCS(i915) ? (size) >> 8 : 0) static bool engine_supports_migration(struct intel_engine_cs *engine) { @@ -490,15 +491,104 @@ intel_context_migrate_copy(struct intel_context *ce, return err; } -static int emit_clear(struct i915_request *rq, int size, u32 value) +static inline u32 *i915_flush_dw(u32 *cmd, u64 dst, u32 flags) +{ + /* Mask the 3 LSB to use the PPGTT address space */ + *cmd++ = MI_FLUSH_DW | flags; + *cmd++ = lower_32_bits(dst); + *cmd++ = upper_32_bits(dst); + + return cmd; +} + +static u32 calc_ctrl_surf_instr_size(struct drm_i915_private *i915, int size) +{ + u32 num_cmds, num_blks, total_size; + + if (!GET_CCS_SIZE(i915, size)) + return 0; + + /* + * XY_CTRL_SURF_COPY_BLT transfers CCS in 256 byte + * blocks. one XY_CTRL_SURF_COPY_BLT command can + * trnasfer upto 1024 blocks. + */ + num_blks = (GET_CCS_SIZE(i915, size) + + (NUM_CCS_BYTES_PER_BLOCK - 1)) >> 8; + num_cmds = (num_blks + (NUM_CCS_BLKS_PER_XFER - 1)) >> 10; + total_size = (XY_CTRL_SURF_INSTR_SIZE) * num_cmds; + + /* + * We need to add a flush before and after + * XY_CTRL_SURF_COPY_BLT + */ + total_size += 2 * MI_FLUSH_DW_SIZE; + return total_size; +} + +static u32 *_i915_ctrl_surf_copy_blt(u32 *cmd, u64 src_addr, u64 dst_addr, + u8 src_mem_access, u8 dst_mem_access, + int src_mocs, int dst_mocs, + u16 num_ccs_blocks) +{ + int i = num_ccs_blocks; + + /* + * The XY_CTRL_SURF_COPY_BLT instruction is used to copy the CCS + * data in and out of the CCS region. + * + * We can copy at most 1024 blocks of 256 bytes using one + * XY_CTRL_SURF_COPY_BLT instruction. + * + * In case we need to copy more than 1024 blocks, we need to add + * another instruction to the same batch buffer. + * + * 1024 blocks of 256 bytes of CCS represent a total 256KB of CCS. + * + * 256 KB of CCS represents 256 * 256 KB = 64 MB of LMEM. + */ + do { + /* + * We use logical AND with 1023 since the size field + * takes values which is in the range of 0 - 1023 + */ + *cmd++ = ((XY_CTRL_SURF_COPY_BLT) | + (src_mem_access << SRC_ACCESS_TYPE_SHIFT) | + (dst_mem_access << DST_ACCESS_TYPE_SHIFT) | + (((i - 1) & 1023) << CCS_SIZE_SHIFT)); + *cmd++ = lower_32_bits(src_addr); + *cmd++ = ((upper_32_bits(src_addr) & 0xFFFF) | + (src_mocs << XY_CTRL_SURF_MOCS_SHIFT)); + *cmd++ = lower_32_bits(dst_addr); + *cmd++ = ((upper_32_bits(dst_addr) & 0xFFFF) | + (dst_mocs << XY_CTRL_SURF_MOCS_SHIFT)); + src_addr += SZ_64M; + dst_addr += SZ_64M; + i -= NUM_CCS_BLKS_PER_XFER; + } while (i > 0); + + return cmd; +} + +static int emit_clear(struct i915_request *rq, + int size, + u32 value, + bool is_lmem) { const int ver = GRAPHICS_VER(rq->engine->i915); u32 instance = rq->engine->instance; u32 *cs; + struct drm_i915_private *i915 = rq->engine->i915; + u32 num_ccs_blks, ccs_ring_size; GEM_BUG_ON(size >> PAGE_SHIFT > S16_MAX); - cs = intel_ring_begin(rq, ver >= 8 ? 8 : 6); + /* Clear flat css only when value is 0 */ + ccs_ring_size = (is_lmem && !value) ? + calc_ctrl_surf_instr_size(i915, size) + : 0; + + cs = intel_ring_begin(rq, ver >= 8 ? 8 + ccs_ring_size : 6); if (IS_ERR(cs)) return PTR_ERR(cs); @@ -521,6 +611,30 @@ static int emit_clear(struct i915_request *rq, int size, u32 value) *cs++ = value; } + if (is_lmem && HAS_FLAT_CCS(i915) && !value) { + num_ccs_blks = (GET_CCS_SIZE(i915, size) + + NUM_CCS_BYTES_PER_BLOCK - 1) >> 8; + /* + * Flat CCS surface can only be accessed via + * XY_CTRL_SURF_COPY_BLT CMD and using indirect + * mapping of associated LMEM. + * We can clear ccs surface by writing all 0s, + * so we will flush the previously cleared buffer + * and use it as a source. + */ + + cs = i915_flush_dw(cs, (u64)instance << 32, + MI_FLUSH_LLC | MI_FLUSH_CCS); + cs = _i915_ctrl_surf_copy_blt(cs, + (u64)instance << 32, + (u64)instance << 32, + DIRECT_ACCESS, + INDIRECT_ACCESS, + 1, 1, + num_ccs_blks); + cs = i915_flush_dw(cs, (u64)instance << 32, + MI_FLUSH_LLC | MI_FLUSH_CCS); + } intel_ring_advance(rq, cs); return 0; } @@ -581,7 +695,7 @@ intel_context_migrate_clear(struct intel_context *ce, if (err) goto out_rq; - err = emit_clear(rq, len, value); + err = emit_clear(rq, len, value, is_lmem); /* Arbitration is re-enabled between requests. */ out_rq: From patchwork Mon Oct 11 16:11:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550467 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13F76C433FE for ; Mon, 11 Oct 2021 16:10:06 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id DA9F460EB6 for ; Mon, 11 Oct 2021 16:10:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org DA9F460EB6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2C1A86E8EE; Mon, 11 Oct 2021 16:09:56 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id CB5796E8EE; Mon, 11 Oct 2021 16:09:52 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056818" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056818" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:52 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441478010" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:50 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ramalingam C Date: Mon, 11 Oct 2021 21:41:54 +0530 Message-Id: <20211011161155.6397-14-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 13/14] drm/i915/uapi: document behaviour for DG2 64K support X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Matthew Auld On discrete platforms like DG2, we need to support a minimum page size of 64K when dealing with device local-memory. This is quite tricky for various reasons, so try to document the new implicit uapi for this. Signed-off-by: Matthew Auld Signed-off-by: Ramalingam C --- include/uapi/drm/i915_drm.h | 61 ++++++++++++++++++++++++++++++++++--- 1 file changed, 56 insertions(+), 5 deletions(-) diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index aa2a7eccfb94..d62e8b7ed8b6 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -1118,10 +1118,16 @@ struct drm_i915_gem_exec_object2 { /** * When the EXEC_OBJECT_PINNED flag is specified this is populated by * the user with the GTT offset at which this object will be pinned. + * * When the I915_EXEC_NO_RELOC flag is specified this must contain the * presumed_offset of the object. + * * During execbuffer2 the kernel populates it with the value of the * current GTT offset of the object, for future presumed_offset writes. + * + * See struct drm_i915_gem_create_ext for the rules when dealing with + * alignment restrictions with I915_MEMORY_CLASS_DEVICE, on devices with + * minimum page sizes, like DG2. */ __u64 offset; @@ -3001,11 +3007,56 @@ struct drm_i915_gem_create_ext { * * The (page-aligned) allocated size for the object will be returned. * - * Note that for some devices we have might have further minimum - * page-size restrictions(larger than 4K), like for device local-memory. - * However in general the final size here should always reflect any - * rounding up, if for example using the I915_GEM_CREATE_EXT_MEMORY_REGIONS - * extension to place the object in device local-memory. + * On discrete platforms, starting from DG2, we have to contend with GTT + * page size restrictions when dealing with I915_MEMORY_CLASS_DEVICE + * objects. Specifically the hardware only supports 64K or larger GTT + * page sizes for such memory. The kernel will already ensure that all + * I915_MEMORY_CLASS_DEVICE memory is allocated using 64K or larger page + * sizes underneath. + * + * Note that the returned size here will always reflect any required + * rounding up done by the kernel, i.e 4K will now become 64K on devices + * such as DG2. The GTT alignment will also need be at least 64K for + * such objects. + * + * Note that due to how the hardware implements 64K GTT page support, we + * have some further complications: + * + * 1.) The entire PDE(which covers a 2M virtual address range), must + * contain only 64K PTEs, i.e mixing 4K and 64K PTEs in the same + * PDE is forbidden by the hardware. + * + * 2.) We still need to support 4K PTEs for I915_MEMORY_CLASS_SYSTEM + * objects. + * + * To handle the above the kernel implements a memory coloring scheme to + * prevent userspace from mixing I915_MEMORY_CLASS_DEVICE and + * I915_MEMORY_CLASS_SYSTEM objects in the same PDE. If the kernel is + * ever unable to evict the required pages for the given PDE(different + * color) when inserting the object into the GTT then it will simply + * fail the request. + * + * Since userspace needs to manage the GTT address space themselves, + * special care is needed to ensure this doesn't happen. The simplest + * scheme is to simply align and round up all I915_MEMORY_CLASS_DEVICE + * objects to 2M, which avoids any issues here. At the very least this + * is likely needed for objects that can be placed in both + * I915_MEMORY_CLASS_DEVICE and I915_MEMORY_CLASS_SYSTEM, to avoid + * potential issues when the kernel needs to migrate the object behind + * the scenes, since that might also involve evicting other objects. + * + * To summarise the GTT rules, on platforms like DG2: + * + * 1.) All objects that can be placed in I915_MEMORY_CLASS_DEVICE must + * have 64K alignment. The kernel will reject this otherwise. + * + * 2.) All I915_MEMORY_CLASS_DEVICE objects must never be placed in + * the same PDE with other I915_MEMORY_CLASS_SYSTEM objects. The + * kernel will reject this otherwise. + * + * 3.) Objects that can be placed in both I915_MEMORY_CLASS_DEVICE and + * I915_MEMORY_CLASS_SYSTEM should probably be aligned and padded out + * to 2M. */ __u64 size; /** From patchwork Mon Oct 11 16:11:55 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramalingam C X-Patchwork-Id: 12550469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9FE04C433FE for ; Mon, 11 Oct 2021 16:10:21 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 64EDB60EB6 for ; Mon, 11 Oct 2021 16:10:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 64EDB60EB6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4A0126E8F3; Mon, 11 Oct 2021 16:10:20 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id A13A46E8F0; Mon, 11 Oct 2021 16:10:18 +0000 (UTC) X-IronPort-AV: E=McAfee;i="6200,9189,10134"; a="214056829" X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="214056829" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:55 -0700 X-IronPort-AV: E=Sophos;i="5.85,364,1624345200"; d="scan'208";a="441478029" Received: from ramaling-i9x.iind.intel.com ([10.99.66.205]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Oct 2021 09:09:52 -0700 From: Ramalingam C To: dri-devel , intel-gfx Cc: Daniel Vetter , Matthew Auld , CQ Tang , Hellstrom Thomas , Ramalingam C , Daniel Vetter Date: Mon, 11 Oct 2021 21:41:55 +0530 Message-Id: <20211011161155.6397-15-ramalingam.c@intel.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20211011161155.6397-1-ramalingam.c@intel.com> References: <20211011161155.6397-1-ramalingam.c@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 14/14] Doc/gpu/rfc/i915: i915 DG2 uAPI X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Details of the new features getting added as part of DG2 enabling and their implicit impact on the uAPI. Signed-off-by: Ramalingam C cc: Daniel Vetter cc: Matthew Auld --- Documentation/gpu/rfc/i915_dg2.rst | 47 ++++++++++++++++++++++++++++++ Documentation/gpu/rfc/index.rst | 3 ++ 2 files changed, 50 insertions(+) create mode 100644 Documentation/gpu/rfc/i915_dg2.rst diff --git a/Documentation/gpu/rfc/i915_dg2.rst b/Documentation/gpu/rfc/i915_dg2.rst new file mode 100644 index 000000000000..a83ca26cd758 --- /dev/null +++ b/Documentation/gpu/rfc/i915_dg2.rst @@ -0,0 +1,47 @@ +==================== +I915 DG2 RFC Section +==================== + +Upstream plan +============= +Plan to upstream the DG2 enabling is: + +* Merge basic HW enabling for DG2(Still without pciid) +* Merge the 64k support for lmem +* Merge the flat CCS enabling patches +* Add the pciid for DG2 and enable the DG2 in CI + + +64K page support for lmem +========================= +On DG2 hw, local-memory supports minimum GTT page size of 64k only. 4k is not supported anymore. + +DG2 hw dont support the 64k(lmem) and 4k(smem) pages in the same ppgtt Page table. Refer the +struct drm_i915_gem_create_ext for the implication of handling the 64k page size. + +.. kernel-doc:: include/uapi/drm/i915_drm.h + :functions: drm_i915_gem_create_ext + + +flat CCS support for lmem +========================= +Gen 12+ devices support 3D surfaces compression and compression formats. This is +accomplished by an additional compression control state (CCS) stored for each surface. + +Gen 12 devices(TGL and DG1) stores compression state in a separate region of memory. +It is managed by userspace and has an associated set of userspace managed page tables +used by hardware for address translation. + +In Gen 12.5 devices(XEXPSDV and DG2) Flat CCS is introduced to replace the userspace +managed AUX pagetable with the flat indexed region of device memory for storing the +compression state + +GOP Driver steals a chunk of memory for the CCS surface corresponding to the entire +range of local memory. The memory required for the CCS of the entire local memory is +1/256 of the main local memory. The Gop driver will also program a secure register +(XEHPSDV_FLAT_CCS_BASE_ADDR 0x4910) with this address value. + +So the Total local memory available for driver allocation is Total lmem size - CCS data size + +Flat CCS data needs to be cleared when a lmem object is allocated. And CCS data can +be copied in and out of CCS region through XY_CTRL_SURF_COPY_BLT. diff --git a/Documentation/gpu/rfc/index.rst b/Documentation/gpu/rfc/index.rst index 91e93a705230..afb320ed4028 100644 --- a/Documentation/gpu/rfc/index.rst +++ b/Documentation/gpu/rfc/index.rst @@ -20,6 +20,9 @@ host such documentation: i915_gem_lmem.rst +.. toctree:: + i915_dg2.rst + .. toctree:: i915_scheduler.rst