From patchwork Wed Aug 28 08:36:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Nirmoy Das X-Patchwork-Id: 13780932 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4D050C54754 for ; Wed, 28 Aug 2024 09:05:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A74E410E4DD; Wed, 28 Aug 2024 09:05:00 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="jzd8FhWl"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0D8C010E4DC; Wed, 28 Aug 2024 09:04:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724835899; x=1756371899; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=kFYlhd6DXkKtZSyNINtr5HTwAwPZ046yNe4tNx94bVE=; b=jzd8FhWljBz2D8IgPIIaoIBGiGwdSdNMxi+NsauRoRDWwE9fB8vVttII ah8eFMk816pCXkv7ATEZe6/wwE6Ah20Y+D6ItGiXdMQSjkkF6nCxmkuBd 0rLANqGpnqW+nOL3AJZ7GkhgQNTIrDejJVeBtBuRMzju5PIFdCwK3zvaL KortNBAq9eu6flo2eEAD9zbDP8AK+CfY/Kkt+8kL/yEpTS/E2tbReIMec jP9cT2CsyVEXpxZc9yhE3LZgF8q9y6HTkymR71zjKGdi4wRbnwGwEOptb mYYulMYgG+t+zDJMcSTgO5bQ7XnoRmyaEl8sga9UGUM+Su2yYPBHbmPV+ g==; X-CSE-ConnectionGUID: 671zDerbRfe784MvrWzUxA== X-CSE-MsgGUID: TT7/huWdRF2ESytzcORzhw== X-IronPort-AV: E=McAfee;i="6700,10204,11177"; a="27235215" X-IronPort-AV: E=Sophos;i="6.10,182,1719903600"; d="scan'208";a="27235215" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Aug 2024 02:04:58 -0700 X-CSE-ConnectionGUID: RcxoFumdRFip1M55k0lj6A== X-CSE-MsgGUID: Yj/Cpu20TZ6gYvftgdzbqw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,182,1719903600"; d="scan'208";a="67525631" Received: from nirmoyda-desk.igk.intel.com ([10.102.138.190]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Aug 2024 02:04:56 -0700 From: Nirmoy Das To: dri-devel@lists.freedesktop.org Cc: intel-xe@lists.freedesktop.org, Nirmoy Das , =?utf-8?q?Christian_K=C3=B6nig?= , Himal Prasad Ghimiray , Lucas De Marchi , Matthew Auld , Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH 1/2] Revert "drm/xe/lnl: Offload system clear page activity to GPU" Date: Wed, 28 Aug 2024 10:36:34 +0200 Message-ID: <20240828083635.23601-1-nirmoy.das@intel.com> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Organization: Intel Deutschland GmbH, Registered Address: Am Campeon 10, 85579 Neubiberg, Germany, Commercial Register: Amtsgericht Muenchen HRB 186928 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" This optimization relied on having to clear CCS on allocations. If there is no need to clear CCS on allocations then this would mostly help in reducing CPU utilization. Revert this patch at this moment because of: 1 Currently Xe can't do clear on free and using a invalid ttm flag, TTM_TT_FLAG_CLEARED_ON_FREE which could poison global ttm pool on multi-device setup. 2 Also for LNL CPU:WB doesn't require clearing CCS as such BO will not be allowed to bind with compression PTE. Subsequent patch will disable clearing CCS for CPU:WB BOs for LNL. This reverts commit 23683061805be368c8d1c7e7ff52abc470cac275. Cc: Christian König Cc: Himal Prasad Ghimiray Cc: Lucas De Marchi Cc: Matthew Auld Cc: Matthew Brost Cc: Thomas Hellström Reviewed-by: Thomas Hellström Signed-off-by: Nirmoy Das --- drivers/gpu/drm/xe/xe_bo.c | 26 ++------------------------ drivers/gpu/drm/xe/xe_device_types.h | 2 -- drivers/gpu/drm/xe/xe_ttm_sys_mgr.c | 12 ------------ 3 files changed, 2 insertions(+), 38 deletions(-) diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 9d6632f92fa9..25d0c939ba31 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -396,14 +396,6 @@ static struct ttm_tt *xe_ttm_tt_create(struct ttm_buffer_object *ttm_bo, caching = ttm_uncached; } - /* - * If the device can support gpu clear system pages then set proper ttm - * flag. Zeroed pages are only required for ttm_bo_type_device so - * unwanted data is not leaked to userspace. - */ - if (ttm_bo->type == ttm_bo_type_device && xe->mem.gpu_page_clear_sys) - page_flags |= TTM_TT_FLAG_CLEARED_ON_FREE; - err = ttm_tt_init(&tt->ttm, &bo->ttm, page_flags, caching, extra_pages); if (err) { kfree(tt); @@ -425,10 +417,6 @@ static int xe_ttm_tt_populate(struct ttm_device *ttm_dev, struct ttm_tt *tt, if (tt->page_flags & TTM_TT_FLAG_EXTERNAL) return 0; - /* Clear TTM_TT_FLAG_ZERO_ALLOC when GPU is set to clear system pages */ - if (tt->page_flags & TTM_TT_FLAG_CLEARED_ON_FREE) - tt->page_flags &= ~TTM_TT_FLAG_ZERO_ALLOC; - err = ttm_pool_alloc(&ttm_dev->pool, tt, ctx); if (err) return err; @@ -671,16 +659,8 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict, bool needs_clear; bool handle_system_ccs = (!IS_DGFX(xe) && xe_bo_needs_ccs_pages(bo) && ttm && ttm_tt_is_populated(ttm)) ? true : false; - bool clear_system_pages; int ret = 0; - /* - * Clear TTM_TT_FLAG_CLEARED_ON_FREE on bo creation path when - * moving to system as the bo doesn't have dma_mapping. - */ - if (!old_mem && ttm && !ttm_tt_is_populated(ttm)) - ttm->page_flags &= ~TTM_TT_FLAG_CLEARED_ON_FREE; - /* Bo creation path, moving to system or TT. */ if ((!old_mem && ttm) && !handle_system_ccs) { if (new_mem->mem_type == XE_PL_TT) @@ -703,10 +683,8 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict, move_lacks_source = handle_system_ccs ? (!bo->ccs_cleared) : (!mem_type_is_vram(old_mem_type) && !tt_has_data); - clear_system_pages = ttm && (ttm->page_flags & TTM_TT_FLAG_CLEARED_ON_FREE); needs_clear = (ttm && ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC) || - (!ttm && ttm_bo->type == ttm_bo_type_device) || - clear_system_pages; + (!ttm && ttm_bo->type == ttm_bo_type_device); if (new_mem->mem_type == XE_PL_TT) { ret = xe_tt_map_sg(ttm); @@ -818,7 +796,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict, if (move_lacks_source) { u32 flags = 0; - if (mem_type_is_vram(new_mem->mem_type) || clear_system_pages) + if (mem_type_is_vram(new_mem->mem_type)) flags |= XE_MIGRATE_CLEAR_FLAG_FULL; else if (handle_system_ccs) flags |= XE_MIGRATE_CLEAR_FLAG_CCS_DATA; diff --git a/drivers/gpu/drm/xe/xe_device_types.h b/drivers/gpu/drm/xe/xe_device_types.h index 4ecd620921a3..e73fb0c23932 100644 --- a/drivers/gpu/drm/xe/xe_device_types.h +++ b/drivers/gpu/drm/xe/xe_device_types.h @@ -333,8 +333,6 @@ struct xe_device { struct xe_mem_region vram; /** @mem.sys_mgr: system TTM manager */ struct ttm_resource_manager sys_mgr; - /** @mem.gpu_page_clear_sys: clear system pages offloaded to GPU */ - bool gpu_page_clear_sys; } mem; /** @sriov: device level virtualization data */ diff --git a/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c b/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c index e0ac20f20758..9844a8edbfe1 100644 --- a/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c +++ b/drivers/gpu/drm/xe/xe_ttm_sys_mgr.c @@ -117,17 +117,5 @@ int xe_ttm_sys_mgr_init(struct xe_device *xe) ttm_resource_manager_init(man, &xe->ttm, gtt_size >> PAGE_SHIFT); ttm_set_driver_manager(&xe->ttm, XE_PL_TT, man); ttm_resource_manager_set_used(man, true); - - /* - * On iGFX device with flat CCS, we clear CCS metadata, let's extend that - * and use GPU to clear pages as well. - * - * Disable this when init_on_free and/or init_on_alloc is on to avoid double - * zeroing pages with CPU and GPU. - */ - if (xe_device_has_flat_ccs(xe) && !IS_DGFX(xe) && - !want_init_on_alloc(GFP_USER) && !want_init_on_free()) - xe->mem.gpu_page_clear_sys = true; - return drmm_add_action_or_reset(&xe->drm, ttm_sys_mgr_fini, xe); } From patchwork Wed Aug 28 08:36:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Nirmoy Das X-Patchwork-Id: 13780933 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 04B83C54754 for ; Wed, 28 Aug 2024 09:05:06 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6F8C810E4DF; Wed, 28 Aug 2024 09:05:05 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.b="kTNiYTIe"; dkim-atps=neutral Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.12]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2630010E4DE; Wed, 28 Aug 2024 09:05:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724835901; x=1756371901; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7JiwAB5V4iGlM4S+ePTq9xETwrWKyp/eBFpaHmTKD2c=; b=kTNiYTIeEV1bacRIw6PHH5gMSVQWbKIcF1e5EHj6C4Nwp98mtpw8QhVw A3uzgAgNxKCnkSZljjdLBuwhBlkBBZ1Gd8ei407AnvCSQJIqrO0+3y2nZ 83xo6d6ZL0ijN4PolcXV0kPJKIq/eWfWmWEHEjTE66nVN5dBP9HbBXvq3 O54DYOpPtBP9tL1dcljszqMNb9GYGgUwME4RNVY+mrtW4m3nP9BSuk3eH hai5T5rMjuH101nCucL22DE9yaufOI6/cLbg1esaxXBjW63BChTOVx34m yPvxT8Mg2fcOaUOyWKs8GabUGecm6DbIUCPq95sN4kbcU+xj7raWrNe5v g==; X-CSE-ConnectionGUID: xLB9BJteTBKKZYaoUC2K/w== X-CSE-MsgGUID: 4KT1x3TwQPmrn4liJvhzIw== X-IronPort-AV: E=McAfee;i="6700,10204,11177"; a="27235219" X-IronPort-AV: E=Sophos;i="6.10,182,1719903600"; d="scan'208";a="27235219" Received: from fmviesa005.fm.intel.com ([10.60.135.145]) by fmvoesa106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Aug 2024 02:05:01 -0700 X-CSE-ConnectionGUID: SEFcR4tpTZSI7G/og6eX0w== X-CSE-MsgGUID: XJpMcKPXQKa8bcxnrmiwig== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,182,1719903600"; d="scan'208";a="67525640" Received: from nirmoyda-desk.igk.intel.com ([10.102.138.190]) by fmviesa005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Aug 2024 02:04:59 -0700 From: Nirmoy Das To: dri-devel@lists.freedesktop.org Cc: intel-xe@lists.freedesktop.org, Nirmoy Das , =?utf-8?q?Christian_K=C3=B6nig?= , Himal Prasad Ghimiray , Lucas De Marchi , Matthew Auld , Matthew Brost , =?utf-8?q?Thomas_Hellstr=C3=B6m?= Subject: [PATCH 2/2] Revert "drm/ttm: Add a flag to allow drivers to skip clear-on-free" Date: Wed, 28 Aug 2024 10:36:35 +0200 Message-ID: <20240828083635.23601-2-nirmoy.das@intel.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20240828083635.23601-1-nirmoy.das@intel.com> References: <20240828083635.23601-1-nirmoy.das@intel.com> MIME-Version: 1.0 Organization: Intel Deutschland GmbH, Registered Address: Am Campeon 10, 85579 Neubiberg, Germany, Commercial Register: Amtsgericht Muenchen HRB 186928 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Remove TTM_TT_FLAG_CLEARED_ON_FREE now that XE stopped using this flag. This reverts commit decbfaf06db05fa1f9b33149ebb3c145b44e878f. Cc: Christian König Cc: Himal Prasad Ghimiray Cc: Lucas De Marchi Cc: Matthew Auld Cc: Matthew Brost Cc: Thomas Hellström Signed-off-by: Nirmoy Das Reviewed-by: Thomas Hellström --- drivers/gpu/drm/ttm/ttm_pool.c | 18 +++++++----------- include/drm/ttm/ttm_tt.h | 6 +----- 2 files changed, 8 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c index 935ab3cfd046..8504dbe19c1a 100644 --- a/drivers/gpu/drm/ttm/ttm_pool.c +++ b/drivers/gpu/drm/ttm/ttm_pool.c @@ -222,18 +222,15 @@ static void ttm_pool_unmap(struct ttm_pool *pool, dma_addr_t dma_addr, } /* Give pages into a specific pool_type */ -static void ttm_pool_type_give(struct ttm_pool_type *pt, struct page *p, - bool cleared) +static void ttm_pool_type_give(struct ttm_pool_type *pt, struct page *p) { unsigned int i, num_pages = 1 << pt->order; - if (!cleared) { - for (i = 0; i < num_pages; ++i) { - if (PageHighMem(p)) - clear_highpage(p + i); - else - clear_page(page_address(p + i)); - } + for (i = 0; i < num_pages; ++i) { + if (PageHighMem(p)) + clear_highpage(p + i); + else + clear_page(page_address(p + i)); } spin_lock(&pt->lock); @@ -397,7 +394,6 @@ static void ttm_pool_free_range(struct ttm_pool *pool, struct ttm_tt *tt, pgoff_t start_page, pgoff_t end_page) { struct page **pages = &tt->pages[start_page]; - bool cleared = tt->page_flags & TTM_TT_FLAG_CLEARED_ON_FREE; unsigned int order; pgoff_t i, nr; @@ -411,7 +407,7 @@ static void ttm_pool_free_range(struct ttm_pool *pool, struct ttm_tt *tt, pt = ttm_pool_select_type(pool, caching, order); if (pt) - ttm_pool_type_give(pt, *pages, cleared); + ttm_pool_type_give(pt, *pages); else ttm_pool_free_page(pool, caching, order, *pages); } diff --git a/include/drm/ttm/ttm_tt.h b/include/drm/ttm/ttm_tt.h index cfaf49de2419..2b9d856ff388 100644 --- a/include/drm/ttm/ttm_tt.h +++ b/include/drm/ttm/ttm_tt.h @@ -85,9 +85,6 @@ struct ttm_tt { * fault handling abuses the DMA api a bit and dma_map_attrs can't be * used to assure pgprot always matches. * - * TTM_TT_FLAG_CLEARED_ON_FREE: Set this if a drm driver handles - * clearing backing store - * * TTM_TT_FLAG_PRIV_POPULATED: TTM internal only. DO NOT USE. This is * set by TTM after ttm_tt_populate() has successfully returned, and is * then unset when TTM calls ttm_tt_unpopulate(). @@ -97,9 +94,8 @@ struct ttm_tt { #define TTM_TT_FLAG_EXTERNAL BIT(2) #define TTM_TT_FLAG_EXTERNAL_MAPPABLE BIT(3) #define TTM_TT_FLAG_DECRYPTED BIT(4) -#define TTM_TT_FLAG_CLEARED_ON_FREE BIT(5) -#define TTM_TT_FLAG_PRIV_POPULATED BIT(6) +#define TTM_TT_FLAG_PRIV_POPULATED BIT(5) uint32_t page_flags; /** @num_pages: Number of pages in the page array. */ uint32_t num_pages;