From patchwork Mon Apr 8 15:16:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Paneer Selvam, Arunpravin" X-Patchwork-Id: 13621273 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ACCDCCD129D for ; Mon, 8 Apr 2024 15:17:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A86B41126E8; Mon, 8 Apr 2024 15:16:54 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (1024-bit key; unprotected) header.d=amd.com header.i=@amd.com header.b="QVsaTY/a"; dkim-atps=neutral Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2079.outbound.protection.outlook.com [40.107.236.79]) by gabe.freedesktop.org (Postfix) with ESMTPS id DD3221126EC; Mon, 8 Apr 2024 15:16:50 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=j23WyI9jnYRF/W+Pz5DEJzqVAmgGPnpSF9erBHXMOfV0RYwgxJpvhFsVUy8Kiop2TCTbzNCoWR0MNuNQLYB5chd6/IARXTLBGcsCj+3ZYNHznhEdFUYS/tiNeJuvnYBxJC2iYdraSq5Nrjg3N0nhHbOjARMxXljQ1Z51rG5FOaghhV51FS6S0twoPDk9BQevcR3t26oCSBngef6xSOwVzMDN7g1JWElP79McBeBGlOZnJyB8YGOGaaKK1YPIfxca0sUznk4DkOVjI7tQ/r8DtkxBH0hkEcOZ9wFK+A4zOPFQPvsICt0Nx7VG6gz+ieKzf5c7uyaDK1e/7Or1Q1J1IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GcWRTU8r+amEwZJCZiLvtoVQdLOD8lWOu9DJuOyluHI=; b=DM51u8QOXdTL3NYKyqbk/rvVhoJhmP69yoLAOUfpCZOG+YcRrdRcISjSydOgtQPC+ACS9cKjZp3Z1pnhpZ3aBe0LY1iLwCCVWp/5QXLzXq0lA/8ZuAKS4nvbNlUJwYI2CG8oKVkcpXVsa1aQ/umhOqGoOC5l4Yo4tVn9qH/2foPM1i2/4Zrm1LSLntAlGuwaYJS+rHX718vI+VpyJiHp/qdhlowCQQoaeJAFx9Fwv1M12QzJCw+66HENNA6YoLD5+TXx2fTA+N3MBR92UeVgurJoybCK8qJaYXhxH0GmIpc83gAQ7v01+L9/D0XMNzF+tZjOuAXNye/QOkegKBe4PQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GcWRTU8r+amEwZJCZiLvtoVQdLOD8lWOu9DJuOyluHI=; b=QVsaTY/aM0VJCGwOVMUKs7qb9BVt+g9NqVpLJCuUOj6iCRomWzt2fcHkJukqcIu7de4663XO4/rp+S9Jd10xt5H/8/gDL3/KpN/pOIgVRP2PaJBemnJtyFhohMQHOedLTT9b/FY1YwKhAHNVNmiFEVeLCzLWYWceAvxUNdlIfKs= Received: from BY3PR04CA0016.namprd04.prod.outlook.com (2603:10b6:a03:217::21) by CH2PR12MB4167.namprd12.prod.outlook.com (2603:10b6:610:7a::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.47; Mon, 8 Apr 2024 15:16:47 +0000 Received: from CO1PEPF000042AC.namprd03.prod.outlook.com (2603:10b6:a03:217:cafe::8c) by BY3PR04CA0016.outlook.office365.com (2603:10b6:a03:217::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.36 via Frontend Transport; Mon, 8 Apr 2024 15:16:46 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1PEPF000042AC.mail.protection.outlook.com (10.167.243.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Mon, 8 Apr 2024 15:16:46 +0000 Received: from amd-X570-AORUS-ELITE.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 8 Apr 2024 10:16:42 -0500 From: Arunpravin Paneer Selvam To: , CC: , , , , , Arunpravin Paneer Selvam Subject: [PATCH v10 1/3] drm/buddy: Implement tracking clear page feature Date: Mon, 8 Apr 2024 20:46:18 +0530 Message-ID: <20240408151620.528163-1-Arunpravin.PaneerSelvam@amd.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000042AC:EE_|CH2PR12MB4167:EE_ X-MS-Office365-Filtering-Correlation-Id: 1ddc92e4-911b-454b-bbde-08dc57dee7f7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: I2TKjEpIrDLqf53by9svKoaqCKOvsMKgKVF+cUnXJXPYTYvafR1UUoFLRAUBjqTq72JF/mrPEgTGuBSputV2zWgDmbYzh27NTl5Mja/dTvEVB5LvvPQ2zabFRQt/TYhadVGu8no7axhcpO8Tc3EqZXE+T/Dx6tjs9eTi/DcId0CfDViYNNmeHRmoPmhd/e0E6qEZ61G+04eTNCh7HLttg8gv+nJHfiAVh05Io7yEhOwZkJVwzjNd6FTG5YBwBMAL2NX7KRQshX98izJnGMAr4w+93dTD2y/L9UFmcOKKxRigVfW6Spm4vnDqwJpnOQWIjNGdtgnLr/gm7KVGgdXlsxJswQ0soXDESwpeKWb8Tp4WPy/po98uNrKKQy5UDjgAkGGT6QKe5p4U53r8bQzVjLqE9/RUHmy9OxNlas5F8sMeWP2svOlYaPnycXvNmnOoRWd0dCtXwJSXrUdj9rzLWnVrgdlG2Db3xfCK+5bfBrbN/NB4hfQpj4Lt9vnvptvIun7ECUUvbMkaTTC3JnRDBDjFQCu2cjXBVi2Yms4DJO3ol5cF9VIpXnGr6vkcKMiMhLv+atPAqk8unFoRy4e3NssrGZPHOPVsJG2ahIASgAHWsRDUVQl/FjdyK+FO5eAMFKR3XmDbwHv2kzZ/XOYKKgy1nWCdHS1t8p3seMzln9Jz954binjmxjmhkl7SdDYmmHxWg6W/ksUvok/4nWW1vPf9geNQ7btHXU0VXvW7+EA4R9anvh4krDr3lRqa4w3B X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230031)(1800799015)(376005)(82310400014)(36860700004); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Apr 2024 15:16:46.3083 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1ddc92e4-911b-454b-bbde-08dc57dee7f7 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000042AC.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR12MB4167 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" - Add tracking clear page feature. - Driver should enable the DRM_BUDDY_CLEARED flag if it successfully clears the blocks in the free path. On the otherhand, DRM buddy marks each block as cleared. - Track the available cleared pages size - If driver requests cleared memory we prefer cleared memory but fallback to uncleared if we can't find the cleared blocks. when driver requests uncleared memory we try to use uncleared but fallback to cleared memory if necessary. - When a block gets freed we clear it and mark the freed block as cleared, when there are buddies which are cleared as well we can merge them. Otherwise, we prefer to keep the blocks as separated. - Add a function to support defragmentation. v1: - Depends on the flag check DRM_BUDDY_CLEARED, enable the block as cleared. Else, reset the clear flag for each block in the list(Christian) - For merging the 2 cleared blocks compare as below, drm_buddy_is_clear(block) != drm_buddy_is_clear(buddy)(Christian) - Defragment the memory beginning from min_order till the required memory space is available. v2: (Matthew) - Add a wrapper drm_buddy_free_list_internal for the freeing of blocks operation within drm buddy. - Write a macro block_incompatible() to allocate the required blocks. - Update the xe driver for the drm_buddy_free_list change in arguments. - add a warning if the two blocks are incompatible on defragmentation - call full defragmentation in the fini() function - place a condition to test if min_order is equal to 0 - replace the list with safe_reverse() variant as we might remove the block from the list. v3: - fix Gitlab user reported lockup issue. - Keep DRM_BUDDY_HEADER_CLEAR define sorted(Matthew) - modify to pass the root order instead max_order in fini() function(Matthew) - change bool 1 to true(Matthew) - add check if min_block_size is power of 2(Matthew) - modify the min_block_size datatype to u64(Matthew) v4: - rename the function drm_buddy_defrag with __force_merge. - Include __force_merge directly in drm buddy file and remove the defrag use in amdgpu driver. - Remove list_empty() check(Matthew) - Remove unnecessary space, headers and placement of new variables(Matthew) - Add a unit test case(Matthew) v5: - remove force merge support to actual range allocation and not to bail out when contains && split(Matthew) - add range support to force merge function. Signed-off-by: Arunpravin Paneer Selvam Signed-off-by: Matthew Auld Suggested-by: Christian König Suggested-by: Matthew Auld Reviewed-by: Matthew Auld --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 6 +- drivers/gpu/drm/drm_buddy.c | 430 ++++++++++++++---- drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 6 +- drivers/gpu/drm/tests/drm_buddy_test.c | 28 +- drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 4 +- include/drm/drm_buddy.h | 16 +- 6 files changed, 368 insertions(+), 122 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index 8db880244324..c0c851409241 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -571,7 +571,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, return 0; error_free_blocks: - drm_buddy_free_list(mm, &vres->blocks); + drm_buddy_free_list(mm, &vres->blocks, 0); mutex_unlock(&mgr->lock); error_fini: ttm_resource_fini(man, &vres->base); @@ -604,7 +604,7 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager *man, amdgpu_vram_mgr_do_reserve(man); - drm_buddy_free_list(mm, &vres->blocks); + drm_buddy_free_list(mm, &vres->blocks, 0); mutex_unlock(&mgr->lock); atomic64_sub(vis_usage, &mgr->vis_usage); @@ -912,7 +912,7 @@ void amdgpu_vram_mgr_fini(struct amdgpu_device *adev) kfree(rsv); list_for_each_entry_safe(rsv, temp, &mgr->reserved_pages, blocks) { - drm_buddy_free_list(&mgr->mm, &rsv->allocated); + drm_buddy_free_list(&mgr->mm, &rsv->allocated, 0); kfree(rsv); } if (!adev->gmc.is_app_apu) diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c index 5ebdd6f8f36e..83dbe252f727 100644 --- a/drivers/gpu/drm/drm_buddy.c +++ b/drivers/gpu/drm/drm_buddy.c @@ -38,8 +38,8 @@ static void drm_block_free(struct drm_buddy *mm, kmem_cache_free(slab_blocks, block); } -static void list_insert_sorted(struct drm_buddy *mm, - struct drm_buddy_block *block) +static void list_insert(struct drm_buddy *mm, + struct drm_buddy_block *block) { struct drm_buddy_block *node; struct list_head *head; @@ -57,6 +57,16 @@ static void list_insert_sorted(struct drm_buddy *mm, __list_add(&block->link, node->link.prev, &node->link); } +static void clear_reset(struct drm_buddy_block *block) +{ + block->header &= ~DRM_BUDDY_HEADER_CLEAR; +} + +static void mark_cleared(struct drm_buddy_block *block) +{ + block->header |= DRM_BUDDY_HEADER_CLEAR; +} + static void mark_allocated(struct drm_buddy_block *block) { block->header &= ~DRM_BUDDY_HEADER_STATE; @@ -71,7 +81,7 @@ static void mark_free(struct drm_buddy *mm, block->header &= ~DRM_BUDDY_HEADER_STATE; block->header |= DRM_BUDDY_FREE; - list_insert_sorted(mm, block); + list_insert(mm, block); } static void mark_split(struct drm_buddy_block *block) @@ -82,6 +92,133 @@ static void mark_split(struct drm_buddy_block *block) list_del(&block->link); } +static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2) +{ + return s1 <= e2 && e1 >= s2; +} + +static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2) +{ + return s1 <= s2 && e1 >= e2; +} + +static struct drm_buddy_block * +__get_buddy(struct drm_buddy_block *block) +{ + struct drm_buddy_block *parent; + + parent = block->parent; + if (!parent) + return NULL; + + if (parent->left == block) + return parent->right; + + return parent->left; +} + +static unsigned int __drm_buddy_free(struct drm_buddy *mm, + struct drm_buddy_block *block, + bool force_merge) +{ + struct drm_buddy_block *parent; + unsigned int order; + + while ((parent = block->parent)) { + struct drm_buddy_block *buddy; + + buddy = __get_buddy(block); + + if (!drm_buddy_block_is_free(buddy)) + break; + + if (!force_merge) { + /* + * Check the block and its buddy clear state and exit + * the loop if they both have the dissimilar state. + */ + if (drm_buddy_block_is_clear(block) != + drm_buddy_block_is_clear(buddy)) + break; + + if (drm_buddy_block_is_clear(block)) + mark_cleared(parent); + } + + list_del(&buddy->link); + if (force_merge && drm_buddy_block_is_clear(buddy)) + mm->clear_avail -= drm_buddy_block_size(mm, buddy); + + drm_block_free(mm, block); + drm_block_free(mm, buddy); + + block = parent; + } + + order = drm_buddy_block_order(block); + mark_free(mm, block); + + return order; +} + +static int __force_merge(struct drm_buddy *mm, + u64 start, + u64 end, + unsigned int min_order) +{ + unsigned int order; + int i; + + if (!min_order) + return -ENOMEM; + + if (min_order > mm->max_order) + return -EINVAL; + + for (i = min_order - 1; i >= 0; i--) { + struct drm_buddy_block *block, *prev; + + list_for_each_entry_safe_reverse(block, prev, &mm->free_list[i], link) { + struct drm_buddy_block *buddy; + u64 block_start, block_end; + + if (!block->parent) + continue; + + block_start = drm_buddy_block_offset(block); + block_end = block_start + drm_buddy_block_size(mm, block) - 1; + + if (!contains(start, end, block_start, block_end)) + continue; + + buddy = __get_buddy(block); + if (!drm_buddy_block_is_free(buddy)) + continue; + + WARN_ON(drm_buddy_block_is_clear(block) == + drm_buddy_block_is_clear(buddy)); + + /* + * If the prev block is same as buddy, don't access the + * block in the next iteration as we would free the + * buddy block as part of the free function. + */ + if (prev == buddy) + prev = list_prev_entry(prev, link); + + list_del(&block->link); + if (drm_buddy_block_is_clear(block)) + mm->clear_avail -= drm_buddy_block_size(mm, block); + + order = __drm_buddy_free(mm, block, true); + if (order >= min_order) + return 0; + } + } + + return -ENOMEM; +} + /** * drm_buddy_init - init memory manager * @@ -186,11 +323,21 @@ EXPORT_SYMBOL(drm_buddy_init); */ void drm_buddy_fini(struct drm_buddy *mm) { + u64 root_size, size; + unsigned int order; int i; + size = mm->size; + for (i = 0; i < mm->n_roots; ++i) { + order = ilog2(size) - ilog2(mm->chunk_size); + __force_merge(mm, 0, size, order); + WARN_ON(!drm_buddy_block_is_free(mm->roots[i])); drm_block_free(mm, mm->roots[i]); + + root_size = mm->chunk_size << order; + size -= root_size; } WARN_ON(mm->avail != mm->size); @@ -223,26 +370,17 @@ static int split_block(struct drm_buddy *mm, mark_free(mm, block->left); mark_free(mm, block->right); + if (drm_buddy_block_is_clear(block)) { + mark_cleared(block->left); + mark_cleared(block->right); + clear_reset(block); + } + mark_split(block); return 0; } -static struct drm_buddy_block * -__get_buddy(struct drm_buddy_block *block) -{ - struct drm_buddy_block *parent; - - parent = block->parent; - if (!parent) - return NULL; - - if (parent->left == block) - return parent->right; - - return parent->left; -} - /** * drm_get_buddy - get buddy address * @@ -260,30 +398,6 @@ drm_get_buddy(struct drm_buddy_block *block) } EXPORT_SYMBOL(drm_get_buddy); -static void __drm_buddy_free(struct drm_buddy *mm, - struct drm_buddy_block *block) -{ - struct drm_buddy_block *parent; - - while ((parent = block->parent)) { - struct drm_buddy_block *buddy; - - buddy = __get_buddy(block); - - if (!drm_buddy_block_is_free(buddy)) - break; - - list_del(&buddy->link); - - drm_block_free(mm, block); - drm_block_free(mm, buddy); - - block = parent; - } - - mark_free(mm, block); -} - /** * drm_buddy_free_block - free a block * @@ -295,42 +409,74 @@ void drm_buddy_free_block(struct drm_buddy *mm, { BUG_ON(!drm_buddy_block_is_allocated(block)); mm->avail += drm_buddy_block_size(mm, block); - __drm_buddy_free(mm, block); + if (drm_buddy_block_is_clear(block)) + mm->clear_avail += drm_buddy_block_size(mm, block); + + __drm_buddy_free(mm, block, false); } EXPORT_SYMBOL(drm_buddy_free_block); -/** - * drm_buddy_free_list - free blocks - * - * @mm: DRM buddy manager - * @objects: input list head to free blocks - */ -void drm_buddy_free_list(struct drm_buddy *mm, struct list_head *objects) +static void __drm_buddy_free_list(struct drm_buddy *mm, + struct list_head *objects, + bool mark_clear, + bool mark_dirty) { struct drm_buddy_block *block, *on; + WARN_ON(mark_dirty && mark_clear); + list_for_each_entry_safe(block, on, objects, link) { + if (mark_clear) + mark_cleared(block); + else if (mark_dirty) + clear_reset(block); drm_buddy_free_block(mm, block); cond_resched(); } INIT_LIST_HEAD(objects); } -EXPORT_SYMBOL(drm_buddy_free_list); -static inline bool overlaps(u64 s1, u64 e1, u64 s2, u64 e2) +static void drm_buddy_free_list_internal(struct drm_buddy *mm, + struct list_head *objects) { - return s1 <= e2 && e1 >= s2; + /* + * Don't touch the clear/dirty bit, since allocation is still internal + * at this point. For example we might have just failed part of the + * allocation. + */ + __drm_buddy_free_list(mm, objects, false, false); } -static inline bool contains(u64 s1, u64 e1, u64 s2, u64 e2) +/** + * drm_buddy_free_list - free blocks + * + * @mm: DRM buddy manager + * @objects: input list head to free blocks + * @flags: optional flags like DRM_BUDDY_CLEARED + */ +void drm_buddy_free_list(struct drm_buddy *mm, + struct list_head *objects, + unsigned int flags) { - return s1 <= s2 && e1 >= e2; + bool mark_clear = flags & DRM_BUDDY_CLEARED; + + __drm_buddy_free_list(mm, objects, mark_clear, !mark_clear); +} +EXPORT_SYMBOL(drm_buddy_free_list); + +static bool block_incompatible(struct drm_buddy_block *block, unsigned int flags) +{ + bool needs_clear = flags & DRM_BUDDY_CLEAR_ALLOCATION; + + return needs_clear != drm_buddy_block_is_clear(block); } static struct drm_buddy_block * -alloc_range_bias(struct drm_buddy *mm, - u64 start, u64 end, - unsigned int order) +__alloc_range_bias(struct drm_buddy *mm, + u64 start, u64 end, + unsigned int order, + unsigned long flags, + bool fallback) { u64 req_size = mm->chunk_size << order; struct drm_buddy_block *block; @@ -379,6 +525,9 @@ alloc_range_bias(struct drm_buddy *mm, if (contains(start, end, block_start, block_end) && order == drm_buddy_block_order(block)) { + if (!fallback && block_incompatible(block, flags)) + continue; + /* * Find the free block within the range. */ @@ -410,30 +559,57 @@ alloc_range_bias(struct drm_buddy *mm, if (buddy && (drm_buddy_block_is_free(block) && drm_buddy_block_is_free(buddy))) - __drm_buddy_free(mm, block); + __drm_buddy_free(mm, block, false); return ERR_PTR(err); } static struct drm_buddy_block * -get_maxblock(struct drm_buddy *mm, unsigned int order) +__drm_buddy_alloc_range_bias(struct drm_buddy *mm, + u64 start, u64 end, + unsigned int order, + unsigned long flags) { - struct drm_buddy_block *max_block = NULL, *node; + struct drm_buddy_block *block; + bool fallback = false; + + block = __alloc_range_bias(mm, start, end, order, + flags, fallback); + if (IS_ERR(block) && mm->clear_avail) + return __alloc_range_bias(mm, start, end, order, + flags, !fallback); + + return block; +} + +static struct drm_buddy_block * +get_maxblock(struct drm_buddy *mm, unsigned int order, + unsigned long flags) +{ + struct drm_buddy_block *max_block = NULL, *block = NULL; unsigned int i; for (i = order; i <= mm->max_order; ++i) { - if (!list_empty(&mm->free_list[i])) { - node = list_last_entry(&mm->free_list[i], - struct drm_buddy_block, - link); - if (!max_block) { - max_block = node; + struct drm_buddy_block *tmp_block; + + list_for_each_entry_reverse(tmp_block, &mm->free_list[i], link) { + if (block_incompatible(tmp_block, flags)) continue; - } - if (drm_buddy_block_offset(node) > - drm_buddy_block_offset(max_block)) { - max_block = node; - } + block = tmp_block; + break; + } + + if (!block) + continue; + + if (!max_block) { + max_block = block; + continue; + } + + if (drm_buddy_block_offset(block) > + drm_buddy_block_offset(max_block)) { + max_block = block; } } @@ -450,11 +626,29 @@ alloc_from_freelist(struct drm_buddy *mm, int err; if (flags & DRM_BUDDY_TOPDOWN_ALLOCATION) { - block = get_maxblock(mm, order); + block = get_maxblock(mm, order, flags); if (block) /* Store the obtained block order */ tmp = drm_buddy_block_order(block); } else { + for (tmp = order; tmp <= mm->max_order; ++tmp) { + struct drm_buddy_block *tmp_block; + + list_for_each_entry_reverse(tmp_block, &mm->free_list[tmp], link) { + if (block_incompatible(tmp_block, flags)) + continue; + + block = tmp_block; + break; + } + + if (block) + break; + } + } + + if (!block) { + /* Fallback method */ for (tmp = order; tmp <= mm->max_order; ++tmp) { if (!list_empty(&mm->free_list[tmp])) { block = list_last_entry(&mm->free_list[tmp], @@ -464,10 +658,10 @@ alloc_from_freelist(struct drm_buddy *mm, break; } } - } - if (!block) - return ERR_PTR(-ENOSPC); + if (!block) + return ERR_PTR(-ENOSPC); + } BUG_ON(!drm_buddy_block_is_free(block)); @@ -483,7 +677,7 @@ alloc_from_freelist(struct drm_buddy *mm, err_undo: if (tmp != order) - __drm_buddy_free(mm, block); + __drm_buddy_free(mm, block, false); return ERR_PTR(err); } @@ -527,13 +721,22 @@ static int __alloc_range(struct drm_buddy *mm, if (contains(start, end, block_start, block_end)) { if (!drm_buddy_block_is_free(block)) { - err = -ENOSPC; - goto err_free; + if (drm_buddy_block_is_split(block)) { + list_add(&block->right->tmp_link, dfs); + list_add(&block->left->tmp_link, dfs); + + continue; + } else { + err = -ENOSPC; + goto err_free; + } } mark_allocated(block); total_allocated += drm_buddy_block_size(mm, block); mm->avail -= drm_buddy_block_size(mm, block); + if (drm_buddy_block_is_clear(block)) + mm->clear_avail -= drm_buddy_block_size(mm, block); list_add_tail(&block->link, &allocated); continue; } @@ -567,14 +770,14 @@ static int __alloc_range(struct drm_buddy *mm, if (buddy && (drm_buddy_block_is_free(block) && drm_buddy_block_is_free(buddy))) - __drm_buddy_free(mm, block); + __drm_buddy_free(mm, block, false); err_free: if (err == -ENOSPC && total_allocated_on_err) { list_splice_tail(&allocated, blocks); *total_allocated_on_err = total_allocated; } else { - drm_buddy_free_list(mm, &allocated); + drm_buddy_free_list_internal(mm, &allocated); } return err; @@ -640,11 +843,11 @@ static int __alloc_contig_try_harder(struct drm_buddy *mm, list_splice(&blocks_lhs, blocks); return 0; } else if (err != -ENOSPC) { - drm_buddy_free_list(mm, blocks); + drm_buddy_free_list_internal(mm, blocks); return err; } /* Free blocks for the next iteration */ - drm_buddy_free_list(mm, blocks); + drm_buddy_free_list_internal(mm, blocks); } return -ENOSPC; @@ -700,6 +903,8 @@ int drm_buddy_block_trim(struct drm_buddy *mm, list_del(&block->link); mark_free(mm, block); mm->avail += drm_buddy_block_size(mm, block); + if (drm_buddy_block_is_clear(block)) + mm->clear_avail += drm_buddy_block_size(mm, block); /* Prevent recursively freeing this node */ parent = block->parent; @@ -711,6 +916,8 @@ int drm_buddy_block_trim(struct drm_buddy *mm, if (err) { mark_allocated(block); mm->avail -= drm_buddy_block_size(mm, block); + if (drm_buddy_block_is_clear(block)) + mm->clear_avail -= drm_buddy_block_size(mm, block); list_add(&block->link, blocks); } @@ -719,13 +926,28 @@ int drm_buddy_block_trim(struct drm_buddy *mm, } EXPORT_SYMBOL(drm_buddy_block_trim); +static struct drm_buddy_block * +__drm_buddy_alloc_blocks(struct drm_buddy *mm, + u64 start, u64 end, + unsigned int order, + unsigned long flags) +{ + if (flags & DRM_BUDDY_RANGE_ALLOCATION) + /* Allocate traversing within the range */ + return __drm_buddy_alloc_range_bias(mm, start, end, + order, flags); + else + /* Allocate from freelist */ + return alloc_from_freelist(mm, order, flags); +} + /** * drm_buddy_alloc_blocks - allocate power-of-two blocks * * @mm: DRM buddy manager to allocate from * @start: start of the allowed range for this block * @end: end of the allowed range for this block - * @size: size of the allocation + * @size: size of the allocation in bytes * @min_block_size: alignment of the allocation * @blocks: output list head to add allocated blocks * @flags: DRM_BUDDY_*_ALLOCATION flags @@ -800,23 +1022,33 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, BUG_ON(order < min_order); do { - if (flags & DRM_BUDDY_RANGE_ALLOCATION) - /* Allocate traversing within the range */ - block = alloc_range_bias(mm, start, end, order); - else - /* Allocate from freelist */ - block = alloc_from_freelist(mm, order, flags); - + block = __drm_buddy_alloc_blocks(mm, start, + end, + order, + flags); if (!IS_ERR(block)) break; if (order-- == min_order) { + /* Try allocation through force merge method */ + if (mm->clear_avail && + !__force_merge(mm, start, end, min_order)) { + block = __drm_buddy_alloc_blocks(mm, start, + end, + min_order, + flags); + if (!IS_ERR(block)) { + order = min_order; + break; + } + } + + /* + * Try contiguous block allocation through + * try harder method. + */ if (flags & DRM_BUDDY_CONTIGUOUS_ALLOCATION && !(flags & DRM_BUDDY_RANGE_ALLOCATION)) - /* - * Try contiguous block allocation through - * try harder method - */ return __alloc_contig_try_harder(mm, original_size, original_min_size, @@ -828,6 +1060,8 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, mark_allocated(block); mm->avail -= drm_buddy_block_size(mm, block); + if (drm_buddy_block_is_clear(block)) + mm->clear_avail -= drm_buddy_block_size(mm, block); kmemleak_update_trace(block); list_add_tail(&block->link, &allocated); @@ -866,7 +1100,7 @@ int drm_buddy_alloc_blocks(struct drm_buddy *mm, return 0; err_free: - drm_buddy_free_list(mm, &allocated); + drm_buddy_free_list_internal(mm, &allocated); return err; } EXPORT_SYMBOL(drm_buddy_alloc_blocks); @@ -899,8 +1133,8 @@ void drm_buddy_print(struct drm_buddy *mm, struct drm_printer *p) { int order; - drm_printf(p, "chunk_size: %lluKiB, total: %lluMiB, free: %lluMiB\n", - mm->chunk_size >> 10, mm->size >> 20, mm->avail >> 20); + drm_printf(p, "chunk_size: %lluKiB, total: %lluMiB, free: %lluMiB, clear_free: %lluMiB\n", + mm->chunk_size >> 10, mm->size >> 20, mm->avail >> 20, mm->clear_avail >> 20); for (order = mm->max_order; order >= 0; order--) { struct drm_buddy_block *block; diff --git a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c index 0d735d5c2b35..942345548bc3 100644 --- a/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c +++ b/drivers/gpu/drm/i915/i915_ttm_buddy_manager.c @@ -126,7 +126,7 @@ static int i915_ttm_buddy_man_alloc(struct ttm_resource_manager *man, return 0; err_free_blocks: - drm_buddy_free_list(mm, &bman_res->blocks); + drm_buddy_free_list(mm, &bman_res->blocks, 0); mutex_unlock(&bman->lock); err_free_res: ttm_resource_fini(man, &bman_res->base); @@ -141,7 +141,7 @@ static void i915_ttm_buddy_man_free(struct ttm_resource_manager *man, struct i915_ttm_buddy_manager *bman = to_buddy_manager(man); mutex_lock(&bman->lock); - drm_buddy_free_list(&bman->mm, &bman_res->blocks); + drm_buddy_free_list(&bman->mm, &bman_res->blocks, 0); bman->visible_avail += bman_res->used_visible_size; mutex_unlock(&bman->lock); @@ -345,7 +345,7 @@ int i915_ttm_buddy_man_fini(struct ttm_device *bdev, unsigned int type) ttm_set_driver_manager(bdev, type, NULL); mutex_lock(&bman->lock); - drm_buddy_free_list(mm, &bman->reserved); + drm_buddy_free_list(mm, &bman->reserved, 0); drm_buddy_fini(mm); bman->visible_avail += bman->visible_reserved; WARN_ON_ONCE(bman->visible_avail != bman->visible_size); diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c b/drivers/gpu/drm/tests/drm_buddy_test.c index e48863a44556..4621a860cb05 100644 --- a/drivers/gpu/drm/tests/drm_buddy_test.c +++ b/drivers/gpu/drm/tests/drm_buddy_test.c @@ -103,7 +103,7 @@ static void drm_test_buddy_alloc_range_bias(struct kunit *test) DRM_BUDDY_RANGE_ALLOCATION), "buddy_alloc i failed with bias(%x-%x), size=%u, ps=%u\n", bias_start, bias_end, bias_size, bias_size); - drm_buddy_free_list(&mm, &tmp); + drm_buddy_free_list(&mm, &tmp, 0); /* single page with internal round_up */ KUNIT_ASSERT_FALSE_MSG(test, @@ -113,7 +113,7 @@ static void drm_test_buddy_alloc_range_bias(struct kunit *test) DRM_BUDDY_RANGE_ALLOCATION), "buddy_alloc failed with bias(%x-%x), size=%u, ps=%u\n", bias_start, bias_end, ps, bias_size); - drm_buddy_free_list(&mm, &tmp); + drm_buddy_free_list(&mm, &tmp, 0); /* random size within */ size = max(round_up(prandom_u32_state(&prng) % bias_rem, ps), ps); @@ -153,14 +153,14 @@ static void drm_test_buddy_alloc_range_bias(struct kunit *test) * unallocated, and ideally not always on the bias * boundaries. */ - drm_buddy_free_list(&mm, &tmp); + drm_buddy_free_list(&mm, &tmp, 0); } else { list_splice_tail(&tmp, &allocated); } } kfree(order); - drm_buddy_free_list(&mm, &allocated); + drm_buddy_free_list(&mm, &allocated, 0); drm_buddy_fini(&mm); /* @@ -220,7 +220,7 @@ static void drm_test_buddy_alloc_range_bias(struct kunit *test) "buddy_alloc passed with bias(%x-%x), size=%u\n", bias_start, bias_end, ps); - drm_buddy_free_list(&mm, &allocated); + drm_buddy_free_list(&mm, &allocated, 0); drm_buddy_fini(&mm); } @@ -269,7 +269,7 @@ static void drm_test_buddy_alloc_contiguous(struct kunit *test) DRM_BUDDY_CONTIGUOUS_ALLOCATION), "buddy_alloc didn't error size=%lu\n", 3 * ps); - drm_buddy_free_list(&mm, &middle); + drm_buddy_free_list(&mm, &middle, 0); KUNIT_ASSERT_TRUE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, 3 * ps, ps, &allocated, DRM_BUDDY_CONTIGUOUS_ALLOCATION), @@ -279,7 +279,7 @@ static void drm_test_buddy_alloc_contiguous(struct kunit *test) DRM_BUDDY_CONTIGUOUS_ALLOCATION), "buddy_alloc didn't error size=%lu\n", 2 * ps); - drm_buddy_free_list(&mm, &right); + drm_buddy_free_list(&mm, &right, 0); KUNIT_ASSERT_TRUE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, 3 * ps, ps, &allocated, DRM_BUDDY_CONTIGUOUS_ALLOCATION), @@ -294,7 +294,7 @@ static void drm_test_buddy_alloc_contiguous(struct kunit *test) DRM_BUDDY_CONTIGUOUS_ALLOCATION), "buddy_alloc hit an error size=%lu\n", 2 * ps); - drm_buddy_free_list(&mm, &left); + drm_buddy_free_list(&mm, &left, 0); KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, 3 * ps, ps, &allocated, DRM_BUDDY_CONTIGUOUS_ALLOCATION), @@ -306,7 +306,7 @@ static void drm_test_buddy_alloc_contiguous(struct kunit *test) KUNIT_ASSERT_EQ(test, total, ps * 2 + ps * 3); - drm_buddy_free_list(&mm, &allocated); + drm_buddy_free_list(&mm, &allocated, 0); drm_buddy_fini(&mm); } @@ -375,7 +375,7 @@ static void drm_test_buddy_alloc_pathological(struct kunit *test) top, max_order); } - drm_buddy_free_list(&mm, &holes); + drm_buddy_free_list(&mm, &holes, 0); /* Nothing larger than blocks of chunk_size now available */ for (order = 1; order <= max_order; order++) { @@ -387,7 +387,7 @@ static void drm_test_buddy_alloc_pathological(struct kunit *test) } list_splice_tail(&holes, &blocks); - drm_buddy_free_list(&mm, &blocks); + drm_buddy_free_list(&mm, &blocks, 0); drm_buddy_fini(&mm); } @@ -482,7 +482,7 @@ static void drm_test_buddy_alloc_pessimistic(struct kunit *test) list_del(&block->link); drm_buddy_free_block(&mm, block); - drm_buddy_free_list(&mm, &blocks); + drm_buddy_free_list(&mm, &blocks, 0); drm_buddy_fini(&mm); } @@ -528,7 +528,7 @@ static void drm_test_buddy_alloc_optimistic(struct kunit *test) size, size, &tmp, flags), "buddy_alloc unexpectedly succeeded, it should be full!"); - drm_buddy_free_list(&mm, &blocks); + drm_buddy_free_list(&mm, &blocks, 0); drm_buddy_fini(&mm); } @@ -563,7 +563,7 @@ static void drm_test_buddy_alloc_limit(struct kunit *test) drm_buddy_block_size(&mm, block), BIT_ULL(mm.max_order) * PAGE_SIZE); - drm_buddy_free_list(&mm, &allocated); + drm_buddy_free_list(&mm, &allocated, 0); drm_buddy_fini(&mm); } diff --git a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c index 115ec745e502..1ad678b62c4a 100644 --- a/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c +++ b/drivers/gpu/drm/xe/xe_ttm_vram_mgr.c @@ -196,7 +196,7 @@ static int xe_ttm_vram_mgr_new(struct ttm_resource_manager *man, return 0; error_free_blocks: - drm_buddy_free_list(mm, &vres->blocks); + drm_buddy_free_list(mm, &vres->blocks, 0); mutex_unlock(&mgr->lock); error_fini: ttm_resource_fini(man, &vres->base); @@ -214,7 +214,7 @@ static void xe_ttm_vram_mgr_del(struct ttm_resource_manager *man, struct drm_buddy *mm = &mgr->mm; mutex_lock(&mgr->lock); - drm_buddy_free_list(mm, &vres->blocks); + drm_buddy_free_list(mm, &vres->blocks, 0); mgr->visible_avail += vres->used_visible_size; mutex_unlock(&mgr->lock); diff --git a/include/drm/drm_buddy.h b/include/drm/drm_buddy.h index a5b39fc01003..82570f77e817 100644 --- a/include/drm/drm_buddy.h +++ b/include/drm/drm_buddy.h @@ -25,6 +25,8 @@ #define DRM_BUDDY_RANGE_ALLOCATION BIT(0) #define DRM_BUDDY_TOPDOWN_ALLOCATION BIT(1) #define DRM_BUDDY_CONTIGUOUS_ALLOCATION BIT(2) +#define DRM_BUDDY_CLEAR_ALLOCATION BIT(3) +#define DRM_BUDDY_CLEARED BIT(4) struct drm_buddy_block { #define DRM_BUDDY_HEADER_OFFSET GENMASK_ULL(63, 12) @@ -32,8 +34,9 @@ struct drm_buddy_block { #define DRM_BUDDY_ALLOCATED (1 << 10) #define DRM_BUDDY_FREE (2 << 10) #define DRM_BUDDY_SPLIT (3 << 10) +#define DRM_BUDDY_HEADER_CLEAR GENMASK_ULL(9, 9) /* Free to be used, if needed in the future */ -#define DRM_BUDDY_HEADER_UNUSED GENMASK_ULL(9, 6) +#define DRM_BUDDY_HEADER_UNUSED GENMASK_ULL(8, 6) #define DRM_BUDDY_HEADER_ORDER GENMASK_ULL(5, 0) u64 header; @@ -86,6 +89,7 @@ struct drm_buddy { u64 chunk_size; u64 size; u64 avail; + u64 clear_avail; }; static inline u64 @@ -112,6 +116,12 @@ drm_buddy_block_is_allocated(struct drm_buddy_block *block) return drm_buddy_block_state(block) == DRM_BUDDY_ALLOCATED; } +static inline bool +drm_buddy_block_is_clear(struct drm_buddy_block *block) +{ + return block->header & DRM_BUDDY_HEADER_CLEAR; +} + static inline bool drm_buddy_block_is_free(struct drm_buddy_block *block) { @@ -150,7 +160,9 @@ int drm_buddy_block_trim(struct drm_buddy *mm, void drm_buddy_free_block(struct drm_buddy *mm, struct drm_buddy_block *block); -void drm_buddy_free_list(struct drm_buddy *mm, struct list_head *objects); +void drm_buddy_free_list(struct drm_buddy *mm, + struct list_head *objects, + unsigned int flags); void drm_buddy_print(struct drm_buddy *mm, struct drm_printer *p); void drm_buddy_block_print(struct drm_buddy *mm, From patchwork Mon Apr 8 15:16:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Paneer Selvam, Arunpravin" X-Patchwork-Id: 13621271 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E6D2FC67861 for ; Mon, 8 Apr 2024 15:16:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2A9691126E7; Mon, 8 Apr 2024 15:16:53 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (1024-bit key; unprotected) header.d=amd.com header.i=@amd.com header.b="FJG/va3c"; dkim-atps=neutral Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2053.outbound.protection.outlook.com [40.107.93.53]) by gabe.freedesktop.org (Postfix) with ESMTPS id B23591126E7; Mon, 8 Apr 2024 15:16:51 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dge+xqO1+gqmjxu54foe2+h+OoSreZxMaZa2pTFET2FZShRMzfAh58fp/eQaj9gZ/IVjC5vdOuSY9ehAIZFAW2nfmfBKX10o2uqoPx7wyJ5JAf6XtFyd6j6DgfeQ1b6TiQ85qh2H9w8kOgJGJadNF+T9xPPY4upylLDCG65dwXBe9N+dwBIpDteyjDMQySebdzq7lL1qFo5DqNXIoBooe2965aGIDtaM9M/0Bm+pJLDssRTKplwD1G26hKffnmQXi3U4I/1zt5wg7RlzdAkSlgWfU8a+yztHRN1gFhsvl0+1nxMx+omsXDmeahp7P9QqNKF5YSWI292odxQE03TnXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+Ll4NKklJRnAh2UdMtervmPZFZbgVYSPymVb6Q+T9u0=; b=IKW7hh3hqUcR7+1L8JlbQT2qPTUdt3UyIHMa6GCFYzQ1VTH18JSvyVWPKVRMo1gwgM+ZiTWowMvtwwJr1Wk3doGmZ3x+qIuF9mOWOLxnHCoAh6bThDPNiAxjortWHdcd4ndc0c86NIdHVc8cCkgI8/aHo3WN5qF7agmrbpQCiQfKb242pIb3HgegHzHZ+on6bkx4AqXPCReE6b8/Ioz66Rc4m6jO31USdi2Z6CkrwZ/5RyJDweuDHEAPKeQhTwgjJcnkdH3cO2mLfL/+e3X6SBz7gpJA6P2IxZlbi7EGYZe8lr9ZuWWKTjtQ8kv/YUwzJxWsHxjz/hrSfkWsiztMmg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+Ll4NKklJRnAh2UdMtervmPZFZbgVYSPymVb6Q+T9u0=; b=FJG/va3cRT+XwVwTIXC62YI6k8SKwr49ddFEJ2kbxU3fV7WgpDn3utxwOoazkrG1e/HSnRENiK8e1UCcMTv2XOIhDESg2t2Td3BsSJJErbTsewAyGqcZFiRfEAHUGNbH9YLA7oQZOPRyuHoClE7MZgnK8zIbZ/O4lx0W+bLPlII= Received: from BY3PR04CA0022.namprd04.prod.outlook.com (2603:10b6:a03:217::27) by SA1PR12MB8947.namprd12.prod.outlook.com (2603:10b6:806:386::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.55; Mon, 8 Apr 2024 15:16:49 +0000 Received: from CO1PEPF000042AC.namprd03.prod.outlook.com (2603:10b6:a03:217:cafe::e6) by BY3PR04CA0022.outlook.office365.com (2603:10b6:a03:217::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.36 via Frontend Transport; Mon, 8 Apr 2024 15:16:49 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1PEPF000042AC.mail.protection.outlook.com (10.167.243.41) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Mon, 8 Apr 2024 15:16:48 +0000 Received: from amd-X570-AORUS-ELITE.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 8 Apr 2024 10:16:45 -0500 From: Arunpravin Paneer Selvam To: , CC: , , , , , Arunpravin Paneer Selvam Subject: [PATCH v10 2/3] drm/amdgpu: Enable clear page functionality Date: Mon, 8 Apr 2024 20:46:19 +0530 Message-ID: <20240408151620.528163-2-Arunpravin.PaneerSelvam@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240408151620.528163-1-Arunpravin.PaneerSelvam@amd.com> References: <20240408151620.528163-1-Arunpravin.PaneerSelvam@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000042AC:EE_|SA1PR12MB8947:EE_ X-MS-Office365-Filtering-Correlation-Id: f5d874c2-e927-41e6-be4f-08dc57dee97e X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: +y4db8t+C3TMlzmhQnMEIMvxcv7H5QBIQrry3OwMzFLiaxluATq5fYn48Jws7FnmQZdL0oWZDajKOAhvxAS4NHFGfVUluLSSGeybbSPHSvg+oUglRyPzjvPYsebKSsiaMNppER0inNuDas7pej7awi4fpzIqQEqY6HCBUvmnSeuL+WaMt+eQQs+mFacGj3lcyqpbVTGCTwjZjtQXQJ891Tx3q6ow3DaY4Fbh9J9spcqMGUAJxXP8jiHPzBdRE9qDt1ZP0vtkF48wvfz0i2iwCH9D//7CvxeAesXyVzInNHdhZtEvaRvqRVhaAFRbtwBC2vYME/B5eGoklPhpEKyQSqAV44shoJt6oXHngxpl9b2JqwMx282m0GQj3pGDfBHfltpL/zP+hTm/7aAFv80k33UYNJejQZG+SL8lGF+3OlDZBhfNCs5WWwaQpdpO099T7XCJx7G4z74O5j8jlk0bw+FRuHtStjXa64+dLMC/MmXzJhPsUekypd4mRvrlb3S4oYYOazIIsNHGCnR+M6xTj2lfkOZCfVGgGuPgddTWn5esb4XrykzYhMuoNuHBPy0K5NXJ2m7BIaTPXCTn2MIK+D+dU5Knok65/e3FH2/1Sjj4uGCDDxdxyxdsWokY8oiYWFK1PYer8wvXK+R3/ghBGpniSC62pjxbJkJb+Nlm1s10IS7DMhkNaDAYxvsTR97bl81Ka7Ud4kNEAWqyd7Jbcf8Fk+P9FweonM5EnNlgyEDXEUEtvPHFpCOaVWimerpz X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230031)(1800799015)(36860700004)(82310400014)(376005); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Apr 2024 15:16:48.8709 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f5d874c2-e927-41e6-be4f-08dc57dee97e X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000042AC.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB8947 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add clear page support in vram memory region. v1(Christian): - Dont handle clear page as TTM flag since when moving the BO back in from GTT again we don't need that. - Make a specialized version of amdgpu_fill_buffer() which only clears the VRAM areas which are not already cleared - Drop the TTM_PL_FLAG_WIPE_ON_RELEASE check in amdgpu_object.c v2: - Modify the function name amdgpu_ttm_* (Alex) - Drop the delayed parameter (Christian) - handle amdgpu_res_cleared(&cursor) just above the size calculation (Christian) - Use AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE for clearing the buffers in the free path to properly wait for fences etc.. (Christian) v3(Christian): - Remove buffer clear code in VRAM manager instead change the AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE handling to set the DRM_BUDDY_CLEARED flag. - Remove ! from amdgpu_res_cleared(&cursor) check. v4(Christian): - vres flag setting move to vram manager file - use dma_fence_get_stub in amdgpu_ttm_clear_buffer function - make fence a mandatory parameter and drop the if and the get/put dance Signed-off-by: Arunpravin Paneer Selvam Suggested-by: Christian König Acked-by: Felix Kuehling --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 9 +-- .../gpu/drm/amd/amdgpu/amdgpu_res_cursor.h | 25 +++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 70 ++++++++++++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 6 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h | 10 +++ 6 files changed, 116 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 8bc79924d171..e011a326fe86 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -39,6 +39,7 @@ #include "amdgpu.h" #include "amdgpu_trace.h" #include "amdgpu_amdkfd.h" +#include "amdgpu_vram_mgr.h" /** * DOC: amdgpu_object @@ -601,8 +602,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev, if (!amdgpu_bo_support_uswc(bo->flags)) bo->flags &= ~AMDGPU_GEM_CREATE_CPU_GTT_USWC; - if (adev->ras_enabled) - bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE; + bo->flags |= AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE; bo->tbo.bdev = &adev->mman.bdev; if (bp->domain & (AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA | @@ -634,7 +634,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev, bo->tbo.resource->mem_type == TTM_PL_VRAM) { struct dma_fence *fence; - r = amdgpu_fill_buffer(bo, 0, bo->tbo.base.resv, &fence, true); + r = amdgpu_ttm_clear_buffer(bo, bo->tbo.base.resv, &fence); if (unlikely(r)) goto fail_unreserve; @@ -1365,8 +1365,9 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo) if (WARN_ON_ONCE(!dma_resv_trylock(bo->base.resv))) return; - r = amdgpu_fill_buffer(abo, AMDGPU_POISON, bo->base.resv, &fence, true); + r = amdgpu_fill_buffer(abo, 0, bo->base.resv, &fence, true); if (!WARN_ON(r)) { + amdgpu_vram_mgr_set_cleared(bo->resource); amdgpu_bo_fence(abo, fence, false); dma_fence_put(fence); } diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h index 381101d2bf05..50fcd86e1033 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_res_cursor.h @@ -164,4 +164,29 @@ static inline void amdgpu_res_next(struct amdgpu_res_cursor *cur, uint64_t size) } } +/** + * amdgpu_res_cleared - check if blocks are cleared + * + * @cur: the cursor to extract the block + * + * Check if the @cur block is cleared + */ +static inline bool amdgpu_res_cleared(struct amdgpu_res_cursor *cur) +{ + struct drm_buddy_block *block; + + switch (cur->mem_type) { + case TTM_PL_VRAM: + block = cur->node; + + if (!amdgpu_vram_mgr_is_cleared(block)) + return false; + break; + default: + return false; + } + + return true; +} + #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index fc418e670fda..f0b42b567357 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -378,11 +378,12 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo, (abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE)) { struct dma_fence *wipe_fence = NULL; - r = amdgpu_fill_buffer(abo, AMDGPU_POISON, NULL, &wipe_fence, - false); + r = amdgpu_fill_buffer(abo, 0, NULL, &wipe_fence, + false); if (r) { goto error; } else if (wipe_fence) { + amdgpu_vram_mgr_set_cleared(bo->resource); dma_fence_put(fence); fence = wipe_fence; } @@ -2215,6 +2216,71 @@ static int amdgpu_ttm_fill_mem(struct amdgpu_ring *ring, uint32_t src_data, return 0; } +/** + * amdgpu_ttm_clear_buffer - clear memory buffers + * @bo: amdgpu buffer object + * @resv: reservation object + * @fence: dma_fence associated with the operation + * + * Clear the memory buffer resource. + * + * Returns: + * 0 for success or a negative error code on failure. + */ +int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo, + struct dma_resv *resv, + struct dma_fence **fence) +{ + struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); + struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring; + struct amdgpu_res_cursor cursor; + u64 addr; + int r; + + if (!adev->mman.buffer_funcs_enabled) + return -EINVAL; + + if (!fence) + return -EINVAL; + + *fence = dma_fence_get_stub(); + + amdgpu_res_first(bo->tbo.resource, 0, amdgpu_bo_size(bo), &cursor); + + mutex_lock(&adev->mman.gtt_window_lock); + while (cursor.remaining) { + struct dma_fence *next = NULL; + u64 size; + + if (amdgpu_res_cleared(&cursor)) { + amdgpu_res_next(&cursor, cursor.size); + continue; + } + + /* Never clear more than 256MiB at once to avoid timeouts */ + size = min(cursor.size, 256ULL << 20); + + r = amdgpu_ttm_map_buffer(&bo->tbo, bo->tbo.resource, &cursor, + 1, ring, false, &size, &addr); + if (r) + goto err; + + r = amdgpu_ttm_fill_mem(ring, 0, addr, size, resv, + &next, true, true); + if (r) + goto err; + + dma_fence_put(*fence); + *fence = next; + + amdgpu_res_next(&cursor, size); + } +err: + mutex_unlock(&adev->mman.gtt_window_lock); + + return r; +} + int amdgpu_fill_buffer(struct amdgpu_bo *bo, uint32_t src_data, struct dma_resv *resv, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h index 65ec82141a8e..b404d89d52e5 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h @@ -38,8 +38,6 @@ #define AMDGPU_GTT_MAX_TRANSFER_SIZE 512 #define AMDGPU_GTT_NUM_TRANSFER_WINDOWS 2 -#define AMDGPU_POISON 0xd0bed0be - extern const struct attribute_group amdgpu_vram_mgr_attr_group; extern const struct attribute_group amdgpu_gtt_mgr_attr_group; @@ -155,6 +153,9 @@ int amdgpu_ttm_copy_mem_to_mem(struct amdgpu_device *adev, uint64_t size, bool tmz, struct dma_resv *resv, struct dma_fence **f); +int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo, + struct dma_resv *resv, + struct dma_fence **fence); int amdgpu_fill_buffer(struct amdgpu_bo *bo, uint32_t src_data, struct dma_resv *resv, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c index c0c851409241..e494f5bf136a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c @@ -450,6 +450,7 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, { struct amdgpu_vram_mgr *mgr = to_vram_mgr(man); struct amdgpu_device *adev = to_amdgpu_device(mgr); + struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); u64 vis_usage = 0, max_bytes, min_block_size; struct amdgpu_vram_mgr_resource *vres; u64 size, remaining_size, lpfn, fpfn; @@ -501,6 +502,9 @@ static int amdgpu_vram_mgr_new(struct ttm_resource_manager *man, if (place->flags & TTM_PL_FLAG_CONTIGUOUS) vres->flags |= DRM_BUDDY_CONTIGUOUS_ALLOCATION; + if (bo->flags & AMDGPU_GEM_CREATE_VRAM_CLEARED) + vres->flags |= DRM_BUDDY_CLEAR_ALLOCATION; + if (fpfn || lpfn != mgr->mm.size) /* Allocate blocks in desired range */ vres->flags |= DRM_BUDDY_RANGE_ALLOCATION; @@ -604,7 +608,7 @@ static void amdgpu_vram_mgr_del(struct ttm_resource_manager *man, amdgpu_vram_mgr_do_reserve(man); - drm_buddy_free_list(mm, &vres->blocks, 0); + drm_buddy_free_list(mm, &vres->blocks, vres->flags); mutex_unlock(&mgr->lock); atomic64_sub(vis_usage, &mgr->vis_usage); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h index 0e04e42cf809..b256cbc2bc27 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h @@ -53,10 +53,20 @@ static inline u64 amdgpu_vram_mgr_block_size(struct drm_buddy_block *block) return (u64)PAGE_SIZE << drm_buddy_block_order(block); } +static inline bool amdgpu_vram_mgr_is_cleared(struct drm_buddy_block *block) +{ + return drm_buddy_block_is_clear(block); +} + static inline struct amdgpu_vram_mgr_resource * to_amdgpu_vram_mgr_resource(struct ttm_resource *res) { return container_of(res, struct amdgpu_vram_mgr_resource, base); } +static inline void amdgpu_vram_mgr_set_cleared(struct ttm_resource *res) +{ + to_amdgpu_vram_mgr_resource(res)->flags |= DRM_BUDDY_CLEARED; +} + #endif From patchwork Mon Apr 8 15:16:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paneer Selvam, Arunpravin" X-Patchwork-Id: 13621272 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E65EBCD1299 for ; Mon, 8 Apr 2024 15:17:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D99AD1126F0; Mon, 8 Apr 2024 15:17:03 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=pass (1024-bit key; unprotected) header.d=amd.com header.i=@amd.com header.b="XJyvRlCU"; dkim-atps=neutral Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2080.outbound.protection.outlook.com [40.107.244.80]) by gabe.freedesktop.org (Postfix) with ESMTPS id 73D521126EC; Mon, 8 Apr 2024 15:16:55 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Pkedo7GpwGb99EGCLb0EjRLO/xFPVVTEgxONEZSQd0qD+DTCQDg2c7NRVr60A/YYEOxGSxbN8j6we89MeKx+De5MnnenEl7V+J+P5vD6H9NA5an6APVI4sIOIT79oQ681JdhYFdNO+AsmCQCgxkytnoRHQ1ZH/RwjszboDoWkl1TMZViu7Y1ruIlxi02caBLWKvVs/7nK6Wi0CvGzTiWdwCKq6BDB1CJarSFAZQhcDF+tFC4Y43i9VDl+qZ3NQF1jNPxdjf0FaK7ThBnswQaSk9gYe2eaz42f6BR5JzUd2Ce6Rg8SG6HKsrF3S+BRLx8nuuw+8M/jpb00mX5XpaPug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bO5kYInkZhD05sBCD9HSo/RufgIFZG2/4fKzHrXj/Hs=; b=b5qmVPQi/m8aHaS5hTulsqwF52C70yXZH56qgEgXxPOzwAHKqh0YZ8P25pftEAaRVFCTs9MPodD9gJJHQ8UfaHW5M7fngQWpMu1O3RtOcrL3eH+KLbRbJD1OgWsc4p090wPnvTjlMxwumbweRCGi321j9dLKDXykhWtMHp3B9BHmVKe1FrGHhGSC2qrPPtiginQwrvXsX2ydxcLr4PXt07/nOpq9FxwdnKjjNs7r3GIQlIjOoyEincV3Cg3zwXNBVqLKMnkEjylE0M8/BQUeEMJ/KDofsUvosepPDlBZxVPZ+lPFoY+UQ/03qlNdf0UMoiEwRjcFbA8WP2hvMhGApA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=lists.freedesktop.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bO5kYInkZhD05sBCD9HSo/RufgIFZG2/4fKzHrXj/Hs=; b=XJyvRlCUzhPhdHQBD3x6By5WXXT/+x4QrCgWkEgSgq0LTcT/ieVjsT1LcMTgonlLJRgVtkgVQEQOeW9SY4f0HlMPh99ZeklV0fVkUKKPNQUwU/3dqQWW7E8Peg1LrZizGZfHQN1NFPlpmFe8QwM899RSDSFiSyCujgldG+Thn6A= Received: from BY5PR04CA0003.namprd04.prod.outlook.com (2603:10b6:a03:1d0::13) by CY8PR12MB8195.namprd12.prod.outlook.com (2603:10b6:930:77::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.48; Mon, 8 Apr 2024 15:16:52 +0000 Received: from CO1PEPF000042A7.namprd03.prod.outlook.com (2603:10b6:a03:1d0:cafe::7a) by BY5PR04CA0003.outlook.office365.com (2603:10b6:a03:1d0::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.36 via Frontend Transport; Mon, 8 Apr 2024 15:16:52 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by CO1PEPF000042A7.mail.protection.outlook.com (10.167.243.36) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.7452.22 via Frontend Transport; Mon, 8 Apr 2024 15:16:51 +0000 Received: from amd-X570-AORUS-ELITE.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 8 Apr 2024 10:16:48 -0500 From: Arunpravin Paneer Selvam To: , CC: , , , , , Arunpravin Paneer Selvam Subject: [PATCH v10 3/3] drm/tests: Add a test case for drm buddy clear allocation Date: Mon, 8 Apr 2024 20:46:20 +0530 Message-ID: <20240408151620.528163-3-Arunpravin.PaneerSelvam@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240408151620.528163-1-Arunpravin.PaneerSelvam@amd.com> References: <20240408151620.528163-1-Arunpravin.PaneerSelvam@amd.com> MIME-Version: 1.0 X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: CO1PEPF000042A7:EE_|CY8PR12MB8195:EE_ X-MS-Office365-Filtering-Correlation-Id: a928c41c-a746-49d8-a1c0-08dc57deeafd X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: oSv22mIctTXxJS1bbzOvketbEeRgaZXvUBm1RNODbPKQxyv44uOwQYtiJqNNYkG3nQ7qdKTtsE7by95SxEw5i1ury9hadcODdcGrQYejZIZ9bntASUxdRdrdriRnb24EnkoREyzFxx3WyQJir38o4KMJNiphiGH+MO8PLvetWLpPG61cU3BlBi6KeUWafqP8+hQnovqqf7D+PmNWR8xIfUWgK+IAHaWAygX6Ktzus4nKY63sqt5D7rE9D8QtDMcxOxOGSiqIGYwhIkZp7Ke5bwFDEXYXfOTjJpoF3S5e7+YNrS0RSwbmqXZ0047kT+fCkzud4p1mYq29cow1D7K7NmlGST1tlaoxBQxBPLcW3aVjATP3P0XOUGTEYDwTS9aNt70VNuwl30M9HQNXEVzX7hFfeeUj0ux/jLDD5YBX5vLbZPkn8CnxJdpM+udub67GGKYs6H/T//4IIRpyWXE/RJ9B+QVrU71wAuDSumOIdfVuPphAwLewvdaY+GFmyuEWlot5MBF0d2UhPGMWRDuyaRpG+iA/D9EH8CEmYzBAooLRatmcquahJSRghazqvlmKTAExdWNs1Z9kiK+6XXM2RZARIlexf+/aMwyi2S95jMB0mwhYjF4i2oR3i5xd2W4XZK1rjE+ik4ilqOI4SahvGTvMkAQQsGjifafpvWzqoR5WO++6Ht+smNQqEt6psdAfNqSx0TOOw147vjT8JDKkaA6+HPk40fERgTUn9yQxlTWCBtP5dG1ZEtdU17sKZvTb X-Forefront-Antispam-Report: CIP:165.204.84.17; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:SATLEXMB04.amd.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230031)(376005)(36860700004)(82310400014)(1800799015); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Apr 2024 15:16:51.4739 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a928c41c-a746-49d8-a1c0-08dc57deeafd X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: CO1PEPF000042A7.namprd03.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY8PR12MB8195 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Add a new test case for the drm buddy clear and dirty allocation. v2:(Matthew) - make size as u32 - rename PAGE_SIZE with SZ_4K - dont fragment the address space for all the order allocation iterations. we can do it once and just increment and allocate the size. - create new mm with non power-of-two size to ensure the multi-root force_merge during fini. Signed-off-by: Arunpravin Paneer Selvam Suggested-by: Matthew Auld Reviewed-by: Matthew Auld --- drivers/gpu/drm/tests/drm_buddy_test.c | 141 +++++++++++++++++++++++++ 1 file changed, 141 insertions(+) diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c b/drivers/gpu/drm/tests/drm_buddy_test.c index 4621a860cb05..b07f132f2835 100644 --- a/drivers/gpu/drm/tests/drm_buddy_test.c +++ b/drivers/gpu/drm/tests/drm_buddy_test.c @@ -224,6 +224,146 @@ static void drm_test_buddy_alloc_range_bias(struct kunit *test) drm_buddy_fini(&mm); } +static void drm_test_buddy_alloc_clear(struct kunit *test) +{ + unsigned long n_pages, total, i = 0; + const unsigned long ps = SZ_4K; + struct drm_buddy_block *block; + const int max_order = 12; + LIST_HEAD(allocated); + struct drm_buddy mm; + unsigned int order; + u32 mm_size, size; + LIST_HEAD(dirty); + LIST_HEAD(clean); + + mm_size = SZ_4K << max_order; + KUNIT_EXPECT_FALSE(test, drm_buddy_init(&mm, mm_size, ps)); + + KUNIT_EXPECT_EQ(test, mm.max_order, max_order); + + /* + * Idea is to allocate and free some random portion of the address space, + * returning those pages as non-dirty and randomly alternate between + * requesting dirty and non-dirty pages (not going over the limit + * we freed as non-dirty), putting that into two separate lists. + * Loop over both lists at the end checking that the dirty list + * is indeed all dirty pages and vice versa. Free it all again, + * keeping the dirty/clear status. + */ + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, + 5 * ps, ps, &allocated, + DRM_BUDDY_TOPDOWN_ALLOCATION), + "buddy_alloc hit an error size=%lu\n", 5 * ps); + drm_buddy_free_list(&mm, &allocated, DRM_BUDDY_CLEARED); + + n_pages = 10; + do { + unsigned long flags; + struct list_head *list; + int slot = i % 2; + + if (slot == 0) { + list = &dirty; + flags = 0; + } else { + list = &clean; + flags = DRM_BUDDY_CLEAR_ALLOCATION; + } + + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, + ps, ps, list, + flags), + "buddy_alloc hit an error size=%lu\n", ps); + } while (++i < n_pages); + + list_for_each_entry(block, &clean, link) + KUNIT_EXPECT_EQ(test, drm_buddy_block_is_clear(block), true); + + list_for_each_entry(block, &dirty, link) + KUNIT_EXPECT_EQ(test, drm_buddy_block_is_clear(block), false); + + drm_buddy_free_list(&mm, &clean, DRM_BUDDY_CLEARED); + + /* + * Trying to go over the clear limit for some allocation. + * The allocation should never fail with reasonable page-size. + */ + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, + 10 * ps, ps, &clean, + DRM_BUDDY_CLEAR_ALLOCATION), + "buddy_alloc hit an error size=%lu\n", 10 * ps); + + drm_buddy_free_list(&mm, &clean, DRM_BUDDY_CLEARED); + drm_buddy_free_list(&mm, &dirty, 0); + drm_buddy_fini(&mm); + + KUNIT_EXPECT_FALSE(test, drm_buddy_init(&mm, mm_size, ps)); + + /* + * Create a new mm. Intentionally fragment the address space by creating + * two alternating lists. Free both lists, one as dirty the other as clean. + * Try to allocate double the previous size with matching min_page_size. The + * allocation should never fail as it calls the force_merge. Also check that + * the page is always dirty after force_merge. Free the page as dirty, then + * repeat the whole thing, increment the order until we hit the max_order. + */ + + i = 0; + n_pages = mm_size / ps; + do { + struct list_head *list; + int slot = i % 2; + + if (slot == 0) + list = &dirty; + else + list = &clean; + + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, + ps, ps, list, 0), + "buddy_alloc hit an error size=%lu\n", ps); + } while (++i < n_pages); + + drm_buddy_free_list(&mm, &clean, DRM_BUDDY_CLEARED); + drm_buddy_free_list(&mm, &dirty, 0); + + order = 1; + do { + size = SZ_4K << order; + + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, + size, size, &allocated, + DRM_BUDDY_CLEAR_ALLOCATION), + "buddy_alloc hit an error size=%u\n", size); + total = 0; + list_for_each_entry(block, &allocated, link) { + if (size != mm_size) + KUNIT_EXPECT_EQ(test, drm_buddy_block_is_clear(block), false); + total += drm_buddy_block_size(&mm, block); + } + KUNIT_EXPECT_EQ(test, total, size); + + drm_buddy_free_list(&mm, &allocated, 0); + } while (++order <= max_order); + + drm_buddy_fini(&mm); + + /* + * Create a new mm with a non power-of-two size. Allocate a random size, free as + * cleared and then call fini. This will ensure the multi-root force merge during + * fini. + */ + mm_size = 12 * SZ_4K; + KUNIT_EXPECT_FALSE(test, drm_buddy_init(&mm, mm_size, ps)); + KUNIT_ASSERT_FALSE_MSG(test, drm_buddy_alloc_blocks(&mm, 0, mm_size, + 4 * ps, ps, &allocated, + DRM_BUDDY_TOPDOWN_ALLOCATION), + "buddy_alloc hit an error size=%lu\n", 4 * ps); + drm_buddy_free_list(&mm, &allocated, DRM_BUDDY_CLEARED); + drm_buddy_fini(&mm); +} + static void drm_test_buddy_alloc_contiguous(struct kunit *test) { const unsigned long ps = SZ_4K, mm_size = 16 * 3 * SZ_4K; @@ -584,6 +724,7 @@ static struct kunit_case drm_buddy_tests[] = { KUNIT_CASE(drm_test_buddy_alloc_pessimistic), KUNIT_CASE(drm_test_buddy_alloc_pathological), KUNIT_CASE(drm_test_buddy_alloc_contiguous), + KUNIT_CASE(drm_test_buddy_alloc_clear), KUNIT_CASE(drm_test_buddy_alloc_range_bias), {} };