From patchwork Thu May 16 12:18:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13666044 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 454C9C25B77 for ; Thu, 16 May 2024 12:18:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A8EE210EC98; Thu, 16 May 2024 12:18:31 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="UcVHRhTX"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6CA1B10EC8E; Thu, 16 May 2024 12:18:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=gF5mI2d807cA6j5Qq7UIRyqf0ELyiAJcrHBqRj0nLr8=; b=UcVHRhTX3jaN2pa3UZWG9tbtjZ wzCKELOlEcUQbBsAgnnR1kKSKXWVz50J+H7dhhvBEcK1fgQ0cltQnyxTgy3BIhOxF5m3C+L34X3Gv w40cxv/7yDi387eM+lxjYdi8L9UnWFBV5IZDguGPmaQPX072TH5ZzhSBNa8mdFHy+86xq1/IpLtwr zsy/cLxnvJPSc62El4PRbcKbef0MLg3h66MWTnOESpwtl+IDsd14uinXF1tklRPFZIXTJ8CB31wNb keNLjL+Fa4YH4GwRZ8lAsTUmmMAkC0R5XroX+K/106h7gznZwIH3ZmK4wt09Bzm4mpGwAH/xzTPaq UWZAc/oA==; Received: from [84.69.19.168] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1s7a41-008odi-E2; Thu, 16 May 2024 14:18:25 +0200 From: Tvrtko Ursulin To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Friedrich Vock Subject: [RFC 1/2] drm/amdgpu: Re-validate evicted buffers Date: Thu, 16 May 2024 13:18:21 +0100 Message-ID: <20240516121822.19036-2-tursulin@igalia.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240516121822.19036-1-tursulin@igalia.com> References: <20240516121822.19036-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Currently the driver appears to be thinking that it will be attempting to re-validate the evicted buffers on the next submission if they are not in their preferred placement. That however appears not to be true for the very common case of buffers with allowed placements of VRAM+GTT. Simply because the check can only detect if the current placement is *none* of the preferred ones, happily leaving VRAM+GTT buffers in the GTT placement "forever". Fix it by extending the VRAM+GTT special case to the re-validation logic. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 6bddd43604bc..e53ff914b62e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -1248,10 +1248,25 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev, struct amdgpu_bo_va *bo_va, * next command submission. */ if (amdgpu_vm_is_bo_always_valid(vm, bo)) { - uint32_t mem_type = bo->tbo.resource->mem_type; + unsigned current_domain = + amdgpu_mem_type_to_domain(bo->tbo.resource->mem_type); + bool move_to_evict = false; - if (!(bo->preferred_domains & - amdgpu_mem_type_to_domain(mem_type))) + if (!(bo->preferred_domains & current_domain)) { + move_to_evict = true; + } else if ((bo->preferred_domains & AMDGPU_GEM_DOMAIN_MASK) == + (AMDGPU_GEM_DOMAIN_VRAM | AMDGPU_GEM_DOMAIN_GTT) && + current_domain != AMDGPU_GEM_DOMAIN_VRAM) { + /* + * If userspace has provided a list of possible + * placements equal to VRAM+GTT, we assume VRAM is *the* + * preferred placement and so try to move it back there + * on the next submission. + */ + move_to_evict = true; + } + + if (move_to_evict) amdgpu_vm_bo_evicted(&bo_va->base); else amdgpu_vm_bo_idle(&bo_va->base); From patchwork Thu May 16 12:18:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tvrtko Ursulin X-Patchwork-Id: 13666045 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3E173C25B77 for ; Thu, 16 May 2024 12:18:37 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id CD40B10ECA0; Thu, 16 May 2024 12:18:31 +0000 (UTC) Authentication-Results: gabe.freedesktop.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=igalia.com header.i=@igalia.com header.b="jo1pgb4H"; dkim-atps=neutral Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4DA9610EC98; Thu, 16 May 2024 12:18:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=3k2ElnK03S3i3Q6UkNBsqu5VyXmT1AMp8qwZV0UTPpk=; b=jo1pgb4H33HCOh8I02xaMIj+lf PurxYmHzNknLQTY3845N1qYJqqT0gFVORr+WuRUfPgVjinPJnnMpBFdP4Of5S2DiY6ImvQ4Yk7vJU YcXbPzMr+CJfJ7eIVwCGvShLdoL1w+sf2Ft0EsISbUivKoHgUGqLIlPYb59VjPDozww0TOAhgGfzS ZKVyfKEWMjeEePc8K8nc91ORS0I1HDVt4H9truETJnvAnY57upeQWGa9MOaeSkHXMlhgd+B82Yw5G RxJzDuSEe2xu01FtqzLr0RjarmoecfDX0pUW8jg/Px6MpA4dEaDpumRXHeqiTgkyFWKD4vUwCYJiO kjbXXs/w==; Received: from [84.69.19.168] (helo=localhost) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_SECP256R1__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1s7a42-008odo-3o; Thu, 16 May 2024 14:18:26 +0200 From: Tvrtko Ursulin To: amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Cc: kernel-dev@igalia.com, Tvrtko Ursulin , =?utf-8?q?Christian_K=C3=B6nig?= , Friedrich Vock Subject: [RFC 2/2] drm/amdgpu: Actually respect buffer migration budget Date: Thu, 16 May 2024 13:18:22 +0100 Message-ID: <20240516121822.19036-3-tursulin@igalia.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240516121822.19036-1-tursulin@igalia.com> References: <20240516121822.19036-1-tursulin@igalia.com> MIME-Version: 1.0 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" From: Tvrtko Ursulin Current code appears to live in a misconception that playing with buffer allowed and preferred placements can always control the decision on whether backing store migration will be attempted or not. That is however not the case when userspace sets buffer placements of VRAM+GTT, which is what radv does since commit 862b6a9a ("radv: Improve spilling on discrete GPUs."), with the end result of completely ignoring the migration budget. Fix this by validating against a local singleton placement set to the current backing store location. This way, when migration budget has been depleted, we can prevent ttm_bo_validate from seeing any other than the current placement. For the case of implicit GTT allowed domain added in amdgpu_bo_create when userspace only sets VRAM the behaviour should be the same. On the first pass the re-validation will attempt to migrate away from the fallback GTT domain, and if that did not succeed the buffer will remain in the fallback placement. Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 112 +++++++++++++++++++------ 1 file changed, 85 insertions(+), 27 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index ec888fc6ead8..08e7631f3a2e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -32,6 +32,7 @@ #include #include +#include #include #include "amdgpu_cs.h" @@ -775,6 +776,56 @@ void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes, spin_unlock(&adev->mm_stats.lock); } +static bool +amdgpu_cs_bo_move_under_budget(struct amdgpu_cs_parser *p, + struct amdgpu_bo *abo) +{ + struct amdgpu_device *adev = amdgpu_ttm_adev(abo->tbo.bdev); + + /* + * Don't move this buffer if we have depleted our allowance + * to move it. Don't move anything if the threshold is zero. + */ + if (p->bytes_moved >= p->bytes_moved_threshold) + return false; + + if ((!abo->tbo.base.dma_buf || + list_empty(&abo->tbo.base.dma_buf->attachments)) && + (!amdgpu_gmc_vram_full_visible(&adev->gmc) && + (abo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) && + p->bytes_moved_vis >= p->bytes_moved_vis_threshold) { + /* + * And don't move a CPU_ACCESS_REQUIRED BO to limited + * visible VRAM if we've depleted our allowance to do + * that. + */ + return false; + } + + return true; +} + +static bool +amdgpu_bo_fill_current_placement(struct amdgpu_bo *abo, + struct ttm_placement *placement, + struct ttm_place *place) +{ + struct ttm_placement *bo_placement = &abo->placement; + int i; + + for (i = 0; i < bo_placement->num_placement; i++) { + if (bo_placement->placement[i].mem_type == + abo->tbo.resource->mem_type) { + *place = bo_placement->placement[i]; + placement->num_placement = 1; + placement->placement = place; + return true; + } + } + + return false; +} + static int amdgpu_cs_bo_validate(void *param, struct amdgpu_bo *bo) { struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); @@ -784,46 +835,53 @@ static int amdgpu_cs_bo_validate(void *param, struct amdgpu_bo *bo) .no_wait_gpu = false, .resv = bo->tbo.base.resv }; - uint32_t domain; + bool allow_move; int r; if (bo->tbo.pin_count) return 0; - /* Don't move this buffer if we have depleted our allowance - * to move it. Don't move anything if the threshold is zero. - */ - if (p->bytes_moved < p->bytes_moved_threshold && - (!bo->tbo.base.dma_buf || - list_empty(&bo->tbo.base.dma_buf->attachments))) { - if (!amdgpu_gmc_vram_full_visible(&adev->gmc) && - (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) { - /* And don't move a CPU_ACCESS_REQUIRED BO to limited - * visible VRAM if we've depleted our allowance to do - * that. - */ - if (p->bytes_moved_vis < p->bytes_moved_vis_threshold) - domain = bo->preferred_domains; - else - domain = bo->allowed_domains; - } else { - domain = bo->preferred_domains; - } - } else { - domain = bo->allowed_domains; - } + if (!bo->tbo.resource || amdgpu_cs_bo_move_under_budget(p, bo)) + allow_move = true; + else + allow_move = false; + amdgpu_bo_placement_from_domain(bo, bo->allowed_domains); retry: - amdgpu_bo_placement_from_domain(bo, domain); - r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx); + if (allow_move) { + r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx); + } else { + struct ttm_placement same_placement = { }; + struct ttm_place same_place; + bool placement_found; + + placement_found = + amdgpu_bo_fill_current_placement(bo, + &same_placement, + &same_place); + + if (drm_WARN_ON_ONCE(&adev->ddev, !placement_found)) { + /* + * QQQ + * Current placement not in the bo->allowed_domains set. + * Can this happen? + * QQQ + */ + allow_move = true; + goto retry; + } + + r = ttm_bo_validate(&bo->tbo, &same_placement, &ctx); + } p->bytes_moved += ctx.bytes_moved; if (!amdgpu_gmc_vram_full_visible(&adev->gmc) && amdgpu_res_cpu_visible(adev, bo->tbo.resource)) p->bytes_moved_vis += ctx.bytes_moved; - if (unlikely(r == -ENOMEM) && domain != bo->allowed_domains) { - domain = bo->allowed_domains; + if (unlikely(r == -ENOMEM) && !allow_move) { + ctx.bytes_moved = 0; + allow_move = true; goto retry; }