From patchwork Mon Mar 16 22:32:16 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Alex Deucher X-Patchwork-Id: 6026201 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id B9F49BF90F for ; Mon, 16 Mar 2015 22:32:24 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 81F5220461 for ; Mon, 16 Mar 2015 22:32:22 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 3BA272045B for ; Mon, 16 Mar 2015 22:32:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8B72D6E5F9; Mon, 16 Mar 2015 15:32:18 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-qg0-f43.google.com (mail-qg0-f43.google.com [209.85.192.43]) by gabe.freedesktop.org (Postfix) with ESMTP id CB9566E5F9 for ; Mon, 16 Mar 2015 15:32:16 -0700 (PDT) Received: by qgfa8 with SMTP id a8so54067842qgf.0 for ; Mon, 16 Mar 2015 15:32:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=KM693sTqNzrgIe6D1fSOBowhsaCf2J0LiFX1hds1xFQ=; b=xHqYHOdRLmtXYCkQtlVPtagDztjUpQjydu2ZG8M4eTMt8Rdl1uX/WQo9OLpj6DwbQ8 6hHWI8ruJZ4C1TOwPeexoyfqGBz40sZyoNZcqwDPyczFaQGttj7nDeqBVd9cu4DjKAzy Zp+SI69+XdY/0xNayX6lv48I4bAeaBhWTJEjmi3t4Xgcr0iSm/a2gF+5/TC7K+4FAV8o suUFO3x/O82wE0spgk38ve1CfmcwVRyc/eqxNZ/jQz59rJ6IxtBHoX25Em7YaJoF00uL bZM+eqLQxLKmEpF1cxPn7NNiRi6kTQIJy0Puq+O4cHuzna2TOKnMkaBh3pADeBj0Y6xt +I7A== MIME-Version: 1.0 X-Received: by 10.140.22.234 with SMTP id 97mr76003682qgn.52.1426545136381; Mon, 16 Mar 2015 15:32:16 -0700 (PDT) Received: by 10.140.41.69 with HTTP; Mon, 16 Mar 2015 15:32:16 -0700 (PDT) In-Reply-To: <550251B2.5020000@daenzer.net> References: <1426088652-32727-1-git-send-email-alexander.deucher@amd.com> <550087B6.3090307@vodafone.de> <55015640.4020008@daenzer.net> <55015B27.2030208@vodafone.de> <550251B2.5020000@daenzer.net> Date: Mon, 16 Mar 2015 18:32:16 -0400 Message-ID: Subject: Re: [PATCH] drm/radeon: fix TOPDOWN handling for bo_create From: Alex Deucher To: =?UTF-8?Q?Michel_D=C3=A4nzer?= Cc: Maling list - DRI developers X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, T_DKIM_INVALID, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Thu, Mar 12, 2015 at 10:55 PM, Michel Dänzer wrote: > On 12.03.2015 22:09, Alex Deucher wrote: >> On Thu, Mar 12, 2015 at 5:23 AM, Christian König >> wrote: >>> On 12.03.2015 10:02, Michel Dänzer wrote: >>>> >>>> On 12.03.2015 06:14, Alex Deucher wrote: >>>>> >>>>> On Wed, Mar 11, 2015 at 4:51 PM, Alex Deucher >>>>> wrote: >>>>>> >>>>>> On Wed, Mar 11, 2015 at 2:21 PM, Christian König >>>>>> wrote: >>>>>>> >>>>>>> On 11.03.2015 16:44, Alex Deucher wrote: >>>>>>>> >>>>>>>> radeon_bo_create() calls radeon_ttm_placement_from_domain() >>>>>>>> before ttm_bo_init() is called. radeon_ttm_placement_from_domain() >>>>>>>> uses the ttm bo size to determine when to select top down >>>>>>>> allocation but since the ttm bo is not initialized yet the >>>>>>>> check is always false. >>>>>>>> >>>>>>>> Noticed-by: Oded Gabbay >>>>>>>> Signed-off-by: Alex Deucher >>>>>>>> Cc: stable@vger.kernel.org >>>>>>> >>>>>>> >>>>>>> And I was already wondering why the heck the BOs always made this >>>>>>> ping/pong >>>>>>> in memory after creation. >>>>>>> >>>>>>> Patch is Reviewed-by: Christian König >>>>>> >>>>>> And fixing that promptly broke VCE due to vram location requirements. >>>>>> Updated patch attached. Thoughts? >>>>> >>>>> And one more take to make things a bit more explicit for static kernel >>>>> driver allocations. >>>> >>>> struct ttm_place::lpfn is honoured even with TTM_PL_FLAG_TOPDOWN, so >>>> latter should work with RADEON_GEM_CPU_ACCESS. It sounds like the >>>> problem is really that some BOs are expected to be within a certain >>>> range from the beginning of VRAM, but lpfn isn't set accordingly. It >>>> would be better to fix that by setting lpfn directly than indirectly via >>>> RADEON_GEM_CPU_ACCESS. >>> >>> >>> Yeah, agree. We should probably try to find the root cause of this instead. >>> >>> As far as I know VCE has no documented limitation on where buffers are >>> placed (unlike UVD). So this is a bit strange. Are you sure that it isn't >>> UVD which breaks here? >> >> It's definitely VCE, I don't know why UVD didn't have a problem. I >> considered using pin_restricted to make sure it got pinned in the CPU >> visible region, but that had two problems: 1. it would end up getting >> migrated when pinned, > > Maybe something like radeon_uvd_force_into_uvd_segment() is needed for > VCE as well? > > >> 2. it would end up at the top of the restricted >> region since the top down flag is set which would end up fragmenting >> vram. > > If that's an issue (which outweighs the supposed benefit of > TTM_PL_FLAG_TOPDOWN), then again the proper solution would be not to set > TTM_PL_FLAG_TOPDOWN when rbo->placements[i].lpfn != 0 and smaller than > the whole available region, instead of checking for VRAM and > RADEON_GEM_CPU_ACCESS. > How about something like the attached patch? I'm not really sure about the restrictions for the UVD and VCE fw and stack/heap buffers, but this seems to work. It seems like the current UVD/VCE code works by accident since the check for TOPDOWN fails. Alex From 304963717d0ad761fc860928c8b08df297635668 Mon Sep 17 00:00:00 2001 From: Alex Deucher Date: Wed, 11 Mar 2015 11:27:26 -0400 Subject: [PATCH] drm/radeon: handle pfn restrictions and TOPDOWN in radeon_bo_create() Explicitly set the pfn restrictions on bo_create. Previously we were relying on bottom up behavior in a number of places. Make it explicit. Also pass the size explicitly since radeon_bo_create() calls radeon_ttm_placement_from_domain() before ttm_bo_init() is called. radeon_ttm_placement_from_domain() uses the ttm bo size to determine when to select top down allocation but since the ttm bo is not initialized yet the check is always false. Signed-off-by: Alex Deucher --- drivers/gpu/drm/radeon/cik.c | 4 +- drivers/gpu/drm/radeon/evergreen.c | 6 +-- drivers/gpu/drm/radeon/r600.c | 4 +- drivers/gpu/drm/radeon/radeon.h | 3 +- drivers/gpu/drm/radeon/radeon_benchmark.c | 6 ++- drivers/gpu/drm/radeon/radeon_device.c | 2 +- drivers/gpu/drm/radeon/radeon_gart.c | 1 + drivers/gpu/drm/radeon/radeon_gem.c | 5 +- drivers/gpu/drm/radeon/radeon_kfd.c | 2 +- drivers/gpu/drm/radeon/radeon_mn.c | 5 +- drivers/gpu/drm/radeon/radeon_object.c | 86 ++++++++++++++++--------------- drivers/gpu/drm/radeon/radeon_object.h | 3 +- drivers/gpu/drm/radeon/radeon_prime.c | 2 +- drivers/gpu/drm/radeon/radeon_ring.c | 2 +- drivers/gpu/drm/radeon/radeon_sa.c | 2 +- drivers/gpu/drm/radeon/radeon_test.c | 4 +- drivers/gpu/drm/radeon/radeon_ttm.c | 14 +++-- drivers/gpu/drm/radeon/radeon_uvd.c | 9 +--- drivers/gpu/drm/radeon/radeon_vce.c | 4 +- drivers/gpu/drm/radeon/radeon_vm.c | 4 +- 20 files changed, 87 insertions(+), 81 deletions(-) diff --git a/drivers/gpu/drm/radeon/cik.c b/drivers/gpu/drm/radeon/cik.c index 28faea9..8d70176 100644 --- a/drivers/gpu/drm/radeon/cik.c +++ b/drivers/gpu/drm/radeon/cik.c @@ -4756,7 +4756,7 @@ static int cik_mec_init(struct radeon_device *rdev) r = radeon_bo_create(rdev, rdev->mec.num_mec *rdev->mec.num_pipe * MEC_HPD_SIZE * 2, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_GTT, 0, NULL, NULL, + RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL, NULL, &rdev->mec.hpd_eop_obj); if (r) { dev_warn(rdev->dev, "(%d) create HDP EOP bo failed\n", r); @@ -4922,7 +4922,7 @@ static int cik_cp_compute_resume(struct radeon_device *rdev) r = radeon_bo_create(rdev, sizeof(struct bonaire_mqd), PAGE_SIZE, true, - RADEON_GEM_DOMAIN_GTT, 0, NULL, + RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL, NULL, &rdev->ring[idx].mqd_obj); if (r) { dev_warn(rdev->dev, "(%d) create MQD bo failed\n", r); diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index f848acf..02b5d2e 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -4054,7 +4054,7 @@ int sumo_rlc_init(struct radeon_device *rdev) /* save restore block */ if (rdev->rlc.save_restore_obj == NULL) { r = radeon_bo_create(rdev, dws * 4, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_VRAM, 0, NULL, + RADEON_GEM_DOMAIN_VRAM, 0, 0, 0, NULL, NULL, &rdev->rlc.save_restore_obj); if (r) { dev_warn(rdev->dev, "(%d) create RLC sr bo failed\n", r); @@ -4133,7 +4133,7 @@ int sumo_rlc_init(struct radeon_device *rdev) if (rdev->rlc.clear_state_obj == NULL) { r = radeon_bo_create(rdev, dws * 4, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_VRAM, 0, NULL, + RADEON_GEM_DOMAIN_VRAM, 0, 0, 0, NULL, NULL, &rdev->rlc.clear_state_obj); if (r) { dev_warn(rdev->dev, "(%d) create RLC c bo failed\n", r); @@ -4210,7 +4210,7 @@ int sumo_rlc_init(struct radeon_device *rdev) if (rdev->rlc.cp_table_obj == NULL) { r = radeon_bo_create(rdev, rdev->rlc.cp_table_size, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_VRAM, 0, NULL, + RADEON_GEM_DOMAIN_VRAM, 0, 0, 0, NULL, NULL, &rdev->rlc.cp_table_obj); if (r) { dev_warn(rdev->dev, "(%d) create RLC cp table bo failed\n", r); diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 8f6d862..7249620 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -1457,7 +1457,7 @@ int r600_vram_scratch_init(struct radeon_device *rdev) if (rdev->vram_scratch.robj == NULL) { r = radeon_bo_create(rdev, RADEON_GPU_PAGE_SIZE, PAGE_SIZE, true, RADEON_GEM_DOMAIN_VRAM, - 0, NULL, NULL, &rdev->vram_scratch.robj); + 0, 0, 0, NULL, NULL, &rdev->vram_scratch.robj); if (r) { return r; } @@ -3390,7 +3390,7 @@ int r600_ih_ring_alloc(struct radeon_device *rdev) if (rdev->ih.ring_obj == NULL) { r = radeon_bo_create(rdev, rdev->ih.ring_size, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_GTT, 0, + RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL, NULL, &rdev->ih.ring_obj); if (r) { DRM_ERROR("radeon: failed to create ih ring buffer (%d).\n", r); diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index 35ab65d..809fc49 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -2980,7 +2980,8 @@ extern void radeon_surface_init(struct radeon_device *rdev); extern int radeon_cs_parser_init(struct radeon_cs_parser *p, void *data); extern void radeon_legacy_set_clock_gating(struct radeon_device *rdev, int enable); extern void radeon_atom_set_clock_gating(struct radeon_device *rdev, int enable); -extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain); +extern void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain, + u64 size, unsigned fpfn, unsigned lpfn); extern bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo); extern int radeon_ttm_tt_set_userptr(struct ttm_tt *ttm, uint64_t addr, uint32_t flags); diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c b/drivers/gpu/drm/radeon/radeon_benchmark.c index 87d5fb2..1caa887 100644 --- a/drivers/gpu/drm/radeon/radeon_benchmark.c +++ b/drivers/gpu/drm/radeon/radeon_benchmark.c @@ -94,7 +94,8 @@ static void radeon_benchmark_move(struct radeon_device *rdev, unsigned size, int time; n = RADEON_BENCHMARK_ITERATIONS; - r = radeon_bo_create(rdev, size, PAGE_SIZE, true, sdomain, 0, NULL, NULL, &sobj); + r = radeon_bo_create(rdev, size, PAGE_SIZE, true, sdomain, 0, 0, 0, + NULL, NULL, &sobj); if (r) { goto out_cleanup; } @@ -106,7 +107,8 @@ static void radeon_benchmark_move(struct radeon_device *rdev, unsigned size, if (r) { goto out_cleanup; } - r = radeon_bo_create(rdev, size, PAGE_SIZE, true, ddomain, 0, NULL, NULL, &dobj); + r = radeon_bo_create(rdev, size, PAGE_SIZE, true, ddomain, 0, 0, 0, + NULL, NULL, &dobj); if (r) { goto out_cleanup; } diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index b7ca4c5..d1cebfe 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -465,7 +465,7 @@ int radeon_wb_init(struct radeon_device *rdev) if (rdev->wb.wb_obj == NULL) { r = radeon_bo_create(rdev, RADEON_GPU_PAGE_SIZE, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_GTT, 0, NULL, NULL, + RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL, NULL, &rdev->wb.wb_obj); if (r) { dev_warn(rdev->dev, "(%d) create WB bo failed\n", r); diff --git a/drivers/gpu/drm/radeon/radeon_gart.c b/drivers/gpu/drm/radeon/radeon_gart.c index 5450fa9..1513c1c 100644 --- a/drivers/gpu/drm/radeon/radeon_gart.c +++ b/drivers/gpu/drm/radeon/radeon_gart.c @@ -128,6 +128,7 @@ int radeon_gart_table_vram_alloc(struct radeon_device *rdev) if (rdev->gart.robj == NULL) { r = radeon_bo_create(rdev, rdev->gart.table_size, PAGE_SIZE, true, RADEON_GEM_DOMAIN_VRAM, + 0, (rdev->mc.visible_vram_size >> PAGE_SHIFT), 0, NULL, NULL, &rdev->gart.robj); if (r) { return r; diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index ac3c131..58f4556 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -67,7 +67,7 @@ int radeon_gem_object_create(struct radeon_device *rdev, unsigned long size, retry: r = radeon_bo_create(rdev, size, alignment, kernel, initial_domain, - flags, NULL, NULL, &robj); + 0, 0, flags, NULL, NULL, &robj); if (r) { if (r != -ERESTARTSYS) { if (initial_domain == RADEON_GEM_DOMAIN_VRAM) { @@ -337,7 +337,8 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, goto release_object; } - radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_GTT); + radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_GTT, bo->tbo.mem.size, + 0, 0); r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false); radeon_bo_unreserve(bo); up_read(¤t->mm->mmap_sem); diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c index 061eaa9..e0caa6c 100644 --- a/drivers/gpu/drm/radeon/radeon_kfd.c +++ b/drivers/gpu/drm/radeon/radeon_kfd.c @@ -212,7 +212,7 @@ static int alloc_gtt_mem(struct kgd_dev *kgd, size_t size, return -ENOMEM; r = radeon_bo_create(rdev, size, PAGE_SIZE, true, RADEON_GEM_DOMAIN_GTT, - RADEON_GEM_GTT_WC, NULL, NULL, &(*mem)->bo); + 0, 0, RADEON_GEM_GTT_WC, NULL, NULL, &(*mem)->bo); if (r) { dev_err(rdev->dev, "failed to allocate BO for amdkfd (%d)\n", r); diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c index a69bd44..7a8b0e5 100644 --- a/drivers/gpu/drm/radeon/radeon_mn.c +++ b/drivers/gpu/drm/radeon/radeon_mn.c @@ -141,14 +141,15 @@ static void radeon_mn_invalidate_range_start(struct mmu_notifier *mn, DRM_ERROR("(%d) failed to wait for user bo\n", r); } - radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_CPU); + radeon_ttm_placement_from_domain(bo, RADEON_GEM_DOMAIN_CPU, bo->tbo.mem.size, + 0, 0); r = ttm_bo_validate(&bo->tbo, &bo->placement, false, false); if (r) DRM_ERROR("(%d) failed to validate user bo\n", r); radeon_bo_unreserve(bo); } - + mutex_unlock(&rmn->lock); } diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c index 43e0994..aa2f815 100644 --- a/drivers/gpu/drm/radeon/radeon_object.c +++ b/drivers/gpu/drm/radeon/radeon_object.c @@ -93,7 +93,8 @@ bool radeon_ttm_bo_is_radeon_bo(struct ttm_buffer_object *bo) return false; } -void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain) +void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain, + u64 size, unsigned fpfn, unsigned lpfn) { u32 c = 0, i; @@ -105,14 +106,17 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain) */ if ((rbo->flags & RADEON_GEM_NO_CPU_ACCESS) && rbo->rdev->mc.visible_vram_size < rbo->rdev->mc.real_vram_size) { - rbo->placements[c].fpfn = - rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT; + if (fpfn > (rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT)) + rbo->placements[c].fpfn = fpfn; + else + rbo->placements[c].fpfn = + rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT; rbo->placements[c++].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM; } - rbo->placements[c].fpfn = 0; + rbo->placements[c].fpfn = fpfn; rbo->placements[c++].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_VRAM; @@ -120,18 +124,17 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain) if (domain & RADEON_GEM_DOMAIN_GTT) { if (rbo->flags & RADEON_GEM_GTT_UC) { - rbo->placements[c].fpfn = 0; + rbo->placements[c].fpfn = fpfn; rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_TT; - } else if ((rbo->flags & RADEON_GEM_GTT_WC) || (rbo->rdev->flags & RADEON_IS_AGP)) { - rbo->placements[c].fpfn = 0; + rbo->placements[c].fpfn = fpfn; rbo->placements[c++].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_TT; } else { - rbo->placements[c].fpfn = 0; + rbo->placements[c].fpfn = fpfn; rbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_TT; } @@ -139,24 +142,24 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain) if (domain & RADEON_GEM_DOMAIN_CPU) { if (rbo->flags & RADEON_GEM_GTT_UC) { - rbo->placements[c].fpfn = 0; + rbo->placements[c].fpfn = fpfn; rbo->placements[c++].flags = TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_SYSTEM; } else if ((rbo->flags & RADEON_GEM_GTT_WC) || rbo->rdev->flags & RADEON_IS_AGP) { - rbo->placements[c].fpfn = 0; + rbo->placements[c].fpfn = fpfn; rbo->placements[c++].flags = TTM_PL_FLAG_WC | TTM_PL_FLAG_UNCACHED | TTM_PL_FLAG_SYSTEM; } else { - rbo->placements[c].fpfn = 0; + rbo->placements[c].fpfn = fpfn; rbo->placements[c++].flags = TTM_PL_FLAG_CACHED | TTM_PL_FLAG_SYSTEM; } } if (!c) { - rbo->placements[c].fpfn = 0; + rbo->placements[c].fpfn = fpfn; rbo->placements[c++].flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM; } @@ -171,7 +174,7 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain) rbo->placements[i].lpfn = rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT; else - rbo->placements[i].lpfn = 0; + rbo->placements[i].lpfn = lpfn; } /* @@ -179,17 +182,18 @@ void radeon_ttm_placement_from_domain(struct radeon_bo *rbo, u32 domain) * improve fragmentation quality. * 512kb was measured as the most optimal number. */ - if (rbo->tbo.mem.size > 512 * 1024) { + if (size > 512 * 1024) { for (i = 0; i < c; i++) { - rbo->placements[i].flags |= TTM_PL_FLAG_TOPDOWN; + if (rbo->placements[i].lpfn == 0) + rbo->placements[i].flags |= TTM_PL_FLAG_TOPDOWN; } } } int radeon_bo_create(struct radeon_device *rdev, unsigned long size, int byte_align, bool kernel, - u32 domain, u32 flags, struct sg_table *sg, - struct reservation_object *resv, + u32 domain, unsigned fpfn, unsigned lpfn, u32 flags, + struct sg_table *sg, struct reservation_object *resv, struct radeon_bo **bo_ptr) { struct radeon_bo *bo; @@ -252,7 +256,7 @@ int radeon_bo_create(struct radeon_device *rdev, bo->flags &= ~RADEON_GEM_GTT_WC; #endif - radeon_ttm_placement_from_domain(bo, domain); + radeon_ttm_placement_from_domain(bo, domain, size, fpfn, lpfn); /* Kernel allocation are uninterruptible */ down_read(&rdev->pm.mclk_lock); r = ttm_bo_init(&rdev->mman.bdev, &bo->tbo, size, type, @@ -328,6 +332,7 @@ int radeon_bo_pin_restricted(struct radeon_bo *bo, u32 domain, u64 max_offset, u64 *gpu_addr) { int r, i; + unsigned lpfn; if (radeon_ttm_tt_has_userptr(bo->tbo.ttm)) return -EPERM; @@ -350,19 +355,16 @@ int radeon_bo_pin_restricted(struct radeon_bo *bo, u32 domain, u64 max_offset, return 0; } - radeon_ttm_placement_from_domain(bo, domain); - for (i = 0; i < bo->placement.num_placement; i++) { - /* force to pin into visible video ram */ - if ((bo->placements[i].flags & TTM_PL_FLAG_VRAM) && - !(bo->flags & RADEON_GEM_NO_CPU_ACCESS) && - (!max_offset || max_offset > bo->rdev->mc.visible_vram_size)) - bo->placements[i].lpfn = - bo->rdev->mc.visible_vram_size >> PAGE_SHIFT; - else - bo->placements[i].lpfn = max_offset >> PAGE_SHIFT; - + /* force to pin into visible video ram */ + if ((domain == RADEON_GEM_DOMAIN_VRAM) && + !(bo->flags & RADEON_GEM_NO_CPU_ACCESS) && + (!max_offset || max_offset > bo->rdev->mc.visible_vram_size)) + lpfn = bo->rdev->mc.visible_vram_size >> PAGE_SHIFT; + else + lpfn = max_offset >> PAGE_SHIFT; + radeon_ttm_placement_from_domain(bo, domain, bo->tbo.mem.size, 0, lpfn); + for (i = 0; i < bo->placement.num_placement; i++) bo->placements[i].flags |= TTM_PL_FLAG_NO_EVICT; - } r = ttm_bo_validate(&bo->tbo, &bo->placement, false, false); if (likely(r == 0)) { @@ -557,9 +559,13 @@ int radeon_bo_list_validate(struct radeon_device *rdev, } retry: - radeon_ttm_placement_from_domain(bo, domain); - if (ring == R600_RING_TYPE_UVD_INDEX) + if (ring == R600_RING_TYPE_UVD_INDEX) { + radeon_ttm_placement_from_domain(bo, domain, bo->tbo.mem.size, + 0, (256 * 1024 * 1024) >> PAGE_SHIFT); radeon_uvd_force_into_uvd_segment(bo, allowed); + } else { + radeon_ttm_placement_from_domain(bo, domain, bo->tbo.mem.size, 0, 0); + } initial_bytes_moved = atomic64_read(&rdev->num_bytes_moved); r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false); @@ -784,7 +790,7 @@ int radeon_bo_fault_reserve_notify(struct ttm_buffer_object *bo) struct radeon_device *rdev; struct radeon_bo *rbo; unsigned long offset, size, lpfn; - int i, r; + int r; if (!radeon_ttm_bo_is_radeon_bo(bo)) return 0; @@ -800,17 +806,13 @@ int radeon_bo_fault_reserve_notify(struct ttm_buffer_object *bo) return 0; /* hurrah the memory is not visible ! */ - radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_VRAM); - lpfn = rdev->mc.visible_vram_size >> PAGE_SHIFT; - for (i = 0; i < rbo->placement.num_placement; i++) { - /* Force into visible VRAM */ - if ((rbo->placements[i].flags & TTM_PL_FLAG_VRAM) && - (!rbo->placements[i].lpfn || rbo->placements[i].lpfn > lpfn)) - rbo->placements[i].lpfn = lpfn; - } + lpfn = rdev->mc.visible_vram_size >> PAGE_SHIFT; + radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_VRAM, + rbo->tbo.mem.size, 0, lpfn); r = ttm_bo_validate(bo, &rbo->placement, false, false); if (unlikely(r == -ENOMEM)) { - radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_GTT); + radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_GTT, + rbo->tbo.mem.size, 0, 0); return ttm_bo_validate(bo, &rbo->placement, false, false); } else if (unlikely(r != 0)) { return r; diff --git a/drivers/gpu/drm/radeon/radeon_object.h b/drivers/gpu/drm/radeon/radeon_object.h index d8d295e..806dc5c 100644 --- a/drivers/gpu/drm/radeon/radeon_object.h +++ b/drivers/gpu/drm/radeon/radeon_object.h @@ -124,7 +124,8 @@ extern int radeon_bo_wait(struct radeon_bo *bo, u32 *mem_type, extern int radeon_bo_create(struct radeon_device *rdev, unsigned long size, int byte_align, - bool kernel, u32 domain, u32 flags, + bool kernel, u32 domain, + unsigned fpfn, unsigned lpfn, u32 flags, struct sg_table *sg, struct reservation_object *resv, struct radeon_bo **bo_ptr); diff --git a/drivers/gpu/drm/radeon/radeon_prime.c b/drivers/gpu/drm/radeon/radeon_prime.c index f3609c9..d808d8f 100644 --- a/drivers/gpu/drm/radeon/radeon_prime.c +++ b/drivers/gpu/drm/radeon/radeon_prime.c @@ -68,7 +68,7 @@ struct drm_gem_object *radeon_gem_prime_import_sg_table(struct drm_device *dev, ww_mutex_lock(&resv->lock, NULL); ret = radeon_bo_create(rdev, attach->dmabuf->size, PAGE_SIZE, false, - RADEON_GEM_DOMAIN_GTT, 0, sg, resv, &bo); + RADEON_GEM_DOMAIN_GTT, 0, 0, 0, sg, resv, &bo); ww_mutex_unlock(&resv->lock); if (ret) return ERR_PTR(ret); diff --git a/drivers/gpu/drm/radeon/radeon_ring.c b/drivers/gpu/drm/radeon/radeon_ring.c index 2456f69..d9f0150 100644 --- a/drivers/gpu/drm/radeon/radeon_ring.c +++ b/drivers/gpu/drm/radeon/radeon_ring.c @@ -383,7 +383,7 @@ int radeon_ring_init(struct radeon_device *rdev, struct radeon_ring *ring, unsig /* Allocate ring buffer */ if (ring->ring_obj == NULL) { r = radeon_bo_create(rdev, ring->ring_size, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_GTT, 0, NULL, + RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL, NULL, &ring->ring_obj); if (r) { dev_err(rdev->dev, "(%d) ring create failed\n", r); diff --git a/drivers/gpu/drm/radeon/radeon_sa.c b/drivers/gpu/drm/radeon/radeon_sa.c index c507896..3013fb18 100644 --- a/drivers/gpu/drm/radeon/radeon_sa.c +++ b/drivers/gpu/drm/radeon/radeon_sa.c @@ -65,7 +65,7 @@ int radeon_sa_bo_manager_init(struct radeon_device *rdev, } r = radeon_bo_create(rdev, size, align, true, - domain, flags, NULL, NULL, &sa_manager->bo); + domain, 0, 0, flags, NULL, NULL, &sa_manager->bo); if (r) { dev_err(rdev->dev, "(%d) failed to allocate bo for manager\n", r); return r; diff --git a/drivers/gpu/drm/radeon/radeon_test.c b/drivers/gpu/drm/radeon/radeon_test.c index 79181816..073d6d2 100644 --- a/drivers/gpu/drm/radeon/radeon_test.c +++ b/drivers/gpu/drm/radeon/radeon_test.c @@ -67,7 +67,7 @@ static void radeon_do_test_moves(struct radeon_device *rdev, int flag) } r = radeon_bo_create(rdev, size, PAGE_SIZE, true, RADEON_GEM_DOMAIN_VRAM, - 0, NULL, NULL, &vram_obj); + 0, 0, 0, NULL, NULL, &vram_obj); if (r) { DRM_ERROR("Failed to create VRAM object\n"); goto out_cleanup; @@ -87,7 +87,7 @@ static void radeon_do_test_moves(struct radeon_device *rdev, int flag) struct radeon_fence *fence = NULL; r = radeon_bo_create(rdev, size, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_GTT, 0, NULL, NULL, + RADEON_GEM_DOMAIN_GTT, 0, 0, 0, NULL, NULL, gtt_obj + i); if (r) { DRM_ERROR("Failed to create GTT object %d\n", i); diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c index d02aa1d..befb590 100644 --- a/drivers/gpu/drm/radeon/radeon_ttm.c +++ b/drivers/gpu/drm/radeon/radeon_ttm.c @@ -197,7 +197,8 @@ static void radeon_evict_flags(struct ttm_buffer_object *bo, switch (bo->mem.mem_type) { case TTM_PL_VRAM: if (rbo->rdev->ring[radeon_copy_ring_index(rbo->rdev)].ready == false) - radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_CPU); + radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_CPU, + rbo->tbo.mem.size, 0, 0); else if (rbo->rdev->mc.visible_vram_size < rbo->rdev->mc.real_vram_size && bo->mem.start < (rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT)) { unsigned fpfn = rbo->rdev->mc.visible_vram_size >> PAGE_SHIFT; @@ -209,7 +210,8 @@ static void radeon_evict_flags(struct ttm_buffer_object *bo, * BOs to be evicted from VRAM */ radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_VRAM | - RADEON_GEM_DOMAIN_GTT); + RADEON_GEM_DOMAIN_GTT, + rbo->tbo.mem.size, 0, 0); rbo->placement.num_busy_placement = 0; for (i = 0; i < rbo->placement.num_placement; i++) { if (rbo->placements[i].flags & TTM_PL_FLAG_VRAM) { @@ -222,11 +224,13 @@ static void radeon_evict_flags(struct ttm_buffer_object *bo, } } } else - radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_GTT); + radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_GTT, + rbo->tbo.mem.size, 0, 0); break; case TTM_PL_TT: default: - radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_CPU); + radeon_ttm_placement_from_domain(rbo, RADEON_GEM_DOMAIN_CPU, + rbo->tbo.mem.size, 0, 0); } *placement = rbo->placement; } @@ -888,7 +892,7 @@ int radeon_ttm_init(struct radeon_device *rdev) radeon_ttm_set_active_vram_size(rdev, rdev->mc.visible_vram_size); r = radeon_bo_create(rdev, 256 * 1024, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_VRAM, 0, NULL, + RADEON_GEM_DOMAIN_VRAM, 0, (256 * 1024) >> PAGE_SHIFT, 0, NULL, NULL, &rdev->stollen_vga_memory); if (r) { return r; diff --git a/drivers/gpu/drm/radeon/radeon_uvd.c b/drivers/gpu/drm/radeon/radeon_uvd.c index c10b2ae..6261463 100644 --- a/drivers/gpu/drm/radeon/radeon_uvd.c +++ b/drivers/gpu/drm/radeon/radeon_uvd.c @@ -141,7 +141,7 @@ int radeon_uvd_init(struct radeon_device *rdev) RADEON_UVD_STACK_SIZE + RADEON_UVD_HEAP_SIZE + RADEON_GPU_PAGE_SIZE; r = radeon_bo_create(rdev, bo_size, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_VRAM, 0, NULL, + RADEON_GEM_DOMAIN_VRAM, 0, (256 * 1024 * 1024) >> PAGE_SHIFT, 0, NULL, NULL, &rdev->uvd.vcpu_bo); if (r) { dev_err(rdev->dev, "(%d) failed to allocate UVD bo\n", r); @@ -259,13 +259,6 @@ int radeon_uvd_resume(struct radeon_device *rdev) void radeon_uvd_force_into_uvd_segment(struct radeon_bo *rbo, uint32_t allowed_domains) { - int i; - - for (i = 0; i < rbo->placement.num_placement; ++i) { - rbo->placements[i].fpfn = 0 >> PAGE_SHIFT; - rbo->placements[i].lpfn = (256 * 1024 * 1024) >> PAGE_SHIFT; - } - /* If it must be in VRAM it must be in the first segment as well */ if (allowed_domains == RADEON_GEM_DOMAIN_VRAM) return; diff --git a/drivers/gpu/drm/radeon/radeon_vce.c b/drivers/gpu/drm/radeon/radeon_vce.c index 976fe43..c35f0a9 100644 --- a/drivers/gpu/drm/radeon/radeon_vce.c +++ b/drivers/gpu/drm/radeon/radeon_vce.c @@ -126,8 +126,8 @@ int radeon_vce_init(struct radeon_device *rdev) size = RADEON_GPU_PAGE_ALIGN(rdev->vce_fw->size) + RADEON_VCE_STACK_SIZE + RADEON_VCE_HEAP_SIZE; r = radeon_bo_create(rdev, size, PAGE_SIZE, true, - RADEON_GEM_DOMAIN_VRAM, 0, NULL, NULL, - &rdev->vce.vcpu_bo); + RADEON_GEM_DOMAIN_VRAM, 0, (256 * 1024 * 1024) >> PAGE_SHIFT, 0, + NULL, NULL, &rdev->vce.vcpu_bo); if (r) { dev_err(rdev->dev, "(%d) failed to allocate VCE bo\n", r); return r; diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c index 2a5a4a9..e845d4c 100644 --- a/drivers/gpu/drm/radeon/radeon_vm.c +++ b/drivers/gpu/drm/radeon/radeon_vm.c @@ -542,7 +542,7 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev, r = radeon_bo_create(rdev, RADEON_VM_PTE_COUNT * 8, RADEON_GPU_PAGE_SIZE, true, - RADEON_GEM_DOMAIN_VRAM, 0, + RADEON_GEM_DOMAIN_VRAM, 0, 0, 0, NULL, NULL, &pt); if (r) return r; @@ -1186,7 +1186,7 @@ int radeon_vm_init(struct radeon_device *rdev, struct radeon_vm *vm) } r = radeon_bo_create(rdev, pd_size, align, true, - RADEON_GEM_DOMAIN_VRAM, 0, NULL, + RADEON_GEM_DOMAIN_VRAM, 0, 0, 0, NULL, NULL, &vm->page_directory); if (r) return r; -- 1.8.3.1