From patchwork Fri Jan 26 13:04:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christian_K=C3=B6nig?= X-Patchwork-Id: 10187127 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 8BD2A605FA for ; Fri, 26 Jan 2018 22:20:20 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 654C529634 for ; Fri, 26 Jan 2018 22:20:20 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 648D129135; Fri, 26 Jan 2018 13:05:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, HTML_MESSAGE, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 35DCD29175 for ; Fri, 26 Jan 2018 13:04:59 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 337286E226; Fri, 26 Jan 2018 13:04:58 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-wm0-x231.google.com (mail-wm0-x231.google.com [IPv6:2a00:1450:400c:c09::231]) by gabe.freedesktop.org (Postfix) with ESMTPS id 289D06E135; Fri, 26 Jan 2018 13:04:56 +0000 (UTC) Received: by mail-wm0-x231.google.com with SMTP id f3so21105681wmc.1; Fri, 26 Jan 2018 05:04:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=reply-to:subject:from:to:cc:references:message-id:date:user-agent :mime-version:in-reply-to:content-language; bh=A7EbcCd1uuRH+KKv5puJgmvEBkMy5uvpwDkxjxJ2SV4=; b=csm6/O4vtJ2B75AVcE8JciuoWz5JQKq1qt3YjLCN3rjJcbRlSziqzRcazVIh87XKHJ COi9M13j2yrzk8ZzCq3y15YZySFc2v1FGZrelDx7iC8UPqB5/ugbFtPDCE7nYAnm1MTz ykVG3lNpuvnLGe8Y7YkjarI9bScG5BBVeJUx6mKbi3BItJP2AfjaqZzISD5DUG6eUnZ1 XeZXEz6CrJZ/H7ECfnkv0j25Kf91Sn27xzdTDrD6gD0egdDI/CuPXuLXwY2NDQhjWQpp +zPvG3ZyOuJd3YFi6W5b9gsrtrXhAXR9UoJpvSam4NwvefCHQbfLoO5TSNIJDwCa6Xpz /2PQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:reply-to:subject:from:to:cc:references :message-id:date:user-agent:mime-version:in-reply-to :content-language; bh=A7EbcCd1uuRH+KKv5puJgmvEBkMy5uvpwDkxjxJ2SV4=; b=uWXSejaLmDS1lqNA4bGV2uDjEJPmugEWeU96UUysu6eVTSvNM7IToV/LXCoA2TJ97e aeYigvcFdKidVaASIHj83SnMAqBG9eqjrK63nMy/wKhKNe8EYZLXbyA3L6WcLOUP1Bcp c1lzpyG/0xHR2cDcGBwq8tv4kX7PVsuega03rRtLDTyehIPeWOwOIvW7qVORbRI9D4xD ia+N+Cm+8bdSzGn8vgIyhcFIpWJOlH+LXcaO3Eizfjn3Oy2yrA7fyoQUi+1Hg3yoTpPI s+SJvw1EXXmEs00yCoD/KGtch7txE7qbuDVdt6ZLFl7kim1+XuRI62dA7wxFxFog/QJQ 3hVg== X-Gm-Message-State: AKwxyteCZAp5IGtXLaZrtUMbOCxO8eB1BrcesXonmBlndLOWVV9l1phl LB/wHmopgOOghHf56PKCM6Y9qHYO X-Google-Smtp-Source: AH8x2246Dq9eJ+2hoRYimg+g+I6sWEpSj+a5okefmZENQ1QciHKxbzIzd8OXyc8o8EbwE5MD0SUvLg== X-Received: by 10.28.55.81 with SMTP id e78mr11371682wma.50.1516971894475; Fri, 26 Jan 2018 05:04:54 -0800 (PST) Received: from ?IPv6:2a02:908:1251:8fc0:4c6d:7233:b7e1:3b88? ([2a02:908:1251:8fc0:4c6d:7233:b7e1:3b88]) by smtp.gmail.com with ESMTPSA id b35sm14345880wra.13.2018.01.26.05.04.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 26 Jan 2018 05:04:53 -0800 (PST) Subject: Re: [PATCH 1/2] [WIP]drm/ttm: add waiter list to prevent allocation not in order From: =?UTF-8?Q?Christian_K=c3=b6nig?= To: "Zhou, David(ChunMing)" , "Koenig, Christian" References: <20180126102247.17923-1-david1.zhou@amd.com> <2beeea26-9426-6956-fbc7-ada52088b6a6@gmail.com> <5a6b20fc.024e650a.6778d.4aafSMTPIN_ADDED_BROKEN@mx.google.com> Message-ID: <8ba40eb5-095b-2a95-d73a-16c141eb64a9@gmail.com> Date: Fri, 26 Jan 2018 14:04:52 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: christian.koenig@amd.com Cc: amd-gfx , "dri-devel@lists.freedesktop.org" Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Attached is what you actually want to do cleanly implemented. But as I said this is a NO-GO. Regards, Christian. Am 26.01.2018 um 13:43 schrieb Christian König: >> After my investigation, this issue should be detect of TTM design >> self, which breaks scheduling balance. > Yeah, but again. This is indented design we can't change easily. > > Regards, > Christian. > > Am 26.01.2018 um 13:36 schrieb Zhou, David(ChunMing): >> I am off work, so reply mail by phone, the format could not be text. >> >> back to topic itself: >> the problem indeed happen on amdgpu driver, someone reports me that >> application runs with two instances, the performance are different. >> I also reproduced the issue with unit test(bo_eviction_test). They >> always think our scheduler isn't working as expected. >> >> After my investigation, this issue should be detect of TTM design >> self, which breaks scheduling balance. >> >> Further, if we run containers for our gpu, container A could run high >> score, container B runs low score with same benchmark. >> >> So this is bug that we need fix. >> >> Regards, >> David Zhou >> >> 发自坚果 Pro >> >> Christian K鰊ig 于 2018年1月26日 >> 下午6:31写道: >> >> Am 26.01.2018 um 11:22 schrieb Chunming Zhou: >> > there is a scheduling balance issue about get node like: >> > a. process A allocates full memory and use it for submission. >> > b. process B tries to allocates memory, will wait for process A BO >> idle in eviction. >> > c. process A completes the job, process B eviction will put process >> A BO node, >> > but in the meantime, process C is comming to allocate BO, whill >> directly get node successfully, and do submission, >> > process B will again wait for process C BO idle. >> > d. repeat the above setps, process B could be delayed much more. >> > >> > later allocation must not be ahead of front in same place. >> >> Again NAK to the whole approach. >> >> At least with amdgpu the problem you described above never occurs >> because evictions are pipelined operations. We could only block for >> deleted regions to become free. >> >> But independent of that incoming memory requests while we make room for >> eviction are intended to be served first. >> >> Changing that is certainly a no-go cause that would favor memory hungry >> applications over small clients. >> >> Regards, >> Christian. >> >> > >> > Change-Id: I3daa892e50f82226c552cc008a29e55894a98f18 >> > Signed-off-by: Chunming Zhou >> > --- >> >   drivers/gpu/drm/ttm/ttm_bo.c    | 69 >> +++++++++++++++++++++++++++++++++++++++-- >> >   include/drm/ttm/ttm_bo_api.h    |  7 +++++ >> >   include/drm/ttm/ttm_bo_driver.h |  7 +++++ >> >   3 files changed, 80 insertions(+), 3 deletions(-) >> > >> > diff --git a/drivers/gpu/drm/ttm/ttm_bo.c >> b/drivers/gpu/drm/ttm/ttm_bo.c >> > index d33a6bb742a1..558ec2cf465d 100644 >> > --- a/drivers/gpu/drm/ttm/ttm_bo.c >> > +++ b/drivers/gpu/drm/ttm/ttm_bo.c >> > @@ -841,6 +841,58 @@ static int ttm_bo_add_move_fence(struct >> ttm_buffer_object *bo, >> >        return 0; >> >   } >> > >> > +static void ttm_man_init_waiter(struct ttm_bo_waiter *waiter, >> > +                             struct ttm_buffer_object *bo, >> > +                             const struct ttm_place *place) >> > +{ >> > +     waiter->tbo = bo; >> > +     memcpy((void *)&waiter->place, (void *)place, sizeof(*place)); >> > +     INIT_LIST_HEAD(&waiter->list); >> > +} >> > + >> > +static void ttm_man_add_waiter(struct ttm_mem_type_manager *man, >> > +                            struct ttm_bo_waiter *waiter) >> > +{ >> > +     if (!waiter) >> > +             return; >> > +     spin_lock(&man->wait_lock); >> > +     list_add_tail(&waiter->list, &man->waiter_list); >> > +     spin_unlock(&man->wait_lock); >> > +} >> > + >> > +static void ttm_man_del_waiter(struct ttm_mem_type_manager *man, >> > +                            struct ttm_bo_waiter *waiter) >> > +{ >> > +     if (!waiter) >> > +             return; >> > +     spin_lock(&man->wait_lock); >> > +     if (!list_empty(&waiter->list)) >> > +             list_del(&waiter->list); >> > +     spin_unlock(&man->wait_lock); >> > +     kfree(waiter); >> > +} >> > + >> > +int ttm_man_check_bo(struct ttm_mem_type_manager *man, >> > +                  struct ttm_buffer_object *bo, >> > +                  const struct ttm_place *place) >> > +{ >> > +     struct ttm_bo_waiter *waiter, *tmp; >> > + >> > +     spin_lock(&man->wait_lock); >> > +     list_for_each_entry_safe(waiter, tmp, &man->waiter_list, list) { >> > +             if ((bo != waiter->tbo) && >> > +                 ((place->fpfn >= waiter->place.fpfn && >> > +                   place->fpfn <= waiter->place.lpfn) || >> > +                  (place->lpfn <= waiter->place.lpfn && place->lpfn >= >> > +                   waiter->place.fpfn))) >> > +                 goto later_bo; >> > +     } >> > +     spin_unlock(&man->wait_lock); >> > +     return true; >> > +later_bo: >> > +     spin_unlock(&man->wait_lock); >> > +     return false; >> > +} >> >   /** >> >    * Repeatedly evict memory from the LRU for @mem_type until we >> create enough >> >    * space, or we've evicted everything and there isn't enough space. >> > @@ -853,17 +905,26 @@ static int ttm_bo_mem_force_space(struct >> ttm_buffer_object *bo, >> >   { >> >        struct ttm_bo_device *bdev = bo->bdev; >> >        struct ttm_mem_type_manager *man = &bdev->man[mem_type]; >> > +     struct ttm_bo_waiter waiter; >> >        int ret; >> > >> > +     ttm_man_init_waiter(&waiter, bo, place); >> > +     ttm_man_add_waiter(man, &waiter); >> >        do { >> >                ret = (*man->func->get_node)(man, bo, place, mem); >> > -             if (unlikely(ret != 0)) >> > +             if (unlikely(ret != 0)) { >> > +                     ttm_man_del_waiter(man, &waiter); >> >                        return ret; >> > -             if (mem->mm_node) >> > +             } >> > +             if (mem->mm_node) { >> > +                     ttm_man_del_waiter(man, &waiter); >> >                        break; >> > +             } >> >                ret = ttm_mem_evict_first(bdev, mem_type, place, ctx); >> > -             if (unlikely(ret != 0)) >> > +             if (unlikely(ret != 0)) { >> > +                     ttm_man_del_waiter(man, &waiter); >> >                        return ret; >> > +             } >> >        } while (1); >> >        mem->mem_type = mem_type; >> >        return ttm_bo_add_move_fence(bo, man, mem); >> > @@ -1450,6 +1511,8 @@ int ttm_bo_init_mm(struct ttm_bo_device >> *bdev, unsigned type, >> >        man->use_io_reserve_lru = false; >> >        mutex_init(&man->io_reserve_mutex); >> >        spin_lock_init(&man->move_lock); >> > +     spin_lock_init(&man->wait_lock); >> > +     INIT_LIST_HEAD(&man->waiter_list); >> >        INIT_LIST_HEAD(&man->io_reserve_lru); >> > >> >        ret = bdev->driver->init_mem_type(bdev, type, man); >> > diff --git a/include/drm/ttm/ttm_bo_api.h >> b/include/drm/ttm/ttm_bo_api.h >> > index 2cd025c2abe7..0fce4dbd02e7 100644 >> > --- a/include/drm/ttm/ttm_bo_api.h >> > +++ b/include/drm/ttm/ttm_bo_api.h >> > @@ -40,6 +40,7 @@ >> >   #include >> >   #include >> >   #include >> > +#include >> > >> >   struct ttm_bo_device; >> > >> > @@ -232,6 +233,12 @@ struct ttm_buffer_object { >> >        struct mutex wu_mutex; >> >   }; >> > >> > +struct ttm_bo_waiter { >> > +     struct ttm_buffer_object *tbo; >> > +     struct ttm_place place; >> > +     struct list_head list; >> > +}; >> > + >> >   /** >> >    * struct ttm_bo_kmap_obj >> >    * >> > diff --git a/include/drm/ttm/ttm_bo_driver.h >> b/include/drm/ttm/ttm_bo_driver.h >> > index 9b417eb2df20..dc6b8b4c9e06 100644 >> > --- a/include/drm/ttm/ttm_bo_driver.h >> > +++ b/include/drm/ttm/ttm_bo_driver.h >> > @@ -293,6 +293,10 @@ struct ttm_mem_type_manager { >> >        bool io_reserve_fastpath; >> >        spinlock_t move_lock; >> > >> > +     /* waiters in list */ >> > +     spinlock_t wait_lock; >> > +     struct list_head waiter_list; >> > + >> >        /* >> >         * Protected by @io_reserve_mutex: >> >         */ >> > @@ -748,6 +752,9 @@ int ttm_bo_mem_space(struct ttm_buffer_object *bo, >> >                     struct ttm_mem_reg *mem, >> >                     struct ttm_operation_ctx *ctx); >> > >> > +int ttm_man_check_bo(struct ttm_mem_type_manager *man, >> > +                  struct ttm_buffer_object *bo, >> > +                  const struct ttm_place *place); >> >   void ttm_bo_mem_put(struct ttm_buffer_object *bo, struct >> ttm_mem_reg *mem); >> >   void ttm_bo_mem_put_locked(struct ttm_buffer_object *bo, >> >                           struct ttm_mem_reg *mem); >> >> >> >> _______________________________________________ >> amd-gfx mailing list >> amd-gfx@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/amd-gfx > From 12295de757156469edf37f41e28da656d3e51844 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Christian=20K=C3=B6nig?= Date: Fri, 26 Jan 2018 14:01:51 +0100 Subject: [PATCH] drm/ttm: alloc only one memory allocation operation at a time MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Signed-off-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 3 +++ include/drm/ttm/ttm_bo_driver.h | 2 ++ 2 files changed, 5 insertions(+) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index d33a6bb742a1..b783884eca2b 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -1053,7 +1053,9 @@ static int ttm_bo_move_buffer(struct ttm_buffer_object *bo, /* * Determine where to move the buffer. */ + mutex_lock(&bo->bdev->mem_alloc); ret = ttm_bo_mem_space(bo, placement, &mem, ctx); + mutex_unlock(&bo->bdev->mem_alloc); if (ret) goto out_unlock; ret = ttm_bo_handle_move_mem(bo, &mem, false, ctx); @@ -1594,6 +1596,7 @@ int ttm_bo_device_init(struct ttm_bo_device *bdev, 0x10000000); INIT_DELAYED_WORK(&bdev->wq, ttm_bo_delayed_workqueue); INIT_LIST_HEAD(&bdev->ddestroy); + mutex_init(&bdev->mem_alloc); bdev->dev_mapping = mapping; bdev->glob = glob; bdev->need_dma32 = need_dma32; diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h index 9b417eb2df20..a23179d20c25 100644 --- a/include/drm/ttm/ttm_bo_driver.h +++ b/include/drm/ttm/ttm_bo_driver.h @@ -596,6 +596,8 @@ struct ttm_bo_device { bool need_dma32; bool no_retry; + + struct mutex mem_alloc; }; /** -- 2.14.1