From patchwork Tue May 7 11:45:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Chunming Zhou X-Patchwork-Id: 10932839 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 314CF14B6 for ; Tue, 7 May 2019 11:45:44 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1DD5F288D0 for ; Tue, 7 May 2019 11:45:44 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 11F83288E0; Tue, 7 May 2019 11:45:44 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 839D9288D0 for ; Tue, 7 May 2019 11:45:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 88C176E7B5; Tue, 7 May 2019 11:45:41 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from NAM05-BY2-obe.outbound.protection.outlook.com (mail-eopbgr710083.outbound.protection.outlook.com [40.107.71.83]) by gabe.freedesktop.org (Postfix) with ESMTPS id 14D6E6E7B5 for ; Tue, 7 May 2019 11:45:40 +0000 (UTC) Received: from DM5PR12CA0024.namprd12.prod.outlook.com (10.172.32.162) by BN6PR12MB1140.namprd12.prod.outlook.com (10.168.225.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1856.11; Tue, 7 May 2019 11:45:38 +0000 Received: from CO1NAM03FT005.eop-NAM03.prod.protection.outlook.com (2a01:111:f400:7e48::205) by DM5PR12CA0024.outlook.office365.com (2603:10b6:4:1::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.10 via Frontend Transport; Tue, 7 May 2019 11:45:37 +0000 Received-SPF: None (protection.outlook.com: amd.com does not designate permitted sender hosts) Received: from SATLEXCHOV02.amd.com (165.204.84.17) by CO1NAM03FT005.mail.protection.outlook.com (10.152.80.156) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1856.11 via Frontend Transport; Tue, 7 May 2019 11:45:37 +0000 Received: from zhoucm1.amd.com (10.34.1.3) by SATLEXCHOV02.amd.com (10.181.40.72) with Microsoft SMTP Server id 14.3.389.1; Tue, 7 May 2019 06:45:35 -0500 From: Chunming Zhou To: , , Subject: [PATCH 1/2] drm/ttm: fix busy memory to fail other user v7 Date: Tue, 7 May 2019 19:45:30 +0800 Message-ID: <20190507114531.26089-1-david1.zhou@amd.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:165.204.84.17; IPV:NLI; CTRY:US; EFV:NLI; SFV:NSPM; SFS:(10009020)(136003)(396003)(376002)(346002)(39860400002)(2980300002)(428003)(199004)(189003)(4326008)(356004)(77096007)(426003)(53936002)(48376002)(50466002)(51416003)(7696005)(16586007)(336012)(316002)(26005)(110136005)(2906002)(186003)(2201001)(53416004)(8676002)(86362001)(5660300002)(305945005)(81166006)(81156014)(36756003)(1076003)(68736007)(8936002)(6666004)(476003)(478600001)(70586007)(126002)(47776003)(70206006)(50226002)(14444005)(486006)(72206003)(2616005); DIR:OUT; SFP:1101; SCL:1; SRVR:BN6PR12MB1140; H:SATLEXCHOV02.amd.com; FPR:; SPF:None; LANG:en; PTR:InfoDomainNonexistent; A:1; MX:1; X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 840ebedc-fe05-41e3-2a46-08d6d2e185ff X-Microsoft-Antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600141)(711020)(4605104)(2017052603328); SRVR:BN6PR12MB1140; X-MS-TrafficTypeDiagnostic: BN6PR12MB1140: X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:8882; X-Forefront-PRVS: 0030839EEE X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Message-Info: RstO/N8kyBJKpuFFimIKWy2IMfTwcIAgqaquiPd4qUKrT9BTWluoRWsGopDF+lJk7vRdharkFeYlIPomciGm6aU3+7urLqkRKws6KpZ0uirYvF8LmfOfHBWr/qjHC3p9sBj4BWGsux+6sK0TI7Wb3EO/nNbNkMxJmH0MKbutV97gn0Nvs+ymjLePCmesrMJtBAeLWvlC0EVSSvE+bWevLMCrAYRSdWvsNxjSCtdD+2CyZ+GFgcCVkdz59VsbHvcxZ+QR8EWDE1346pOxyKZgvL1FiTybI9rYEXnLAEpBfxJI+tdkD4tZcqwnrij77TOU7F8XbxvasKBpZFt53C/6D8wy8r7yaslFaUJ+B8c26Hpfor1unrglaC4CiIzssvqRlFTL0s8vBNm15ZT9oaIv1W1QJ6tZYOSAI8oTMgKfftw= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 May 2019 11:45:37.4858 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 840ebedc-fe05-41e3-2a46-08d6d2e185ff X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXCHOV02.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR12MB1140 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amdcloud.onmicrosoft.com; s=selector1-amd-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DNR7HYH4jj8+HtHULoseTe72bDZBlfBgH5F6lIMFzK8=; b=TVzWldwBUl9ycr2ZltnaEzE6Es77xNeB/zblZLIXz77fM2Aq4cAEq9VOyb8gXYYdYoVlZz/sg3Af3enKsUp6s2/7rRE3c/Xs3g4hLeeIbtNmurAEDXFXJ53mzyJ4IwlbQ0tAqicVSv/ihSFlmorPSQNrDD0CUU34Mu4OtXYMc7E= X-Mailman-Original-Authentication-Results: spf=none (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; lists.freedesktop.org; dkim=none (message not signed) header.d=none;lists.freedesktop.org; dmarc=permerror action=none header.from=amd.com; X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP heavy gpu job could occupy memory long time, which lead other user fail to get memory. basically pick up Christian idea: 1. Reserve the BO in DC using a ww_mutex ticket (trivial). 2. If we then run into this EBUSY condition in TTM check if the BO we need memory for (or rather the ww_mutex of its reservation object) has a ticket assigned. 3. If we have a ticket we grab a reference to the first BO on the LRU, drop the LRU lock and try to grab the reservation lock with the ticket. 4. If getting the reservation lock with the ticket succeeded we check if the BO is still the first one on the LRU in question (the BO could have moved). 5. If the BO is still the first one on the LRU in question we try to evict it as we would evict any other BO. 6. If any of the "If's" above fail we just back off and return -EBUSY. v2: fix some minor check v3: address Christian v2 comments. v4: fix some missing v5: handle first_bo unlock and bo_get/put v6: abstract unified iterate function, and handle all possible usecase not only pinned bo. v7: pass request bo->resv to ttm_bo_evict_first Change-Id: I21423fb922f885465f13833c41df1e134364a8e7 Signed-off-by: Chunming Zhou Reviewed-by: Christian König --- drivers/gpu/drm/ttm/ttm_bo.c | 111 +++++++++++++++++++++++++++++------ 1 file changed, 94 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 8502b3ed2d88..f5e6328e4a57 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -766,11 +766,13 @@ EXPORT_SYMBOL(ttm_bo_eviction_valuable); * b. Otherwise, trylock it. */ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, - struct ttm_operation_ctx *ctx, bool *locked) + struct ttm_operation_ctx *ctx, bool *locked, bool *busy) { bool ret = false; *locked = false; + if (busy) + *busy = false; if (bo->resv == ctx->resv) { reservation_object_assert_held(bo->resv); if (ctx->flags & TTM_OPT_FLAG_ALLOW_RES_EVICT @@ -779,35 +781,46 @@ static bool ttm_bo_evict_swapout_allowable(struct ttm_buffer_object *bo, } else { *locked = reservation_object_trylock(bo->resv); ret = *locked; + if (!ret && busy) + *busy = true; } return ret; } -static int ttm_mem_evict_first(struct ttm_bo_device *bdev, - uint32_t mem_type, - const struct ttm_place *place, - struct ttm_operation_ctx *ctx) +static struct ttm_buffer_object* +ttm_mem_find_evitable_bo(struct ttm_bo_device *bdev, + struct ttm_mem_type_manager *man, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + struct ttm_buffer_object **first_bo, + bool *locked) { - struct ttm_bo_global *glob = bdev->glob; - struct ttm_mem_type_manager *man = &bdev->man[mem_type]; struct ttm_buffer_object *bo = NULL; - bool locked = false; - unsigned i; - int ret; + int i; - spin_lock(&glob->lru_lock); + if (first_bo) + *first_bo = NULL; for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &man->lru[i], lru) { - if (!ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) + bool busy = false; + + if (!ttm_bo_evict_swapout_allowable(bo, ctx, locked, + &busy)) { + if (first_bo && !(*first_bo) && busy) { + ttm_bo_get(bo); + *first_bo = bo; + } continue; + } if (place && !bdev->driver->eviction_valuable(bo, place)) { - if (locked) + if (*locked) reservation_object_unlock(bo->resv); continue; } + break; } @@ -818,9 +831,67 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, bo = NULL; } + return bo; +} + +static int ttm_mem_evict_first(struct ttm_bo_device *bdev, + uint32_t mem_type, + const struct ttm_place *place, + struct ttm_operation_ctx *ctx, + struct reservation_object *request_resv) +{ + struct ttm_bo_global *glob = bdev->glob; + struct ttm_mem_type_manager *man = &bdev->man[mem_type]; + struct ttm_buffer_object *bo = NULL, *first_bo = NULL; + bool locked = false; + int ret; + + spin_lock(&glob->lru_lock); + bo = ttm_mem_find_evitable_bo(bdev, man, place, ctx, &first_bo, + &locked); if (!bo) { + struct ttm_operation_ctx busy_ctx; + spin_unlock(&glob->lru_lock); - return -EBUSY; + /* check if other user occupy memory too long time */ + if (!first_bo || !request_resv || !request_resv->lock.ctx) { + if (first_bo) + ttm_bo_put(first_bo); + return -EBUSY; + } + if (first_bo->resv == request_resv) { + ttm_bo_put(first_bo); + return -EBUSY; + } + if (ctx->interruptible) + ret = ww_mutex_lock_interruptible(&first_bo->resv->lock, + request_resv->lock.ctx); + else + ret = ww_mutex_lock(&first_bo->resv->lock, request_resv->lock.ctx); + if (ret) { + ttm_bo_put(first_bo); + return ret; + } + spin_lock(&glob->lru_lock); + /* previous busy resv lock is held by above, idle now, + * so let them evictable. + */ + busy_ctx.interruptible = ctx->interruptible; + busy_ctx.no_wait_gpu = ctx->no_wait_gpu; + busy_ctx.resv = first_bo->resv; + busy_ctx.flags = TTM_OPT_FLAG_ALLOW_RES_EVICT; + + bo = ttm_mem_find_evitable_bo(bdev, man, place, &busy_ctx, NULL, + &locked); + if (bo && (bo->resv == first_bo->resv)) + locked = true; + else if (bo) + ww_mutex_unlock(&first_bo->resv->lock); + if (!bo) { + spin_unlock(&glob->lru_lock); + ttm_bo_put(first_bo); + return -EBUSY; + } } kref_get(&bo->list_kref); @@ -829,11 +900,15 @@ static int ttm_mem_evict_first(struct ttm_bo_device *bdev, ret = ttm_bo_cleanup_refs(bo, ctx->interruptible, ctx->no_wait_gpu, locked); kref_put(&bo->list_kref, ttm_bo_release_list); + if (first_bo) + ttm_bo_put(first_bo); return ret; } ttm_bo_del_from_lru(bo); spin_unlock(&glob->lru_lock); + if (first_bo) + ttm_bo_put(first_bo); ret = ttm_bo_evict(bo, ctx); if (locked) { @@ -907,7 +982,7 @@ static int ttm_bo_mem_force_space(struct ttm_buffer_object *bo, return ret; if (mem->mm_node) break; - ret = ttm_mem_evict_first(bdev, mem_type, place, ctx); + ret = ttm_mem_evict_first(bdev, mem_type, place, ctx, bo->resv); if (unlikely(ret != 0)) return ret; } while (1); @@ -1413,7 +1488,8 @@ static int ttm_bo_force_list_clean(struct ttm_bo_device *bdev, for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { while (!list_empty(&man->lru[i])) { spin_unlock(&glob->lru_lock); - ret = ttm_mem_evict_first(bdev, mem_type, NULL, &ctx); + ret = ttm_mem_evict_first(bdev, mem_type, NULL, &ctx, + NULL); if (ret) return ret; spin_lock(&glob->lru_lock); @@ -1784,7 +1860,8 @@ int ttm_bo_swapout(struct ttm_bo_global *glob, struct ttm_operation_ctx *ctx) spin_lock(&glob->lru_lock); for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) { list_for_each_entry(bo, &glob->swap_lru[i], swap) { - if (ttm_bo_evict_swapout_allowable(bo, ctx, &locked)) { + if (ttm_bo_evict_swapout_allowable(bo, ctx, &locked, + NULL)) { ret = 0; break; }