From patchwork Thu Apr 18 15:00:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 10907495 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7B6EB161F for ; Thu, 18 Apr 2019 15:01:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5E18028C3D for ; Thu, 18 Apr 2019 15:01:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5C80828CC4; Thu, 18 Apr 2019 15:01:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F16F628CE4 for ; Thu, 18 Apr 2019 15:01:04 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 9507B6E164; Thu, 18 Apr 2019 15:01:03 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from NAM04-BN3-obe.outbound.protection.outlook.com (mail-eopbgr680082.outbound.protection.outlook.com [40.107.68.82]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5E2956E160; Thu, 18 Apr 2019 15:01:02 +0000 (UTC) Received: from MWHPR12CA0028.namprd12.prod.outlook.com (2603:10b6:301:2::14) by BLUPR12MB0580.namprd12.prod.outlook.com (2a01:111:e400:594f::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1813.12; Thu, 18 Apr 2019 15:01:00 +0000 Received: from DM3NAM03FT026.eop-NAM03.prod.protection.outlook.com (2a01:111:f400:7e49::203) by MWHPR12CA0028.outlook.office365.com (2603:10b6:301:2::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1813.12 via Frontend Transport; Thu, 18 Apr 2019 15:00:59 +0000 Received-SPF: None (protection.outlook.com: amd.com does not designate permitted sender hosts) Received: from SATLEXCHOV01.amd.com (165.204.84.17) by DM3NAM03FT026.mail.protection.outlook.com (10.152.82.185) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.1771.16 via Frontend Transport; Thu, 18 Apr 2019 15:00:58 +0000 Received: from agrodzovsky-All-Series.amd.com (10.34.1.3) by SATLEXCHOV01.amd.com (10.181.40.71) with Microsoft SMTP Server id 14.3.389.1; Thu, 18 Apr 2019 10:00:56 -0500 From: Andrey Grodzovsky To: , , , , Subject: [PATCH v5 5/6] drm/scheduler: Add flag to hint the release of guilty job. Date: Thu, 18 Apr 2019 11:00:23 -0400 Message-ID: <1555599624-12285-5-git-send-email-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1555599624-12285-1-git-send-email-andrey.grodzovsky@amd.com> References: <1555599624-12285-1-git-send-email-andrey.grodzovsky@amd.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:165.204.84.17; IPV:NLI; CTRY:US; EFV:NLI; SFV:NSPM; SFS:(10009020)(396003)(346002)(376002)(39860400002)(136003)(2980300002)(428003)(199004)(189003)(426003)(7696005)(11346002)(446003)(336012)(16586007)(72206003)(51416003)(486006)(476003)(126002)(2616005)(110136005)(54906003)(44832011)(356004)(53416004)(36756003)(478600001)(4326008)(6666004)(97736004)(50226002)(8936002)(14444005)(47776003)(5660300002)(50466002)(8676002)(48376002)(81166006)(86362001)(2201001)(81156014)(68736007)(26005)(77096007)(316002)(186003)(2906002)(53936002)(305945005)(76176011)(2101003); DIR:OUT; SFP:1101; SCL:1; SRVR:BLUPR12MB0580; H:SATLEXCHOV01.amd.com; FPR:; SPF:None; LANG:en; PTR:InfoDomainNonexistent; A:1; MX:1; X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0f1d09d6-e4cb-4de8-0f34-08d6c40eaab4 X-Microsoft-Antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600141)(711020)(4605104)(2017052603328); SRVR:BLUPR12MB0580; X-MS-TrafficTypeDiagnostic: BLUPR12MB0580: X-Microsoft-Antispam-PRVS: X-Forefront-PRVS: 0011612A55 X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Message-Info: CKu9l/bNT1GunZUESykTM96R+bGz0qd3pU5z3hocxh/dR1Eb/WhLHtY72amKg6pH+ZpAdaIdqsUQcbfHNi6fhKDxFGuWMkoluQLlj7PuRhANgFE0FWt0uUwZscfrmnXBJdPf/Hd88LCWuHL7iuGmRU+Az2JufjpRhgkM7uuqlN+bEnrKPCIWUTYoHNUdbsj9dbWDSecckvijKkELDCWbcV19an/k2HEzbFdNBWWcwEvynEZzoNdBdn1ouA8apFMSSjkLCO/SnUC4PLmTjC9w72fPn+7KnSnMuvqaTkt8RFC9dUt2+IsuLB4NxOea8gmZkcBn8NfDEjkmkMZ+Ei2EBEXA1JgnmKLjnJ9WrZwOWOxg1QzOnGqvW1WnHfGiZXT2YGIIZ37x8+UlFegiiYx1cADCiZmg8dOpDghcupqFz4M= X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Apr 2019 15:00:58.8634 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0f1d09d6-e4cb-4de8-0f34-08d6c40eaab4 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXCHOV01.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR12MB0580 X-Mailman-Original-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amdcloud.onmicrosoft.com; s=selector1-amd-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=eIIk7p7TgL3NEF1nV2HF3uB+U9/vGqY4A2MTbT+nT+8=; b=TCzGo+7Z17fQ7n+4euHVem4drBcelUtTxo75srPAYWi0/vFZDD9uWCdJv1wzwbCO689RLIqIypZBpaGoh0Fn/zWBoUpyTAV4yTZqdZUInwS8a8YkYwAhizobZ7OIzeeSJk9zGuddcJI2qhtfDfBuD3gbXMsCjSn8hcHLoFd3QVQ= X-Mailman-Original-Authentication-Results: spf=none (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; lists.freedesktop.org; dkim=none (message not signed) header.d=none;lists.freedesktop.org; dmarc=permerror action=none header.from=amd.com; X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas.Kazlauskas@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Problem: Sched thread's cleanup function races against TO handler and removes the guilty job from mirror list and we have no way of differentiating if the job was removed from within the TO handler or from the sched thread's clean-up function. Fix: Add a flag to scheduler to hint the TO handler that the guilty job needs to be explicitly released. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/scheduler/sched_main.c | 9 +++++++-- include/drm/gpu_scheduler.h | 2 ++ 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 03e6bd8..f8f0e1c 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -293,8 +293,10 @@ static void drm_sched_job_timedout(struct work_struct *work) * Guilty job did complete and hence needs to be manually removed * See drm_sched_stop doc. */ - if (list_empty(&job->node)) + if (sched->free_guilty) { job->sched->ops->free_job(job); + sched->free_guilty = false; + } spin_lock_irqsave(&sched->job_list_lock, flags); drm_sched_start_timeout(sched); @@ -395,10 +397,13 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) /* * We must keep bad job alive for later use during - * recovery by some of the drivers + * recovery by some of the drivers but leave a hint + * that the guilty job must be released. */ if (bad != s_job) sched->ops->free_job(s_job); + else + sched->free_guilty = true; } } diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 9ee0f27..fc0b421 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -259,6 +259,7 @@ struct drm_sched_backend_ops { * guilty and it will be considered for scheduling further. * @num_jobs: the number of jobs in queue in the scheduler * @ready: marks if the underlying HW is ready to work + * @free_guilty: A hit to time out handler to free the guilty job. * * One scheduler is implemented for each hardware ring. */ @@ -279,6 +280,7 @@ struct drm_gpu_scheduler { int hang_limit; atomic_t num_jobs; bool ready; + bool free_guilty; }; int drm_sched_init(struct drm_gpu_scheduler *sched,