From patchwork Fri Jun 1 17:11:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Grodzovsky X-Patchwork-Id: 10444149 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 90535603D7 for ; Fri, 1 Jun 2018 17:12:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 823F028CD2 for ; Fri, 1 Jun 2018 17:12:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 7660628D54; Fri, 1 Jun 2018 17:12:02 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.2 required=2.0 tests=BAD_ENC_HEADER,BAYES_00, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 9AF3928CD2 for ; Fri, 1 Jun 2018 17:12:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3445F6E728; Fri, 1 Jun 2018 17:11:59 +0000 (UTC) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from NAM01-BY2-obe.outbound.protection.outlook.com (mail-by2nam01on0081.outbound.protection.outlook.com [104.47.34.81]) by gabe.freedesktop.org (Postfix) with ESMTPS id 172DF6E724; Fri, 1 Jun 2018 17:11:57 +0000 (UTC) Received: from SN1PR12CA0113.namprd12.prod.outlook.com (2603:10b6:802:21::48) by DM3PR12MB0762.namprd12.prod.outlook.com (2a01:111:e400:5984::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.797.11; Fri, 1 Jun 2018 17:11:54 +0000 Received: from DM3NAM03FT045.eop-NAM03.prod.protection.outlook.com (2a01:111:f400:7e49::209) by SN1PR12CA0113.outlook.office365.com (2603:10b6:802:21::48) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.820.13 via Frontend Transport; Fri, 1 Jun 2018 17:11:54 +0000 Received-SPF: None (protection.outlook.com: amd.com does not designate permitted sender hosts) Received: from SATLEXCHOV02.amd.com (165.204.84.17) by DM3NAM03FT045.mail.protection.outlook.com (10.152.82.208) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.20.820.8 via Frontend Transport; Fri, 1 Jun 2018 17:11:54 +0000 Received: from agrodzovsky-All-Series.amd.com (10.34.1.3) by SATLEXCHOV02.amd.com (10.181.40.72) with Microsoft SMTP Server id 14.3.382.0; Fri, 1 Jun 2018 12:11:52 -0500 From: Andrey Grodzovsky To: , Subject: [PATCH v2 1/2] drm/scheduler: Avoid using wait_event_killable for dying process. Date: Fri, 1 Jun 2018 13:11:46 -0400 Message-ID: <1527873107-32642-1-git-send-email-andrey.grodzovsky@amd.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-Office365-Filtering-HT: Tenant X-Forefront-Antispam-Report: CIP:165.204.84.17; IPV:NLI; CTRY:US; EFV:NLI; SFV:NSPM; SFS:(10009020)(39860400002)(39380400002)(396003)(376002)(346002)(2980300002)(428003)(199004)(189003)(81156014)(47776003)(53416004)(5660300001)(72206003)(50466002)(16586007)(8676002)(59450400001)(450100002)(478600001)(53936002)(110136005)(26005)(54906003)(4326008)(186003)(106466001)(316002)(44832011)(50226002)(126002)(86362001)(2906002)(426003)(36756003)(68736007)(105586002)(486006)(51416003)(7696005)(2616005)(476003)(104016004)(356003)(77096007)(305945005)(81166006)(48376002)(6666003)(336012)(8936002)(97736004); DIR:OUT; SFP:1101; SCL:1; SRVR:DM3PR12MB0762; H:SATLEXCHOV02.amd.com; FPR:; SPF:None; LANG:en; PTR:InfoDomainNonexistent; A:1; MX:1; X-Microsoft-Exchange-Diagnostics: 1; DM3NAM03FT045; 1:SvDRgPxCGrzyA/sZWjsi3/2h4RDX4SaQqlaZHszQR+I23BHbKNQ6dt3ZNNKEOgHekLw4qjFAhw3lYzLyWKZ1lP3e5fxRrzoXQlo/DwSmHhfSz6DmcNf3+OCYZ7J6zAAy X-MS-PublicTrafficType: Email X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060); SRVR:DM3PR12MB0762; X-Microsoft-Exchange-Diagnostics: 1; DM3PR12MB0762; 3:UVRfIl6gskM4GjNHWI1XTJActWcSAdlXOn4DYCgmPZH9yKgpyU7s7hnXGcRKJeeIWIeJLtMPPZYuAZ10jx+PWIXyxg2y7wyJWCiAqpLKBhUgbRDR1qHPbALjpp0ueEefUc89OGEYxwY4XjdQAfLW/m4AXNPxPbwjN8OwrrhGxHrjbB2ijxNcy73EWdDY7/gDNzGRuyV1ohbwAxkb6fTjTDTxRgDTgCNWNS2QAnOskPXWXYhsnZpPk1I9UBfw80s1qq3KIw4EA6UoNFfHHJtpfAJPiydGd+ZlTGpME+fz9FIzXbE9g0hsjpQg+rrK0Z5v0ZmcaSL6eteAMLGfpCu72AFXLQ8ek//Q1esbQSaiKWg=; 25:zdhNqRjOWuOy0Hgr/Zb6++z+uJeLeDvBued353b/S9WRrdZPKyDkZj5lQ4Vk8sxATy4n1OzVU+8HbKKBflFyee9+Mha/oJ/i3kmkUPsCv//sNcTqlnRq+p0Kho352Ft67PEhmReuQOoZ4dLbXNx/fCBLS4/XwKYrNum2/7+VdRjNZDswCeAEwzX37i9dBSzx5sAkGrWWcXwQLWVuO36heSveFqjdifPg0TI/buMB/Vv4T7YTLAvvE7wRItjXHSW3cF5sZvmqZrlZKXLR0NzC0pKHGbmlNDLLwVxkGJylN95ucE+dUh1kHOG2AkIVBe1Eeh9eGcrlzuyYhXYAKaAo+A== X-MS-TrafficTypeDiagnostic: DM3PR12MB0762: X-Microsoft-Exchange-Diagnostics: 1; DM3PR12MB0762; 31:Wsga8JPCQga1MJ+6lrpa0wIJtEOSROHxdEO9GhJZYyLBpVf8GXaBj+M4T0HfD00XLtxwQkqu9FOqS8j1AxOOcXV5xjSBR2iMMVkxL7xCWszkBHH5JcLb9xHH/xo+Q5Q5KcJffKbv5f+2aKDcg9HMxBU9YhlReVg+KLdPQRGsrGuePkqmz6Y38MuRt/oUb5h/PcF4D2r5DyWL+uW5vIERV2NrqMoWp+riV1gRAuf4UD0=; 20:m1lifgflkkrgefsjA9sUMVR/9wcsDQQ9znHc13fmmOA6aql56aEOLhH31M43ckaKCQTPwfN3FjS3YIr0LAiZHzerkISxzyCk+Fa4R535d2kP6J66plCURDWYc8+guYuFQs7ESEQYnJRle3DnIg1AoRZvhL6OOlsjYBO2Fpl56p/nKTQTTZVuePOQCSHeRthQk1sD7cAI8IRG0po4azzT1zgcoWuRDLk3HX6g3+JPeu9mMHqkjMH9GnZHRs8duYXl+IVN0YFw8duQ6ZBU048rylhzslWAyJqYN6gA/C9xJlo38IKgTkaDFIfN1s4Q6Q8bMi5pqQIfRb3jlglxVmdYmVK1YYhFYTZoG3f9nNa/jY3/0ohWEbqE7iEHrMu5xF8M1BD1pw+soPZmGkhhazydbLaDNFWjtrSh4F7421HhHKHozeITGtO3kDQbb3qxfWUWsK8DwTyBMbhIY6NS+OTJK0Ebfgr73CKR+hzsPZsZ+y4m9PIlbSuX7/QYJMd+BnTG X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(767451399110); X-MS-Exchange-SenderADCheck: 1 X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3231254)(944501410)(52105095)(3002001)(10201501046)(93006095)(93003095)(6055026)(149027)(150027)(6041310)(20161123560045)(20161123558120)(20161123562045)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011)(7699016); SRVR:DM3PR12MB0762; BCL:0; PCL:0; RULEID:; SRVR:DM3PR12MB0762; X-Microsoft-Exchange-Diagnostics: 1; DM3PR12MB0762; 4:wxsFPjegTXHubJBi51xz/fGQRXAuCTwQaKSUvSbKI9hHhte/HQVxH2T98Npm7h8qyqHw6rhL7cVC21ag+a9cvbN96yyaKSTGcckURBc19SjzagKM/YcDCbBu+P4eB5JOFtI5EattUYLB2ABX2Z9Z97rn+eLHugt6VLljzA9ihSjfcev9nKUnMTEPm2yWKU+jnD2YNPtb7mGNmd8PDMKdiN+X+eRw/FrQxT4UMFCYLzFLrAzbbn3jFkZovMqTXL/LaMtrES9jp3DkSYQ2umUqRERF+v4ba3gK1djZqETN7+PiR/HR8y/k2l1uLzgIGu1J X-Forefront-PRVS: 0690E5FF22 X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DM3PR12MB0762; 23:R1UVjD+rnGFfXQKw1/AdtPAUUHuRuX5IuqshKozHO?= =?us-ascii?Q?Kg4yFlofF/dTYj52HcjMCzr6avW4Fw1Og7JBGjZkT3/uyHfl+gQI4QRsko0g?= =?us-ascii?Q?0sR2EdIE2zxAzuRum/k2U3RLc1k3dYQ6CKRZntMJW7xDPNig2HudwFwwM+3D?= =?us-ascii?Q?L6kH82+Dh9gTe0atrFZWZAI2RYpxUi/Zf3QFtQudZGoWfv+lEa9XKpQfTIYG?= =?us-ascii?Q?lbgUNN26S0X/pw8v0C8nJb5N91t+K4IFAXERsbnbMvh1LCmckVDuU4ZfX14k?= =?us-ascii?Q?b4ROJsZgf1QFQ4XOc6UAJYZ+LAVwg8wr920y3afCD5UlvZposZcUoHxnhbHI?= =?us-ascii?Q?8rX7hh5QTQGm0zcIysdhroGnxzKo8NSy/t6nUaV10ZgY8qkivEBpWfxzPXqz?= =?us-ascii?Q?ZIP15UHvb6+6CjP97jv870y2QW0fMY7ZRq2sfwY0s9sRqz/fNRrbJZMF4EFm?= =?us-ascii?Q?yJ3EnZiBDWFhgddZgIngklQrrB5uk5IyFGN6wus8plmHGT3xvRYEbeR7/Yu6?= =?us-ascii?Q?UDxh5gMDy0YFrt+4sldxBWFJ0bFfcWxT7O4+fyxMTNMFZ+8QKhEzBi9ilRHh?= =?us-ascii?Q?zYgBG3aul6v6DRMQpC5Fa+estRwAu+Be/kbJS7+BDb9HIx2sad0oHqp+A1X0?= =?us-ascii?Q?QABr8JrYdLbrkjfK+SycnnyLHGhqXkK4ijhc1xhUM6aIYmyZZ2oHcMo45g0E?= =?us-ascii?Q?ZNE1xd4FY/cYSVMG4heBieJ2E5mOwXihRYoaC+DBELJ/xZbAliePymFAF9rG?= =?us-ascii?Q?QruvSgGIVhiPryvhXe+z5okoCT5P+3bJ2f0w/PlH4tD8ekVKKq76HaykdnEL?= =?us-ascii?Q?JbQpZUbDlXJ7BEhK+GJFN4qPMt2IcIR55+KIKr9zs4uTOfgJz1SS/FkxuBhF?= =?us-ascii?Q?38AAgB+6bccqjBtO63B+UdFPNcRjBSE1niV5BT+UXQXMUiYl98gPI2DHxUQP?= =?us-ascii?Q?mQG1SCPvhB4JX/1eO3TzPuuoJbnR2mLgSGp38pYU3bSDw1rYXcKFMaj97/9Z?= =?us-ascii?Q?kotSYS+/+hp7WsgClJRJNdjGV5NJmLOQNvEWJ2qwD0s0tR/R4+fhmMZHSWsh?= =?us-ascii?Q?FjNoSbSuEQsRF0z7GQ8th7Gs1sPEUokr2926+n0x4wQQ++tEgy+uhHLGp1UJ?= =?us-ascii?Q?GbOtEXhn7Ewjn27x6SUacX1IIhcpx30rjSBjlixg2FjNMi93cbjs0YJ9XhP3?= =?us-ascii?Q?rmTaDP1yw11IjE=3D?= X-Microsoft-Antispam-Message-Info: +kS+z2IiKlhNAndt6TDxj4cPDylDJ3p5qIylwTLdTe3cNsaW/cD6Y7/s6pqBajw+clVhe3IqYvefsl+EHxdWs+RJEXV2Ava0S306lqp8SNg2IOkbOUyAhkBRsBBhf0nb5oqS7bxCKW0+kYCvnQ6mxjK2cVyFVRJsQSLlNWbbzx2KnPTxZSLrm7f4CCvT2TVU X-Microsoft-Exchange-Diagnostics: 1; DM3PR12MB0762; 6:EH4S3dUzCQzR0rDiyeY2oTrU8OJh3HZt4n02ws9xpzF4tWlUmWOaXrvD1F56vfF8ymmTAoNej7YirVQ2Lld7pjKskFXafE8xFRMU3XkQa8vxFyo/l8kXdzhU8z7ZAidbHL9AUdVuA7gyIzi7jYSsKO1aSd+f+q4GWFbJvCx8WnvQCa7pOLXuxUVHa0hiHWHLl8bHo5ZKJM2aGjQwDb6rgCty+yNolH3FG6b96pOumxwth+GEx3f59Tn9Y+O6Mbc0WEhgkPrVtMJyNi2PMKBFhPFWhNGHzNQCvUs/yCDGRxSSIKibUzc3uHSCG2Q61Bqj/89CD5YYAfcBSRRbTdKSn+ejoAVswZ2zo7Ak8wiYoWQUmUW3oS+5uNJ5xM1wnSRQYwNNpbviHokIhft9bB3nilPXoPckQk86vCF/RY2mJinW3TnvXMDmguVLAPigE0ngA/RFEHj2FiAyNRhZu49+Pg==; 5:nJ//99aRSeIYb3kny2AZsqNazemEZEcgiJRI6VTInlCTCezw2mEhTZ7XkrNrkfDLWUBlpC+pcUYMJl66SDjLx36+6UkmnjARxLPIUiWrYtJ/y767Dt1IRppyrLAYpoS4QqEwXY4liiqsLqOcrUTOziFBl1NsxzveoY0RqBUrAtc=; 24:C6dJJdBLYDJ0OOEkj1Fo+FgwV/PUULnhSa1UmXgVGS6TyBtQ+AdJjdn25yz1VG6PIBfGst/MxEdif173AMd4QSWOtiOUA0/8UFHhAC63/fQ= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; DM3PR12MB0762; 7:uoisEpxvdxP3faRP6b+Kr0SBmLDK+WRd5Q1/JT9OM0sJtGwaW9y4DvWnpNCj+hsRUqE8VB97fTJWAL074xksB7nBdZnejHqhv4MNlINGPr7cHB85komCW6RHp/BPHRwxEv++EDAeGF6hTfqYWoDlZFk4GxznfpSmDS/8+2d7/POPXAlEJyE6sZmbPy3H7EzJGLlXceGEExYCXZgtfZWkgmOlPnkd0S9sNURDTnC/ykHKWOAs8gaBCV3AyW9WQPXm; 20:VklbBXeWxYKXWkKXnhqBhNAtUgZbMJzSPePaXpB4JuNW9HFGoNhQFD/e42A5LIDCK769LhcaLV6qtKjB1jsNvETDMg3mLEUvldYIiWk3htzbJEP/81Neg1r1ttT8KMO2z5QUzUDwmBMdxK4VbAfeFpD+6g1EJY00+ZDRqOE4HNSg+oXI3nG3eCWV2O/Xg3VoLTbu92tK4/4zRnOTYOR//LZH3K0pV+Se5fr8fly1RIaUoWrsBWQBXSfchyaomNj9 X-MS-Office365-Filtering-Correlation-Id: c88dbe0d-1251-4287-5b94-08d5c7e2c624 X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Jun 2018 17:11:54.0163 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c88dbe0d-1251-4287-5b94-08d5c7e2c624 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[165.204.84.17]; Helo=[SATLEXCHOV02.amd.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM3PR12MB0762 X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Christian.Koenig@amd.com Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" X-Virus-Scanned: ClamAV using ClamSMTP Dying process might be blocked from receiving any more signals so avoid using it. Also retire enity->fini_status and just check the SW queue, if it's not empty do the fallback cleanup. Also handle entity->last_scheduled == NULL use case which happens when HW ring is already hangged whem a new entity tried to enqeue jobs. v2: Return the remaining timeout and use that as parameter for the next call. This way when we need to cleanup multiple queues we don't wait for the entire TO period for each queue but rather in total. Styling comments. Rebase. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/scheduler/gpu_scheduler.c | 74 ++++++++++++++++++++++++------- include/drm/gpu_scheduler.h | 7 +-- 2 files changed, 61 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c b/drivers/gpu/drm/scheduler/gpu_scheduler.c index 8c1e80c..c594d17 100644 --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c @@ -181,7 +181,6 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched, entity->rq = rq; entity->sched = sched; entity->guilty = guilty; - entity->fini_status = 0; entity->last_scheduled = NULL; spin_lock_init(&entity->rq_lock); @@ -219,7 +218,8 @@ static bool drm_sched_entity_is_initialized(struct drm_gpu_scheduler *sched, static bool drm_sched_entity_is_idle(struct drm_sched_entity *entity) { rmb(); - if (spsc_queue_peek(&entity->job_queue) == NULL) + + if (!entity->rq || spsc_queue_peek(&entity->job_queue) == NULL) return true; return false; @@ -260,25 +260,48 @@ static void drm_sched_entity_kill_jobs_cb(struct dma_fence *f, * * @sched: scheduler instance * @entity: scheduler entity + * @timeout: time to wait in ms for Q to become empty. * * Splitting drm_sched_entity_fini() into two functions, The first one does the waiting, * removes the entity from the runqueue and returns an error when the process was killed. + * + * Returns amount of time spent in waiting for TO. + * 0 if wait wasn't with time out. + * MAX_WAIT_SCHED_ENTITY_Q_EMPTY_MS if wait timed out with condition false + * Number of MS spent in waiting before condition became true + * */ -void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched, - struct drm_sched_entity *entity) +unsigned drm_sched_entity_do_release(struct drm_gpu_scheduler *sched, + struct drm_sched_entity *entity, unsigned timeout) { + unsigned ret = 0; + if (!drm_sched_entity_is_initialized(sched, entity)) return; /** * The client will not queue more IBs during this fini, consume existing * queued IBs or discard them on SIGKILL */ - if ((current->flags & PF_SIGNALED) && current->exit_code == SIGKILL) - entity->fini_status = -ERESTARTSYS; - else - entity->fini_status = wait_event_killable(sched->job_scheduled, - drm_sched_entity_is_idle(entity)); - drm_sched_entity_set_rq(entity, NULL); + if (current->flags & PF_EXITING) { + if (timeout) { + ret = jiffies_to_msecs( + wait_event_timeout( + sched->job_scheduled, + drm_sched_entity_is_idle(entity), + msecs_to_jiffies(timeout))); + + if (!ret) + ret = MAX_WAIT_SCHED_ENTITY_Q_EMPTY_MS; + } + } else + wait_event_killable(sched->job_scheduled, drm_sched_entity_is_idle(entity)); + + + /* For killed process disable any more IBs enqueue right now */ + if ((current->flags & PF_EXITING) && (current->exit_code == SIGKILL)) + drm_sched_entity_set_rq(entity, NULL); + + return ret; } EXPORT_SYMBOL(drm_sched_entity_do_release); @@ -290,11 +313,18 @@ EXPORT_SYMBOL(drm_sched_entity_do_release); * * This should be called after @drm_sched_entity_do_release. It goes over the * entity and signals all jobs with an error code if the process was killed. + * */ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched, struct drm_sched_entity *entity) { - if (entity->fini_status) { + + drm_sched_entity_set_rq(entity, NULL); + + /* Consumption of existing IBs wasn't completed. Forcefully + * remove them here. + */ + if (spsc_queue_peek(&entity->job_queue)) { struct drm_sched_job *job; int r; @@ -314,12 +344,22 @@ void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched, struct drm_sched_fence *s_fence = job->s_fence; drm_sched_fence_scheduled(s_fence); dma_fence_set_error(&s_fence->finished, -ESRCH); - r = dma_fence_add_callback(entity->last_scheduled, &job->finish_cb, - drm_sched_entity_kill_jobs_cb); - if (r == -ENOENT) + + /* + * When pipe is hanged by older entity, new entity might + * not even have chance to submit it's first job to HW + * and so entity->last_scheduled will remain NULL + */ + if (!entity->last_scheduled) { drm_sched_entity_kill_jobs_cb(NULL, &job->finish_cb); - else if (r) - DRM_ERROR("fence add callback failed (%d)\n", r); + } else { + r = dma_fence_add_callback(entity->last_scheduled, &job->finish_cb, + drm_sched_entity_kill_jobs_cb); + if (r == -ENOENT) + drm_sched_entity_kill_jobs_cb(NULL, &job->finish_cb); + else if (r) + DRM_ERROR("fence add callback failed (%d)\n", r); + } } } @@ -339,7 +379,7 @@ EXPORT_SYMBOL(drm_sched_entity_cleanup); void drm_sched_entity_fini(struct drm_gpu_scheduler *sched, struct drm_sched_entity *entity) { - drm_sched_entity_do_release(sched, entity); + drm_sched_entity_do_release(sched, entity, MAX_WAIT_SCHED_ENTITY_Q_EMPTY_MS); drm_sched_entity_cleanup(sched, entity); } EXPORT_SYMBOL(drm_sched_entity_fini); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index 496442f..af07875 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -27,6 +27,8 @@ #include #include +#define MAX_WAIT_SCHED_ENTITY_Q_EMPTY_MS 1000 + struct drm_gpu_scheduler; struct drm_sched_rq; @@ -84,7 +86,6 @@ struct drm_sched_entity { struct dma_fence *dependency; struct dma_fence_cb cb; atomic_t *guilty; - int fini_status; struct dma_fence *last_scheduled; }; @@ -283,8 +284,8 @@ int drm_sched_entity_init(struct drm_gpu_scheduler *sched, struct drm_sched_entity *entity, struct drm_sched_rq *rq, atomic_t *guilty); -void drm_sched_entity_do_release(struct drm_gpu_scheduler *sched, - struct drm_sched_entity *entity); +unsigned drm_sched_entity_do_release(struct drm_gpu_scheduler *sched, + struct drm_sched_entity *entity, unsigned timeout); void drm_sched_entity_cleanup(struct drm_gpu_scheduler *sched, struct drm_sched_entity *entity); void drm_sched_entity_fini(struct drm_gpu_scheduler *sched,