From patchwork Wed Apr 20 17:13:42 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Harrison X-Patchwork-Id: 8893481 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 194C19F46D for ; Wed, 20 Apr 2016 17:14:32 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 0EBB220149 for ; Wed, 20 Apr 2016 17:14:30 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id BA15C20145 for ; Wed, 20 Apr 2016 17:14:28 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0FA8D6EA9E; Wed, 20 Apr 2016 17:14:27 +0000 (UTC) X-Original-To: Intel-GFX@lists.freedesktop.org Delivered-To: Intel-GFX@lists.freedesktop.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTP id 6A9816EA99 for ; Wed, 20 Apr 2016 17:14:25 +0000 (UTC) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP; 20 Apr 2016 10:14:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,510,1455004800"; d="scan'208";a="689424516" Received: from johnharr-linux.isw.intel.com ([10.102.226.93]) by FMSMGA003.fm.intel.com with ESMTP; 20 Apr 2016 10:14:24 -0700 From: John.C.Harrison@Intel.com To: Intel-GFX@Lists.FreeDesktop.Org Date: Wed, 20 Apr 2016 18:13:42 +0100 Message-Id: <1461172435-4256-25-git-send-email-John.C.Harrison@Intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1461172435-4256-1-git-send-email-John.C.Harrison@Intel.com> References: <1461172435-4256-1-git-send-email-John.C.Harrison@Intel.com> Organization: Intel Corporation (UK) Ltd. - Co. Reg. #1134945 - Pipers Way, Swindon SN3 1RJ Subject: [Intel-gfx] [PATCH v6 24/34] drm/i915: Added scheduler queue throttling by DRM file handle X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-5.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: John Harrison The scheduler decouples the submission of batch buffers to the driver from their subsequent submission to the hardware. This means that an application which is continuously submitting buffers as fast as it can could potentialy flood the driver. To prevent this, the driver now tracks how many buffers are in progress (queued in software or executing in hardware) and limits this to a given (tunable) number. If this number is exceeded then the queue to the driver will return EAGAIN and thus prevent the scheduler's queue becoming arbitrarily large. v3: Added a missing decrement of the file queue counter. v4: Updated a comment. v5: Updated due to changes to earlier patches in series - removing forward declarations and white space. Also added some documentation. [Joonas Lahtinen] v6: Updated to newer nightly (lots of ring -> engine renaming). Replace the simple 'return to userland when full' scheme with a 'sleep on request' scheme. The former could lead to the busy polling and wasting lots of CPU time as user land continuously retried the execbuf IOCTL in a tight loop. Now the driver will sleep (without holding the mutex lock) on the oldest request outstanding for that file and then automatically retry. This is closer to the pre-scheduler behaviour of stalling on a full ring buffer. For: VIZ-1587 Signed-off-by: John Harrison --- drivers/gpu/drm/i915/i915_drv.h | 2 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 ++ drivers/gpu/drm/i915/i915_scheduler.c | 118 +++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_scheduler.h | 2 + 4 files changed, 130 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index e9aaacc..25b8fd6 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -376,6 +376,8 @@ struct drm_i915_file_private { } rps; unsigned int bsd_ring; + + u32 scheduler_queue_length; }; /* Used by dp and fdi links */ diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index a08638a..1f8486e 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1818,6 +1818,10 @@ i915_gem_execbuffer(struct drm_device *dev, void *data, return -EINVAL; } + /* Throttle batch requests per device file */ + if (i915_scheduler_file_queue_wait(file)) + return -EAGAIN; + /* Copy in the exec list from userland */ exec_list = drm_malloc_ab(sizeof(*exec_list), args->buffer_count); exec2_list = drm_malloc_ab(sizeof(*exec2_list), args->buffer_count); @@ -1908,6 +1912,10 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data, return -EINVAL; } + /* Throttle batch requests per device file */ + if (i915_scheduler_file_queue_wait(file)) + return -EAGAIN; + exec2_list = drm_malloc_gfp(args->buffer_count, sizeof(*exec2_list), GFP_TEMPORARY); diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c index a3a7a82..0908370 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.c +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -80,6 +80,7 @@ int i915_scheduler_init(struct drm_device *dev) scheduler->priority_level_bump = 50; scheduler->priority_level_preempt = 900; scheduler->min_flying = 2; + scheduler->file_queue_max = 64; dev_priv->scheduler = scheduler; @@ -496,6 +497,28 @@ static int i915_scheduler_submit_unlocked(struct intel_engine_cs *engine) return ret; } +/** + * i915_scheduler_file_queue_inc - Increment the file's request queue count. + * @file: File object to process. + */ +static void i915_scheduler_file_queue_inc(struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file->driver_priv; + + file_priv->scheduler_queue_length++; +} + +/** + * i915_scheduler_file_queue_dec - Decrement the file's request queue count. + * @file: File object to process. + */ +static void i915_scheduler_file_queue_dec(struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file->driver_priv; + + file_priv->scheduler_queue_length--; +} + static void i915_generate_dependencies(struct i915_scheduler *scheduler, struct i915_scheduler_queue_entry *node, uint32_t engine) @@ -675,6 +698,8 @@ int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) list_add_tail(&node->link, &scheduler->node_queue[engine->id]); + i915_scheduler_file_queue_inc(node->params.file); + not_flying = i915_scheduler_count_flying(scheduler, engine) < scheduler->min_flying; @@ -871,6 +896,12 @@ static bool i915_scheduler_remove(struct i915_scheduler *scheduler, /* Strip the dependency info while the mutex is still locked */ i915_scheduler_remove_dependent(scheduler, node); + /* Likewise clean up the file pointer. */ + if (node->params.file) { + i915_scheduler_file_queue_dec(node->params.file); + node->params.file = NULL; + } + continue; } @@ -963,6 +994,92 @@ void i915_scheduler_work_handler(struct work_struct *work) i915_scheduler_process_work(engine); } +/** + * i915_scheduler_file_queue_wait - Waits for space in the per file queue. + * @file: File object to process. + * This allows throttling of applications by limiting the total number of + * outstanding requests to a specified level. Once that limit is reached, + * this call will stall waiting on the oldest outstanding request. If it can + * not stall for any reason it returns true to mean that the queue is full + * and no more requests should be accepted. + */ +bool i915_scheduler_file_queue_wait(struct drm_file *file) +{ + struct drm_i915_file_private *file_priv = file->driver_priv; + struct drm_i915_private *dev_priv = file_priv->dev_priv; + struct i915_scheduler *scheduler = dev_priv->scheduler; + struct drm_i915_gem_request *req; + struct i915_scheduler_queue_entry *node; + unsigned reset_counter; + int ret; + struct intel_engine_cs *engine; + + if (file_priv->scheduler_queue_length < scheduler->file_queue_max) + return false; + + do { + spin_lock_irq(&scheduler->lock); + + /* + * Find the first (i.e. oldest) request for this file. In the + * case where an app is using multiple engines, this search + * might be skewed by engine. However, worst case is an app has + * queued ~60 requests to a high indexed engine and then one + * request to a low indexed engine. In such a case, the driver + * will wait for longer than necessary but operation will + * still be correct and that case is not rare enough to add + * jiffy based inter-engine checks. + */ + req = NULL; + for_each_engine(engine, dev_priv) { + for_each_scheduler_node(node, engine->id) { + if (I915_SQS_IS_COMPLETE(node)) + continue; + + if (node->params.file != file) + continue; + + req = node->params.request; + break; + } + + if (req) + break; + } + + if (!req) { + spin_unlock_irq(&scheduler->lock); + return false; + } + + i915_gem_request_reference(req); + + spin_unlock_irq(&scheduler->lock); + + ret = i915_gem_check_wedge(&dev_priv->gpu_error, false); + if (ret) + goto err_unref; + + reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter); + + ret = __i915_wait_request(req, reset_counter, + I915_WAIT_REQUEST_INTERRUPTIBLE, NULL, NULL); + if (ret) + goto err_unref; + + /* Make sure the request's resources actually get cleared up */ + i915_scheduler_process_work(req->engine); + + i915_gem_request_unreference(req); + } while(file_priv->scheduler_queue_length >= scheduler->file_queue_max); + + return false; + +err_unref: + i915_gem_request_unreference(req); + return true; +} + static int i915_scheduler_submit_max_priority(struct intel_engine_cs *engine, bool is_locked) { @@ -1185,6 +1302,7 @@ void i915_scheduler_closefile(struct drm_device *dev, struct drm_file *file) node->status, engine->name); + i915_scheduler_file_queue_dec(node->params.file); node->params.file = NULL; } } diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h index 4e7c0a7..5c33c83 100644 --- a/drivers/gpu/drm/i915/i915_scheduler.h +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -94,6 +94,7 @@ struct i915_scheduler { int32_t priority_level_bump; int32_t priority_level_preempt; uint32_t min_flying; + uint32_t file_queue_max; }; /* Flag bits for i915_scheduler::flags */ @@ -116,5 +117,6 @@ int i915_scheduler_flush(struct intel_engine_cs *engine, bool is_locked); int i915_scheduler_flush_stamp(struct intel_engine_cs *engine, unsigned long stamp, bool is_locked); bool i915_scheduler_is_mutex_required(struct drm_i915_gem_request *req); +bool i915_scheduler_file_queue_wait(struct drm_file *file); #endif /* _I915_SCHEDULER_H_ */