Message ID | 1461172435-4256-7-git-send-email-John.C.Harrison@Intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, Just a few random comments/questions. (not a full review!) On 20/04/16 18:13, John.C.Harrison@Intel.com wrote: > From: John Harrison <John.C.Harrison@Intel.com> > > Initial creation of scheduler source files. Note that this patch > implements most of the scheduler functionality but does not hook it in > to the driver yet. It also leaves the scheduler code in 'pass through' > mode so that even when it is hooked in, it will not actually do very > much. This allows the hooks to be added one at a time in bite size > chunks and only when the scheduler is finally enabled at the end does > anything start happening. > > The general theory of operation is that when batch buffers are > submitted to the driver, the execbuffer() code packages up all the > information required to execute the batch buffer at a later time. This > package is given over to the scheduler which adds it to an internal > node list. The scheduler also scans the list of objects associated > with the batch buffer and compares them against the objects already in > use by other buffers in the node list. If matches are found then the > new batch buffer node is marked as being dependent upon the matching > node. The same is done for the context object. The scheduler also > bumps up the priority of such matching nodes on the grounds that the > more dependencies a given batch buffer has the more important it is > likely to be. > > The scheduler aims to have a given (tuneable) number of batch buffers > in flight on the hardware at any given time. If fewer than this are > currently executing when a new node is queued, then the node is passed > straight through to the submit function. Otherwise it is simply added > to the queue and the driver returns back to user land. > > The scheduler is notified when each batch buffer completes and updates > its internal tracking accordingly. At the end of the completion > interrupt processing, if any scheduler tracked batches were processed, > the scheduler's deferred worker thread is woken up. This can do more > involved processing such as actually removing completed nodes from the > queue and freeing up the resources associated with them (internal > memory allocations, DRM object references, context reference, etc.). > The work handler also checks the in flight count and calls the > submission code if a new slot has appeared. > > When the scheduler's submit code is called, it scans the queued node > list for the highest priority node that has no unmet dependencies. > Note that the dependency calculation is complex as it must take > inter-ring dependencies and potential preemptions into account. Note > also that in the future this will be extended to include external > dependencies such as the Android Native Sync file descriptors and/or > the linux dma-buff synchronisation scheme. > > If a suitable node is found then it is sent to execbuff_final() for > submission to the hardware. The in flight count is then re-checked and > a new node popped from the list if appropriate. All nodes that are not > submitted have their priority bumped. This ensures that low priority > tasks do not get starved out by busy higher priority ones - everything > will eventually get its turn to run. > > Note that this patch does not implement pre-emptive scheduling. Only > basic scheduling by re-ordering batch buffer submission is currently > implemented. Pre-emption of actively executing batch buffers comes in > the next patch series. > > v2: Changed priority levels to +/-1023 due to feedback from Chris > Wilson. > > Removed redundant index from scheduler node. > > Changed time stamps to use jiffies instead of raw monotonic. This > provides lower resolution but improved compatibility with other i915 > code. > > Major re-write of completion tracking code due to struct fence > conversion. The scheduler no longer has it's own private IRQ handler > but just lets the existing request code handle completion events. > Instead, the scheduler now hooks into the request notify code to be > told when a request has completed. > > Reduced driver mutex locking scope. Removal of scheduler nodes no > longer grabs the mutex lock. > > v3: Refactor of dependency generation to make the code more readable. > Also added in read-read optimisation support - i.e., don't treat a > shared read-only buffer as being a dependency. > > Allowed the killing of queued nodes rather than only flying ones. > > v4: Updated the commit message to better reflect the current state of > the code. Downgraded some BUG_ONs to WARN_ONs. Used the correct array > memory allocator function (kmalloc_array instead of kmalloc). > Corrected the format of some comments. Wrapped some lines differently > to keep the style checker happy. > > Fixed a WARN_ON when killing nodes. The dependency removal code checks > that nodes being destroyed do not have any oustanding dependencies > (which would imply they should not have been executed yet). In the > case of nodes being destroyed, e.g. due to context banning, then this > might well be the case - they have not been executed and do indeed > have outstanding dependencies. > > Re-instated the code to disble interrupts when not in use. The > underlying problem causing broken IRQ reference counts seems to have > been fixed now. > > v5: Shuffled various functions around to remove forward declarations > as apparently these are frowned upon. Removed lots of white space as > apparently having easy to read code is also frowned upon. Split the > direct submission scheduler bypass code out into a separate function. > Squashed down the i915_scheduler.c sections of various patches into > this patch. Thus the later patches simply hook in existing code into > various parts of the driver rather than adding the code as well. Added > documentation to various functions. Re-worked the submit function in > terms of mutex locking, error handling and exit paths. Split the > delayed work handler function in half. Made use of the kernel 'clamp' > macro. [Joonas Lahtinen] > > Added runtime PM calls as these must be done at the top level before > acquiring the driver mutex lock. [Chris Wilson] > > Removed some obsolete debug code that had been forgotten about. > > Moved more clean up code into the 'i915_gem_scheduler_clean_node()' > function rather than replicating it in mutliple places. > > Used lighter weight spinlocks. > > v6: Updated to newer nightly (lots of ring -> engine renaming). > > Added 'for_each_scheduler_node()' and 'assert_scheduler_lock_held()' > helper macros. Renamed 'i915_gem_execbuff_release_batch_obj' to > 'i915_gem_execbuf_release_batch_obj'. Updated to use 'to_i915()' > instead of dev_private. Converted all enum labels to uppercase. > Removed various unnecessary WARNs. Renamed 'saved_objects' to just > 'objs'. Split code for counting incomplete nodes out into a separate > function. Removed even more white space. Added a destroy() function. > [review feedback from Joonas Lahtinen] > > Added running totals of 'flying' and 'queued' nodes rather than > re-calculating each time as a minor CPU performance optimisation. > > Removed support for out of order seqno completion. All the prep work > patch series (seqno to request conversion, late seqno assignment, > etc.) that has now been done means that the scheduler no longer > generates out of order seqno completions. Thus all the complex code > for coping with such is no longer required and can be removed. > > Fixed a bug in scheduler bypass mode introduced in the clean up code > refactoring of v5. The clean up function was seeing the node in the > wrong state and thus refusing to process it. > > For: VIZ-1587 > Signed-off-by: John Harrison <John.C.Harrison@Intel.com> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > --- > drivers/gpu/drm/i915/Makefile | 1 + > drivers/gpu/drm/i915/i915_dma.c | 3 + > drivers/gpu/drm/i915/i915_drv.h | 6 + > drivers/gpu/drm/i915/i915_gem.c | 5 + > drivers/gpu/drm/i915/i915_scheduler.c | 867 ++++++++++++++++++++++++++++++++++ > drivers/gpu/drm/i915/i915_scheduler.h | 113 +++++ > 6 files changed, 995 insertions(+) > create mode 100644 drivers/gpu/drm/i915/i915_scheduler.c > create mode 100644 drivers/gpu/drm/i915/i915_scheduler.h > > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index e9cdeb5..289fa73 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -10,6 +10,7 @@ ccflags-y := -Werror > i915-y := i915_drv.o \ > i915_irq.o \ > i915_params.o \ > + i915_scheduler.o \ > i915_suspend.o \ > i915_sysfs.o \ > intel_csr.o \ > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index b377753..2ad4071 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -37,6 +37,7 @@ > #include "i915_drv.h" > #include "i915_vgpu.h" > #include "i915_trace.h" > +#include "i915_scheduler.h" > #include <linux/pci.h> > #include <linux/console.h> > #include <linux/vt.h> > @@ -1448,6 +1449,8 @@ int i915_driver_unload(struct drm_device *dev) > > intel_csr_ucode_fini(dev_priv); > > + i915_scheduler_destroy(dev_priv); > + > /* Free error state after interrupts are fully disabled. */ > cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work); > i915_destroy_error_state(dev); > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 7492ce7..7b62e2c 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1717,6 +1717,8 @@ struct i915_execbuffer_params { > struct drm_i915_gem_request *request; > }; > > +struct i915_scheduler; > + > /* used in computing the new watermarks state */ > struct intel_wm_config { > unsigned int num_pipes_active; > @@ -1994,6 +1996,8 @@ struct drm_i915_private { > > struct i915_runtime_pm pm; > > + struct i915_scheduler *scheduler; > + > /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */ > struct { > int (*execbuf_submit)(struct i915_execbuffer_params *params, > @@ -2335,6 +2339,8 @@ struct drm_i915_gem_request { > /** process identifier submitting this request */ > struct pid *pid; > > + struct i915_scheduler_queue_entry *scheduler_qe; > + > /** > * The ELSP only accepts two elements at a time, so we queue > * context/tail pairs on a given queue (ring->execlist_queue) until the > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index a632276..b7466cb 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -32,6 +32,7 @@ > #include "i915_vgpu.h" > #include "i915_trace.h" > #include "intel_drv.h" > +#include "i915_scheduler.h" > #include <linux/shmem_fs.h> > #include <linux/slab.h> > #include <linux/swap.h> > @@ -5405,6 +5406,10 @@ int i915_gem_init(struct drm_device *dev) > */ > intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); > > + ret = i915_scheduler_init(dev); > + if (ret) > + goto out_unlock; > + > ret = i915_gem_init_userptr(dev); > if (ret) > goto out_unlock; > diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c > new file mode 100644 > index 0000000..9d628b9 > --- /dev/null > +++ b/drivers/gpu/drm/i915/i915_scheduler.c > @@ -0,0 +1,867 @@ > +/* > + * Copyright (c) 2014 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS > + * IN THE SOFTWARE. > + * > + */ > + > +#include "i915_drv.h" > +#include "intel_drv.h" > +#include "i915_scheduler.h" > + > +#define for_each_scheduler_node(node, id) \ > + list_for_each_entry((node), &scheduler->node_queue[(id)], link) > + > +#define assert_scheduler_lock_held(scheduler) \ > + do { \ > + WARN_ONCE(!spin_is_locked(&(scheduler)->lock), "Spinlock not locked!"); \ > + } while(0) > + > +/** > + * i915_scheduler_is_enabled - Returns true if the scheduler is enabled. > + * @dev: DRM device > + */ > +bool i915_scheduler_is_enabled(struct drm_device *dev) > +{ > + struct drm_i915_private *dev_priv = to_i915(dev); > + > + return dev_priv->scheduler != NULL; > +} > + > +/** > + * i915_scheduler_init - Initialise the scheduler. > + * @dev: DRM device > + * Returns zero on success or -ENOMEM if memory allocations fail. > + */ > +int i915_scheduler_init(struct drm_device *dev) > +{ > + struct drm_i915_private *dev_priv = to_i915(dev); > + struct i915_scheduler *scheduler = dev_priv->scheduler; > + int e; > + > + if (scheduler) > + return 0; Probably a GEM_BUG_ON or something. > + > + scheduler = kzalloc(sizeof(*scheduler), GFP_KERNEL); > + if (!scheduler) > + return -ENOMEM; > + > + spin_lock_init(&scheduler->lock); > + > + for (e = 0; e < I915_NUM_ENGINES; e++) { > + INIT_LIST_HEAD(&scheduler->node_queue[e]); > + scheduler->counts[e].flying = 0; > + scheduler->counts[e].queued = 0; > + } > + > + /* Default tuning values: */ > + scheduler->priority_level_min = -1023; > + scheduler->priority_level_max = 1023; > + scheduler->priority_level_bump = 50; > + scheduler->priority_level_preempt = 900; > + scheduler->min_flying = 2; > + > + dev_priv->scheduler = scheduler; > + > + return 0; > +} > + > +/** > + * i915_scheduler_destroy - Get rid of the scheduler. > + * @dev: DRM device > + */ > +void i915_scheduler_destroy(struct drm_i915_private *dev_priv) > +{ > + struct i915_scheduler *scheduler = dev_priv->scheduler; > + int e; > + > + if (!scheduler) > + return; > + > + for (e = 0; e < I915_NUM_ENGINES; e++) > + WARN(!list_empty(&scheduler->node_queue[e]), "Destroying with list entries on engine %d!", e); > + > + kfree(scheduler); > + dev_priv->scheduler = NULL; > +} > + > +/* > + * Add a popped node back in to the queue. For example, because the engine > + * was hung when execfinal() was called and thus the engine submission needs > + * to be retried later. > + */ > +static void i915_scheduler_node_requeue(struct i915_scheduler *scheduler, > + struct i915_scheduler_queue_entry *node) > +{ > + assert_scheduler_lock_held(scheduler); > + > + WARN_ON(!I915_SQS_IS_FLYING(node)); > + > + /* Seqno will be reassigned on relaunch */ > + node->params.request->seqno = 0; > + node->status = I915_SQS_QUEUED; > + scheduler->counts[node->params.engine->id].flying--; > + scheduler->counts[node->params.engine->id].queued++; > +} > + > +/* > + * Give up on a node completely. For example, because it is causing the > + * engine to hang or is using some resource that no longer exists. > + */ > +static void i915_scheduler_node_kill(struct i915_scheduler *scheduler, > + struct i915_scheduler_queue_entry *node) > +{ > + assert_scheduler_lock_held(scheduler); > + > + WARN_ON(I915_SQS_IS_COMPLETE(node)); > + > + if (I915_SQS_IS_FLYING(node)) > + scheduler->counts[node->params.engine->id].flying--; > + else > + scheduler->counts[node->params.engine->id].queued--; > + > + node->status = I915_SQS_DEAD; > +} > + > +/* Mark a node as in flight on the hardware. */ > +static void i915_scheduler_node_fly(struct i915_scheduler_queue_entry *node) > +{ > + struct drm_i915_private *dev_priv = to_i915(node->params.dev); > + struct i915_scheduler *scheduler = dev_priv->scheduler; > + struct intel_engine_cs *engine = node->params.engine; > + > + assert_scheduler_lock_held(scheduler); > + > + WARN_ON(node->status != I915_SQS_POPPED); > + > + /* > + * Add the node (which should currently be in state popped) to the > + * front of the queue. This ensure that flying nodes are always held > + * in hardware submission order. > + */ > + list_add(&node->link, &scheduler->node_queue[engine->id]); > + > + node->status = I915_SQS_FLYING; > + > + scheduler->counts[engine->id].flying++; > + > + if (!(scheduler->flags[engine->id] & I915_SF_INTERRUPTS_ENABLED)) { > + bool success = true; > + > + success = engine->irq_get(engine); > + if (success) > + scheduler->flags[engine->id] |= I915_SF_INTERRUPTS_ENABLED; > + } > +} > + > +static inline uint32_t i915_scheduler_count_flying(struct i915_scheduler *scheduler, > + struct intel_engine_cs *engine) > +{ > + return scheduler->counts[engine->id].flying; > +} Maybe num_flying would be a more obvious name, or at least less worry inducing. > + > +static void i915_scheduler_priority_bump_clear(struct i915_scheduler *scheduler) > +{ > + struct i915_scheduler_queue_entry *node; > + int i; > + > + assert_scheduler_lock_held(scheduler); > + > + /* > + * Ensure circular dependencies don't cause problems and that a bump > + * by object usage only bumps each using buffer once: > + */ > + for (i = 0; i < I915_NUM_ENGINES; i++) { > + for_each_scheduler_node(node, i) > + node->bumped = false; > + } > +} > + > +static int i915_scheduler_priority_bump(struct i915_scheduler *scheduler, > + struct i915_scheduler_queue_entry *target, > + uint32_t bump) > +{ > + uint32_t new_priority; > + int i, count; > + > + if (target->priority >= scheduler->priority_level_max) > + return 1; > + > + if (target->bumped) > + return 0; > + > + new_priority = target->priority + bump; > + if ((new_priority <= target->priority) || > + (new_priority > scheduler->priority_level_max)) > + target->priority = scheduler->priority_level_max; > + else > + target->priority = new_priority; > + > + count = 1; > + target->bumped = true; > + > + for (i = 0; i < target->num_deps; i++) { > + if (!target->dep_list[i]) > + continue; > + > + if (target->dep_list[i]->bumped) > + continue; > + > + count += i915_scheduler_priority_bump(scheduler, > + target->dep_list[i], > + bump); > + } > + > + return count; > +} > + > +/* > + * Nodes are considered valid dependencies if they are queued on any engine > + * or if they are in flight on a different engine. In flight on the same > + * engine is no longer interesting for non-premptive nodes as the engine > + * serialises execution. For pre-empting nodes, all in flight dependencies > + * are valid as they must not be jumped by the act of pre-empting. > + * > + * Anything that is neither queued nor flying is uninteresting. > + */ > +static inline bool i915_scheduler_is_dependency_valid( > + struct i915_scheduler_queue_entry *node, uint32_t idx) > +{ > + struct i915_scheduler_queue_entry *dep; > + > + dep = node->dep_list[idx]; > + if (!dep) > + return false; > + > + if (I915_SQS_IS_QUEUED(dep)) > + return true; > + > + if (I915_SQS_IS_FLYING(dep)) { > + if (node->params.engine != dep->params.engine) > + return true; > + } > + > + return false; > +} > + > +static int i915_scheduler_pop_from_queue_locked(struct intel_engine_cs *engine, > + struct i915_scheduler_queue_entry **pop_node) > +{ > + struct drm_i915_private *dev_priv = to_i915(engine->dev); > + struct i915_scheduler *scheduler = dev_priv->scheduler; > + struct i915_scheduler_queue_entry *best = NULL; > + struct i915_scheduler_queue_entry *node; > + int ret; > + int i; > + bool any_queued = false; > + bool has_local, has_remote, only_remote = false; > + > + assert_scheduler_lock_held(scheduler); > + > + *pop_node = NULL; Looks to be not needed since there is *pop_node = best below. > + ret = -ENODATA; > + > + for_each_scheduler_node(node, engine->id) { > + if (!I915_SQS_IS_QUEUED(node)) > + continue; > + any_queued = true; > + > + has_local = false; > + has_remote = false; > + for (i = 0; i < node->num_deps; i++) { > + if (!i915_scheduler_is_dependency_valid(node, i)) > + continue; > + > + if (node->dep_list[i]->params.engine == node->params.engine) > + has_local = true; > + else > + has_remote = true; Would it be worth breaking early from this loop if has_local && has_remote? > + } > + > + if (has_remote && !has_local) > + only_remote = true; > + > + if (!has_local && !has_remote) { > + if (!best || > + (node->priority > best->priority)) > + best = node; > + } > + } > + > + if (best) { > + list_del(&best->link); > + > + INIT_LIST_HEAD(&best->link); list_del_init can replace the two lines above. > + best->status = I915_SQS_POPPED; > + > + scheduler->counts[engine->id].queued--; > + > + ret = 0; > + } else { > + /* Can only get here if: > + * (a) there are no buffers in the queue > + * (b) all queued buffers are dependent on other buffers > + * e.g. on a buffer that is in flight on a different engine > + */ > + if (only_remote) { > + /* The only dependent buffers are on another engine. */ > + ret = -EAGAIN; > + } else if (any_queued) { > + /* It seems that something has gone horribly wrong! */ > + WARN_ONCE(true, "Broken dependency tracking on engine %d!\n", > + (int) engine->id); > + } > + } > + > + *pop_node = best; > + return ret; > +} > + > +/* > + * NB: The driver mutex lock must be held before calling this function. It is > + * only really required during the actual back end submission call. However, > + * attempting to acquire a mutex while holding a spin lock is a Bad Idea. > + * And releasing the one before acquiring the other leads to other code > + * being run and interfering. > + */ > +static int i915_scheduler_submit(struct intel_engine_cs *engine) > +{ > + struct drm_i915_private *dev_priv = to_i915(engine->dev); > + struct i915_scheduler *scheduler = dev_priv->scheduler; > + struct i915_scheduler_queue_entry *node; > + int ret, count = 0, flying; > + > + WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex)); > + > + spin_lock_irq(&scheduler->lock); > + > + WARN_ON(scheduler->flags[engine->id] & I915_SF_SUBMITTING); > + scheduler->flags[engine->id] |= I915_SF_SUBMITTING; > + > + /* First time around, complain if anything unexpected occurs: */ > + ret = i915_scheduler_pop_from_queue_locked(engine, &node); > + if (ret) > + goto error; > + > + do { > + WARN_ON(node->params.engine != engine); > + WARN_ON(node->status != I915_SQS_POPPED); > + count++; > + > + /* > + * The call to pop above will have removed the node from the > + * list. So add it back in and mark it as in flight. > + */ > + i915_scheduler_node_fly(node); > + > + spin_unlock_irq(&scheduler->lock); > + ret = dev_priv->gt.execbuf_final(&node->params); > + spin_lock_irq(&scheduler->lock); > + > + /* > + * Handle failed submission but first check that the > + * watchdog/reset code has not nuked the node while we > + * weren't looking: > + */ > + if (ret && (node->status != I915_SQS_DEAD)) { > + bool requeue = true; > + > + /* > + * Oh dear! Either the node is broken or the engine is > + * busy. So need to kill the node or requeue it and try > + * again later as appropriate. > + */ > + > + switch (-ret) { > + case ENODEV: > + case ENOENT: > + /* Fatal errors. Kill the node. */ > + requeue = false; > + i915_scheduler_node_kill(scheduler, node); > + break; > + > + case EAGAIN: > + case EBUSY: > + case EIO: > + case ENOMEM: > + case ERESTARTSYS: > + case EINTR: > + /* Supposedly recoverable errors. */ > + break; > + > + default: > + /* > + * Assume the error is recoverable and hope > + * for the best. > + */ > + MISSING_CASE(-ret); > + break; > + } > + > + if (requeue) { > + i915_scheduler_node_requeue(scheduler, node); > + /* > + * No point spinning if the engine is currently > + * unavailable so just give up and come back > + * later. > + */ > + break; > + } > + } > + > + /* Keep launching until the sky is sufficiently full. */ > + flying = i915_scheduler_count_flying(scheduler, engine); > + if (flying >= scheduler->min_flying) > + break; > + > + /* Grab another node and go round again... */ > + ret = i915_scheduler_pop_from_queue_locked(engine, &node); > + } while (ret == 0); > + > + /* Don't complain about not being able to submit extra entries */ > + if (ret == -ENODATA) > + ret = 0; > + > + /* > + * Bump the priority of everything that was not submitted to prevent > + * starvation of low priority tasks by a spamming high priority task. > + */ > + i915_scheduler_priority_bump_clear(scheduler); > + for_each_scheduler_node(node, engine->id) { > + if (!I915_SQS_IS_QUEUED(node)) > + continue; > + > + i915_scheduler_priority_bump(scheduler, node, > + scheduler->priority_level_bump); > + } bump_clear will iterate queues for all engines and then you iterate it again. Would this be equivalent: for (i = 0; i < I915_NUM_ENGINES; i++) { for_each_scheduler_node(node, i) { node->bumped = false; if (i == engine->id && I915_SQS_IS_QUEUED(node)) i915_scheduler_priority_bump(scheduler, node, scheduler->priority_level_bump); } } Advantage is one loop fewer. > + > + /* On success, return the number of buffers submitted. */ > + if (ret == 0) > + ret = count; > + > +error: > + scheduler->flags[engine->id] &= ~I915_SF_SUBMITTING; > + spin_unlock_irq(&scheduler->lock); > + return ret; > +} > + > +static void i915_generate_dependencies(struct i915_scheduler *scheduler, > + struct i915_scheduler_queue_entry *node, > + uint32_t engine) > +{ > + struct i915_scheduler_obj_entry *this, *that; > + struct i915_scheduler_queue_entry *test; > + int i, j; > + bool found; > + > + for_each_scheduler_node(test, engine) { > + if (I915_SQS_IS_COMPLETE(test)) > + continue; > + > + /* > + * Batches on the same engine for the same > + * context must be kept in order. > + */ > + found = (node->params.ctx == test->params.ctx) && > + (node->params.engine == test->params.engine); > + > + /* > + * Batches working on the same objects must > + * be kept in order. > + */ > + for (i = 0; (i < node->num_objs) && !found; i++) { > + this = node->objs + i; > + > + for (j = 0; j < test->num_objs; j++) { > + that = test->objs + j; > + > + if (this->obj != that->obj) > + continue; > + > + /* Only need to worry about writes */ > + if (this->read_only && that->read_only) > + continue; > + > + found = true; > + break; > + } > + } > + > + if (found) { > + node->dep_list[node->num_deps] = test; > + node->num_deps++; > + } > + } > +} > + > +static int i915_scheduler_queue_execbuffer_bypass(struct i915_scheduler_queue_entry *qe) > +{ > + struct drm_i915_private *dev_priv = to_i915(qe->params.dev); > + struct i915_scheduler *scheduler = dev_priv->scheduler; > + int ret; > + > + scheduler->flags[qe->params.engine->id] |= I915_SF_SUBMITTING; > + ret = dev_priv->gt.execbuf_final(&qe->params); > + scheduler->flags[qe->params.engine->id] &= ~I915_SF_SUBMITTING; > + > + /* > + * Don't do any clean up on failure because the caller will > + * do it all anyway. > + */ > + if (ret) > + return ret; > + > + /* Need to release any resources held by the node: */ > + qe->status = I915_SQS_COMPLETE; > + i915_scheduler_clean_node(qe); > + > + return 0; > +} > + > +static inline uint32_t i915_scheduler_count_incomplete(struct i915_scheduler *scheduler) > +{ > + int e, incomplete = 0; > + > + for (e = 0; e < I915_NUM_ENGINES; e++) > + incomplete += scheduler->counts[e].queued + scheduler->counts[e].flying; > + > + return incomplete; > +} > + > +/** > + * i915_scheduler_queue_execbuffer - Submit a batch buffer request to the > + * scheduler. > + * @qe: The batch buffer request to be queued. > + * The expectation is the qe passed in is a local stack variable. This > + * function will copy its contents into a freshly allocated list node. The > + * new node takes ownership of said contents so the original qe should simply > + * be discarded and not cleaned up (i.e. don't free memory it points to or > + * dereference objects it holds). The node is added to the scheduler's queue > + * and the batch buffer will be submitted to the hardware at some future > + * point in time (which may be immediately, before returning or may be quite > + * a lot later). > + */ > +int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) > +{ > + struct drm_i915_private *dev_priv = to_i915(qe->params.dev); > + struct i915_scheduler *scheduler = dev_priv->scheduler; > + struct intel_engine_cs *engine = qe->params.engine; > + struct i915_scheduler_queue_entry *node; > + bool not_flying; > + int i, e; > + int incomplete; > + > + /* Bypass the scheduler and send the buffer immediately? */ > + if (1/*!i915.enable_scheduler*/) > + return i915_scheduler_queue_execbuffer_bypass(qe); > + > + node = kmalloc(sizeof(*node), GFP_KERNEL); > + if (!node) > + return -ENOMEM; > + > + *node = *qe; > + INIT_LIST_HEAD(&node->link); > + node->status = I915_SQS_QUEUED; > + node->stamp = jiffies; > + i915_gem_request_reference(node->params.request); > + > + WARN_ON(node->params.request->scheduler_qe); > + node->params.request->scheduler_qe = node; > + > + /* > + * Need to determine the number of incomplete entries in the list as > + * that will be the maximum size of the dependency list. > + * > + * Note that the allocation must not be made with the spinlock acquired > + * as kmalloc can sleep. However, the unlock/relock is safe because no > + * new entries can be queued up during the unlock as the i915 driver > + * mutex is still held. Entries could be removed from the list but that > + * just means the dep_list will be over-allocated which is fine. > + */ > + spin_lock_irq(&scheduler->lock); > + incomplete = i915_scheduler_count_incomplete(scheduler); > + > + /* Temporarily unlock to allocate memory: */ > + spin_unlock_irq(&scheduler->lock); > + if (incomplete) { > + node->dep_list = kmalloc_array(incomplete, > + sizeof(*node->dep_list), > + GFP_KERNEL); > + if (!node->dep_list) { > + kfree(node); > + return -ENOMEM; > + } > + } else > + node->dep_list = NULL; > + > + spin_lock_irq(&scheduler->lock); > + node->num_deps = 0; > + > + if (node->dep_list) { > + for (e = 0; e < I915_NUM_ENGINES; e++) > + i915_generate_dependencies(scheduler, node, e); > + > + WARN_ON(node->num_deps > incomplete); > + } > + > + node->priority = clamp(node->priority, > + scheduler->priority_level_min, > + scheduler->priority_level_max); > + > + if ((node->priority > 0) && node->num_deps) { > + i915_scheduler_priority_bump_clear(scheduler); > + > + for (i = 0; i < node->num_deps; i++) > + i915_scheduler_priority_bump(scheduler, > + node->dep_list[i], node->priority); > + } > + > + list_add_tail(&node->link, &scheduler->node_queue[engine->id]); > + > + not_flying = i915_scheduler_count_flying(scheduler, engine) < > + scheduler->min_flying; > + > + scheduler->counts[engine->id].queued++; > + > + spin_unlock_irq(&scheduler->lock); > + > + if (not_flying) > + i915_scheduler_submit(engine); > + > + return 0; > +} > + > +/** > + * i915_scheduler_notify_request - Notify the scheduler that the given > + * request has completed on the hardware. > + * @req: Request structure which has completed > + * @preempt: Did it complete pre-emptively? > + * A sequence number has popped out of the hardware and the request handling > + * code has mapped it back to a request and will mark that request complete. > + * It also calls this function to notify the scheduler about the completion > + * so the scheduler's node can be updated appropriately. > + * Returns true if the request is scheduler managed, false if not. The return > + * value is combined for all freshly completed requests and if any were true > + * then i915_scheduler_wakeup() is called so the scheduler can do further > + * processing (submit more work) at the end. > + */ > +bool i915_scheduler_notify_request(struct drm_i915_gem_request *req) > +{ > + struct drm_i915_private *dev_priv = to_i915(req->engine->dev); > + struct i915_scheduler *scheduler = dev_priv->scheduler; > + struct i915_scheduler_queue_entry *node = req->scheduler_qe; > + unsigned long flags; > + > + if (!node) > + return false; > + > + spin_lock_irqsave(&scheduler->lock, flags); > + > + WARN_ON(!I915_SQS_IS_FLYING(node)); > + > + /* Node was in flight so mark it as complete. */ > + if (req->cancelled) > + node->status = I915_SQS_DEAD; > + else > + node->status = I915_SQS_COMPLETE; > + > + scheduler->counts[req->engine->id].flying--; > + > + spin_unlock_irqrestore(&scheduler->lock, flags); > + > + return true; > +} > + > +static int i915_scheduler_remove_dependent(struct i915_scheduler *scheduler, > + struct i915_scheduler_queue_entry *remove) > +{ > + struct i915_scheduler_queue_entry *node; > + int i, r; > + int count = 0; > + > + /* > + * Ensure that a node is not being removed which is still dependent > + * upon other (not completed) work. If that happens, it implies > + * something has gone very wrong with the dependency tracking! Note > + * that there is no need to worry if this node has been explicitly > + * killed for some reason - it might be being killed before it got > + * sent to the hardware. > + */ > + if (remove->status != I915_SQS_DEAD) { > + for (i = 0; i < remove->num_deps; i++) > + if ((remove->dep_list[i]) && > + (!I915_SQS_IS_COMPLETE(remove->dep_list[i]))) > + count++; > + WARN_ON(count); > + } > + > + /* > + * Remove this node from the dependency lists of any other node which > + * might be waiting on it. > + */ > + for (r = 0; r < I915_NUM_ENGINES; r++) { > + for_each_scheduler_node(node, r) { > + for (i = 0; i < node->num_deps; i++) { > + if (node->dep_list[i] != remove) > + continue; > + > + node->dep_list[i] = NULL; Can the same node be listed in other's node dependency list multiple times? If not you could break after clearing it and not iterate the rest of the list. > + } > + } > + } > + > + return 0; > +} > + > +/** > + * i915_scheduler_wakeup - wake the scheduler's worker thread > + * @dev: DRM device > + * Called at the end of seqno interrupt processing if any request has > + * completed that corresponds to a scheduler node. > + */ > +void i915_scheduler_wakeup(struct drm_device *dev) > +{ > + /* XXX: Need to call i915_scheduler_remove() via work handler. */ > +} > + > +/** > + * i915_scheduler_clean_node - free up any allocations/references > + * associated with the given scheduler queue entry. > + * @node: Queue entry structure which is complete > + * After a give batch buffer completes on the hardware, all the information > + * required to resubmit it is no longer required. However, the node entry > + * itself might still be required for tracking purposes for a while longer. > + * This function should be called as soon as the node is known to be complete > + * so that these resources may be freed even though the node itself might > + * hang around. > + */ > +void i915_scheduler_clean_node(struct i915_scheduler_queue_entry *node) > +{ > + if (!I915_SQS_IS_COMPLETE(node)) { > + WARN(!node->params.request->cancelled, > + "Cleaning active node: %d!\n", node->status); > + return; > + } > + > + if (node->params.batch_obj) { > + /* > + * The batch buffer must be unpinned before it is unreferenced > + * otherwise the unpin fails with a missing vma!? > + */ > + if (node->params.dispatch_flags & I915_DISPATCH_SECURE) > + i915_gem_execbuf_release_batch_obj(node->params.batch_obj); > + > + node->params.batch_obj = NULL; > + } > + > + /* And anything else owned by the node: */ > + if (node->params.cliprects) { > + kfree(node->params.cliprects); > + node->params.cliprects = NULL; > + } > +} > + > +static bool i915_scheduler_remove(struct i915_scheduler *scheduler, > + struct intel_engine_cs *engine, > + struct list_head *remove) > +{ > + struct i915_scheduler_queue_entry *node, *node_next; > + bool do_submit; > + > + spin_lock_irq(&scheduler->lock); > + > + INIT_LIST_HEAD(remove); > + list_for_each_entry_safe(node, node_next, &scheduler->node_queue[engine->id], link) { > + if (!I915_SQS_IS_COMPLETE(node)) > + break; > + > + list_del(&node->link); > + list_add(&node->link, remove); list_move > + > + /* Strip the dependency info while the mutex is still locked */ > + i915_scheduler_remove_dependent(scheduler, node); > + > + continue; Could kill the continue. :) > + } > + > + /* > + * Release the interrupt reference count if there are no longer any > + * nodes to worry about. > + */ > + if (list_empty(&scheduler->node_queue[engine->id]) && > + (scheduler->flags[engine->id] & I915_SF_INTERRUPTS_ENABLED)) { > + engine->irq_put(engine); > + scheduler->flags[engine->id] &= ~I915_SF_INTERRUPTS_ENABLED; > + } > + > + /* Launch more packets now? */ > + do_submit = (scheduler->counts[engine->id].queued > 0) && > + (scheduler->counts[engine->id].flying < scheduler->min_flying); > + > + spin_unlock_irq(&scheduler->lock); > + > + return do_submit; > +} > + > +void i915_scheduler_process_work(struct intel_engine_cs *engine) > +{ > + struct drm_i915_private *dev_priv = to_i915(engine->dev); > + struct i915_scheduler *scheduler = dev_priv->scheduler; > + struct i915_scheduler_queue_entry *node; > + bool do_submit; > + struct list_head remove; Could do LIST_HEAD(remove); to declare an empty list and then wouldn't have to do INIT_LIST_HEAD in i915_scheduler_remove. Would probably be better to group the logic together like that. > + > + if (list_empty(&scheduler->node_queue[engine->id])) > + return; > + > + /* Remove completed nodes. */ > + do_submit = i915_scheduler_remove(scheduler, engine, &remove); > + > + if (!do_submit && list_empty(&remove)) > + return; > + > + /* Need to grab the pm lock outside of the mutex lock */ > + if (do_submit) > + intel_runtime_pm_get(dev_priv); > + > + mutex_lock(&engine->dev->struct_mutex); > + > + if (do_submit) > + i915_scheduler_submit(engine); > + > + while (!list_empty(&remove)) { > + node = list_first_entry(&remove, typeof(*node), link); > + list_del(&node->link); > + > + /* Free up all the DRM references */ > + i915_scheduler_clean_node(node); > + > + /* And anything else owned by the node: */ > + node->params.request->scheduler_qe = NULL; > + i915_gem_request_unreference(node->params.request); > + kfree(node->dep_list); > + kfree(node); > + } > + > + mutex_unlock(&engine->dev->struct_mutex); > + > + if (do_submit) > + intel_runtime_pm_put(dev_priv); > +} > diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h > new file mode 100644 > index 0000000..c895c4c > --- /dev/null > +++ b/drivers/gpu/drm/i915/i915_scheduler.h > @@ -0,0 +1,113 @@ > +/* > + * Copyright (c) 2014 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS > + * IN THE SOFTWARE. > + * > + */ > + > +#ifndef _I915_SCHEDULER_H_ > +#define _I915_SCHEDULER_H_ > + > +enum i915_scheduler_queue_status { > + /* Limbo: */ > + I915_SQS_NONE = 0, > + /* Not yet submitted to hardware: */ > + I915_SQS_QUEUED, > + /* Popped from queue, ready to fly: */ > + I915_SQS_POPPED, > + /* Sent to hardware for processing: */ > + I915_SQS_FLYING, > + /* Finished processing on the hardware: */ > + I915_SQS_COMPLETE, > + /* Killed by watchdog or catastrophic submission failure: */ > + I915_SQS_DEAD, > + /* Limit value for use with arrays/loops */ > + I915_SQS_MAX > +}; > + > +#define I915_SQS_IS_QUEUED(node) (((node)->status == I915_SQS_QUEUED)) > +#define I915_SQS_IS_FLYING(node) (((node)->status == I915_SQS_FLYING)) > +#define I915_SQS_IS_COMPLETE(node) (((node)->status == I915_SQS_COMPLETE) || \ > + ((node)->status == I915_SQS_DEAD)) > + > +struct i915_scheduler_obj_entry { > + struct drm_i915_gem_object *obj; > + bool read_only; > +}; > + > +struct i915_scheduler_queue_entry { > + /* Any information required to submit this batch buffer to the hardware */ > + struct i915_execbuffer_params params; > + > + /* -1023 = lowest priority, 0 = default, 1023 = highest */ > + int32_t priority; > + bool bumped; > + > + /* Objects referenced by this batch buffer */ > + struct i915_scheduler_obj_entry *objs; > + int num_objs; > + > + /* Batch buffers this one is dependent upon */ > + struct i915_scheduler_queue_entry **dep_list; > + int num_deps; > + > + enum i915_scheduler_queue_status status; > + unsigned long stamp; > + > + /* List of all scheduler queue entry nodes */ > + struct list_head link; > +}; > + > +struct i915_scheduler_node_states { > + uint32_t flying; > + uint32_t queued; > +}; > + > +struct i915_scheduler { > + struct list_head node_queue[I915_NUM_ENGINES]; > + uint32_t flags[I915_NUM_ENGINES]; > + spinlock_t lock; > + > + /* Node counts: */ > + struct i915_scheduler_node_states counts[I915_NUM_ENGINES]; > + > + /* Tuning parameters: */ > + int32_t priority_level_min; > + int32_t priority_level_max; > + int32_t priority_level_bump; > + int32_t priority_level_preempt; > + uint32_t min_flying; > +}; > + > +/* Flag bits for i915_scheduler::flags */ > +enum { > + I915_SF_INTERRUPTS_ENABLED = (1 << 0), > + I915_SF_SUBMITTING = (1 << 1), > +}; > + > +bool i915_scheduler_is_enabled(struct drm_device *dev); > +int i915_scheduler_init(struct drm_device *dev); > +void i915_scheduler_destroy(struct drm_i915_private *dev_priv); > +void i915_scheduler_clean_node(struct i915_scheduler_queue_entry *node); > +int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe); > +bool i915_scheduler_notify_request(struct drm_i915_gem_request *req); > +void i915_scheduler_wakeup(struct drm_device *dev); > + > +#endif /* _I915_SCHEDULER_H_ */ > Regards, Tvrtko
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index e9cdeb5..289fa73 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -10,6 +10,7 @@ ccflags-y := -Werror i915-y := i915_drv.o \ i915_irq.o \ i915_params.o \ + i915_scheduler.o \ i915_suspend.o \ i915_sysfs.o \ intel_csr.o \ diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index b377753..2ad4071 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -37,6 +37,7 @@ #include "i915_drv.h" #include "i915_vgpu.h" #include "i915_trace.h" +#include "i915_scheduler.h" #include <linux/pci.h> #include <linux/console.h> #include <linux/vt.h> @@ -1448,6 +1449,8 @@ int i915_driver_unload(struct drm_device *dev) intel_csr_ucode_fini(dev_priv); + i915_scheduler_destroy(dev_priv); + /* Free error state after interrupts are fully disabled. */ cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work); i915_destroy_error_state(dev); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 7492ce7..7b62e2c 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1717,6 +1717,8 @@ struct i915_execbuffer_params { struct drm_i915_gem_request *request; }; +struct i915_scheduler; + /* used in computing the new watermarks state */ struct intel_wm_config { unsigned int num_pipes_active; @@ -1994,6 +1996,8 @@ struct drm_i915_private { struct i915_runtime_pm pm; + struct i915_scheduler *scheduler; + /* Abstract the submission mechanism (legacy ringbuffer or execlists) away */ struct { int (*execbuf_submit)(struct i915_execbuffer_params *params, @@ -2335,6 +2339,8 @@ struct drm_i915_gem_request { /** process identifier submitting this request */ struct pid *pid; + struct i915_scheduler_queue_entry *scheduler_qe; + /** * The ELSP only accepts two elements at a time, so we queue * context/tail pairs on a given queue (ring->execlist_queue) until the diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a632276..b7466cb 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -32,6 +32,7 @@ #include "i915_vgpu.h" #include "i915_trace.h" #include "intel_drv.h" +#include "i915_scheduler.h" #include <linux/shmem_fs.h> #include <linux/slab.h> #include <linux/swap.h> @@ -5405,6 +5406,10 @@ int i915_gem_init(struct drm_device *dev) */ intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL); + ret = i915_scheduler_init(dev); + if (ret) + goto out_unlock; + ret = i915_gem_init_userptr(dev); if (ret) goto out_unlock; diff --git a/drivers/gpu/drm/i915/i915_scheduler.c b/drivers/gpu/drm/i915/i915_scheduler.c new file mode 100644 index 0000000..9d628b9 --- /dev/null +++ b/drivers/gpu/drm/i915/i915_scheduler.c @@ -0,0 +1,867 @@ +/* + * Copyright (c) 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +#include "i915_drv.h" +#include "intel_drv.h" +#include "i915_scheduler.h" + +#define for_each_scheduler_node(node, id) \ + list_for_each_entry((node), &scheduler->node_queue[(id)], link) + +#define assert_scheduler_lock_held(scheduler) \ + do { \ + WARN_ONCE(!spin_is_locked(&(scheduler)->lock), "Spinlock not locked!"); \ + } while(0) + +/** + * i915_scheduler_is_enabled - Returns true if the scheduler is enabled. + * @dev: DRM device + */ +bool i915_scheduler_is_enabled(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = to_i915(dev); + + return dev_priv->scheduler != NULL; +} + +/** + * i915_scheduler_init - Initialise the scheduler. + * @dev: DRM device + * Returns zero on success or -ENOMEM if memory allocations fail. + */ +int i915_scheduler_init(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = to_i915(dev); + struct i915_scheduler *scheduler = dev_priv->scheduler; + int e; + + if (scheduler) + return 0; + + scheduler = kzalloc(sizeof(*scheduler), GFP_KERNEL); + if (!scheduler) + return -ENOMEM; + + spin_lock_init(&scheduler->lock); + + for (e = 0; e < I915_NUM_ENGINES; e++) { + INIT_LIST_HEAD(&scheduler->node_queue[e]); + scheduler->counts[e].flying = 0; + scheduler->counts[e].queued = 0; + } + + /* Default tuning values: */ + scheduler->priority_level_min = -1023; + scheduler->priority_level_max = 1023; + scheduler->priority_level_bump = 50; + scheduler->priority_level_preempt = 900; + scheduler->min_flying = 2; + + dev_priv->scheduler = scheduler; + + return 0; +} + +/** + * i915_scheduler_destroy - Get rid of the scheduler. + * @dev: DRM device + */ +void i915_scheduler_destroy(struct drm_i915_private *dev_priv) +{ + struct i915_scheduler *scheduler = dev_priv->scheduler; + int e; + + if (!scheduler) + return; + + for (e = 0; e < I915_NUM_ENGINES; e++) + WARN(!list_empty(&scheduler->node_queue[e]), "Destroying with list entries on engine %d!", e); + + kfree(scheduler); + dev_priv->scheduler = NULL; +} + +/* + * Add a popped node back in to the queue. For example, because the engine + * was hung when execfinal() was called and thus the engine submission needs + * to be retried later. + */ +static void i915_scheduler_node_requeue(struct i915_scheduler *scheduler, + struct i915_scheduler_queue_entry *node) +{ + assert_scheduler_lock_held(scheduler); + + WARN_ON(!I915_SQS_IS_FLYING(node)); + + /* Seqno will be reassigned on relaunch */ + node->params.request->seqno = 0; + node->status = I915_SQS_QUEUED; + scheduler->counts[node->params.engine->id].flying--; + scheduler->counts[node->params.engine->id].queued++; +} + +/* + * Give up on a node completely. For example, because it is causing the + * engine to hang or is using some resource that no longer exists. + */ +static void i915_scheduler_node_kill(struct i915_scheduler *scheduler, + struct i915_scheduler_queue_entry *node) +{ + assert_scheduler_lock_held(scheduler); + + WARN_ON(I915_SQS_IS_COMPLETE(node)); + + if (I915_SQS_IS_FLYING(node)) + scheduler->counts[node->params.engine->id].flying--; + else + scheduler->counts[node->params.engine->id].queued--; + + node->status = I915_SQS_DEAD; +} + +/* Mark a node as in flight on the hardware. */ +static void i915_scheduler_node_fly(struct i915_scheduler_queue_entry *node) +{ + struct drm_i915_private *dev_priv = to_i915(node->params.dev); + struct i915_scheduler *scheduler = dev_priv->scheduler; + struct intel_engine_cs *engine = node->params.engine; + + assert_scheduler_lock_held(scheduler); + + WARN_ON(node->status != I915_SQS_POPPED); + + /* + * Add the node (which should currently be in state popped) to the + * front of the queue. This ensure that flying nodes are always held + * in hardware submission order. + */ + list_add(&node->link, &scheduler->node_queue[engine->id]); + + node->status = I915_SQS_FLYING; + + scheduler->counts[engine->id].flying++; + + if (!(scheduler->flags[engine->id] & I915_SF_INTERRUPTS_ENABLED)) { + bool success = true; + + success = engine->irq_get(engine); + if (success) + scheduler->flags[engine->id] |= I915_SF_INTERRUPTS_ENABLED; + } +} + +static inline uint32_t i915_scheduler_count_flying(struct i915_scheduler *scheduler, + struct intel_engine_cs *engine) +{ + return scheduler->counts[engine->id].flying; +} + +static void i915_scheduler_priority_bump_clear(struct i915_scheduler *scheduler) +{ + struct i915_scheduler_queue_entry *node; + int i; + + assert_scheduler_lock_held(scheduler); + + /* + * Ensure circular dependencies don't cause problems and that a bump + * by object usage only bumps each using buffer once: + */ + for (i = 0; i < I915_NUM_ENGINES; i++) { + for_each_scheduler_node(node, i) + node->bumped = false; + } +} + +static int i915_scheduler_priority_bump(struct i915_scheduler *scheduler, + struct i915_scheduler_queue_entry *target, + uint32_t bump) +{ + uint32_t new_priority; + int i, count; + + if (target->priority >= scheduler->priority_level_max) + return 1; + + if (target->bumped) + return 0; + + new_priority = target->priority + bump; + if ((new_priority <= target->priority) || + (new_priority > scheduler->priority_level_max)) + target->priority = scheduler->priority_level_max; + else + target->priority = new_priority; + + count = 1; + target->bumped = true; + + for (i = 0; i < target->num_deps; i++) { + if (!target->dep_list[i]) + continue; + + if (target->dep_list[i]->bumped) + continue; + + count += i915_scheduler_priority_bump(scheduler, + target->dep_list[i], + bump); + } + + return count; +} + +/* + * Nodes are considered valid dependencies if they are queued on any engine + * or if they are in flight on a different engine. In flight on the same + * engine is no longer interesting for non-premptive nodes as the engine + * serialises execution. For pre-empting nodes, all in flight dependencies + * are valid as they must not be jumped by the act of pre-empting. + * + * Anything that is neither queued nor flying is uninteresting. + */ +static inline bool i915_scheduler_is_dependency_valid( + struct i915_scheduler_queue_entry *node, uint32_t idx) +{ + struct i915_scheduler_queue_entry *dep; + + dep = node->dep_list[idx]; + if (!dep) + return false; + + if (I915_SQS_IS_QUEUED(dep)) + return true; + + if (I915_SQS_IS_FLYING(dep)) { + if (node->params.engine != dep->params.engine) + return true; + } + + return false; +} + +static int i915_scheduler_pop_from_queue_locked(struct intel_engine_cs *engine, + struct i915_scheduler_queue_entry **pop_node) +{ + struct drm_i915_private *dev_priv = to_i915(engine->dev); + struct i915_scheduler *scheduler = dev_priv->scheduler; + struct i915_scheduler_queue_entry *best = NULL; + struct i915_scheduler_queue_entry *node; + int ret; + int i; + bool any_queued = false; + bool has_local, has_remote, only_remote = false; + + assert_scheduler_lock_held(scheduler); + + *pop_node = NULL; + ret = -ENODATA; + + for_each_scheduler_node(node, engine->id) { + if (!I915_SQS_IS_QUEUED(node)) + continue; + any_queued = true; + + has_local = false; + has_remote = false; + for (i = 0; i < node->num_deps; i++) { + if (!i915_scheduler_is_dependency_valid(node, i)) + continue; + + if (node->dep_list[i]->params.engine == node->params.engine) + has_local = true; + else + has_remote = true; + } + + if (has_remote && !has_local) + only_remote = true; + + if (!has_local && !has_remote) { + if (!best || + (node->priority > best->priority)) + best = node; + } + } + + if (best) { + list_del(&best->link); + + INIT_LIST_HEAD(&best->link); + best->status = I915_SQS_POPPED; + + scheduler->counts[engine->id].queued--; + + ret = 0; + } else { + /* Can only get here if: + * (a) there are no buffers in the queue + * (b) all queued buffers are dependent on other buffers + * e.g. on a buffer that is in flight on a different engine + */ + if (only_remote) { + /* The only dependent buffers are on another engine. */ + ret = -EAGAIN; + } else if (any_queued) { + /* It seems that something has gone horribly wrong! */ + WARN_ONCE(true, "Broken dependency tracking on engine %d!\n", + (int) engine->id); + } + } + + *pop_node = best; + return ret; +} + +/* + * NB: The driver mutex lock must be held before calling this function. It is + * only really required during the actual back end submission call. However, + * attempting to acquire a mutex while holding a spin lock is a Bad Idea. + * And releasing the one before acquiring the other leads to other code + * being run and interfering. + */ +static int i915_scheduler_submit(struct intel_engine_cs *engine) +{ + struct drm_i915_private *dev_priv = to_i915(engine->dev); + struct i915_scheduler *scheduler = dev_priv->scheduler; + struct i915_scheduler_queue_entry *node; + int ret, count = 0, flying; + + WARN_ON(!mutex_is_locked(&engine->dev->struct_mutex)); + + spin_lock_irq(&scheduler->lock); + + WARN_ON(scheduler->flags[engine->id] & I915_SF_SUBMITTING); + scheduler->flags[engine->id] |= I915_SF_SUBMITTING; + + /* First time around, complain if anything unexpected occurs: */ + ret = i915_scheduler_pop_from_queue_locked(engine, &node); + if (ret) + goto error; + + do { + WARN_ON(node->params.engine != engine); + WARN_ON(node->status != I915_SQS_POPPED); + count++; + + /* + * The call to pop above will have removed the node from the + * list. So add it back in and mark it as in flight. + */ + i915_scheduler_node_fly(node); + + spin_unlock_irq(&scheduler->lock); + ret = dev_priv->gt.execbuf_final(&node->params); + spin_lock_irq(&scheduler->lock); + + /* + * Handle failed submission but first check that the + * watchdog/reset code has not nuked the node while we + * weren't looking: + */ + if (ret && (node->status != I915_SQS_DEAD)) { + bool requeue = true; + + /* + * Oh dear! Either the node is broken or the engine is + * busy. So need to kill the node or requeue it and try + * again later as appropriate. + */ + + switch (-ret) { + case ENODEV: + case ENOENT: + /* Fatal errors. Kill the node. */ + requeue = false; + i915_scheduler_node_kill(scheduler, node); + break; + + case EAGAIN: + case EBUSY: + case EIO: + case ENOMEM: + case ERESTARTSYS: + case EINTR: + /* Supposedly recoverable errors. */ + break; + + default: + /* + * Assume the error is recoverable and hope + * for the best. + */ + MISSING_CASE(-ret); + break; + } + + if (requeue) { + i915_scheduler_node_requeue(scheduler, node); + /* + * No point spinning if the engine is currently + * unavailable so just give up and come back + * later. + */ + break; + } + } + + /* Keep launching until the sky is sufficiently full. */ + flying = i915_scheduler_count_flying(scheduler, engine); + if (flying >= scheduler->min_flying) + break; + + /* Grab another node and go round again... */ + ret = i915_scheduler_pop_from_queue_locked(engine, &node); + } while (ret == 0); + + /* Don't complain about not being able to submit extra entries */ + if (ret == -ENODATA) + ret = 0; + + /* + * Bump the priority of everything that was not submitted to prevent + * starvation of low priority tasks by a spamming high priority task. + */ + i915_scheduler_priority_bump_clear(scheduler); + for_each_scheduler_node(node, engine->id) { + if (!I915_SQS_IS_QUEUED(node)) + continue; + + i915_scheduler_priority_bump(scheduler, node, + scheduler->priority_level_bump); + } + + /* On success, return the number of buffers submitted. */ + if (ret == 0) + ret = count; + +error: + scheduler->flags[engine->id] &= ~I915_SF_SUBMITTING; + spin_unlock_irq(&scheduler->lock); + return ret; +} + +static void i915_generate_dependencies(struct i915_scheduler *scheduler, + struct i915_scheduler_queue_entry *node, + uint32_t engine) +{ + struct i915_scheduler_obj_entry *this, *that; + struct i915_scheduler_queue_entry *test; + int i, j; + bool found; + + for_each_scheduler_node(test, engine) { + if (I915_SQS_IS_COMPLETE(test)) + continue; + + /* + * Batches on the same engine for the same + * context must be kept in order. + */ + found = (node->params.ctx == test->params.ctx) && + (node->params.engine == test->params.engine); + + /* + * Batches working on the same objects must + * be kept in order. + */ + for (i = 0; (i < node->num_objs) && !found; i++) { + this = node->objs + i; + + for (j = 0; j < test->num_objs; j++) { + that = test->objs + j; + + if (this->obj != that->obj) + continue; + + /* Only need to worry about writes */ + if (this->read_only && that->read_only) + continue; + + found = true; + break; + } + } + + if (found) { + node->dep_list[node->num_deps] = test; + node->num_deps++; + } + } +} + +static int i915_scheduler_queue_execbuffer_bypass(struct i915_scheduler_queue_entry *qe) +{ + struct drm_i915_private *dev_priv = to_i915(qe->params.dev); + struct i915_scheduler *scheduler = dev_priv->scheduler; + int ret; + + scheduler->flags[qe->params.engine->id] |= I915_SF_SUBMITTING; + ret = dev_priv->gt.execbuf_final(&qe->params); + scheduler->flags[qe->params.engine->id] &= ~I915_SF_SUBMITTING; + + /* + * Don't do any clean up on failure because the caller will + * do it all anyway. + */ + if (ret) + return ret; + + /* Need to release any resources held by the node: */ + qe->status = I915_SQS_COMPLETE; + i915_scheduler_clean_node(qe); + + return 0; +} + +static inline uint32_t i915_scheduler_count_incomplete(struct i915_scheduler *scheduler) +{ + int e, incomplete = 0; + + for (e = 0; e < I915_NUM_ENGINES; e++) + incomplete += scheduler->counts[e].queued + scheduler->counts[e].flying; + + return incomplete; +} + +/** + * i915_scheduler_queue_execbuffer - Submit a batch buffer request to the + * scheduler. + * @qe: The batch buffer request to be queued. + * The expectation is the qe passed in is a local stack variable. This + * function will copy its contents into a freshly allocated list node. The + * new node takes ownership of said contents so the original qe should simply + * be discarded and not cleaned up (i.e. don't free memory it points to or + * dereference objects it holds). The node is added to the scheduler's queue + * and the batch buffer will be submitted to the hardware at some future + * point in time (which may be immediately, before returning or may be quite + * a lot later). + */ +int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe) +{ + struct drm_i915_private *dev_priv = to_i915(qe->params.dev); + struct i915_scheduler *scheduler = dev_priv->scheduler; + struct intel_engine_cs *engine = qe->params.engine; + struct i915_scheduler_queue_entry *node; + bool not_flying; + int i, e; + int incomplete; + + /* Bypass the scheduler and send the buffer immediately? */ + if (1/*!i915.enable_scheduler*/) + return i915_scheduler_queue_execbuffer_bypass(qe); + + node = kmalloc(sizeof(*node), GFP_KERNEL); + if (!node) + return -ENOMEM; + + *node = *qe; + INIT_LIST_HEAD(&node->link); + node->status = I915_SQS_QUEUED; + node->stamp = jiffies; + i915_gem_request_reference(node->params.request); + + WARN_ON(node->params.request->scheduler_qe); + node->params.request->scheduler_qe = node; + + /* + * Need to determine the number of incomplete entries in the list as + * that will be the maximum size of the dependency list. + * + * Note that the allocation must not be made with the spinlock acquired + * as kmalloc can sleep. However, the unlock/relock is safe because no + * new entries can be queued up during the unlock as the i915 driver + * mutex is still held. Entries could be removed from the list but that + * just means the dep_list will be over-allocated which is fine. + */ + spin_lock_irq(&scheduler->lock); + incomplete = i915_scheduler_count_incomplete(scheduler); + + /* Temporarily unlock to allocate memory: */ + spin_unlock_irq(&scheduler->lock); + if (incomplete) { + node->dep_list = kmalloc_array(incomplete, + sizeof(*node->dep_list), + GFP_KERNEL); + if (!node->dep_list) { + kfree(node); + return -ENOMEM; + } + } else + node->dep_list = NULL; + + spin_lock_irq(&scheduler->lock); + node->num_deps = 0; + + if (node->dep_list) { + for (e = 0; e < I915_NUM_ENGINES; e++) + i915_generate_dependencies(scheduler, node, e); + + WARN_ON(node->num_deps > incomplete); + } + + node->priority = clamp(node->priority, + scheduler->priority_level_min, + scheduler->priority_level_max); + + if ((node->priority > 0) && node->num_deps) { + i915_scheduler_priority_bump_clear(scheduler); + + for (i = 0; i < node->num_deps; i++) + i915_scheduler_priority_bump(scheduler, + node->dep_list[i], node->priority); + } + + list_add_tail(&node->link, &scheduler->node_queue[engine->id]); + + not_flying = i915_scheduler_count_flying(scheduler, engine) < + scheduler->min_flying; + + scheduler->counts[engine->id].queued++; + + spin_unlock_irq(&scheduler->lock); + + if (not_flying) + i915_scheduler_submit(engine); + + return 0; +} + +/** + * i915_scheduler_notify_request - Notify the scheduler that the given + * request has completed on the hardware. + * @req: Request structure which has completed + * @preempt: Did it complete pre-emptively? + * A sequence number has popped out of the hardware and the request handling + * code has mapped it back to a request and will mark that request complete. + * It also calls this function to notify the scheduler about the completion + * so the scheduler's node can be updated appropriately. + * Returns true if the request is scheduler managed, false if not. The return + * value is combined for all freshly completed requests and if any were true + * then i915_scheduler_wakeup() is called so the scheduler can do further + * processing (submit more work) at the end. + */ +bool i915_scheduler_notify_request(struct drm_i915_gem_request *req) +{ + struct drm_i915_private *dev_priv = to_i915(req->engine->dev); + struct i915_scheduler *scheduler = dev_priv->scheduler; + struct i915_scheduler_queue_entry *node = req->scheduler_qe; + unsigned long flags; + + if (!node) + return false; + + spin_lock_irqsave(&scheduler->lock, flags); + + WARN_ON(!I915_SQS_IS_FLYING(node)); + + /* Node was in flight so mark it as complete. */ + if (req->cancelled) + node->status = I915_SQS_DEAD; + else + node->status = I915_SQS_COMPLETE; + + scheduler->counts[req->engine->id].flying--; + + spin_unlock_irqrestore(&scheduler->lock, flags); + + return true; +} + +static int i915_scheduler_remove_dependent(struct i915_scheduler *scheduler, + struct i915_scheduler_queue_entry *remove) +{ + struct i915_scheduler_queue_entry *node; + int i, r; + int count = 0; + + /* + * Ensure that a node is not being removed which is still dependent + * upon other (not completed) work. If that happens, it implies + * something has gone very wrong with the dependency tracking! Note + * that there is no need to worry if this node has been explicitly + * killed for some reason - it might be being killed before it got + * sent to the hardware. + */ + if (remove->status != I915_SQS_DEAD) { + for (i = 0; i < remove->num_deps; i++) + if ((remove->dep_list[i]) && + (!I915_SQS_IS_COMPLETE(remove->dep_list[i]))) + count++; + WARN_ON(count); + } + + /* + * Remove this node from the dependency lists of any other node which + * might be waiting on it. + */ + for (r = 0; r < I915_NUM_ENGINES; r++) { + for_each_scheduler_node(node, r) { + for (i = 0; i < node->num_deps; i++) { + if (node->dep_list[i] != remove) + continue; + + node->dep_list[i] = NULL; + } + } + } + + return 0; +} + +/** + * i915_scheduler_wakeup - wake the scheduler's worker thread + * @dev: DRM device + * Called at the end of seqno interrupt processing if any request has + * completed that corresponds to a scheduler node. + */ +void i915_scheduler_wakeup(struct drm_device *dev) +{ + /* XXX: Need to call i915_scheduler_remove() via work handler. */ +} + +/** + * i915_scheduler_clean_node - free up any allocations/references + * associated with the given scheduler queue entry. + * @node: Queue entry structure which is complete + * After a give batch buffer completes on the hardware, all the information + * required to resubmit it is no longer required. However, the node entry + * itself might still be required for tracking purposes for a while longer. + * This function should be called as soon as the node is known to be complete + * so that these resources may be freed even though the node itself might + * hang around. + */ +void i915_scheduler_clean_node(struct i915_scheduler_queue_entry *node) +{ + if (!I915_SQS_IS_COMPLETE(node)) { + WARN(!node->params.request->cancelled, + "Cleaning active node: %d!\n", node->status); + return; + } + + if (node->params.batch_obj) { + /* + * The batch buffer must be unpinned before it is unreferenced + * otherwise the unpin fails with a missing vma!? + */ + if (node->params.dispatch_flags & I915_DISPATCH_SECURE) + i915_gem_execbuf_release_batch_obj(node->params.batch_obj); + + node->params.batch_obj = NULL; + } + + /* And anything else owned by the node: */ + if (node->params.cliprects) { + kfree(node->params.cliprects); + node->params.cliprects = NULL; + } +} + +static bool i915_scheduler_remove(struct i915_scheduler *scheduler, + struct intel_engine_cs *engine, + struct list_head *remove) +{ + struct i915_scheduler_queue_entry *node, *node_next; + bool do_submit; + + spin_lock_irq(&scheduler->lock); + + INIT_LIST_HEAD(remove); + list_for_each_entry_safe(node, node_next, &scheduler->node_queue[engine->id], link) { + if (!I915_SQS_IS_COMPLETE(node)) + break; + + list_del(&node->link); + list_add(&node->link, remove); + + /* Strip the dependency info while the mutex is still locked */ + i915_scheduler_remove_dependent(scheduler, node); + + continue; + } + + /* + * Release the interrupt reference count if there are no longer any + * nodes to worry about. + */ + if (list_empty(&scheduler->node_queue[engine->id]) && + (scheduler->flags[engine->id] & I915_SF_INTERRUPTS_ENABLED)) { + engine->irq_put(engine); + scheduler->flags[engine->id] &= ~I915_SF_INTERRUPTS_ENABLED; + } + + /* Launch more packets now? */ + do_submit = (scheduler->counts[engine->id].queued > 0) && + (scheduler->counts[engine->id].flying < scheduler->min_flying); + + spin_unlock_irq(&scheduler->lock); + + return do_submit; +} + +void i915_scheduler_process_work(struct intel_engine_cs *engine) +{ + struct drm_i915_private *dev_priv = to_i915(engine->dev); + struct i915_scheduler *scheduler = dev_priv->scheduler; + struct i915_scheduler_queue_entry *node; + bool do_submit; + struct list_head remove; + + if (list_empty(&scheduler->node_queue[engine->id])) + return; + + /* Remove completed nodes. */ + do_submit = i915_scheduler_remove(scheduler, engine, &remove); + + if (!do_submit && list_empty(&remove)) + return; + + /* Need to grab the pm lock outside of the mutex lock */ + if (do_submit) + intel_runtime_pm_get(dev_priv); + + mutex_lock(&engine->dev->struct_mutex); + + if (do_submit) + i915_scheduler_submit(engine); + + while (!list_empty(&remove)) { + node = list_first_entry(&remove, typeof(*node), link); + list_del(&node->link); + + /* Free up all the DRM references */ + i915_scheduler_clean_node(node); + + /* And anything else owned by the node: */ + node->params.request->scheduler_qe = NULL; + i915_gem_request_unreference(node->params.request); + kfree(node->dep_list); + kfree(node); + } + + mutex_unlock(&engine->dev->struct_mutex); + + if (do_submit) + intel_runtime_pm_put(dev_priv); +} diff --git a/drivers/gpu/drm/i915/i915_scheduler.h b/drivers/gpu/drm/i915/i915_scheduler.h new file mode 100644 index 0000000..c895c4c --- /dev/null +++ b/drivers/gpu/drm/i915/i915_scheduler.h @@ -0,0 +1,113 @@ +/* + * Copyright (c) 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +#ifndef _I915_SCHEDULER_H_ +#define _I915_SCHEDULER_H_ + +enum i915_scheduler_queue_status { + /* Limbo: */ + I915_SQS_NONE = 0, + /* Not yet submitted to hardware: */ + I915_SQS_QUEUED, + /* Popped from queue, ready to fly: */ + I915_SQS_POPPED, + /* Sent to hardware for processing: */ + I915_SQS_FLYING, + /* Finished processing on the hardware: */ + I915_SQS_COMPLETE, + /* Killed by watchdog or catastrophic submission failure: */ + I915_SQS_DEAD, + /* Limit value for use with arrays/loops */ + I915_SQS_MAX +}; + +#define I915_SQS_IS_QUEUED(node) (((node)->status == I915_SQS_QUEUED)) +#define I915_SQS_IS_FLYING(node) (((node)->status == I915_SQS_FLYING)) +#define I915_SQS_IS_COMPLETE(node) (((node)->status == I915_SQS_COMPLETE) || \ + ((node)->status == I915_SQS_DEAD)) + +struct i915_scheduler_obj_entry { + struct drm_i915_gem_object *obj; + bool read_only; +}; + +struct i915_scheduler_queue_entry { + /* Any information required to submit this batch buffer to the hardware */ + struct i915_execbuffer_params params; + + /* -1023 = lowest priority, 0 = default, 1023 = highest */ + int32_t priority; + bool bumped; + + /* Objects referenced by this batch buffer */ + struct i915_scheduler_obj_entry *objs; + int num_objs; + + /* Batch buffers this one is dependent upon */ + struct i915_scheduler_queue_entry **dep_list; + int num_deps; + + enum i915_scheduler_queue_status status; + unsigned long stamp; + + /* List of all scheduler queue entry nodes */ + struct list_head link; +}; + +struct i915_scheduler_node_states { + uint32_t flying; + uint32_t queued; +}; + +struct i915_scheduler { + struct list_head node_queue[I915_NUM_ENGINES]; + uint32_t flags[I915_NUM_ENGINES]; + spinlock_t lock; + + /* Node counts: */ + struct i915_scheduler_node_states counts[I915_NUM_ENGINES]; + + /* Tuning parameters: */ + int32_t priority_level_min; + int32_t priority_level_max; + int32_t priority_level_bump; + int32_t priority_level_preempt; + uint32_t min_flying; +}; + +/* Flag bits for i915_scheduler::flags */ +enum { + I915_SF_INTERRUPTS_ENABLED = (1 << 0), + I915_SF_SUBMITTING = (1 << 1), +}; + +bool i915_scheduler_is_enabled(struct drm_device *dev); +int i915_scheduler_init(struct drm_device *dev); +void i915_scheduler_destroy(struct drm_i915_private *dev_priv); +void i915_scheduler_clean_node(struct i915_scheduler_queue_entry *node); +int i915_scheduler_queue_execbuffer(struct i915_scheduler_queue_entry *qe); +bool i915_scheduler_notify_request(struct drm_i915_gem_request *req); +void i915_scheduler_wakeup(struct drm_device *dev); + +#endif /* _I915_SCHEDULER_H_ */