[v6,06/34] drm/i915: Start of GPU scheduler

From: John Harrison <John.C.Harrison@Intel.com>

From: John Harrison <John.C.Harrison@Intel.com>

Initial creation of scheduler source files. Note that this patch
implements most of the scheduler functionality but does not hook it in
to the driver yet. It also leaves the scheduler code in 'pass through'
mode so that even when it is hooked in, it will not actually do very
much. This allows the hooks to be added one at a time in bite size
chunks and only when the scheduler is finally enabled at the end does
anything start happening.

The general theory of operation is that when batch buffers are
submitted to the driver, the execbuffer() code packages up all the
information required to execute the batch buffer at a later time. This
package is given over to the scheduler which adds it to an internal
node list. The scheduler also scans the list of objects associated
with the batch buffer and compares them against the objects already in
use by other buffers in the node list. If matches are found then the
new batch buffer node is marked as being dependent upon the matching
node. The same is done for the context object. The scheduler also
bumps up the priority of such matching nodes on the grounds that the
more dependencies a given batch buffer has the more important it is
likely to be.

The scheduler aims to have a given (tuneable) number of batch buffers
in flight on the hardware at any given time. If fewer than this are
currently executing when a new node is queued, then the node is passed
straight through to the submit function. Otherwise it is simply added
to the queue and the driver returns back to user land.

The scheduler is notified when each batch buffer completes and updates
its internal tracking accordingly. At the end of the completion
interrupt processing, if any scheduler tracked batches were processed,
the scheduler's deferred worker thread is woken up. This can do more
involved processing such as actually removing completed nodes from the
queue and freeing up the resources associated with them (internal
memory allocations, DRM object references, context reference, etc.).
The work handler also checks the in flight count and calls the
submission code if a new slot has appeared.

When the scheduler's submit code is called, it scans the queued node
list for the highest priority node that has no unmet dependencies.
Note that the dependency calculation is complex as it must take
inter-ring dependencies and potential preemptions into account. Note
also that in the future this will be extended to include external
dependencies such as the Android Native Sync file descriptors and/or
the linux dma-buff synchronisation scheme.

If a suitable node is found then it is sent to execbuff_final() for
submission to the hardware. The in flight count is then re-checked and
a new node popped from the list if appropriate. All nodes that are not
submitted have their priority bumped. This ensures that low priority
tasks do not get starved out by busy higher priority ones - everything
will eventually get its turn to run.

Note that this patch does not implement pre-emptive scheduling. Only
basic scheduling by re-ordering batch buffer submission is currently
implemented. Pre-emption of actively executing batch buffers comes in
the next patch series.

v2: Changed priority levels to +/-1023 due to feedback from Chris
Wilson.

Removed redundant index from scheduler node.

Changed time stamps to use jiffies instead of raw monotonic. This
provides lower resolution but improved compatibility with other i915
code.

Major re-write of completion tracking code due to struct fence
conversion. The scheduler no longer has it's own private IRQ handler
but just lets the existing request code handle completion events.
Instead, the scheduler now hooks into the request notify code to be
told when a request has completed.

Reduced driver mutex locking scope. Removal of scheduler nodes no
longer grabs the mutex lock.

v3: Refactor of dependency generation to make the code more readable.
Also added in read-read optimisation support - i.e., don't treat a
shared read-only buffer as being a dependency.

Allowed the killing of queued nodes rather than only flying ones.

v4: Updated the commit message to better reflect the current state of
the code. Downgraded some BUG_ONs to WARN_ONs. Used the correct array
memory allocator function (kmalloc_array instead of kmalloc).
Corrected the format of some comments. Wrapped some lines differently
to keep the style checker happy.

Fixed a WARN_ON when killing nodes. The dependency removal code checks
that nodes being destroyed do not have any oustanding dependencies
(which would imply they should not have been executed yet). In the
case of nodes being destroyed, e.g. due to context banning, then this
might well be the case - they have not been executed and do indeed
have outstanding dependencies.

Re-instated the code to disble interrupts when not in use. The
underlying problem causing broken IRQ reference counts seems to have
been fixed now.

v5: Shuffled various functions around to remove forward declarations
as apparently these are frowned upon. Removed lots of white space as
apparently having easy to read code is also frowned upon. Split the
direct submission scheduler bypass code out into a separate function.
Squashed down the i915_scheduler.c sections of various patches into
this patch. Thus the later patches simply hook in existing code into
various parts of the driver rather than adding the code as well. Added
documentation to various functions. Re-worked the submit function in
terms of mutex locking, error handling and exit paths. Split the
delayed work handler function in half. Made use of the kernel 'clamp'
macro. [Joonas Lahtinen]

Added runtime PM calls as these must be done at the top level before
acquiring the driver mutex lock. [Chris Wilson]

Removed some obsolete debug code that had been forgotten about.

Moved more clean up code into the 'i915_gem_scheduler_clean_node()'
function rather than replicating it in mutliple places.

Used lighter weight spinlocks.

v6: Updated to newer nightly (lots of ring -> engine renaming).

Added 'for_each_scheduler_node()' and 'assert_scheduler_lock_held()'
helper macros. Renamed 'i915_gem_execbuff_release_batch_obj' to
'i915_gem_execbuf_release_batch_obj'. Updated to use 'to_i915()'
instead of dev_private. Converted all enum labels to uppercase.
Removed various unnecessary WARNs. Renamed 'saved_objects' to just
'objs'. Split code for counting incomplete nodes out into a separate
function. Removed even more white space. Added a destroy() function.
[review feedback from Joonas Lahtinen]

Added running totals of 'flying' and 'queued' nodes rather than
re-calculating each time as a minor CPU performance optimisation.

Removed support for out of order seqno completion. All the prep work
patch series (seqno to request conversion, late seqno assignment,
etc.) that has now been done means that the scheduler no longer
generates out of order seqno completions. Thus all the complex code
for coping with such is no longer required and can be removed.

Fixed a bug in scheduler bypass mode introduced in the clean up code
refactoring of v5. The clean up function was seeing the node in the
wrong state and thus refusing to process it.

For: VIZ-1587
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile         |   1 +
 drivers/gpu/drm/i915/i915_dma.c       |   3 +
 drivers/gpu/drm/i915/i915_drv.h       |   6 +
 drivers/gpu/drm/i915/i915_gem.c       |   5 +
 drivers/gpu/drm/i915/i915_scheduler.c | 867 ++++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_scheduler.h | 113 +++++
 6 files changed, 995 insertions(+)
 create mode 100644 drivers/gpu/drm/i915/i915_scheduler.c
 create mode 100644 drivers/gpu/drm/i915/i915_scheduler.h

[v6,06/34] drm/i915: Start of GPU scheduler

Commit Message

Comments

Patch