From patchwork Fri Jun 13 15:37:44 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: oscar.mateo@intel.com X-Patchwork-Id: 4349951 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 6497D9F314 for ; Fri, 13 Jun 2014 15:43:23 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 442DE20265 for ; Fri, 13 Jun 2014 15:43:22 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 281082022A for ; Fri, 13 Jun 2014 15:43:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 903D06EA6B; Fri, 13 Jun 2014 08:43:20 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTP id D66256EA6E for ; Fri, 13 Jun 2014 08:43:18 -0700 (PDT) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP; 13 Jun 2014 08:43:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.01,471,1400050800"; d="scan'208";a="555136503" Received: from omateolo-linux2.iwi.intel.com ([172.28.253.145]) by fmsmga002.fm.intel.com with ESMTP; 13 Jun 2014 08:43:17 -0700 From: oscar.mateo@intel.com To: intel-gfx@lists.freedesktop.org Date: Fri, 13 Jun 2014 16:37:44 +0100 Message-Id: <1402673891-14618-27-git-send-email-oscar.mateo@intel.com> X-Mailer: git-send-email 1.9.0 In-Reply-To: <1402673891-14618-1-git-send-email-oscar.mateo@intel.com> References: <1402673891-14618-1-git-send-email-oscar.mateo@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH 26/53] drm/i915/bdw: New logical ring submission mechanism X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.8 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Oscar Mateo Well, new-ish: if all this code looks familiar, that's because it's a clone of the existing submission mechanism (with some modifications here and there to adapt it to LRCs and Execlists). And why did we do this? Execlists offer several advantages, like control over when the GPU is done with a given workload, that can help simplify the submission mechanism, no doubt, but I am interested in getting Execlists to work first and foremost. As we are creating a parallel submission mechanism (even if itñ? just a clone), we can now start improving it without the fear of breaking old gens. Signed-off-by: Oscar Mateo --- drivers/gpu/drm/i915/intel_lrc.c | 214 +++++++++++++++++++++++++++++++++++++++ drivers/gpu/drm/i915/intel_lrc.h | 18 ++++ 2 files changed, 232 insertions(+) diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index 02fc3d0..89aed7a 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -86,6 +86,220 @@ bool intel_enable_execlists(struct drm_device *dev) return HAS_LOGICAL_RING_CONTEXTS(dev) && USES_PPGTT(dev); } +static inline struct intel_ringbuffer * +logical_ringbuf_get(struct intel_engine_cs *ring, struct intel_context *ctx) +{ + return ctx->engine[ring->id].ringbuf; +} + +void intel_logical_ring_advance_and_submit(struct intel_engine_cs *ring, + struct intel_context *ctx) +{ + struct intel_ringbuffer *ringbuf = logical_ringbuf_get(ring, ctx); + + intel_logical_ring_advance(ringbuf); + + if (intel_ring_stopped(ring)) + return; + + ring->submit_ctx(ring, ctx, ringbuf->tail); +} + +static int logical_ring_alloc_seqno(struct intel_engine_cs *ring, + struct intel_context *ctx) +{ + if (ring->outstanding_lazy_seqno) + return 0; + + if (ring->preallocated_lazy_request == NULL) { + struct drm_i915_gem_request *request; + + request = kmalloc(sizeof(*request), GFP_KERNEL); + if (request == NULL) + return -ENOMEM; + + ring->preallocated_lazy_request = request; + } + + return i915_gem_get_seqno(ring->dev, &ring->outstanding_lazy_seqno); +} + +static int logical_ring_wait_request(struct intel_engine_cs *ring, + struct intel_ringbuffer *ringbuf, + struct intel_context *ctx, + int bytes) +{ + struct drm_i915_gem_request *request; + u32 seqno = 0; + int ret; + + if (ringbuf->last_retired_head != -1) { + ringbuf->head = ringbuf->last_retired_head; + ringbuf->last_retired_head = -1; + + ringbuf->space = intel_ring_space(ringbuf); + if (ringbuf->space >= bytes) + return 0; + } + + list_for_each_entry(request, &ring->request_list, list) { + if (__intel_ring_space(request->tail, ringbuf->tail, + ringbuf->size) >= bytes) { + seqno = request->seqno; + break; + } + } + + if (seqno == 0) + return -ENOSPC; + + ret = i915_wait_seqno(ring, seqno); + if (ret) + return ret; + + /* TODO: make sure we update the right ringbuffer's last_retired_head + * when retiring requests */ + i915_gem_retire_requests_ring(ring); + ringbuf->head = ringbuf->last_retired_head; + ringbuf->last_retired_head = -1; + + ringbuf->space = intel_ring_space(ringbuf); + return 0; +} + +static int logical_ring_wait_for_space(struct intel_engine_cs *ring, + struct intel_ringbuffer *ringbuf, + struct intel_context *ctx, + int bytes) +{ + struct drm_device *dev = ring->dev; + struct drm_i915_private *dev_priv = dev->dev_private; + unsigned long end; + int ret; + + ret = logical_ring_wait_request(ring, ringbuf, ctx, bytes); + if (ret != -ENOSPC) + return ret; + + /* Force the context submission in case we have been skipping it */ + intel_logical_ring_advance_and_submit(ring, ctx); + + /* With GEM the hangcheck timer should kick us out of the loop, + * leaving it early runs the risk of corrupting GEM state (due + * to running on almost untested codepaths). But on resume + * timers don't work yet, so prevent a complete hang in that + * case by choosing an insanely large timeout. */ + end = jiffies + 60 * HZ; + + do { + ringbuf->head = I915_READ_HEAD(ring); + ringbuf->space = intel_ring_space(ringbuf); + if (ringbuf->space >= bytes) { + ret = 0; + break; + } + + if (!drm_core_check_feature(dev, DRIVER_MODESET) && + dev->primary->master) { + struct drm_i915_master_private *master_priv = dev->primary->master->driver_priv; + if (master_priv->sarea_priv) + master_priv->sarea_priv->perf_boxes |= I915_BOX_WAIT; + } + + msleep(1); + + if (dev_priv->mm.interruptible && signal_pending(current)) { + ret = -ERESTARTSYS; + break; + } + + ret = i915_gem_check_wedge(&dev_priv->gpu_error, + dev_priv->mm.interruptible); + if (ret) + break; + + if (time_after(jiffies, end)) { + ret = -EBUSY; + break; + } + } while (1); + + return ret; +} + +static int logical_ring_wrap_buffer(struct intel_engine_cs *ring, + struct intel_ringbuffer *ringbuf, + struct intel_context *ctx) +{ + uint32_t __iomem *virt; + int rem = ringbuf->size - ringbuf->tail; + + if (ringbuf->space < rem) { + int ret = logical_ring_wait_for_space(ring, ringbuf, ctx, rem); + if (ret) + return ret; + } + + virt = ringbuf->virtual_start + ringbuf->tail; + rem /= 4; + while (rem--) + iowrite32(MI_NOOP, virt++); + + ringbuf->tail = 0; + ringbuf->space = intel_ring_space(ringbuf); + + return 0; +} + +static int logical_ring_prepare(struct intel_engine_cs *ring, + struct intel_ringbuffer *ringbuf, + struct intel_context *ctx, + int bytes) +{ + int ret; + + if (unlikely(ringbuf->tail + bytes > ringbuf->effective_size)) { + ret = logical_ring_wrap_buffer(ring, ringbuf, ctx); + if (unlikely(ret)) + return ret; + } + + if (unlikely(ringbuf->space < bytes)) { + ret = logical_ring_wait_for_space(ring, ringbuf, ctx, bytes); + if (unlikely(ret)) + return ret; + } + + return 0; +} + +int intel_logical_ring_begin(struct intel_engine_cs *ring, + struct intel_context *ctx, + int num_dwords) +{ + struct drm_i915_private *dev_priv = ring->dev->dev_private; + struct intel_ringbuffer *ringbuf = logical_ringbuf_get(ring, ctx); + int ret; + + ret = i915_gem_check_wedge(&dev_priv->gpu_error, + dev_priv->mm.interruptible); + if (ret) + return ret; + + ret = logical_ring_prepare(ring, ringbuf, ctx, + num_dwords * sizeof(uint32_t)); + if (ret) + return ret; + + /* Preallocate the olr before touching the ring */ + ret = logical_ring_alloc_seqno(ring, ctx); + if (ret) + return ret; + + ringbuf->space -= num_dwords * sizeof(uint32_t); + return 0; +} + static int gen8_init_common_ring(struct intel_engine_cs *ring) { struct drm_device *dev = ring->dev; diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h index 26b0949..686ebf5 100644 --- a/drivers/gpu/drm/i915/intel_lrc.h +++ b/drivers/gpu/drm/i915/intel_lrc.h @@ -5,6 +5,24 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *ring); int intel_logical_rings_init(struct drm_device *dev); +void intel_logical_ring_advance_and_submit(struct intel_engine_cs *ring, + struct intel_context *ctx); + +static inline void intel_logical_ring_advance(struct intel_ringbuffer *ringbuf) +{ + ringbuf->tail &= ringbuf->size - 1; +} + +static inline void intel_logical_ring_emit(struct intel_ringbuffer *ringbuf, u32 data) +{ + iowrite32(data, ringbuf->virtual_start + ringbuf->tail); + ringbuf->tail += 4; +} + +int intel_logical_ring_begin(struct intel_engine_cs *ring, + struct intel_context *ctx, + int num_dwords); + /* Logical Ring Contexts */ void intel_lr_context_free(struct intel_context *ctx); int intel_lr_context_deferred_create(struct intel_context *ctx,