From patchwork Fri Jan 15 14:35:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 8041421 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 3ED37BEEE5 for ; Fri, 15 Jan 2016 14:36:04 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id EC10E20430 for ; Fri, 15 Jan 2016 14:36:02 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id 731C22038E for ; Fri, 15 Jan 2016 14:36:01 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BB7346EB7D; Fri, 15 Jan 2016 06:36:00 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mail-wm0-f68.google.com (mail-wm0-f68.google.com [74.125.82.68]) by gabe.freedesktop.org (Postfix) with ESMTPS id 800838920D for ; Fri, 15 Jan 2016 06:35:58 -0800 (PST) Received: by mail-wm0-f68.google.com with SMTP id b14so3166335wmb.1 for ; Fri, 15 Jan 2016 06:35:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:in-reply-to:references; bh=VDZ34Mr3vkKfoSDIaK0Y0/UDNju7BIn29+CD6FcW/RQ=; b=TZCg7yHAl5UzY6m6CHp5ORVy3VS6P0Jv7zSZ49xY7JYHM0/SCyQp5uKEuaMlNAmiDD qehMk9EH+N7PI6ch6ECmoAT6BoVBLYVV9TQGzCjVnnUdymgWekHSQoao44n+58gvwT/u 4oKZmSw6/qOz3G4YdVBXF8sL3bpptrz6FV74Glxs6sAjska1AwBrjT2+DtIxSpzM+TB7 epfFc2iyOz5fcH49xyTUlTA2oGkG9S4xmdqvGGngGIN8D1T7yGVFrlCURD0+QEBwSCWM mhkx+4osB0ZKEBe8lM6YPDSGONzYJYCAdvNhTtBazzQlE5BR9pGxZfT7bsxGff7C4/xh +iOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id :in-reply-to:references; bh=VDZ34Mr3vkKfoSDIaK0Y0/UDNju7BIn29+CD6FcW/RQ=; b=iFIWsTFb4qQSkrGbDehlyL4Ia9RNvr8U+438Ska9S4xdIGOuh0+jWYuM6AMQBUnPt7 H8WhuBR+3FJ7dLlWBI9B6NwQ/eYy13SySpsinC7VI99wrN2RSsr8VfV8QyAVaxT1HbaU lj54AC/oGd1XNDhqssDymYrsTZ5sCtqyiW/oeeaB67elaNh2M+AcXU5iTuD044FYJP0Y tc5YUCripqPl+nzh0EQ2GHABrKtLr+AuXULxKEC3Heozg8yBeDb9uC8B3foKozw3wR1a FpToIMjSX5F8LjaWeSYW2JP+lRbNpLrbHnWMo2pRffjPZMCNGgFaa4AmJT7/V6N/ZRKt O0Kw== X-Gm-Message-State: AG10YORLJJRudplpBItEM8i5cjzQw0lGwpfQL3ktllcg8ULU+9nhBJD2eu4AoTK8/ldMdw== X-Received: by 10.28.223.134 with SMTP id w128mr3605246wmg.69.1452868557173; Fri, 15 Jan 2016 06:35:57 -0800 (PST) Received: from haswell.alporthouse.com ([78.156.65.138]) by smtp.gmail.com with ESMTPSA id qs1sm10900961wjc.2.2016.01.15.06.35.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 15 Jan 2016 06:35:56 -0800 (PST) From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Fri, 15 Jan 2016 14:35:41 +0000 Message-Id: <1452868545-19586-3-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 2.7.0.rc3 In-Reply-To: <1452868545-19586-1-git-send-email-chris@chris-wilson.co.uk> References: <1452868545-19586-1-git-send-email-chris@chris-wilson.co.uk> Cc: Mika Kuoppala Subject: [Intel-gfx] [PATCH 2/6] drm/i915: Separate out the seqno-barrier from engine->get_seqno X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In order to simplify future patches, extract the lazy_coherency optimisation our of the engine->get_seqno() vfunc into its own callback. v2: Rename the barrier to engine->irq_seqno_barrier to try and better reflect that the barrier is only required after the user interrupt before reading the seqno (to ensure that the seqno update lands in time as we do not have strict seqno-irq ordering on all platforms). Reviewed-by: Dave Gordon [#v2] v3: Comments for hangcheck paranoia. Mika wanted to keep the extra barrier inside the hangcheck, just in case. I can argue that it doesn't provide a barrier against anything, but the side-effects of applying the barrier may prevent a false declaration of a hung GPU. Signed-off-by: Chris Wilson Cc: Mika Kuoppala Cc: Dave Gordon --- drivers/gpu/drm/i915/i915_debugfs.c | 6 +++--- drivers/gpu/drm/i915/i915_drv.h | 12 +++++++---- drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- drivers/gpu/drm/i915/i915_irq.c | 14 +++++++++++-- drivers/gpu/drm/i915/i915_trace.h | 2 +- drivers/gpu/drm/i915/intel_lrc.c | 18 ++++++----------- drivers/gpu/drm/i915/intel_ringbuffer.c | 36 +++++++++++++++++---------------- drivers/gpu/drm/i915/intel_ringbuffer.h | 4 ++-- 8 files changed, 52 insertions(+), 42 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c index e3377abc0d4d..b421b53ca128 100644 --- a/drivers/gpu/drm/i915/i915_debugfs.c +++ b/drivers/gpu/drm/i915/i915_debugfs.c @@ -600,7 +600,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data) ring->name, i915_gem_request_get_seqno(work->flip_queued_req), dev_priv->next_seqno, - ring->get_seqno(ring, true), + ring->get_seqno(ring), i915_gem_request_completed(work->flip_queued_req, true)); } else seq_printf(m, "Flip not associated with any ring\n"); @@ -732,7 +732,7 @@ static void i915_ring_seqno_info(struct seq_file *m, { if (ring->get_seqno) { seq_printf(m, "Current sequence (%s): %x\n", - ring->name, ring->get_seqno(ring, false)); + ring->name, ring->get_seqno(ring)); } } @@ -1342,8 +1342,8 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused) intel_runtime_pm_get(dev_priv); for_each_ring(ring, dev_priv, i) { - seqno[i] = ring->get_seqno(ring, false); acthd[i] = intel_ring_get_active_head(ring); + seqno[i] = ring->get_seqno(ring); } i915_get_extra_instdone(dev, instdone); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index eb7bb97f7316..0cdaed66b786 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2961,15 +2961,19 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2) static inline bool i915_gem_request_started(struct drm_i915_gem_request *req, bool lazy_coherency) { - u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); - return i915_seqno_passed(seqno, req->previous_seqno); + if (!lazy_coherency && req->ring->irq_seqno_barrier) + req->ring->irq_seqno_barrier(req->ring); + return i915_seqno_passed(req->ring->get_seqno(req->ring), + req->previous_seqno); } static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req, bool lazy_coherency) { - u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency); - return i915_seqno_passed(seqno, req->seqno); + if (!lazy_coherency && req->ring->irq_seqno_barrier) + req->ring->irq_seqno_barrier(req->ring); + return i915_seqno_passed(req->ring->get_seqno(req->ring), + req->seqno); } int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno); diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c index 06ca4082735b..b978febf67a0 100644 --- a/drivers/gpu/drm/i915/i915_gpu_error.c +++ b/drivers/gpu/drm/i915/i915_gpu_error.c @@ -902,8 +902,8 @@ static void i915_record_ring_state(struct drm_device *dev, ering->waiting = waitqueue_active(&ring->irq_queue); ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base)); - ering->seqno = ring->get_seqno(ring, false); ering->acthd = intel_ring_get_active_head(ring); + ering->seqno = ring->get_seqno(ring); ering->start = I915_READ_START(ring); ering->head = I915_READ_HEAD(ring); ering->tail = I915_READ_TAIL(ring); diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 25a89373df63..07bc2cdd6252 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -2937,7 +2937,7 @@ static int semaphore_passed(struct intel_engine_cs *ring) if (signaller->hangcheck.deadlock >= I915_NUM_RINGS) return -1; - if (i915_seqno_passed(signaller->get_seqno(signaller, false), seqno)) + if (i915_seqno_passed(signaller->get_seqno(signaller), seqno)) return 1; /* cursory check for an unkickable deadlock */ @@ -3101,8 +3101,18 @@ static void i915_hangcheck_elapsed(struct work_struct *work) semaphore_clear_deadlocks(dev_priv); - seqno = ring->get_seqno(ring, false); + /* We don't strictly need an irq-barrier here, as we are not + * serving an interrupt request, be paranoid in case the + * barrier has side-effects (such as preventing a broken + * cacheline snoop) and so be sure that we can see the seqno + * advance. If the seqno should stick, due to a stale + * cacheline, we would erroneously declare the GPU hung. + */ + if (ring->irq_seqno_barrier) + ring->irq_seqno_barrier(ring); + acthd = intel_ring_get_active_head(ring); + seqno = ring->get_seqno(ring); if (ring->hangcheck.seqno == seqno) { if (ring_idle(ring, seqno)) { diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 52b2d409945d..cfb5f78a6e84 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -573,7 +573,7 @@ TRACE_EVENT(i915_gem_request_notify, TP_fast_assign( __entry->dev = ring->dev->primary->index; __entry->ring = ring->id; - __entry->seqno = ring->get_seqno(ring, false); + __entry->seqno = ring->get_seqno(ring); ), TP_printk("dev=%u, ring=%u, seqno=%u", diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c index f5d89c845ede..3ed4ab7f571e 100644 --- a/drivers/gpu/drm/i915/intel_lrc.c +++ b/drivers/gpu/drm/i915/intel_lrc.c @@ -1781,7 +1781,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request, return 0; } -static u32 gen8_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +static u32 gen8_get_seqno(struct intel_engine_cs *ring) { return intel_read_status_page(ring, I915_GEM_HWS_INDEX); } @@ -1791,9 +1791,8 @@ static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno) intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno); } -static u32 bxt_a_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +static void bxt_seqno_barrier(struct intel_engine_cs *ring) { - /* * On BXT A steppings there is a HW coherency issue whereby the * MI_STORE_DATA_IMM storing the completed request's seqno @@ -1804,11 +1803,7 @@ static u32 bxt_a_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) * bxt_a_set_seqno(), where we also do a clflush after the write. So * this clflush in practice becomes an invalidate operation. */ - - if (!lazy_coherency) - intel_flush_status_page(ring, I915_GEM_HWS_INDEX); - - return intel_read_status_page(ring, I915_GEM_HWS_INDEX); + intel_flush_status_page(ring, I915_GEM_HWS_INDEX); } static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno) @@ -1954,12 +1949,11 @@ logical_ring_default_vfuncs(struct drm_device *dev, ring->irq_get = gen8_logical_ring_get_irq; ring->irq_put = gen8_logical_ring_put_irq; ring->emit_bb_start = gen8_emit_bb_start; + ring->get_seqno = gen8_get_seqno; + ring->set_seqno = gen8_set_seqno; if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) { - ring->get_seqno = bxt_a_get_seqno; + ring->irq_seqno_barrier = bxt_seqno_barrier; ring->set_seqno = bxt_a_set_seqno; - } else { - ring->get_seqno = gen8_get_seqno; - ring->set_seqno = gen8_set_seqno; } } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 935add1422ae..13ef3e956901 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -1487,8 +1487,8 @@ pc_render_add_request(struct drm_i915_gem_request *req) return 0; } -static u32 -gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +static void +gen6_seqno_barrier(struct intel_engine_cs *ring) { /* Workaround to force correct ordering between irq and seqno writes on * ivb (and maybe also on snb) by reading from a CS register (like @@ -1502,18 +1502,14 @@ gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) * a delay after every batch i.e. much more frequent than a delay * when waiting for the interrupt (with the same net latency). */ - if (!lazy_coherency) { - struct drm_i915_private *dev_priv = ring->dev->dev_private; - POSTING_READ_FW(RING_ACTHD(ring->mmio_base)); - - intel_flush_status_page(ring, I915_GEM_HWS_INDEX); - } + struct drm_i915_private *dev_priv = to_i915(ring->dev); + POSTING_READ_FW(RING_ACTHD(ring->mmio_base)); - return intel_read_status_page(ring, I915_GEM_HWS_INDEX); + intel_flush_status_page(ring, I915_GEM_HWS_INDEX); } static u32 -ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +ring_get_seqno(struct intel_engine_cs *ring) { return intel_read_status_page(ring, I915_GEM_HWS_INDEX); } @@ -1525,7 +1521,7 @@ ring_set_seqno(struct intel_engine_cs *ring, u32 seqno) } static u32 -pc_render_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency) +pc_render_get_seqno(struct intel_engine_cs *ring) { return ring->scratch.cpu_page[0]; } @@ -2717,7 +2713,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev) ring->irq_get = gen8_ring_get_irq; ring->irq_put = gen8_ring_put_irq; ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (i915_semaphore_is_enabled(dev)) { WARN_ON(!dev_priv->semaphore_obj); @@ -2734,7 +2731,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev) ring->irq_get = gen6_ring_get_irq; ring->irq_put = gen6_ring_put_irq; ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (i915_semaphore_is_enabled(dev)) { ring->semaphore.sync_to = gen6_ring_sync; @@ -2848,7 +2846,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev) ring->write_tail = gen6_bsd_ring_write_tail; ring->flush = gen6_bsd_ring_flush; ring->add_request = gen6_add_request; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (INTEL_INFO(dev)->gen >= 8) { ring->irq_enable_mask = @@ -2920,7 +2919,8 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev) ring->mmio_base = GEN8_BSD2_RING_BASE; ring->flush = gen6_bsd_ring_flush; ring->add_request = gen6_add_request; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT; @@ -2950,7 +2950,8 @@ int intel_init_blt_ring_buffer(struct drm_device *dev) ring->write_tail = ring_write_tail; ring->flush = gen6_ring_flush; ring->add_request = gen6_add_request; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (INTEL_INFO(dev)->gen >= 8) { ring->irq_enable_mask = @@ -3007,7 +3008,8 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev) ring->write_tail = ring_write_tail; ring->flush = gen6_ring_flush; ring->add_request = gen6_add_request; - ring->get_seqno = gen6_ring_get_seqno; + ring->irq_seqno_barrier = gen6_seqno_barrier; + ring->get_seqno = ring_get_seqno; ring->set_seqno = ring_set_seqno; if (INTEL_INFO(dev)->gen >= 8) { diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 7349d9258191..8fb02b21e75d 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -193,8 +193,8 @@ struct intel_engine_cs { * seen value is good enough. Note that the seqno will always be * monotonic, even if not coherent. */ - u32 (*get_seqno)(struct intel_engine_cs *ring, - bool lazy_coherency); + void (*irq_seqno_barrier)(struct intel_engine_cs *ring); + u32 (*get_seqno)(struct intel_engine_cs *ring); void (*set_seqno)(struct intel_engine_cs *ring, u32 seqno); int (*dispatch_execbuffer)(struct drm_i915_gem_request *req,