From patchwork Mon Feb 23 15:44:07 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michel Thierry X-Patchwork-Id: 5866291 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id ED5E29F695 for ; Mon, 23 Feb 2015 15:44:44 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id E5D2620615 for ; Mon, 23 Feb 2015 15:44:43 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by mail.kernel.org (Postfix) with ESMTP id BD82820643 for ; Mon, 23 Feb 2015 15:44:42 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 30FAA6E43F; Mon, 23 Feb 2015 07:44:42 -0800 (PST) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTP id 8A29B6E43F for ; Mon, 23 Feb 2015 07:44:40 -0800 (PST) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP; 23 Feb 2015 07:44:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,631,1418112000"; d="scan'208";a="682051049" Received: from michelth-linux.isw.intel.com ([10.102.226.150]) by fmsmga002.fm.intel.com with ESMTP; 23 Feb 2015 07:44:38 -0800 From: Michel Thierry To: intel-gfx@lists.freedesktop.org Date: Mon, 23 Feb 2015 15:44:07 +0000 Message-Id: <1424706272-3016-8-git-send-email-michel.thierry@intel.com> X-Mailer: git-send-email 2.1.1 In-Reply-To: <1424706272-3016-1-git-send-email-michel.thierry@intel.com> References: <1418922621-25818-1-git-send-email-michel.thierry@intel.com> <1424706272-3016-1-git-send-email-michel.thierry@intel.com> Subject: [Intel-gfx] [PATCH v5 07/32] drm/i915: Track page table reload need X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Ben Widawsky This patch was formerly known as, "Force pd restore when PDEs change, gen6-7." I had to change the name because it is needed for GEN8 too. The real issue this is trying to solve is when a new object is mapped into the current address space. The GPU does not snoop the new mapping so we must do the gen specific action to reload the page tables. GEN8 and GEN7 do differ in the way they load page tables for the RCS. GEN8 does so with the context restore, while GEN7 requires the proper load commands in the command streamer. Non-render is similar for both. Caveat for GEN7 The docs say you cannot change the PDEs of a currently running context. We never map new PDEs of a running context, and expect them to be present - so I think this is okay. (We can unmap, but this should also be okay since we only unmap unreferenced objects that the GPU shouldn't be tryingto va->pa xlate.) The MI_SET_CONTEXT command does have a flag to signal that even if the context is the same, force a reload. It's unclear exactly what this does, but I have a hunch it's the right thing to do. The logic assumes that we always emit a context switch after mapping new PDEs, and before we submit a batch. This is the case today, and has been the case since the inception of hardware contexts. A note in the comment let's the user know. It's not just for gen8. If the current context has mappings change, we need a context reload to switch v2: Rebased after ppgtt clean up patches. Split the warning for aliasing and true ppgtt options. And do not break aliasing ppgtt, where to->ppgtt is always null. v3: Invalidate PPGTT TLBs inside alloc_va_range. v4: Rename ppgtt_invalidate_tlbs to mark_tlbs_dirty and move pd_dirty_rings from i915_address_space to i915_hw_ppgtt. Fixes when neither ctx->ppgtt and aliasing_ppgtt exist. v5: Removed references to teardown_va_range. Signed-off-by: Ben Widawsky Signed-off-by: Michel Thierry (v2+) --- drivers/gpu/drm/i915/i915_gem_context.c | 29 ++++++++++++++++++++++++----- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 11 +++++++++++ drivers/gpu/drm/i915/i915_gem_gtt.c | 11 +++++++++++ drivers/gpu/drm/i915/i915_gem_gtt.h | 1 + 4 files changed, 47 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 6206d27..437cdcc 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -569,8 +569,20 @@ static inline bool should_skip_switch(struct intel_engine_cs *ring, struct intel_context *from, struct intel_context *to) { - if (from == to && !to->remap_slice) - return true; + struct drm_i915_private *dev_priv = ring->dev->dev_private; + + if (to->remap_slice) + return false; + + if (to->ppgtt) { + if (from == to && !test_bit(ring->id, + &to->ppgtt->pd_dirty_rings)) + return true; + } else if (dev_priv->mm.aliasing_ppgtt) { + if (from == to && !test_bit(ring->id, + &dev_priv->mm.aliasing_ppgtt->pd_dirty_rings)) + return true; + } return false; } @@ -587,9 +599,8 @@ needs_pd_load_pre(struct intel_engine_cs *ring, struct intel_context *to) static bool needs_pd_load_post(struct intel_engine_cs *ring, struct intel_context *to) { - return (!to->legacy_hw_ctx.initialized || - i915_gem_context_is_default(to)) && - to->ppgtt && IS_GEN8(ring->dev); + return IS_GEN8(ring->dev) && + (to->ppgtt || &to->ppgtt->pd_dirty_rings); } static int do_switch(struct intel_engine_cs *ring, @@ -634,6 +645,12 @@ static int do_switch(struct intel_engine_cs *ring, ret = to->ppgtt->switch_mm(to->ppgtt, ring); if (ret) goto unpin_out; + + /* Doing a PD load always reloads the page dirs */ + if (to->ppgtt) + clear_bit(ring->id, &to->ppgtt->pd_dirty_rings); + else + clear_bit(ring->id, &dev_priv->mm.aliasing_ppgtt->pd_dirty_rings); } if (ring != &dev_priv->ring[RCS]) { @@ -672,6 +689,8 @@ static int do_switch(struct intel_engine_cs *ring, */ if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) hw_flags |= MI_RESTORE_INHIBIT; + else if (to->ppgtt && test_and_clear_bit(ring->id, &to->ppgtt->pd_dirty_rings)) + hw_flags |= MI_FORCE_RESTORE; ret = mi_set_context(ring, to, hw_flags); if (ret) diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index b773368..1961107 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1198,6 +1198,13 @@ i915_gem_ringbuffer_submission(struct drm_device *dev, struct drm_file *file, if (ret) goto error; + if (ctx->ppgtt) + WARN(ctx->ppgtt->pd_dirty_rings & (1<id), + "%s didn't clear reload\n", ring->name); + else if (dev_priv->mm.aliasing_ppgtt) + WARN(dev_priv->mm.aliasing_ppgtt->pd_dirty_rings & + (1<id), "%s didn't clear reload\n", ring->name); + instp_mode = args->flags & I915_EXEC_CONSTANTS_MASK; instp_mask = I915_EXEC_CONSTANTS_MASK; switch (instp_mode) { @@ -1467,6 +1474,10 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (ret) goto err; + /* XXX: Reserve has possibly change PDEs which means we must do a + * context switch before we can coherently read some of the reserved + * VMAs. */ + /* The objects are in their final locations, apply the relocations. */ if (need_relocs) ret = i915_gem_execbuffer_relocate(eb); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 5ee92ce..18d7b28 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -1151,6 +1151,16 @@ static void gen6_ppgtt_unmap_pages(struct i915_hw_ppgtt *ppgtt) 4096, PCI_DMA_BIDIRECTIONAL); } +/* PDE TLBs are a pain invalidate pre GEN8. It requires a context reload. If we + * are switching between contexts with the same LRCA, we also must do a force + * restore. + */ +static inline void mark_tlbs_dirty(struct i915_hw_ppgtt *ppgtt) +{ + /* If current vm != vm, */ + ppgtt->pd_dirty_rings = INTEL_INFO(ppgtt->base.dev)->ring_mask; +} + static int gen6_alloc_va_range(struct i915_address_space *vm, uint64_t start, uint64_t length) { @@ -1170,6 +1180,7 @@ static int gen6_alloc_va_range(struct i915_address_space *vm, I915_PPGTT_PT_ENTRIES); } + mark_tlbs_dirty(ppgtt); return 0; } diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h index 1b15fc9..eaf530f 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.h +++ b/drivers/gpu/drm/i915/i915_gem_gtt.h @@ -297,6 +297,7 @@ struct i915_hw_ppgtt { struct i915_address_space base; struct kref ref; struct drm_mm_node node; + unsigned long pd_dirty_rings; unsigned num_pd_entries; unsigned num_pd_pages; /* gen8+ */ union {