From patchwork Sat May 25 19:26:43 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 2613971 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork2.kernel.org (Postfix) with ESMTP id 54491DF2A2 for ; Sat, 25 May 2013 19:29:08 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 43DF7E5D18 for ; Sat, 25 May 2013 12:29:08 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from shiva.localdomain (unknown [209.20.75.48]) by gabe.freedesktop.org (Postfix) with ESMTP id 8475BE5DDE for ; Sat, 25 May 2013 12:24:45 -0700 (PDT) Received: by shiva.localdomain (Postfix, from userid 1005) id 812A88874C; Sat, 25 May 2013 19:24:43 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on shiva.chad-versace.us X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, URIBL_BLOCKED autolearn=ham version=3.3.2 Received: from lundgren.kumite (c-24-21-100-90.hsd1.or.comcast.net [24.21.100.90]) by shiva.localdomain (Postfix) with ESMTPSA id 4A6B58874E; Sat, 25 May 2013 19:24:33 +0000 (UTC) From: Ben Widawsky To: Intel GFX Date: Sat, 25 May 2013 12:26:43 -0700 Message-Id: <1369510028-3343-10-git-send-email-ben@bwidawsk.net> X-Mailer: git-send-email 1.8.2.3 In-Reply-To: <1369510028-3343-1-git-send-email-ben@bwidawsk.net> References: <1369510028-3343-1-git-send-email-ben@bwidawsk.net> Cc: Ben Widawsky Subject: [Intel-gfx] [PATCH 09/34] drm/i915: Use PDEs as the guard page X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Scary alert. AFAICT, we simply do not need the guard page if we have the PDEs at the top since all prefetching is CS related, and it should always be safe to prefetch into a PDE (provided the PDE is valid). The PDE fetching itself should not be subject to the prefetching problem, though without full PPGTT support, we may never know. Potentially this is achievable even without using the PPGTT reworks I've done prior to this, but I figure there is no point in rocking the boat without the PPGTT generalizations I've submitted. There is a very simple bool left to help test if things break and we fear guard page related stuff. Of course the commit is easily revertable as well. One reason why it's desirable to do this is with the drm_mm_node allocation I've done with the PDEs, we fragment a bit of space finding the the first properly aligned address (if my math is right, we get 11 spare pages fragmented at the top of our address space). This problem didn't exist with the original implementation that stole the PDEs out from under the drm_mm. Signed-off-by: Ben Widawsky --- drivers/gpu/drm/i915/i915_drv.h | 3 +- drivers/gpu/drm/i915/i915_gem.c | 2 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 80 +++++++++++++++++++++++-------------- 3 files changed, 54 insertions(+), 31 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 59f98ec..d6dd26a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1750,7 +1750,8 @@ void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj); void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj); void i915_gem_init_global_gtt(struct drm_device *dev); void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start, - unsigned long mappable_end, unsigned long end); + unsigned long mappable_end, unsigned long end, + bool guard_page); int i915_gem_gtt_init(struct drm_device *dev); static inline void i915_gem_chipset_flush(struct drm_device *dev) { diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 2fc9b93..f85fb0e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -161,7 +161,7 @@ i915_gem_init_ioctl(struct drm_device *dev, void *data, mutex_lock(&dev->struct_mutex); i915_gem_setup_global_gtt(dev, args->gtt_start, args->gtt_end, - args->gtt_end); + args->gtt_end, true); dev_priv->gtt.mappable_end = args->gtt_end; mutex_unlock(&dev->struct_mutex); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 9ff8769..49467c1 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -282,7 +282,7 @@ static int gen6_ppgtt_init(struct i915_hw_ppgtt *ppgtt) &ppgtt->node, GEN6_PD_SIZE, GEN6_PD_ALIGN, 0, dev_priv->gtt.mappable_end, - dev_priv->gtt.total-PAGE_SIZE, + dev_priv->gtt.total, DRM_MM_TOPDOWN); if (ret) return ret; @@ -360,10 +360,9 @@ err_pt_alloc: return ret; } -static int i915_gem_init_aliasing_ppgtt(struct drm_device *dev) +int i915_gem_ppgtt_init(struct drm_device *dev, struct i915_hw_ppgtt *ppgtt) { struct drm_i915_private *dev_priv = dev->dev_private; - struct i915_hw_ppgtt *ppgtt; int ret; ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL); @@ -630,10 +629,26 @@ static void i915_gtt_color_adjust(struct drm_mm_node *node, *end -= 4096; } } + +static bool intel_enable_ppgtt(struct drm_device *dev) +{ + if (i915_enable_ppgtt >= 0) + return i915_enable_ppgtt; + +#ifdef CONFIG_INTEL_IOMMU + /* Disable ppgtt on SNB if VT-d is on. */ + if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) + return false; +#endif + + return true; +} + void i915_gem_setup_global_gtt(struct drm_device *dev, unsigned long start, unsigned long mappable_end, - unsigned long end) + unsigned long end, + bool guard_page) { /* Let GEM Manage all of the aperture. * @@ -651,8 +666,12 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, BUG_ON(mappable_end > end); - /* Subtract the guard page ... */ - drm_mm_init(&dev_priv->mm.gtt_space, start, end - start - PAGE_SIZE); + if (!guard_page) + drm_mm_init(&dev_priv->mm.gtt_space, start, end - start); + else + drm_mm_init(&dev_priv->mm.gtt_space, start, + end - start - PAGE_SIZE); /* Guard page */ + if (!HAS_LLC(dev)) dev_priv->mm.gtt_space.color_adjust = i915_gtt_color_adjust; @@ -681,46 +700,49 @@ void i915_gem_setup_global_gtt(struct drm_device *dev, (hole_end-hole_start) / PAGE_SIZE); } - /* And finally clear the reserved guard page */ - dev_priv->gtt.gtt_clear_range(dev, end / PAGE_SIZE - 1, 1); -} - -static bool -intel_enable_ppgtt(struct drm_device *dev) -{ - if (i915_enable_ppgtt >= 0) - return i915_enable_ppgtt; - -#ifdef CONFIG_INTEL_IOMMU - /* Disable ppgtt on SNB if VT-d is on. */ - if (INTEL_INFO(dev)->gen == 6 && intel_iommu_gfx_mapped) - return false; -#endif - - return true; + /* And finally clear the reserved guard page (if exists) */ + if (guard_page) + dev_priv->gtt.gtt_clear_range(dev, end / PAGE_SIZE - 1, 1); } void i915_gem_init_global_gtt(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; unsigned long gtt_size, mappable_size; + int ret = 0; gtt_size = dev_priv->gtt.total; mappable_size = dev_priv->gtt.mappable_end; if (intel_enable_ppgtt(dev) && HAS_ALIASING_PPGTT(dev)) { - int ret; + struct i915_hw_ppgtt *ppgtt; + + i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, + false); + + ppgtt = kzalloc(sizeof(*ppgtt), GFP_KERNEL); + if (!ppgtt) { + ret = -ENOMEM; + goto ggtt_only; + } - i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size); + ret = i915_gem_ppgtt_init(dev, ppgtt); + if (ret) + goto ggtt_only; - ret = i915_gem_init_aliasing_ppgtt(dev); - if (!ret) - return; + return; + } +/* XXX: We need to takedown the drm_mm and have this fall back because of the + * conditional use of the guard page. + * TODO: It's a bit hackish and could use cleanup. + */ +ggtt_only: + if (ret) { DRM_ERROR("Aliased PPGTT setup failed %d\n", ret); drm_mm_takedown(&dev_priv->mm.gtt_space); } - i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size); + i915_gem_setup_global_gtt(dev, 0, mappable_size, gtt_size, true); } static int setup_scratch_page(struct drm_device *dev)