From patchwork Sun Aug 12 11:04:39 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Wilson X-Patchwork-Id: 1309531 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork1.kernel.org (Postfix) with ESMTP id 1D92F3FC33 for ; Sun, 12 Aug 2012 11:06:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id BDF589E77A for ; Sun, 12 Aug 2012 04:06:14 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from fireflyinternet.com (smtp.fireflyinternet.com [109.228.6.236]) by gabe.freedesktop.org (Postfix) with ESMTP id 6BF959E746 for ; Sun, 12 Aug 2012 04:05:14 -0700 (PDT) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.73.22; Received: from arrandale.alporthouse.com (unverified [78.156.73.22]) by fireflyinternet.com (Firefly Internet SMTP) with ESMTP id 120101649-1500050 for multiple; Sun, 12 Aug 2012 12:05:04 +0100 From: Chris Wilson To: intel-gfx@lists.freedesktop.org Date: Sun, 12 Aug 2012 12:04:39 +0100 Message-Id: <1344769479-3237-1-git-send-email-chris@chris-wilson.co.uk> X-Mailer: git-send-email 1.7.10.4 X-Originating-IP: 78.156.73.22 Subject: [Intel-gfx] [PATCH] agp/intel, drm/i915: Use a write-combining map for updating PTEs X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org In order to be able to ioremap_wc the GTT space, we need to remove the conflicting pci_iomap from drm/i915, so we limit the register map in drm/i915 to the suitable range for each generation. The benefit of doing this is an order of magnitude reduction in time spent rewriting the GTT entries when inserting and removing objects. For example, this halves the CPU time spent in X when pushing pixels for chromium through a userptr (chromium has a bug where it likes to recreate its ShmPixmap on every draw). Signed-off-by: Chris Wilson --- drivers/char/agp/intel-gtt.c | 13 ++++++++++--- drivers/gpu/drm/i915/i915_dma.c | 14 ++++++++++++-- 2 files changed, 22 insertions(+), 5 deletions(-) diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c index 76103aa..73bdb74 100644 --- a/drivers/char/agp/intel-gtt.c +++ b/drivers/char/agp/intel-gtt.c @@ -666,8 +666,14 @@ static int intel_gtt_init(void) gtt_map_size = intel_private.base.gtt_total_entries * 4; - intel_private.gtt = ioremap(intel_private.gtt_bus_addr, - gtt_map_size); + intel_private.gtt = ioremap_wc(intel_private.gtt_bus_addr, + gtt_map_size); + if (!intel_private.gtt) { + dev_err(&intel_private.bridge_dev->dev, + "failed to map GATT as wc, falling back to uc-\n"); + intel_private.gtt = ioremap(intel_private.gtt_bus_addr, + gtt_map_size); + } if (!intel_private.gtt) { intel_private.driver->cleanup(); iounmap(intel_private.registers); @@ -1233,12 +1239,13 @@ static inline int needs_idle_maps(void) static int i9xx_setup(void) { u32 reg_addr; - int size = KB(512); + int size; pci_read_config_dword(intel_private.pcidev, I915_MMADDR, ®_addr); reg_addr &= 0xfff80000; + size = KB(512); if (INTEL_GTT_GEN >= 7) size = MB(2); diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index a21e0b0..c453304 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1458,7 +1458,7 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) { struct drm_i915_private *dev_priv; struct intel_device_info *info; - int ret = 0, mmio_bar; + int ret = 0, mmio_bar, mmio_size; uint32_t aperture_size; info = (struct intel_device_info *) flags; @@ -1522,8 +1522,18 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) if (IS_BROADWATER(dev) || IS_CRESTLINE(dev)) dma_set_coherent_mask(&dev->pdev->dev, DMA_BIT_MASK(32)); + /* Restrict iomap to avoid clobbering the GTT which we want WC mapped. + * Do not attempt to map the whole BAR! + */ mmio_bar = IS_GEN2(dev) ? 1 : 0; - dev_priv->regs = pci_iomap(dev->pdev, mmio_bar, 0); + if (info->gen < 3) + mmio_size = 64*1024; + else if (info->gen < 5) + mmio_size = 512*1024; + else + mmio_size = 2*1024*1024; + + dev_priv->regs = pci_iomap(dev->pdev, mmio_bar, mmio_size); if (!dev_priv->regs) { DRM_ERROR("failed to map registers\n"); ret = -EIO;