From patchwork Fri Oct 26 20:54:24 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Widawsky X-Patchwork-Id: 1653851 Return-Path: X-Original-To: patchwork-intel-gfx@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork2.kernel.org (Postfix) with ESMTP id 7785ADF2F6 for ; Fri, 26 Oct 2012 20:55:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 646F2A0DC9 for ; Fri, 26 Oct 2012 13:55:30 -0700 (PDT) X-Original-To: intel-gfx@lists.freedesktop.org Delivered-To: intel-gfx@lists.freedesktop.org Received: from shiva.chad-versace.us (209-20-75-48.static.cloud-ips.com [209.20.75.48]) by gabe.freedesktop.org (Postfix) with ESMTP id D9B509E91F for ; Fri, 26 Oct 2012 13:54:09 -0700 (PDT) Received: by shiva.chad-versace.us (Postfix, from userid 1005) id 52EC3885AD; Fri, 26 Oct 2012 20:55:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on shiva.chad-versace.us X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=unavailable version=3.3.2 Received: from localhost.localdomain (c-71-237-161-120.hsd1.or.comcast.net [71.237.161.120]) by shiva.chad-versace.us (Postfix) with ESMTPSA id 176F8884E8; Fri, 26 Oct 2012 20:55:24 +0000 (UTC) From: Ben Widawsky To: intel-gfx@lists.freedesktop.org Date: Fri, 26 Oct 2012 13:54:24 -0700 Message-Id: <1351284864-3555-1-git-send-email-ben@bwidawsk.net> X-Mailer: git-send-email 1.8.0 In-Reply-To: <1350956055-3224-7-git-send-email-ben@bwidawsk.net> References: <1350956055-3224-7-git-send-email-ben@bwidawsk.net> Cc: Ben Widawsky Subject: [Intel-gfx] [PATCH 06/10 v2] drm/i915: Stop using AGP layer for GEN6+ X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org Errors-To: intel-gfx-bounces+patchwork-intel-gfx=patchwork.kernel.org@lists.freedesktop.org As a quick hack we make the old intel_gtt structure mutable so we can fool a bunch of the existing code which depends on elements in that data structure. We can/should try to remove this in a subsequent patch. This should preserve the old gtt init behavior which upon writing these patches seems incorrect. The next patch will fix these things. The one exception is VLV which doesn't have the preserved flush control write behavior. Since we want to do that for all GEN6+ stuff, we'll handle that in a later patch. Mainstream VLV support doesn't actually exist yet anyway. v2: Update the comment to remove the "voodoo" Check that the last pte written matches what we readback Signed-off-by: Ben Widawsky --- drivers/char/agp/intel-gtt.c | 2 +- drivers/gpu/drm/i915/i915_dma.c | 16 +-- drivers/gpu/drm/i915/i915_drv.h | 10 +- drivers/gpu/drm/i915/i915_gem.c | 12 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c | 2 +- drivers/gpu/drm/i915/i915_gem_gtt.c | 214 +++++++++++++++++++++++++++-- drivers/gpu/drm/i915/i915_reg.h | 6 + include/drm/intel-gtt.h | 3 +- 8 files changed, 234 insertions(+), 31 deletions(-) diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c index 38390f7..4dfbb80 100644 --- a/drivers/char/agp/intel-gtt.c +++ b/drivers/char/agp/intel-gtt.c @@ -1686,7 +1686,7 @@ int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev, } EXPORT_SYMBOL(intel_gmch_probe); -const struct intel_gtt *intel_gtt_get(void) +struct intel_gtt *intel_gtt_get(void) { return &intel_private.base; } diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index ffbc915..eda983b 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1492,19 +1492,9 @@ int i915_driver_load(struct drm_device *dev, unsigned long flags) goto free_priv; } - ret = intel_gmch_probe(dev_priv->bridge_dev, dev->pdev, NULL); - if (!ret) { - DRM_ERROR("failed to set up gmch\n"); - ret = -EIO; + ret = i915_gem_gtt_init(dev); + if (ret) goto put_bridge; - } - - dev_priv->mm.gtt = intel_gtt_get(); - if (!dev_priv->mm.gtt) { - DRM_ERROR("Failed to initialize GTT\n"); - ret = -ENODEV; - goto put_gmch; - } i915_kick_out_firmware_fb(dev_priv); @@ -1678,7 +1668,7 @@ out_mtrrfree: out_rmmap: pci_iounmap(dev->pdev, dev_priv->regs); put_gmch: - intel_gmch_remove(); + i915_gem_gtt_fini(dev); put_bridge: pci_dev_put(dev_priv->bridge_dev); free_priv: diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index bf628c4..534d282 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -687,7 +687,7 @@ typedef struct drm_i915_private { struct { /** Bridge to intel-gtt-ko */ - const struct intel_gtt *gtt; + struct intel_gtt *gtt; /** Memory allocator for GTT stolen memory */ struct drm_mm stolen; /** Memory allocator for GTT */ @@ -1507,6 +1507,14 @@ void i915_gem_init_global_gtt(struct drm_device *dev, unsigned long start, unsigned long mappable_end, unsigned long end); +int i915_gem_gtt_init(struct drm_device *dev); +void i915_gem_gtt_fini(struct drm_device *dev); +extern inline void i915_gem_chipset_flush(struct drm_device *dev) +{ + if (INTEL_INFO(dev)->gen < 6) + intel_gtt_chipset_flush(); +} + /* i915_gem_evict.c */ int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size, diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e04820c..ae3d4c1 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -846,12 +846,12 @@ out: * domain anymore. */ if (obj->base.write_domain != I915_GEM_DOMAIN_CPU) { i915_gem_clflush_object(obj); - intel_gtt_chipset_flush(); + i915_gem_chipset_flush(dev); } } if (needs_clflush_after) - intel_gtt_chipset_flush(); + i915_gem_chipset_flush(dev); return ret; } @@ -3059,7 +3059,7 @@ i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj) return; i915_gem_clflush_object(obj); - intel_gtt_chipset_flush(); + i915_gem_chipset_flush(obj->base.dev); old_write_domain = obj->base.write_domain; obj->base.write_domain = 0; @@ -3960,7 +3960,7 @@ i915_gem_init_hw(struct drm_device *dev) drm_i915_private_t *dev_priv = dev->dev_private; int ret; - if (!intel_enable_gtt()) + if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt()) return -EIO; if (IS_HASWELL(dev) && (I915_READ(0x120010) == 1)) @@ -4295,7 +4295,7 @@ void i915_gem_detach_phys_object(struct drm_device *dev, page_cache_release(page); } } - intel_gtt_chipset_flush(); + i915_gem_chipset_flush(dev); obj->phys_obj->cur_obj = NULL; obj->phys_obj = NULL; @@ -4382,7 +4382,7 @@ i915_gem_phys_pwrite(struct drm_device *dev, return -EFAULT; } - intel_gtt_chipset_flush(); + i915_gem_chipset_flush(dev); return 0; } diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 6a2f3e5..1af00b5 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -673,7 +673,7 @@ i915_gem_execbuffer_move_to_gpu(struct intel_ring_buffer *ring, } if (flush_domains & I915_GEM_DOMAIN_CPU) - intel_gtt_chipset_flush(); + i915_gem_chipset_flush(ring->dev); if (flush_domains & I915_GEM_DOMAIN_GTT) wmb(); diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c index 89c273f..96c8ef3 100644 --- a/drivers/gpu/drm/i915/i915_gem_gtt.c +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c @@ -305,13 +305,38 @@ static void undo_idling(struct drm_i915_private *dev_priv, bool interruptible) dev_priv->mm.interruptible = interruptible; } + +static void i915_ggtt_clear_range(struct drm_device *dev, + unsigned first_entry, + unsigned num_entries) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + gtt_pte_t scratch_pte; + volatile void __iomem *gtt_base = dev_priv->mm.gtt->gtt + first_entry; + const int max_entries = dev_priv->mm.gtt->gtt_total_entries - first_entry; + + if (INTEL_INFO(dev)->gen < 6) { + intel_gtt_clear_range(first_entry, num_entries); + return; + } + + if (WARN(num_entries > max_entries, + "First entry = %d; Num entries = %d (max=%d)\n", + first_entry, num_entries, max_entries)) + num_entries = max_entries; + + scratch_pte = pte_encode(dev, dev_priv->mm.gtt->scratch_page_dma, I915_CACHE_LLC); + memset_io(gtt_base, scratch_pte, num_entries * sizeof(scratch_pte)); + readl(gtt_base); +} + void i915_gem_restore_gtt_mappings(struct drm_device *dev) { struct drm_i915_private *dev_priv = dev->dev_private; struct drm_i915_gem_object *obj; /* First fill our portion of the GTT with scratch pages */ - intel_gtt_clear_range(dev_priv->mm.gtt_start / PAGE_SIZE, + i915_ggtt_clear_range(dev, dev_priv->mm.gtt_start / PAGE_SIZE, (dev_priv->mm.gtt_end - dev_priv->mm.gtt_start) / PAGE_SIZE); list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list) { @@ -319,7 +344,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev) i915_gem_gtt_bind_object(obj, obj->cache_level); } - intel_gtt_chipset_flush(); + i915_gem_chipset_flush(dev); } int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj) @@ -335,21 +360,70 @@ int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj) return 0; } +/* + * Binds an object into the global gtt with the specified cache level. The object + * will be accessible to the GPU via commands whose operands reference offsets + * within the global GTT as well as accessible by the GPU through the GMADR + * mapped BAR (dev_priv->mm.gtt->gtt). + */ +static void gen6_ggtt_bind_object(struct drm_i915_gem_object *obj, + enum i915_cache_level level) +{ + struct drm_device *dev = obj->base.dev; + struct drm_i915_private *dev_priv = dev->dev_private; + struct sg_table *st = obj->pages; + struct scatterlist *sg = st->sgl; + const int first_entry = obj->gtt_space->start >> PAGE_SHIFT; + const int max_entries = dev_priv->mm.gtt->gtt_total_entries - first_entry; + gtt_pte_t __iomem *gtt_entries = dev_priv->mm.gtt->gtt + first_entry; + int unused, i = 0; + unsigned int len, m = 0; + dma_addr_t addr; + + for_each_sg(st->sgl, sg, st->nents, unused) { + len = sg_dma_len(sg) >> PAGE_SHIFT; + for (m = 0; m < len; m++) { + addr = sg_dma_address(sg) + (m << PAGE_SHIFT); + gtt_entries[i] = pte_encode(dev, addr, level); + i++; + if (WARN_ON(i > max_entries)) + goto out; + } + } + +out: + if (WARN(i == 0, "No ptes updated for %p\n", obj)) + return; + + /* XXX: This serves as a posting read to make sure that the PTE has + * actually been updated. There is some concern that even though + * registers and PTEs are within the same BAR that they are potentially + * of NUMA access patterns. Therefore, even with the way we assume + * hardware should work, we must keep this posting read for paranoia. + */ + WARN_ON(readl(>t_entries[i-1]) != pte_encode(dev, addr, level)); +} + void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj, enum i915_cache_level cache_level) { struct drm_device *dev = obj->base.dev; - unsigned int agp_type = cache_level_to_agp_type(dev, cache_level); + if (INTEL_INFO(dev)->gen < 6) { + unsigned int agp_type = cache_level_to_agp_type(dev, cache_level); + intel_gtt_insert_sg_entries(obj->pages, + obj->gtt_space->start >> PAGE_SHIFT, + agp_type); + } else { + gen6_ggtt_bind_object(obj, cache_level); + } - intel_gtt_insert_sg_entries(obj->pages, - obj->gtt_space->start >> PAGE_SHIFT, - agp_type); obj->has_global_gtt_mapping = 1; } void i915_gem_gtt_unbind_object(struct drm_i915_gem_object *obj) { - intel_gtt_clear_range(obj->gtt_space->start >> PAGE_SHIFT, + i915_ggtt_clear_range(obj->base.dev, + obj->gtt_space->start >> PAGE_SHIFT, obj->base.size >> PAGE_SHIFT); obj->has_global_gtt_mapping = 0; @@ -407,5 +481,129 @@ void i915_gem_init_global_gtt(struct drm_device *dev, dev_priv->mm.mappable_gtt_total = min(end, mappable_end) - start; /* ... but ensure that we clear the entire range. */ - intel_gtt_clear_range(start / PAGE_SIZE, (end-start) / PAGE_SIZE); + i915_ggtt_clear_range(dev, start / PAGE_SIZE, (end-start) / PAGE_SIZE); +} + +static int setup_scratch_page(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + struct page *page; + dma_addr_t dma_addr; + + page = alloc_page(GFP_KERNEL | GFP_DMA32 | __GFP_ZERO); + if (page == NULL) + return -ENOMEM; + get_page(page); + set_pages_uc(page, 1); + +#ifdef CONFIG_INTEL_IOMMU + dma_addr = pci_map_page(dev->pdev, page, 0, PAGE_SIZE, + PCI_DMA_BIDIRECTIONAL); + if (pci_dma_mapping_error(dev->pdev, dma_addr)) + return -EINVAL; +#else + dma_addr = page_to_phys(page); +#endif + dev_priv->mm.gtt->scratch_page = page; + dev_priv->mm.gtt->scratch_page_dma = dma_addr; + + return 0; +} + +static void teardown_scratch_page(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + set_pages_wb(dev_priv->mm.gtt->scratch_page, 1); + pci_unmap_page(dev->pdev, dev_priv->mm.gtt->scratch_page_dma, + PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); + put_page(dev_priv->mm.gtt->scratch_page); + __free_page(dev_priv->mm.gtt->scratch_page); +} + +static inline unsigned int gen6_get_total_gtt_size(u16 snb_gmch_ctl) +{ + snb_gmch_ctl >>= SNB_GMCH_GGMS_SHIFT; + snb_gmch_ctl &= SNB_GMCH_GGMS_MASK; + return snb_gmch_ctl << 20; +} + +static inline unsigned int gen6_get_stolen_size(u16 snb_gmch_ctl) +{ + snb_gmch_ctl >>= SNB_GMCH_GMS_SHIFT; + snb_gmch_ctl &= SNB_GMCH_GMS_MASK; + return snb_gmch_ctl << 25; /* 32 MB units */ +} + +int i915_gem_gtt_init(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + phys_addr_t gtt_bus_addr; + u16 snb_gmch_ctl; + u32 tmp; + int ret; + + /* On modern platforms we need not worry ourself with the legacy + * hostbridge query stuff. Skip it entirely + */ + if (INTEL_INFO(dev)->gen < 6) { + ret = intel_gmch_probe(dev_priv->bridge_dev, dev->pdev, NULL); + if (!ret) { + DRM_ERROR("failed to set up gmch\n"); + return -EIO; + } + + dev_priv->mm.gtt = intel_gtt_get(); + if (!dev_priv->mm.gtt) { + DRM_ERROR("Failed to initialize GTT\n"); + intel_gmch_remove(); + return -ENODEV; + } + return 0; + } + + + dev_priv->mm.gtt = kzalloc(sizeof(*dev_priv->mm.gtt), GFP_KERNEL); + if (!dev_priv->mm.gtt) + return -ENOMEM; + + if (!pci_set_dma_mask(dev->pdev, DMA_BIT_MASK(40))) + pci_set_consistent_dma_mask(dev->pdev, DMA_BIT_MASK(40)); + + pci_read_config_dword(dev->pdev, PCI_BASE_ADDRESS_0, &tmp); + /* For GEN6+ the PTEs for the ggtt live at 2MB + BAR0 */ + gtt_bus_addr = (tmp & PCI_BASE_ADDRESS_MEM_MASK) + (2<<20); + + dev_priv->mm.gtt->gtt_mappable_entries = pci_resource_len(dev->pdev, 2) >> PAGE_SHIFT; + pci_read_config_dword(dev->pdev, PCI_BASE_ADDRESS_2, &tmp); + dev_priv->mm.gtt->gma_bus_addr = tmp & PCI_BASE_ADDRESS_MEM_MASK; + + /* i9xx_setup */ + pci_read_config_word(dev->pdev, SNB_GMCH_CTRL, &snb_gmch_ctl); + dev_priv->mm.gtt->gtt_total_entries = + gen6_get_total_gtt_size(snb_gmch_ctl) / sizeof(gtt_pte_t); + dev_priv->mm.gtt->stolen_size = gen6_get_stolen_size(snb_gmch_ctl); + + ret = setup_scratch_page(dev); + if (ret) + return ret; + + dev_priv->mm.gtt->gtt = ioremap(gtt_bus_addr, + dev_priv->mm.gtt->gtt_total_entries * sizeof(gtt_pte_t)); + + /* GMADR is the PCI aperture used by SW to access tiled GFX surfaces in a linear fashion. */ + DRM_DEBUG_DRIVER("GMADR size = %dM\n", dev_priv->mm.gtt->gtt_mappable_entries >> 8); + DRM_DEBUG_DRIVER("GTT total entries = %dK\n", dev_priv->mm.gtt->gtt_total_entries >> 10); + DRM_DEBUG_DRIVER("GTT stolen size = %dM\n", dev_priv->mm.gtt->stolen_size >> 20); + + return 0; +} + +void i915_gem_gtt_fini(struct drm_device *dev) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + iounmap(dev_priv->mm.gtt->gtt); + teardown_scratch_page(dev); + if (INTEL_INFO(dev)->gen < 6) + intel_gmch_remove(); + kfree(dev_priv->mm.gtt); } diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index f4d70b0..10a6e9b 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -40,6 +40,12 @@ */ #define INTEL_GMCH_CTRL 0x52 #define INTEL_GMCH_VGA_DISABLE (1 << 1) +#define SNB_GMCH_CTRL 0x50 +#define SNB_GMCH_GGMS_SHIFT 8 /* GTT Graphics Memory Size */ +#define SNB_GMCH_GGMS_MASK 0x3 +#define SNB_GMCH_GMS_SHIFT 3 /* Graphics Mode Select */ +#define SNB_GMCH_GMS_MASK 0x1f + /* PCI config space */ diff --git a/include/drm/intel-gtt.h b/include/drm/intel-gtt.h index 2e37e9f..94e8f2c 100644 --- a/include/drm/intel-gtt.h +++ b/include/drm/intel-gtt.h @@ -3,7 +3,7 @@ #ifndef _DRM_INTEL_GTT_H #define _DRM_INTEL_GTT_H -const struct intel_gtt { +struct intel_gtt { /* Size of memory reserved for graphics by the BIOS */ unsigned int stolen_size; /* Total number of gtt entries. */ @@ -17,6 +17,7 @@ const struct intel_gtt { unsigned int do_idle_maps : 1; /* Share the scratch page dma with ppgtts. */ dma_addr_t scratch_page_dma; + struct page *scratch_page; /* for ppgtt PDE access */ u32 __iomem *gtt; /* needed for ioremap in drm/i915 */