diff mbox

drm/i915: Redirect GTT mappings to the CPU page if cache-coherent

Message ID 1302719752-11605-1-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson April 13, 2011, 6:35 p.m. UTC
... or if we will need to perform a cache-flush on the object anyway.
Unless, of course, we need to use a fence register to perform tiling
operations during the transfer (in which case we are no longer on a
chipset for which we need to be extra careful not to write through the
GTT to a snooped page).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |   41 ++++++++++++++++++++++++++++++++++++++-
 1 files changed, 40 insertions(+), 1 deletions(-)

Comments

Daniel Vetter April 13, 2011, 7:13 p.m. UTC | #1
On Wed, Apr 13, 2011 at 07:35:52PM +0100, Chris Wilson wrote:
> ... or if we will need to perform a cache-flush on the object anyway.
> Unless, of course, we need to use a fence register to perform tiling
> operations during the transfer (in which case we are no longer on a
> chipset for which we need to be extra careful not to write through the
> GTT to a snooped page).

So either we are on snb and there gtt writes should work on llc cached
objects (otherwise we'll have a giant problem with uploads to tiled
buffers). On the other hand on pre-gen6 tiling on snooped mem doesn't work
and we have a few other restrictions like this here. So for that userspace
needs to be aware of what's going on, anyway. Hence we might as well
SIGBUS/disallow gtt mappings for such vmapped buffers and teach userspace
to use the cpu mappings (again).

I don't know but maybe using snooped buffers to directly write to vbos and
stuff like that is better on snb. Currently we're using pwrite everywhere,
so again a userspace changes seems required, why not use cpu mappings
directly?

Hence I'd like to weasel myself out from reviewing this: Do we really need
this complexity?
-Daniel
Chris Wilson April 13, 2011, 7:47 p.m. UTC | #2
On Wed, 13 Apr 2011 21:13:24 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> Hence I'd like to weasel myself out from reviewing this: Do we really need
> this complexity?

Good idea. At the moment I'd rather restrict this to being the minimum to
protect ourselves against future breakage and so killing the driver/app
with a SIGBUS for doing something illegal sounds sane.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8b3007c..3c7443d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1211,12 +1211,43 @@  int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 	trace_i915_gem_object_fault(obj, page_offset, true, write);
 
-	/* Now bind it into the GTT if needed */
 	if (!obj->map_and_fenceable) {
 		ret = i915_gem_object_unbind(obj);
 		if (ret)
 			goto unlock;
 	}
+
+	/* If it is unbound or we are currently writing through the CPU
+	 * domain, continue to do so.  On older chipsets it is
+	 * particularly important to avoid writing through the GTT to
+	 * snooped pages or face dire consequences. At least that's what
+	 * the docs say...
+	 */
+	if (obj->tiling_mode == I915_TILING_NONE &&
+	    (obj->cache_level != I915_CACHE_NONE ||
+	     obj->base.write_domain == I915_GEM_DOMAIN_CPU)) {
+		struct page *page;
+
+		ret = i915_gem_object_set_to_cpu_domain(obj, write);
+		if (ret)
+			goto unlock;
+
+		obj->dirty = 1;
+		obj->fault_mappable = true;
+		mutex_unlock(&dev->struct_mutex);
+
+		page = read_cache_page_gfp(obj->base.filp->f_path.dentry->d_inode->i_mapping,
+					   page_offset,
+					   GFP_HIGHUSER | __GFP_RECLAIMABLE);
+		if (IS_ERR(page)) {
+			ret = PTR_ERR(page);
+			goto out;
+		}
+
+		vmf->page = page;
+		return VM_FAULT_LOCKED;
+	}
+
 	if (!obj->gtt_space) {
 		ret = i915_gem_object_bind_to_gtt(obj, 0, true);
 		if (ret)
@@ -1699,6 +1730,11 @@  i915_gem_object_truncate(struct drm_i915_gem_object *obj)
 {
 	struct inode *inode;
 
+	/* We may have inserted the backing pages into our vma
+	 * when fulfilling a pagefault whilst in the CPU domain.
+	 */
+	i915_gem_release_mmap(obj);
+
 	/* Our goal here is to return as much of the memory as
 	 * is possible back to the system as we are called from OOM.
 	 * To do this we must instruct the shmfs to drop all of its
@@ -3691,6 +3727,9 @@  void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	if (obj->phys_obj)
 		i915_gem_detach_phys_object(dev, obj);
 
+	/* Discard all references to the backing storage for this object */
+	i915_gem_object_truncate(obj);
+
 	i915_gem_free_object_tail(obj);
 }