diff mbox

[6/6] drm/i915: Migrate stolen objects before hibernation

Message ID 1449665182-10054-7-git-send-email-ankitprasad.r.sharma@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

ankitprasad.r.sharma@intel.com Dec. 9, 2015, 12:46 p.m. UTC
From: Chris Wilson <chris@chris-wilson.co.uk>

Ville reminded us that stolen memory is not preserved across
hibernation, and a result of this was that context objects now being
allocated from stolen were being corrupted on S4 and promptly hanging
the GPU on resume.

We want to utilise stolen for as much as possible (nothing else will use
that wasted memory otherwise), so we need a strategy for handling
general objects allocated from stolen and hibernation. A simple solution
is to do a CPU copy through the GTT of the stolen object into a fresh
shmemfs backing store and thenceforth treat it as a normal objects. This
can be refined in future to either use a GPU copy to avoid the slow
uncached reads (though it's hibernation!) and recreate stolen objects
upon resume/first-use. For now, a simple approach should suffice for
testing the object migration.

v2:
Swap PTE for pinned bindings over to the shmemfs. This adds a
complicated dance, but is required as many stolen objects are likely to
be pinned for use by the hardware. Swapping the PTEs should not result
in externally visible behaviour, as each PTE update should be atomic and
the two pages identical. (danvet)

safe-by-default, or the principle of least surprise. We need a new flag
to mark objects that we can wilfully discard and recreate across
hibernation. (danvet)

Just use the global_list rather than invent a new stolen_list. This is
the slowpath hibernate and so adding a new list and the associated
complexity isn't worth it.

v3: Rebased on drm-intel-nightly (Ankit)

v4: Use insert_page to map stolen memory backed pages for migration to
shmem (Chris)

v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)

Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
 drivers/gpu/drm/i915/i915_drv.h         |   7 +
 drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_display.c    |   3 +
 drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
 drivers/gpu/drm/i915/intel_pm.c         |   2 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
 7 files changed, 261 insertions(+), 12 deletions(-)

Comments

Tvrtko Ursulin Dec. 9, 2015, 5:25 p.m. UTC | #1
Hi,

On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> Ville reminded us that stolen memory is not preserved across
> hibernation, and a result of this was that context objects now being
> allocated from stolen were being corrupted on S4 and promptly hanging
> the GPU on resume.
>
> We want to utilise stolen for as much as possible (nothing else will use
> that wasted memory otherwise), so we need a strategy for handling
> general objects allocated from stolen and hibernation. A simple solution
> is to do a CPU copy through the GTT of the stolen object into a fresh
> shmemfs backing store and thenceforth treat it as a normal objects. This
> can be refined in future to either use a GPU copy to avoid the slow
> uncached reads (though it's hibernation!) and recreate stolen objects
> upon resume/first-use. For now, a simple approach should suffice for
> testing the object migration.
>
> v2:
> Swap PTE for pinned bindings over to the shmemfs. This adds a
> complicated dance, but is required as many stolen objects are likely to
> be pinned for use by the hardware. Swapping the PTEs should not result
> in externally visible behaviour, as each PTE update should be atomic and
> the two pages identical. (danvet)
>
> safe-by-default, or the principle of least surprise. We need a new flag
> to mark objects that we can wilfully discard and recreate across
> hibernation. (danvet)
>
> Just use the global_list rather than invent a new stolen_list. This is
> the slowpath hibernate and so adding a new list and the associated
> complexity isn't worth it.
>
> v3: Rebased on drm-intel-nightly (Ankit)
>
> v4: Use insert_page to map stolen memory backed pages for migration to
> shmem (Chris)
>
> v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
>   drivers/gpu/drm/i915/i915_drv.h         |   7 +
>   drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/intel_display.c    |   3 +
>   drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
>   drivers/gpu/drm/i915/intel_pm.c         |   2 +
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
>   7 files changed, 261 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 9f55209..2bb9e9e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
>   	return i915_drm_suspend(drm_dev);
>   }
>
> +static int i915_pm_freeze(struct device *dev)
> +{
> +	int ret;
> +
> +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
> +	if (ret)
> +		return ret;

Can we distinguish between S3 and S4 if the stolen corruption only 
happens in S4? Not to spend all the extra effort for nothing in S3? Or 
maybe this is not even called for S3?

> +
> +	ret = i915_pm_suspend(dev);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
>   static int i915_pm_suspend_late(struct device *dev)
>   {
>   	struct drm_device *drm_dev = dev_to_i915(dev)->dev;
> @@ -1700,7 +1715,7 @@ static const struct dev_pm_ops i915_pm_ops = {
>   	 * @restore, @restore_early : called after rebooting and restoring the
>   	 *                            hibernation image [PMSG_RESTORE]
>   	 */
> -	.freeze = i915_pm_suspend,
> +	.freeze = i915_pm_freeze,
>   	.freeze_late = i915_pm_suspend_late,
>   	.thaw_early = i915_pm_resume_early,
>   	.thaw = i915_pm_resume,
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e0b09b0..0d18b07 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2080,6 +2080,12 @@ struct drm_i915_gem_object {
>   	 * Advice: are the backing pages purgeable?
>   	 */
>   	unsigned int madv:2;
> +	/**
> +	 * Whereas madv is for userspace, there are certain situations
> +	 * where we want I915_MADV_DONTNEED behaviour on internal objects
> +	 * without conflating the userspace setting.
> +	 */
> +	unsigned int internal_volatile:1;
>
>   	/**
>   	 * Current tiling mode for the object.
> @@ -3006,6 +3012,7 @@ int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
>   void i915_gem_init_swizzling(struct drm_device *dev);
>   void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
>   int __must_check i915_gpu_idle(struct drm_device *dev);
> +int __must_check i915_gem_freeze(struct drm_device *dev);
>   int __must_check i915_gem_suspend(struct drm_device *dev);
>   void __i915_add_request(struct drm_i915_gem_request *req,
>   			struct drm_i915_gem_object *batch_obj,
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 68ed67a..1f134b0 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4511,12 +4511,27 @@ static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
>   	.put_pages = i915_gem_object_put_pages_gtt,
>   };
>
> +static struct address_space *
> +i915_gem_set_inode_gfp(struct drm_device *dev, struct file *file)
> +{
> +	struct address_space *mapping = file_inode(file)->i_mapping;
> +	gfp_t mask;
> +
> +	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> +	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
> +		/* 965gm cannot relocate objects above 4GiB. */
> +		mask &= ~__GFP_HIGHMEM;
> +		mask |= __GFP_DMA32;
> +	}
> +	mapping_set_gfp_mask(mapping, mask);
> +
> +	return mapping;
> +}
> +
>   struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
>   						  size_t size)
>   {
>   	struct drm_i915_gem_object *obj;
> -	struct address_space *mapping;
> -	gfp_t mask;
>   	int ret;
>
>   	obj = i915_gem_object_alloc(dev);
> @@ -4529,15 +4544,7 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
>   		return ERR_PTR(ret);
>   	}
>
> -	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> -	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
> -		/* 965gm cannot relocate objects above 4GiB. */
> -		mask &= ~__GFP_HIGHMEM;
> -		mask |= __GFP_DMA32;
> -	}
> -
> -	mapping = file_inode(obj->base.filp)->i_mapping;
> -	mapping_set_gfp_mask(mapping, mask);
> +	i915_gem_set_inode_gfp(dev, obj->base.filp);
>
>   	i915_gem_object_init(obj, &i915_gem_object_ops);
>
> @@ -4714,6 +4721,209 @@ i915_gem_stop_ringbuffers(struct drm_device *dev)
>   		dev_priv->gt.stop_ring(ring);
>   }
>
> +static int
> +i915_gem_object_migrate_stolen_to_shmemfs(struct drm_i915_gem_object *obj)
> +{

Some documentation for this function would be good.

> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +	struct i915_vma *vma, *vn;
> +	struct drm_mm_node node;
> +	struct file *file;
> +	struct address_space *mapping;
> +	struct sg_table *stolen_pages, *shmemfs_pages;
> +	int ret, i;
> +
> +	if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
> +		return -EINVAL;

I am no expert in hibernation or swizzling but this looks really bad to me.

It is both platform and user controlled and it will cause hibernation to 
fail in a very noisy way, correct?

At least it needs to be WARN_ON_ONCE, but if my thinking is correct it 
should really be that either:

a) hibernation is prevented in a quieter way (DRM_ERROR, once) 
altogether when dev_priv->mm.bit_6_swizzle_x == 
I915_BIT_6_SWIZZLE_9_10_17, or

b) set_tiling fails on the same platforms which also support hibernation.

Comments?

> +
> +	ret = i915_gem_object_set_to_gtt_domain(obj, false);
> +	if (ret)
> +		return ret;
> +
> +	file = shmem_file_setup("drm mm object", obj->base.size, VM_NORESERVE);
> +	if (IS_ERR(file))
> +		return PTR_ERR(file);
> +	mapping = i915_gem_set_inode_gfp(obj->base.dev, file);
> +
> +	list_for_each_entry_safe(vma, vn, &obj->vma_list, vma_link)
> +		if (i915_vma_unbind(vma))
> +			continue;
> +
> +	if (obj->madv != I915_MADV_WILLNEED && list_empty(&obj->vma_list)) {
> +		/* Discard the stolen reservation, and replace with
> +		 * an unpopulated shmemfs object.
> +		 */
> +		obj->madv = __I915_MADV_PURGED;
> +		goto swap_pages;
> +	}

Maybe put a comment before this block saying "no need to copy 
content/something for objects...", if I got it right.

> +
> +	/* stolen objects are already pinned to prevent shrinkage */
> +	memset(&node, 0, sizeof(node));
> +	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm,
> +						  &node,
> +						  4096, 0, I915_CACHE_NONE,
> +						  0, i915->gtt.mappable_end,
> +						  DRM_MM_SEARCH_DEFAULT,
> +						  DRM_MM_CREATE_DEFAULT);
> +	if (ret)
> +		return ret;

If there is a likelyhood global gtt can be full would it be worth it 
trying to evict something before attempting hibernation?

Also leaks file.

> +
> +	for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
> +		struct page *page;
> +		void *__iomem src;
> +		void *dst;
> +
> +		wmb();

What is this one for? If it is for the memcpy_fromio would it be more 
obvious to put it after that call?

> +		i915->gtt.base.insert_page(&i915->gtt.base,
> +					   i915_gem_object_get_dma_address(obj, i),
> +					   node.start,
> +					   I915_CACHE_NONE,
> +					   0);
> +		wmb();
> +
> +		page = shmem_read_mapping_page(mapping, i);
> +		if (IS_ERR(page)) {
> +			ret = PTR_ERR(page);
> +			goto err_node;
> +		}
> +
> +		src = io_mapping_map_atomic_wc(i915->gtt.mappable, node.start + PAGE_SIZE * i);
> +		dst = kmap_atomic(page);
> +		memcpy_fromio(dst, src, PAGE_SIZE);
> +		kunmap_atomic(dst);
> +		io_mapping_unmap_atomic(src);
> +
> +		page_cache_release(page);

I assume shmem_file_setup takes one reference to each page, 
shmem_read_mapping_page another and then here we release that extra one? Or?

> +	}
> +
> +	wmb();
> +	i915->gtt.base.clear_range(&i915->gtt.base,
> +				   node.start, node.size,
> +				   true);
> +	drm_mm_remove_node(&node);

Maybe move the whole copy content loop into a helper for readability?

> +
> +swap_pages:
> +	stolen_pages = obj->pages;
> +	obj->pages = NULL;
> +
> +	obj->base.filp = file;
> +	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
> +	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
> +
> +	/* Recreate any pinned binding with pointers to the new storage */
> +	if (!list_empty(&obj->vma_list)) {
> +		ret = i915_gem_object_get_pages_gtt(obj);
> +		if (ret) {
> +			obj->pages = stolen_pages;
> +			goto err_file;
> +		}
> +
> +		ret = i915_gem_object_set_to_gtt_domain(obj, true);
> +		if (ret) {
> +			i915_gem_object_put_pages_gtt(obj);
> +			obj->pages = stolen_pages;
> +			goto err_file;
> +		}
> +
> +		obj->get_page.sg = obj->pages->sgl;
> +		obj->get_page.last = 0;
> +
> +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +			if (!drm_mm_node_allocated(&vma->node))
> +				continue;
> +
> +			WARN_ON(i915_vma_bind(vma,
> +					      obj->cache_level,
> +					      PIN_UPDATE));
> +		}
> +	} else
> +		list_del(&obj->global_list);

Hm, can it be bound if there were no VMAs?

> +
> +	/* drop the stolen pin and backing */
> +	shmemfs_pages = obj->pages;
> +	obj->pages = stolen_pages;
> +
> +	i915_gem_object_unpin_pages(obj);
> +	obj->ops->put_pages(obj);
> +	if (obj->ops->release)
> +		obj->ops->release(obj);
> +
> +	obj->ops = &i915_gem_object_ops;
> +	obj->pages = shmemfs_pages;
> +
> +	return 0;
> +
> +err_node:
> +	wmb();
> +	i915->gtt.base.clear_range(&i915->gtt.base,
> +				   node.start, node.size,
> +				   true);
> +	drm_mm_remove_node(&node);
> +err_file:
> +	fput(file);
> +	obj->base.filp = NULL;
> +	return ret;
> +}
> +
> +int
> +i915_gem_freeze(struct drm_device *dev)
> +{
> +	/* Called before i915_gem_suspend() when hibernating */
> +	struct drm_i915_private *i915 = to_i915(dev);
> +	struct drm_i915_gem_object *obj, *tmp;
> +	struct list_head *phase[] = {
> +		&i915->mm.unbound_list, &i915->mm.bound_list, NULL
> +	}, **p;
> +	int ret;
> +
> +	ret = i915_mutex_lock_interruptible(dev);
> +	if (ret)
> +		return ret;
> +	/* Across hibernation, the stolen area is not preserved.
> +	 * Anything inside stolen must copied back to normal
> +	 * memory if we wish to preserve it.
> +	 */
> +	for (p = phase; *p; p++) {
> +		struct list_head migrate;
> +		int ret;
> +
> +		INIT_LIST_HEAD(&migrate);
> +		list_for_each_entry_safe(obj, tmp, *p, global_list) {
> +			if (obj->stolen == NULL)
> +				continue;
> +
> +			if (obj->internal_volatile)
> +				continue;
> +
> +			/* In the general case, this object may only be alive
> +			 * due to an active reference, and that may disappear
> +			 * when we unbind any of the objects (and so wait upon
> +			 * the GPU and retire requests). To prevent one of the
> +			 * objects from disappearing beneath us, we need to
> +			 * take a reference to each as we build the migration
> +			 * list.
> +			 *
> +			 * This is similar to the strategy required whilst
> +			 * shrinking or evicting objects (for the same reason).
> +			 */
> +			drm_gem_object_reference(&obj->base);
> +			list_move(&obj->global_list, &migrate);
> +		}
> +
> +		ret = 0;
> +		list_for_each_entry_safe(obj, tmp, &migrate, global_list) {
> +			if (ret == 0)
> +				ret = i915_gem_object_migrate_stolen_to_shmemfs(obj);
> +			drm_gem_object_unreference(&obj->base);
> +		}
> +		list_splice(&migrate, *p);

Hmmm are this some clever games with obj->global_list ?

> +		if (ret)
> +			break;
> +	}
> +
> +	mutex_unlock(&dev->struct_mutex);
> +	return ret;
> +}
> +
>   int
>   i915_gem_suspend(struct drm_device *dev)
>   {
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index f281e0b..0803922 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -2549,6 +2549,9 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
>   	if (IS_ERR(obj))
>   		return false;
>
> +	/* Not to be preserved across hibernation */
> +	obj->internal_volatile = true;
> +
>   	obj->tiling_mode = plane_config->tiling;
>   	if (obj->tiling_mode == I915_TILING_X)
>   		obj->stride = fb->pitches[0];
> diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> index f43681e..1d89253 100644
> --- a/drivers/gpu/drm/i915/intel_fbdev.c
> +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> @@ -154,6 +154,12 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
>   		goto out;
>   	}
>
> +	/* Discard the contents of the BIOS fb across hibernation.
> +	 * We really want to completely throwaway the earlier fbdev
> +	 * and reconfigure it anyway.
> +	 */
> +	obj->internal_volatile = true;
> +
>   	fb = __intel_framebuffer_create(dev, &mode_cmd, obj);
>   	if (IS_ERR(fb)) {
>   		ret = PTR_ERR(fb);
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 03ad276..6ddc20a 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -5181,6 +5181,8 @@ static void valleyview_setup_pctx(struct drm_device *dev)
>   	I915_WRITE(VLV_PCBR, pctx_paddr);
>
>   out:
> +	/* The power context need not be preserved across hibernation */
> +	pctx->internal_volatile = true;
>   	DRM_DEBUG_DRIVER("PCBR: 0x%08x\n", I915_READ(VLV_PCBR));
>   	dev_priv->vlv_pctx = pctx;
>   }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 5eabaf6..370d96a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2090,6 +2090,12 @@ static int intel_alloc_ringbuffer_obj(struct drm_device *dev,
>   	if (IS_ERR(obj))
>   		return PTR_ERR(obj);
>
> +	/* Ringbuffer objects are by definition volatile - only the commands
> +	 * between HEAD and TAIL need to be preserved and whilst there are
> +	 * any commands there, the ringbuffer is pinned by activity.
> +	 */
> +	obj->internal_volatile = true;
> +

What does this mean? It gets correctly re-initialized by existing code 
on resume? Don't see anythign specific about HEAD and TAIL in this patch.

>   	/* mark ring buffers as read-only from GPU side by default */
>   	obj->gt_ro = 1;
>
>

Regards,

Tvrtko
Ville Syrjälä Dec. 9, 2015, 7:24 p.m. UTC | #2
On Wed, Dec 09, 2015 at 05:25:19PM +0000, Tvrtko Ursulin wrote:
> 
> Hi,
> 
> On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > Ville reminded us that stolen memory is not preserved across
> > hibernation, and a result of this was that context objects now being
> > allocated from stolen were being corrupted on S4 and promptly hanging
> > the GPU on resume.
> >
> > We want to utilise stolen for as much as possible (nothing else will use
> > that wasted memory otherwise), so we need a strategy for handling
> > general objects allocated from stolen and hibernation. A simple solution
> > is to do a CPU copy through the GTT of the stolen object into a fresh
> > shmemfs backing store and thenceforth treat it as a normal objects. This
> > can be refined in future to either use a GPU copy to avoid the slow
> > uncached reads (though it's hibernation!) and recreate stolen objects
> > upon resume/first-use. For now, a simple approach should suffice for
> > testing the object migration.
> >
> > v2:
> > Swap PTE for pinned bindings over to the shmemfs. This adds a
> > complicated dance, but is required as many stolen objects are likely to
> > be pinned for use by the hardware. Swapping the PTEs should not result
> > in externally visible behaviour, as each PTE update should be atomic and
> > the two pages identical. (danvet)
> >
> > safe-by-default, or the principle of least surprise. We need a new flag
> > to mark objects that we can wilfully discard and recreate across
> > hibernation. (danvet)
> >
> > Just use the global_list rather than invent a new stolen_list. This is
> > the slowpath hibernate and so adding a new list and the associated
> > complexity isn't worth it.
> >
> > v3: Rebased on drm-intel-nightly (Ankit)
> >
> > v4: Use insert_page to map stolen memory backed pages for migration to
> > shmem (Chris)
> >
> > v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
> >   drivers/gpu/drm/i915/i915_drv.h         |   7 +
> >   drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
> >   drivers/gpu/drm/i915/intel_display.c    |   3 +
> >   drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
> >   drivers/gpu/drm/i915/intel_pm.c         |   2 +
> >   drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
> >   7 files changed, 261 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 9f55209..2bb9e9e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
> >   	return i915_drm_suspend(drm_dev);
> >   }
> >
> > +static int i915_pm_freeze(struct device *dev)
> > +{
> > +	int ret;
> > +
> > +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
> > +	if (ret)
> > +		return ret;
> 
> Can we distinguish between S3 and S4 if the stolen corruption only 
> happens in S4? Not to spend all the extra effort for nothing in S3? Or 
> maybe this is not even called for S3?

The hook is only for hibernation as explained in the nice comment
Imre added next to the function pointer assignments.

It actually gets called for both the freeze and quiesce transitions.
We should only need it for freeze. I'm not sure if the PMSG_ thing
gets stored anywhere that we could look it up and skip this for 
quiesce. And not sure if ayone really cares that much. I don't,
since I don't even load i915 for the loader kernel.

https://bugs.freedesktop.org/show_bug.cgi?id=91295 actually says
we might need this for S3 too if rabidstart is enabled. I have a
laptop that supports it, but I don't have a clue how what kind of
partition it would need. Not that I would be willing to repartition
the disk anyway. Judging by drivers/platform/x86/intel-rst.c,
maybe we could just look for the INT3392 ACPI device, or something?
Dave Gordon Dec. 9, 2015, 7:35 p.m. UTC | #3
On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> Ville reminded us that stolen memory is not preserved across
> hibernation, and a result of this was that context objects now being
> allocated from stolen were being corrupted on S4 and promptly hanging
> the GPU on resume.
>
> We want to utilise stolen for as much as possible (nothing else will use
> that wasted memory otherwise), so we need a strategy for handling
> general objects allocated from stolen and hibernation. A simple solution
> is to do a CPU copy through the GTT of the stolen object into a fresh
> shmemfs backing store and thenceforth treat it as a normal objects. This
> can be refined in future to either use a GPU copy to avoid the slow
> uncached reads (though it's hibernation!) and recreate stolen objects
> upon resume/first-use. For now, a simple approach should suffice for
> testing the object migration.
>
> v2:
> Swap PTE for pinned bindings over to the shmemfs. This adds a
> complicated dance, but is required as many stolen objects are likely to
> be pinned for use by the hardware. Swapping the PTEs should not result
> in externally visible behaviour, as each PTE update should be atomic and
> the two pages identical. (danvet)
>
> safe-by-default, or the principle of least surprise. We need a new flag
> to mark objects that we can wilfully discard and recreate across
> hibernation. (danvet)
>
> Just use the global_list rather than invent a new stolen_list. This is
> the slowpath hibernate and so adding a new list and the associated
> complexity isn't worth it.
>
> v3: Rebased on drm-intel-nightly (Ankit)
>
> v4: Use insert_page to map stolen memory backed pages for migration to
> shmem (Chris)
>
> v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
>   drivers/gpu/drm/i915/i915_drv.h         |   7 +
>   drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/intel_display.c    |   3 +
>   drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
>   drivers/gpu/drm/i915/intel_pm.c         |   2 +
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
>   7 files changed, 261 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 9f55209..2bb9e9e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
>   	return i915_drm_suspend(drm_dev);
>   }
>
> +static int i915_pm_freeze(struct device *dev)
> +{
> +	int ret;
> +
> +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
> +	if (ret)
> +		return ret;
> +
> +	ret = i915_pm_suspend(dev);
> +	if (ret)
> +		return ret;
> +
> +	return 0;
> +}
> +
>   static int i915_pm_suspend_late(struct device *dev)
>   {
>   	struct drm_device *drm_dev = dev_to_i915(dev)->dev;
> @@ -1700,7 +1715,7 @@ static const struct dev_pm_ops i915_pm_ops = {
>   	 * @restore, @restore_early : called after rebooting and restoring the
>   	 *                            hibernation image [PMSG_RESTORE]
>   	 */
> -	.freeze = i915_pm_suspend,
> +	.freeze = i915_pm_freeze,
>   	.freeze_late = i915_pm_suspend_late,
>   	.thaw_early = i915_pm_resume_early,
>   	.thaw = i915_pm_resume,
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e0b09b0..0d18b07 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2080,6 +2080,12 @@ struct drm_i915_gem_object {
>   	 * Advice: are the backing pages purgeable?
>   	 */
>   	unsigned int madv:2;
> +	/**
> +	 * Whereas madv is for userspace, there are certain situations
> +	 * where we want I915_MADV_DONTNEED behaviour on internal objects
> +	 * without conflating the userspace setting.
> +	 */
> +	unsigned int internal_volatile:1;

Does this new flag need to be examined by other code that currently 
checks 'madv', e.g. put_pages() ? Or does this indicate 
not-really-volatile-in-normal-use-only-across-hibernation ?

.Dave.
Tvrtko Ursulin Dec. 10, 2015, 9:43 a.m. UTC | #4
Hi,

Two more comments below:

On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
>
> Ville reminded us that stolen memory is not preserved across
> hibernation, and a result of this was that context objects now being
> allocated from stolen were being corrupted on S4 and promptly hanging
> the GPU on resume.
>
> We want to utilise stolen for as much as possible (nothing else will use
> that wasted memory otherwise), so we need a strategy for handling
> general objects allocated from stolen and hibernation. A simple solution
> is to do a CPU copy through the GTT of the stolen object into a fresh
> shmemfs backing store and thenceforth treat it as a normal objects. This
> can be refined in future to either use a GPU copy to avoid the slow
> uncached reads (though it's hibernation!) and recreate stolen objects
> upon resume/first-use. For now, a simple approach should suffice for
> testing the object migration.

Mention of "testing" in the commit message and absence of a path to 
migrate the objects back to stolen memory on resume makes me think this 
is kind of half finished and note really ready for review / merge ?

Because I don't see how it is useful to migrate it one way and never 
move back?

>
> v2:
> Swap PTE for pinned bindings over to the shmemfs. This adds a
> complicated dance, but is required as many stolen objects are likely to
> be pinned for use by the hardware. Swapping the PTEs should not result
> in externally visible behaviour, as each PTE update should be atomic and
> the two pages identical. (danvet)
>
> safe-by-default, or the principle of least surprise. We need a new flag
> to mark objects that we can wilfully discard and recreate across
> hibernation. (danvet)
>
> Just use the global_list rather than invent a new stolen_list. This is
> the slowpath hibernate and so adding a new list and the associated
> complexity isn't worth it.
>
> v3: Rebased on drm-intel-nightly (Ankit)
>
> v4: Use insert_page to map stolen memory backed pages for migration to
> shmem (Chris)
>
> v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
>   drivers/gpu/drm/i915/i915_drv.h         |   7 +
>   drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/intel_display.c    |   3 +
>   drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
>   drivers/gpu/drm/i915/intel_pm.c         |   2 +
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
>   7 files changed, 261 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 9f55209..2bb9e9e 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
>   	return i915_drm_suspend(drm_dev);
>   }
>
> +static int i915_pm_freeze(struct device *dev)
> +{
> +	int ret;
> +
> +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
> +	if (ret)
> +		return ret;

One of the first steps in idling GEM seems to be idling the GPU and 
retiring requests.

Would it also make sense to do those steps before attempting to migrate 
the stolen objects?

Regards,

Tvrtko
ankitprasad.r.sharma@intel.com Dec. 10, 2015, 1:17 p.m. UTC | #5
On Wed, 2015-12-09 at 17:25 +0000, Tvrtko Ursulin wrote:
> Hi,
> 
> On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > Ville reminded us that stolen memory is not preserved across
> > hibernation, and a result of this was that context objects now being
> > allocated from stolen were being corrupted on S4 and promptly hanging
> > the GPU on resume.
> >
> > We want to utilise stolen for as much as possible (nothing else will use
> > that wasted memory otherwise), so we need a strategy for handling
> > general objects allocated from stolen and hibernation. A simple solution
> > is to do a CPU copy through the GTT of the stolen object into a fresh
> > shmemfs backing store and thenceforth treat it as a normal objects. This
> > can be refined in future to either use a GPU copy to avoid the slow
> > uncached reads (though it's hibernation!) and recreate stolen objects
> > upon resume/first-use. For now, a simple approach should suffice for
> > testing the object migration.
> >
> > v2:
> > Swap PTE for pinned bindings over to the shmemfs. This adds a
> > complicated dance, but is required as many stolen objects are likely to
> > be pinned for use by the hardware. Swapping the PTEs should not result
> > in externally visible behaviour, as each PTE update should be atomic and
> > the two pages identical. (danvet)
> >
> > safe-by-default, or the principle of least surprise. We need a new flag
> > to mark objects that we can wilfully discard and recreate across
> > hibernation. (danvet)
> >
> > Just use the global_list rather than invent a new stolen_list. This is
> > the slowpath hibernate and so adding a new list and the associated
> > complexity isn't worth it.
> >
> > v3: Rebased on drm-intel-nightly (Ankit)
> >
> > v4: Use insert_page to map stolen memory backed pages for migration to
> > shmem (Chris)
> >
> > v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
> >   drivers/gpu/drm/i915/i915_drv.h         |   7 +
> >   drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
> >   drivers/gpu/drm/i915/intel_display.c    |   3 +
> >   drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
> >   drivers/gpu/drm/i915/intel_pm.c         |   2 +
> >   drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
> >   7 files changed, 261 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 9f55209..2bb9e9e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
> >   	return i915_drm_suspend(drm_dev);
> >   }
> >
> > +static int i915_pm_freeze(struct device *dev)
> > +{
> > +	int ret;
> > +
> > +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
> > +	if (ret)
> > +		return ret;
> 
> Can we distinguish between S3 and S4 if the stolen corruption only 
> happens in S4? Not to spend all the extra effort for nothing in S3? Or 
> maybe this is not even called for S3?
For S3, i915_pm_suspend will be called. 
i915_pm_freeze will be called in the hibernation (which corresponds to
S4?)

> 
> > +
> > +	ret = i915_pm_suspend(dev);
> > +	if (ret)
> > +		return ret;
> > +
> > +	return 0;
> > +}
> > +
> >   static int i915_pm_suspend_late(struct device *dev)
> >   {
> >   	struct drm_device *drm_dev = dev_to_i915(dev)->dev;
> > @@ -1700,7 +1715,7 @@ static const struct dev_pm_ops i915_pm_ops = {
> >   	 * @restore, @restore_early : called after rebooting and restoring the
> >   	 *                            hibernation image [PMSG_RESTORE]
> >   	 */
> > -	.freeze = i915_pm_suspend,
> > +	.freeze = i915_pm_freeze,
> >   	.freeze_late = i915_pm_suspend_late,
> >   	.thaw_early = i915_pm_resume_early,
> >   	.thaw = i915_pm_resume,
> > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> > index e0b09b0..0d18b07 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.h
> > +++ b/drivers/gpu/drm/i915/i915_drv.h
> > @@ -2080,6 +2080,12 @@ struct drm_i915_gem_object {
> >   	 * Advice: are the backing pages purgeable?
> >   	 */
> >   	unsigned int madv:2;
> > +	/**
> > +	 * Whereas madv is for userspace, there are certain situations
> > +	 * where we want I915_MADV_DONTNEED behaviour on internal objects
> > +	 * without conflating the userspace setting.
> > +	 */
> > +	unsigned int internal_volatile:1;
> >
> >   	/**
> >   	 * Current tiling mode for the object.
> > @@ -3006,6 +3012,7 @@ int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
> >   void i915_gem_init_swizzling(struct drm_device *dev);
> >   void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
> >   int __must_check i915_gpu_idle(struct drm_device *dev);
> > +int __must_check i915_gem_freeze(struct drm_device *dev);
> >   int __must_check i915_gem_suspend(struct drm_device *dev);
> >   void __i915_add_request(struct drm_i915_gem_request *req,
> >   			struct drm_i915_gem_object *batch_obj,
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index 68ed67a..1f134b0 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -4511,12 +4511,27 @@ static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
> >   	.put_pages = i915_gem_object_put_pages_gtt,
> >   };
> >
> > +static struct address_space *
> > +i915_gem_set_inode_gfp(struct drm_device *dev, struct file *file)
> > +{
> > +	struct address_space *mapping = file_inode(file)->i_mapping;
> > +	gfp_t mask;
> > +
> > +	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> > +	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
> > +		/* 965gm cannot relocate objects above 4GiB. */
> > +		mask &= ~__GFP_HIGHMEM;
> > +		mask |= __GFP_DMA32;
> > +	}
> > +	mapping_set_gfp_mask(mapping, mask);
> > +
> > +	return mapping;
> > +}
> > +
> >   struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
> >   						  size_t size)
> >   {
> >   	struct drm_i915_gem_object *obj;
> > -	struct address_space *mapping;
> > -	gfp_t mask;
> >   	int ret;
> >
> >   	obj = i915_gem_object_alloc(dev);
> > @@ -4529,15 +4544,7 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
> >   		return ERR_PTR(ret);
> >   	}
> >
> > -	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
> > -	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
> > -		/* 965gm cannot relocate objects above 4GiB. */
> > -		mask &= ~__GFP_HIGHMEM;
> > -		mask |= __GFP_DMA32;
> > -	}
> > -
> > -	mapping = file_inode(obj->base.filp)->i_mapping;
> > -	mapping_set_gfp_mask(mapping, mask);
> > +	i915_gem_set_inode_gfp(dev, obj->base.filp);
> >
> >   	i915_gem_object_init(obj, &i915_gem_object_ops);
> >
> > @@ -4714,6 +4721,209 @@ i915_gem_stop_ringbuffers(struct drm_device *dev)
> >   		dev_priv->gt.stop_ring(ring);
> >   }
> >
> > +static int
> > +i915_gem_object_migrate_stolen_to_shmemfs(struct drm_i915_gem_object *obj)
> > +{
> 
> Some documentation for this function would be good.
Yes, will do it.
> 
> > +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> > +	struct i915_vma *vma, *vn;
> > +	struct drm_mm_node node;
> > +	struct file *file;
> > +	struct address_space *mapping;
> > +	struct sg_table *stolen_pages, *shmemfs_pages;
> > +	int ret, i;
> > +
> > +	if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
> > +		return -EINVAL;
> 
> I am no expert in hibernation or swizzling but this looks really bad to me.
> 
> It is both platform and user controlled and it will cause hibernation to 
> fail in a very noisy way, correct?
> 
> At least it needs to be WARN_ON_ONCE, but if my thinking is correct it 
> should really be that either:
> 
> a) hibernation is prevented in a quieter way (DRM_ERROR, once) 
> altogether when dev_priv->mm.bit_6_swizzle_x == 
> I915_BIT_6_SWIZZLE_9_10_17, or
> 
> b) set_tiling fails on the same platforms which also support hibernation.
> 
Either we can disallow the set_tiling call if both swizzling and
hibernation is allowed on the platform or we can exit quietly.
Chris can further suggest on this.
> Comments?
> 
> > +
> > +	ret = i915_gem_object_set_to_gtt_domain(obj, false);
> > +	if (ret)
> > +		return ret;
> > +
> > +	file = shmem_file_setup("drm mm object", obj->base.size, VM_NORESERVE);
> > +	if (IS_ERR(file))
> > +		return PTR_ERR(file);
> > +	mapping = i915_gem_set_inode_gfp(obj->base.dev, file);
> > +
> > +	list_for_each_entry_safe(vma, vn, &obj->vma_list, vma_link)
> > +		if (i915_vma_unbind(vma))
> > +			continue;
> > +
> > +	if (obj->madv != I915_MADV_WILLNEED && list_empty(&obj->vma_list)) {
> > +		/* Discard the stolen reservation, and replace with
> > +		 * an unpopulated shmemfs object.
> > +		 */
> > +		obj->madv = __I915_MADV_PURGED;
> > +		goto swap_pages;
> > +	}
> 
> Maybe put a comment before this block saying "no need to copy 
> content/something for objects...", if I got it right.
> 
> > +
> > +	/* stolen objects are already pinned to prevent shrinkage */
> > +	memset(&node, 0, sizeof(node));
> > +	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm,
> > +						  &node,
> > +						  4096, 0, I915_CACHE_NONE,
> > +						  0, i915->gtt.mappable_end,
> > +						  DRM_MM_SEARCH_DEFAULT,
> > +						  DRM_MM_CREATE_DEFAULT);
> > +	if (ret)
> > +		return ret;
> 
> If there is a likelyhood global gtt can be full would it be worth it 
> trying to evict something before attempting hibernation?
Yes, but it is very unlikely to happen.
> 
> Also leaks file.
Yes, will fix this.
> 
> > +
> > +	for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
> > +		struct page *page;
> > +		void *__iomem src;
> > +		void *dst;
> > +
> > +		wmb();
> 
> What is this one for? If it is for the memcpy_fromio would it be more 
> obvious to put it after that call?
As replied in a separate mail, this is to ensure following program order
strictly, minimal reordering during this loop.
> 
> > +		i915->gtt.base.insert_page(&i915->gtt.base,
> > +					   i915_gem_object_get_dma_address(obj, i),
> > +					   node.start,
> > +					   I915_CACHE_NONE,
> > +					   0);
> > +		wmb();
> > +
> > +		page =  shmem_read_mapping_page(mapping, i);
> > +		if (IS_ERR(page)) {
> > +			ret = PTR_ERR(page);
> > +			goto err_node;
> > +		}
> > +
> > +		src = io_mapping_map_atomic_wc(i915->gtt.mappable, node.start + PAGE_SIZE * i);
> > +		dst = kmap_atomic(page);
> > +		memcpy_fromio(dst, src, PAGE_SIZE);
> > +		kunmap_atomic(dst);
> > +		io_mapping_unmap_atomic(src);
> > +
> > +		page_cache_release(page);
> 
> I assume shmem_file_setup takes one reference to each page, 
> shmem_read_mapping_page another and then here we release that extra one? Or?
> 
1. Only file instantiation happens during shmem_file_setup, no pages are
allocated.
2. shmem_read_mapping_page does the allocation of pages and returns with
a refcount of 2 (1 for shmem-internal purpose and another for the
driver/caller)
3. page_cache_release releases the refcount from the driver side as we
don't necessarily want the new page to be pinned in RAM, once copy is
done.
4. Later on, if we need to pin the object, get_pages_gtt again increases
the refcount to 2. 
> > +	}
> > +
> > +	wmb();
> > +	i915->gtt.base.clear_range(&i915->gtt.base,
> > +				   node.start, node.size,
> > +				   true);
> > +	drm_mm_remove_node(&node);
> 
> Maybe move the whole copy content loop into a helper for readability?
This can be done.
> 
> > +
> > +swap_pages:
> > +	stolen_pages = obj->pages;
> > +	obj->pages = NULL;
> > +
> > +	obj->base.filp = file;
> > +	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
> > +	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
> > +
> > +	/* Recreate any pinned binding with pointers to the new storage */
> > +	if (!list_empty(&obj->vma_list)) {
> > +		ret = i915_gem_object_get_pages_gtt(obj);
> > +		if (ret) {
> > +			obj->pages = stolen_pages;
> > +			goto err_file;
> > +		}
> > +
> > +		ret = i915_gem_object_set_to_gtt_domain(obj, true);
> > +		if (ret) {
> > +			i915_gem_object_put_pages_gtt(obj);
> > +			obj->pages = stolen_pages;
> > +			goto err_file;
> > +		}
> > +
> > +		obj->get_page.sg = obj->pages->sgl;
> > +		obj->get_page.last = 0;
> > +
> > +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> > +			if (!drm_mm_node_allocated(&vma->node))
> > +				continue;
> > +
> > +			WARN_ON(i915_vma_bind(vma,
> > +					      obj->cache_level,
> > +					      PIN_UPDATE));
> > +		}
> > +	} else
> > +		list_del(&obj->global_list);
> 
> Hm, can it be bound if there were no VMAs?
Object will not be bound as there are no VMAs, but 'obj->global_list'
will be part of unbound list and we need to unlink that from the unbound
list.
> 
> > +
> > +	/* drop the stolen pin and backing */
> > +	shmemfs_pages = obj->pages;
> > +	obj->pages = stolen_pages;
> > +
> > +	i915_gem_object_unpin_pages(obj);
> > +	obj->ops->put_pages(obj);
> > +	if (obj->ops->release)
> > +		obj->ops->release(obj);
> > +
> > +	obj->ops = &i915_gem_object_ops;
> > +	obj->pages = shmemfs_pages;
> > +
> > +	return 0;
> > +
> > +err_node:
> > +	wmb();
> > +	i915->gtt.base.clear_range(&i915->gtt.base,
> > +				   node.start, node.size,
> > +				   true);
> > +	drm_mm_remove_node(&node);
> > +err_file:
> > +	fput(file);
> > +	obj->base.filp = NULL;
> > +	return ret;
> > +}
> > +
> > +int
> > +i915_gem_freeze(struct drm_device *dev)
> > +{
> > +	/* Called before i915_gem_suspend() when hibernating */
> > +	struct drm_i915_private *i915 = to_i915(dev);
> > +	struct drm_i915_gem_object *obj, *tmp;
> > +	struct list_head *phase[] = {
> > +		&i915->mm.unbound_list, &i915->mm.bound_list, NULL
> > +	}, **p;
> > +	int ret;
> > +
> > +	ret = i915_mutex_lock_interruptible(dev);
> > +	if (ret)
> > +		return ret;
> > +	/* Across hibernation, the stolen area is not preserved.
> > +	 * Anything inside stolen must copied back to normal
> > +	 * memory if we wish to preserve it.
> > +	 */
> > +	for (p = phase; *p; p++) {
> > +		struct list_head migrate;
> > +		int ret;
> > +
> > +		INIT_LIST_HEAD(&migrate);
> > +		list_for_each_entry_safe(obj, tmp, *p, global_list) {
> > +			if (obj->stolen == NULL)
> > +				continue;
> > +
> > +			if (obj->internal_volatile)
> > +				continue;
> > +
> > +			/* In the general case, this object may only be alive
> > +			 * due to an active reference, and that may disappear
> > +			 * when we unbind any of the objects (and so wait upon
> > +			 * the GPU and retire requests). To prevent one of the
> > +			 * objects from disappearing beneath us, we need to
> > +			 * take a reference to each as we build the migration
> > +			 * list.
> > +			 *
> > +			 * This is similar to the strategy required whilst
> > +			 * shrinking or evicting objects (for the same reason).
> > +			 */
> > +			drm_gem_object_reference(&obj->base);
> > +			list_move(&obj->global_list, &migrate);
> > +		}
> > +
> > +		ret = 0;
> > +		list_for_each_entry_safe(obj, tmp, &migrate, global_list) {
> > +			if (ret == 0)
> > +				ret = i915_gem_object_migrate_stolen_to_shmemfs(obj);
> > +			drm_gem_object_unreference(&obj->base);
> > +		}
> > +		list_splice(&migrate, *p);
> 
> Hmmm are this some clever games with obj->global_list ?
If the migration was unsuccessful, we are just moving the objects back
to their original list (bound or unbound).
> 
> > +		if (ret)
> > +			break;
> > +	}
> > +
> > +	mutex_unlock(&dev->struct_mutex);
> > +	return ret;
> > +}
> > +
> >   int
> >   i915_gem_suspend(struct drm_device *dev)
> >   {
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index f281e0b..0803922 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -2549,6 +2549,9 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
> >   	if (IS_ERR(obj))
> >   		return false;
> >
> > +	/* Not to be preserved across hibernation */
> > +	obj->internal_volatile = true;
> > +
> >   	obj->tiling_mode = plane_config->tiling;
> >   	if (obj->tiling_mode == I915_TILING_X)
> >   		obj->stride = fb->pitches[0];
> > diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
> > index f43681e..1d89253 100644
> > --- a/drivers/gpu/drm/i915/intel_fbdev.c
> > +++ b/drivers/gpu/drm/i915/intel_fbdev.c
> > @@ -154,6 +154,12 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
> >   		goto out;
> >   	}
> >
> > +	/* Discard the contents of the BIOS fb across hibernation.
> > +	 * We really want to completely throwaway the earlier fbdev
> > +	 * and reconfigure it anyway.
> > +	 */
> > +	obj->internal_volatile = true;
> > +
> >   	fb = __intel_framebuffer_create(dev, &mode_cmd, obj);
> >   	if (IS_ERR(fb)) {
> >   		ret = PTR_ERR(fb);
> > diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> > index 03ad276..6ddc20a 100644
> > --- a/drivers/gpu/drm/i915/intel_pm.c
> > +++ b/drivers/gpu/drm/i915/intel_pm.c
> > @@ -5181,6 +5181,8 @@ static void valleyview_setup_pctx(struct drm_device *dev)
> >   	I915_WRITE(VLV_PCBR, pctx_paddr);
> >
> >   out:
> > +	/* The power context need not be preserved across hibernation */
> > +	pctx->internal_volatile = true;
> >   	DRM_DEBUG_DRIVER("PCBR: 0x%08x\n", I915_READ(VLV_PCBR));
> >   	dev_priv->vlv_pctx = pctx;
> >   }
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 5eabaf6..370d96a 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -2090,6 +2090,12 @@ static int intel_alloc_ringbuffer_obj(struct drm_device *dev,
> >   	if (IS_ERR(obj))
> >   		return PTR_ERR(obj);
> >
> > +	/* Ringbuffer objects are by definition volatile - only the commands
> > +	 * between HEAD and TAIL need to be preserved and whilst there are
> > +	 * any commands there, the ringbuffer is pinned by activity.
> > +	 */
> > +	obj->internal_volatile = true;
> > +
> 
> What does this mean? It gets correctly re-initialized by existing code 
> on resume? Don't see anythign specific about HEAD and TAIL in this patch.
The HEAD and TAIL will be the same for the ringbuffer before the system
goes in to hibernation, which will be taken care by vma_unbind to
complete all requests.
> 
> >   	/* mark ring buffers as read-only from GPU side by default */
> >   	obj->gt_ro = 1;
> >
> >

Thanks,
Ankit
ankitprasad.r.sharma@intel.com Dec. 10, 2015, 1:17 p.m. UTC | #6
On Thu, 2015-12-10 at 09:43 +0000, Tvrtko Ursulin wrote:
> Hi,
> 
> Two more comments below:
> 
> On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
> > From: Chris Wilson <chris@chris-wilson.co.uk>
> >
> > Ville reminded us that stolen memory is not preserved across
> > hibernation, and a result of this was that context objects now being
> > allocated from stolen were being corrupted on S4 and promptly hanging
> > the GPU on resume.
> >
> > We want to utilise stolen for as much as possible (nothing else will use
> > that wasted memory otherwise), so we need a strategy for handling
> > general objects allocated from stolen and hibernation. A simple solution
> > is to do a CPU copy through the GTT of the stolen object into a fresh
> > shmemfs backing store and thenceforth treat it as a normal objects. This
> > can be refined in future to either use a GPU copy to avoid the slow
> > uncached reads (though it's hibernation!) and recreate stolen objects
> > upon resume/first-use. For now, a simple approach should suffice for
> > testing the object migration.
> 
> Mention of "testing" in the commit message and absence of a path to 
> migrate the objects back to stolen memory on resume makes me think this 
> is kind of half finished and note really ready for review / merge ?
> 
> Because I don't see how it is useful to migrate it one way and never 
> move back?
I think that this is not much of a problem, as the purpose here is to
keep the object intact, to avoid breaking anything.
So as far as objects are concerned they will be in shmem and can be used
without any issue, and the stolen memory will be free again for other
usage from the user.
> 
> >
> > v2:
> > Swap PTE for pinned bindings over to the shmemfs. This adds a
> > complicated dance, but is required as many stolen objects are likely to
> > be pinned for use by the hardware. Swapping the PTEs should not result
> > in externally visible behaviour, as each PTE update should be atomic and
> > the two pages identical. (danvet)
> >
> > safe-by-default, or the principle of least surprise. We need a new flag
> > to mark objects that we can wilfully discard and recreate across
> > hibernation. (danvet)
> >
> > Just use the global_list rather than invent a new stolen_list. This is
> > the slowpath hibernate and so adding a new list and the associated
> > complexity isn't worth it.
> >
> > v3: Rebased on drm-intel-nightly (Ankit)
> >
> > v4: Use insert_page to map stolen memory backed pages for migration to
> > shmem (Chris)
> >
> > v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
> >
> > Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> > Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> > ---
> >   drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
> >   drivers/gpu/drm/i915/i915_drv.h         |   7 +
> >   drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
> >   drivers/gpu/drm/i915/intel_display.c    |   3 +
> >   drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
> >   drivers/gpu/drm/i915/intel_pm.c         |   2 +
> >   drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
> >   7 files changed, 261 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> > index 9f55209..2bb9e9e 100644
> > --- a/drivers/gpu/drm/i915/i915_drv.c
> > +++ b/drivers/gpu/drm/i915/i915_drv.c
> > @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
> >   	return i915_drm_suspend(drm_dev);
> >   }
> >
> > +static int i915_pm_freeze(struct device *dev)
> > +{
> > +	int ret;
> > +
> > +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
> > +	if (ret)
> > +		return ret;
> 
> One of the first steps in idling GEM seems to be idling the GPU and 
> retiring requests.
> 
> Would it also make sense to do those steps before attempting to migrate 
> the stolen objects?
Here, we do that implicitly when trying to do a vma_unbind for the
object.

Thanks,
Ankit
Tvrtko Ursulin Dec. 10, 2015, 2:15 p.m. UTC | #7
On 10/12/15 13:17, Ankitprasad Sharma wrote:
> On Thu, 2015-12-10 at 09:43 +0000, Tvrtko Ursulin wrote:
>> Hi,
>>
>> Two more comments below:
>>
>> On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
>>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>>
>>> Ville reminded us that stolen memory is not preserved across
>>> hibernation, and a result of this was that context objects now being
>>> allocated from stolen were being corrupted on S4 and promptly hanging
>>> the GPU on resume.
>>>
>>> We want to utilise stolen for as much as possible (nothing else will use
>>> that wasted memory otherwise), so we need a strategy for handling
>>> general objects allocated from stolen and hibernation. A simple solution
>>> is to do a CPU copy through the GTT of the stolen object into a fresh
>>> shmemfs backing store and thenceforth treat it as a normal objects. This
>>> can be refined in future to either use a GPU copy to avoid the slow
>>> uncached reads (though it's hibernation!) and recreate stolen objects
>>> upon resume/first-use. For now, a simple approach should suffice for
>>> testing the object migration.
>>
>> Mention of "testing" in the commit message and absence of a path to
>> migrate the objects back to stolen memory on resume makes me think this
>> is kind of half finished and note really ready for review / merge ?
>>
>> Because I don't see how it is useful to migrate it one way and never
>> move back?
> I think that this is not much of a problem, as the purpose here is to
> keep the object intact, to avoid breaking anything.
> So as far as objects are concerned they will be in shmem and can be used
> without any issue, and the stolen memory will be free again for other
> usage from the user.

I am not sure that is a good state of things.

One of the things it means is that when user wanted to create an object 
in stolen memory, after resume it will not be any more. So what is the 
point in failing stolen object creation when area is full in the first 
place? We could just return a normal object instead.

Then the question of objects which are allocated in stolen by the 
driver. Are they being re-allocated on resume or will also be stuck in 
shmemfs from then onward?

And finally, one corner case might be that shmemfs plus stolen is a 
larger sum which will be attempted to restored in shmemfs only on 
resume. Will that always work if everything is fully populated and what 
will happen if we run out of space?

At minimum all this should be discussed and explicitly documented in the 
commit message.

Would it be difficult to implement the reverse path?

>>> v2:
>>> Swap PTE for pinned bindings over to the shmemfs. This adds a
>>> complicated dance, but is required as many stolen objects are likely to
>>> be pinned for use by the hardware. Swapping the PTEs should not result
>>> in externally visible behaviour, as each PTE update should be atomic and
>>> the two pages identical. (danvet)
>>>
>>> safe-by-default, or the principle of least surprise. We need a new flag
>>> to mark objects that we can wilfully discard and recreate across
>>> hibernation. (danvet)
>>>
>>> Just use the global_list rather than invent a new stolen_list. This is
>>> the slowpath hibernate and so adding a new list and the associated
>>> complexity isn't worth it.
>>>
>>> v3: Rebased on drm-intel-nightly (Ankit)
>>>
>>> v4: Use insert_page to map stolen memory backed pages for migration to
>>> shmem (Chris)
>>>
>>> v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
>>>
>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
>>> ---
>>>    drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
>>>    drivers/gpu/drm/i915/i915_drv.h         |   7 +
>>>    drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
>>>    drivers/gpu/drm/i915/intel_display.c    |   3 +
>>>    drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
>>>    drivers/gpu/drm/i915/intel_pm.c         |   2 +
>>>    drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
>>>    7 files changed, 261 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>>> index 9f55209..2bb9e9e 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>> @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
>>>    	return i915_drm_suspend(drm_dev);
>>>    }
>>>
>>> +static int i915_pm_freeze(struct device *dev)
>>> +{
>>> +	int ret;
>>> +
>>> +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
>>> +	if (ret)
>>> +		return ret;
>>
>> One of the first steps in idling GEM seems to be idling the GPU and
>> retiring requests.
>>
>> Would it also make sense to do those steps before attempting to migrate
>> the stolen objects?
> Here, we do that implicitly when trying to do a vma_unbind for the
> object.

Code paths are not the same so it makes me uncomfortable.  It looks more 
logical to do the migration after the existing i915_gem_suspend. It 
would mean some code duplication, true (maybe split i915_drm_suspend in 
two and call i915_gem_freeze in between), but to me it looks more like a 
proper place to do it.

Do Chris or Ville have any opinions here?

Regards,

Tvrtko
Dave Gordon Dec. 10, 2015, 6 p.m. UTC | #8
On 10/12/15 14:15, Tvrtko Ursulin wrote:
>
> On 10/12/15 13:17, Ankitprasad Sharma wrote:
>> On Thu, 2015-12-10 at 09:43 +0000, Tvrtko Ursulin wrote:
>>> Hi,
>>>
>>> Two more comments below:
>>>
>>> On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
>>>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>>>
>>>> Ville reminded us that stolen memory is not preserved across
>>>> hibernation, and a result of this was that context objects now being
>>>> allocated from stolen were being corrupted on S4 and promptly hanging
>>>> the GPU on resume.
>>>>
>>>> We want to utilise stolen for as much as possible (nothing else will
>>>> use
>>>> that wasted memory otherwise), so we need a strategy for handling
>>>> general objects allocated from stolen and hibernation. A simple
>>>> solution
>>>> is to do a CPU copy through the GTT of the stolen object into a fresh
>>>> shmemfs backing store and thenceforth treat it as a normal objects.
>>>> This
>>>> can be refined in future to either use a GPU copy to avoid the slow
>>>> uncached reads (though it's hibernation!) and recreate stolen objects
>>>> upon resume/first-use. For now, a simple approach should suffice for
>>>> testing the object migration.
>>>
>>> Mention of "testing" in the commit message and absence of a path to
>>> migrate the objects back to stolen memory on resume makes me think this
>>> is kind of half finished and note really ready for review / merge ?
>>>
>>> Because I don't see how it is useful to migrate it one way and never
>>> move back?
>> I think that this is not much of a problem, as the purpose here is to
>> keep the object intact, to avoid breaking anything.
>> So as far as objects are concerned they will be in shmem and can be used
>> without any issue, and the stolen memory will be free again for other
>> usage from the user.
>
> I am not sure that is a good state of things.
>
> One of the things it means is that when user wanted to create an object
> in stolen memory, after resume it will not be any more. So what is the
> point in failing stolen object creation when area is full in the first
> place? We could just return a normal object instead.
>
> Then the question of objects which are allocated in stolen by the
> driver. Are they being re-allocated on resume or will also be stuck in
> shmemfs from then onward?
>
> And finally, one corner case might be that shmemfs plus stolen is a
> larger sum which will be attempted to restored in shmemfs only on
> resume. Will that always work if everything is fully populated and what
> will happen if we run out of space?
>
> At minimum all this should be discussed and explicitly documented in the
> commit message.
>
> Would it be difficult to implement the reverse path?

Please don't migrate random objects to stolen! It has all sorts of 
limitations that make it unsuitable for some types of object (e.g. 
contexts).

Only objects that were originally placed in stolen should ever be 
candidates for the reverse migration ...

.Dave.
ankitprasad.r.sharma@intel.com Dec. 11, 2015, 5:16 a.m. UTC | #9
On Thu, 2015-12-10 at 14:15 +0000, Tvrtko Ursulin wrote:
> On 10/12/15 13:17, Ankitprasad Sharma wrote:
> > On Thu, 2015-12-10 at 09:43 +0000, Tvrtko Ursulin wrote:
> >> Hi,
> >>
> >> Two more comments below:
> >>
> >> On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
> >>> From: Chris Wilson <chris@chris-wilson.co.uk>
> >>>
> >>> Ville reminded us that stolen memory is not preserved across
> >>> hibernation, and a result of this was that context objects now being
> >>> allocated from stolen were being corrupted on S4 and promptly hanging
> >>> the GPU on resume.
> >>>
> >>> We want to utilise stolen for as much as possible (nothing else will use
> >>> that wasted memory otherwise), so we need a strategy for handling
> >>> general objects allocated from stolen and hibernation. A simple solution
> >>> is to do a CPU copy through the GTT of the stolen object into a fresh
> >>> shmemfs backing store and thenceforth treat it as a normal objects. This
> >>> can be refined in future to either use a GPU copy to avoid the slow
> >>> uncached reads (though it's hibernation!) and recreate stolen objects
> >>> upon resume/first-use. For now, a simple approach should suffice for
> >>> testing the object migration.
> >>
> >> Mention of "testing" in the commit message and absence of a path to
> >> migrate the objects back to stolen memory on resume makes me think this
> >> is kind of half finished and note really ready for review / merge ?
> >>
> >> Because I don't see how it is useful to migrate it one way and never
> >> move back?
> > I think that this is not much of a problem, as the purpose here is to
> > keep the object intact, to avoid breaking anything.
> > So as far as objects are concerned they will be in shmem and can be used
> > without any issue, and the stolen memory will be free again for other
> > usage from the user.
> 
> I am not sure that is a good state of things.
> 
> One of the things it means is that when user wanted to create an object 
> in stolen memory, after resume it will not be any more. So what is the 
> point in failing stolen object creation when area is full in the first 
> place? We could just return a normal object instead.
I agree with you, but the absence of a reverse path will not affect the
user in any way, though the user may be under the wrong impression that
the buffer is residing inside the stolen area.

> 
> Then the question of objects which are allocated in stolen by the 
> driver. Are they being re-allocated on resume or will also be stuck in 
> shmemfs from then onward?
Objects allocated by the driver need not be preserved (we use a
internal_volatile flag for those). These are not migrated to the shmemfs
and are later re-populated by the driver, when used again after resume.
> 
> And finally, one corner case might be that shmemfs plus stolen is a 
> larger sum which will be attempted to restored in shmemfs only on 
> resume. Will that always work if everything is fully populated and what 
> will happen if we run out of space?
As per my understanding,
shmemfs size will get increased, due to migration, before the
hibernation itself. And if not everything from shmemfs can be stored in
RAM, swap-out will take care of it.
Whatever was stored in the RAM will be restored on resume, rest all will
remain in the swap.
> 
> At minimum all this should be discussed and explicitly documented in the 
> commit message.
> 
> Would it be difficult to implement the reverse path?
I will try to explore the reverse path as well. But that can be
submitted separately as a follow-up patch.
> 
> >>> v2:
> >>> Swap PTE for pinned bindings over to the shmemfs. This adds a
> >>> complicated dance, but is required as many stolen objects are likely to
> >>> be pinned for use by the hardware. Swapping the PTEs should not result
> >>> in externally visible behaviour, as each PTE update should be atomic and
> >>> the two pages identical. (danvet)
> >>>
> >>> safe-by-default, or the principle of least surprise. We need a new flag
> >>> to mark objects that we can wilfully discard and recreate across
> >>> hibernation. (danvet)
> >>>
> >>> Just use the global_list rather than invent a new stolen_list. This is
> >>> the slowpath hibernate and so adding a new list and the associated
> >>> complexity isn't worth it.
> >>>
> >>> v3: Rebased on drm-intel-nightly (Ankit)
> >>>
> >>> v4: Use insert_page to map stolen memory backed pages for migration to
> >>> shmem (Chris)
> >>>
> >>> v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
> >>>
> >>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> >>> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> >>> ---
> >>>    drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
> >>>    drivers/gpu/drm/i915/i915_drv.h         |   7 +
> >>>    drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
> >>>    drivers/gpu/drm/i915/intel_display.c    |   3 +
> >>>    drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
> >>>    drivers/gpu/drm/i915/intel_pm.c         |   2 +
> >>>    drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
> >>>    7 files changed, 261 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> >>> index 9f55209..2bb9e9e 100644
> >>> --- a/drivers/gpu/drm/i915/i915_drv.c
> >>> +++ b/drivers/gpu/drm/i915/i915_drv.c
> >>> @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
> >>>    	return i915_drm_suspend(drm_dev);
> >>>    }
> >>>
> >>> +static int i915_pm_freeze(struct device *dev)
> >>> +{
> >>> +	int ret;
> >>> +
> >>> +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
> >>> +	if (ret)
> >>> +		return ret;
> >>
> >> One of the first steps in idling GEM seems to be idling the GPU and
> >> retiring requests.
> >>
> >> Would it also make sense to do those steps before attempting to migrate
> >> the stolen objects?
> > Here, we do that implicitly when trying to do a vma_unbind for the
> > object.
> 
> Code paths are not the same so it makes me uncomfortable.  It looks more 
> logical to do the migration after the existing i915_gem_suspend. It 
> would mean some code duplication, true (maybe split i915_drm_suspend in 
> two and call i915_gem_freeze in between), but to me it looks more like a 
> proper place to do it.
> 
All inactive stolen objects will be migrated immediately and the active
ones will be implicitly synchronized in vma_unbind.
Would it be more appropriate to rename i915_gem_freeze to
i915_gem_migrate_stolen?

But anyway, we will wait for Chris or Ville's inputs.
> Do Chris or Ville have any opinions here?
> 
> Regards,
> 
> Tvrtko

Thanks,
Ankit
ankitprasad.r.sharma@intel.com Dec. 11, 2015, 5:19 a.m. UTC | #10
On Thu, 2015-12-10 at 18:00 +0000, Dave Gordon wrote:
> On 10/12/15 14:15, Tvrtko Ursulin wrote:
> >
> > On 10/12/15 13:17, Ankitprasad Sharma wrote:
> >> On Thu, 2015-12-10 at 09:43 +0000, Tvrtko Ursulin wrote:
> >>> Hi,
> >>>
> >>> Two more comments below:
> >>>
> >>> On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
> >>>> From: Chris Wilson <chris@chris-wilson.co.uk>
> >>>>
> >>>> Ville reminded us that stolen memory is not preserved across
> >>>> hibernation, and a result of this was that context objects now being
> >>>> allocated from stolen were being corrupted on S4 and promptly hanging
> >>>> the GPU on resume.
> >>>>
> >>>> We want to utilise stolen for as much as possible (nothing else will
> >>>> use
> >>>> that wasted memory otherwise), so we need a strategy for handling
> >>>> general objects allocated from stolen and hibernation. A simple
> >>>> solution
> >>>> is to do a CPU copy through the GTT of the stolen object into a fresh
> >>>> shmemfs backing store and thenceforth treat it as a normal objects.
> >>>> This
> >>>> can be refined in future to either use a GPU copy to avoid the slow
> >>>> uncached reads (though it's hibernation!) and recreate stolen objects
> >>>> upon resume/first-use. For now, a simple approach should suffice for
> >>>> testing the object migration.
> >>>
> >>> Mention of "testing" in the commit message and absence of a path to
> >>> migrate the objects back to stolen memory on resume makes me think this
> >>> is kind of half finished and note really ready for review / merge ?
> >>>
> >>> Because I don't see how it is useful to migrate it one way and never
> >>> move back?
> >> I think that this is not much of a problem, as the purpose here is to
> >> keep the object intact, to avoid breaking anything.
> >> So as far as objects are concerned they will be in shmem and can be used
> >> without any issue, and the stolen memory will be free again for other
> >> usage from the user.
> >
> > I am not sure that is a good state of things.
> >
> > One of the things it means is that when user wanted to create an object
> > in stolen memory, after resume it will not be any more. So what is the
> > point in failing stolen object creation when area is full in the first
> > place? We could just return a normal object instead.
> >
> > Then the question of objects which are allocated in stolen by the
> > driver. Are they being re-allocated on resume or will also be stuck in
> > shmemfs from then onward?
> >
> > And finally, one corner case might be that shmemfs plus stolen is a
> > larger sum which will be attempted to restored in shmemfs only on
> > resume. Will that always work if everything is fully populated and what
> > will happen if we run out of space?
> >
> > At minimum all this should be discussed and explicitly documented in the
> > commit message.
> >
> > Would it be difficult to implement the reverse path?
> 
> Please don't migrate random objects to stolen! It has all sorts of 
> limitations that make it unsuitable for some types of object (e.g. 
> contexts).
> 
> Only objects that were originally placed in stolen should ever be 
> candidates for the reverse migration ...
Yes, obviously. We will consider only those objects which were
originally placed in stolen area.
> 
> .Dave.
> 
Thanks,
Ankit
Tvrtko Ursulin Dec. 11, 2015, 12:33 p.m. UTC | #11
On 11/12/15 05:16, Ankitprasad Sharma wrote:
> On Thu, 2015-12-10 at 14:15 +0000, Tvrtko Ursulin wrote:
>> On 10/12/15 13:17, Ankitprasad Sharma wrote:
>>> On Thu, 2015-12-10 at 09:43 +0000, Tvrtko Ursulin wrote:
>>>> Hi,
>>>>
>>>> Two more comments below:
>>>>
>>>> On 09/12/15 12:46, ankitprasad.r.sharma@intel.com wrote:
>>>>> From: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>
>>>>> Ville reminded us that stolen memory is not preserved across
>>>>> hibernation, and a result of this was that context objects now being
>>>>> allocated from stolen were being corrupted on S4 and promptly hanging
>>>>> the GPU on resume.
>>>>>
>>>>> We want to utilise stolen for as much as possible (nothing else will use
>>>>> that wasted memory otherwise), so we need a strategy for handling
>>>>> general objects allocated from stolen and hibernation. A simple solution
>>>>> is to do a CPU copy through the GTT of the stolen object into a fresh
>>>>> shmemfs backing store and thenceforth treat it as a normal objects. This
>>>>> can be refined in future to either use a GPU copy to avoid the slow
>>>>> uncached reads (though it's hibernation!) and recreate stolen objects
>>>>> upon resume/first-use. For now, a simple approach should suffice for
>>>>> testing the object migration.
>>>>
>>>> Mention of "testing" in the commit message and absence of a path to
>>>> migrate the objects back to stolen memory on resume makes me think this
>>>> is kind of half finished and note really ready for review / merge ?
>>>>
>>>> Because I don't see how it is useful to migrate it one way and never
>>>> move back?
>>> I think that this is not much of a problem, as the purpose here is to
>>> keep the object intact, to avoid breaking anything.
>>> So as far as objects are concerned they will be in shmem and can be used
>>> without any issue, and the stolen memory will be free again for other
>>> usage from the user.
>>
>> I am not sure that is a good state of things.
>>
>> One of the things it means is that when user wanted to create an object
>> in stolen memory, after resume it will not be any more. So what is the
>> point in failing stolen object creation when area is full in the first
>> place? We could just return a normal object instead.
> I agree with you, but the absence of a reverse path will not affect the
> user in any way, though the user may be under the wrong impression that
> the buffer is residing inside the stolen area.

Yes, and since suspend-resume is rather a frequent use case it brings 
the whole point of stolen into question for me unless implemented fully.

>
>>
>> Then the question of objects which are allocated in stolen by the
>> driver. Are they being re-allocated on resume or will also be stuck in
>> shmemfs from then onward?
> Objects allocated by the driver need not be preserved (we use a
> internal_volatile flag for those). These are not migrated to the shmemfs
> and are later re-populated by the driver, when used again after resume.

Good then, it wasn't clear to me from that comment about HEAD and TAIL etc.

>>
>> And finally, one corner case might be that shmemfs plus stolen is a
>> larger sum which will be attempted to restored in shmemfs only on
>> resume. Will that always work if everything is fully populated and what
>> will happen if we run out of space?
> As per my understanding,
> shmemfs size will get increased, due to migration, before the
> hibernation itself. And if not everything from shmemfs can be stored in
> RAM, swap-out will take care of it.
> Whatever was stored in the RAM will be restored on resume, rest all will
> remain in the swap.

Yes but still there is a change it won't fit is someone is running at 
the limit and maybe with no swap, no?

>>
>> At minimum all this should be discussed and explicitly documented in the
>> commit message.
>>
>> Would it be difficult to implement the reverse path?
> I will try to explore the reverse path as well. But that can be
> submitted separately as a follow-up patch.

I don't think so because of what I wrote above. Especially since Ville 
said corruption can even happen in S3, but even if we only think of S4, 
for me that is such a common on frequent use-case that having the 
ability to request object be in stolen, only for that to be silently 
changed on resume is not worth it.

SO until the migration both ways is there I don't see what is the point 
of allowing stolen objects and even failing the creation if there is not 
enough space there.

>>
>>>>> v2:
>>>>> Swap PTE for pinned bindings over to the shmemfs. This adds a
>>>>> complicated dance, but is required as many stolen objects are likely to
>>>>> be pinned for use by the hardware. Swapping the PTEs should not result
>>>>> in externally visible behaviour, as each PTE update should be atomic and
>>>>> the two pages identical. (danvet)
>>>>>
>>>>> safe-by-default, or the principle of least surprise. We need a new flag
>>>>> to mark objects that we can wilfully discard and recreate across
>>>>> hibernation. (danvet)
>>>>>
>>>>> Just use the global_list rather than invent a new stolen_list. This is
>>>>> the slowpath hibernate and so adding a new list and the associated
>>>>> complexity isn't worth it.
>>>>>
>>>>> v3: Rebased on drm-intel-nightly (Ankit)
>>>>>
>>>>> v4: Use insert_page to map stolen memory backed pages for migration to
>>>>> shmem (Chris)
>>>>>
>>>>> v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)
>>>>>
>>>>> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
>>>>> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
>>>>> ---
>>>>>     drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
>>>>>     drivers/gpu/drm/i915/i915_drv.h         |   7 +
>>>>>     drivers/gpu/drm/i915/i915_gem.c         | 232 ++++++++++++++++++++++++++++++--
>>>>>     drivers/gpu/drm/i915/intel_display.c    |   3 +
>>>>>     drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
>>>>>     drivers/gpu/drm/i915/intel_pm.c         |   2 +
>>>>>     drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
>>>>>     7 files changed, 261 insertions(+), 12 deletions(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
>>>>> index 9f55209..2bb9e9e 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.c
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.c
>>>>> @@ -1036,6 +1036,21 @@ static int i915_pm_suspend(struct device *dev)
>>>>>     	return i915_drm_suspend(drm_dev);
>>>>>     }
>>>>>
>>>>> +static int i915_pm_freeze(struct device *dev)
>>>>> +{
>>>>> +	int ret;
>>>>> +
>>>>> +	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
>>>>> +	if (ret)
>>>>> +		return ret;
>>>>
>>>> One of the first steps in idling GEM seems to be idling the GPU and
>>>> retiring requests.
>>>>
>>>> Would it also make sense to do those steps before attempting to migrate
>>>> the stolen objects?
>>> Here, we do that implicitly when trying to do a vma_unbind for the
>>> object.
>>
>> Code paths are not the same so it makes me uncomfortable.  It looks more
>> logical to do the migration after the existing i915_gem_suspend. It
>> would mean some code duplication, true (maybe split i915_drm_suspend in
>> two and call i915_gem_freeze in between), but to me it looks more like a
>> proper place to do it.
>>
> All inactive stolen objects will be migrated immediately and the active
> ones will be implicitly synchronized in vma_unbind.
> Would it be more appropriate to rename i915_gem_freeze to
> i915_gem_migrate_stolen?
>
> But anyway, we will wait for Chris or Ville's inputs.

Ping! :)

Regards,

Tvrtko
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 9f55209..2bb9e9e 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -1036,6 +1036,21 @@  static int i915_pm_suspend(struct device *dev)
 	return i915_drm_suspend(drm_dev);
 }
 
+static int i915_pm_freeze(struct device *dev)
+{
+	int ret;
+
+	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
+	if (ret)
+		return ret;
+
+	ret = i915_pm_suspend(dev);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
 static int i915_pm_suspend_late(struct device *dev)
 {
 	struct drm_device *drm_dev = dev_to_i915(dev)->dev;
@@ -1700,7 +1715,7 @@  static const struct dev_pm_ops i915_pm_ops = {
 	 * @restore, @restore_early : called after rebooting and restoring the
 	 *                            hibernation image [PMSG_RESTORE]
 	 */
-	.freeze = i915_pm_suspend,
+	.freeze = i915_pm_freeze,
 	.freeze_late = i915_pm_suspend_late,
 	.thaw_early = i915_pm_resume_early,
 	.thaw = i915_pm_resume,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e0b09b0..0d18b07 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2080,6 +2080,12 @@  struct drm_i915_gem_object {
 	 * Advice: are the backing pages purgeable?
 	 */
 	unsigned int madv:2;
+	/**
+	 * Whereas madv is for userspace, there are certain situations
+	 * where we want I915_MADV_DONTNEED behaviour on internal objects
+	 * without conflating the userspace setting.
+	 */
+	unsigned int internal_volatile:1;
 
 	/**
 	 * Current tiling mode for the object.
@@ -3006,6 +3012,7 @@  int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
+int __must_check i915_gem_freeze(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 void __i915_add_request(struct drm_i915_gem_request *req,
 			struct drm_i915_gem_object *batch_obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 68ed67a..1f134b0 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4511,12 +4511,27 @@  static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
 	.put_pages = i915_gem_object_put_pages_gtt,
 };
 
+static struct address_space *
+i915_gem_set_inode_gfp(struct drm_device *dev, struct file *file)
+{
+	struct address_space *mapping = file_inode(file)->i_mapping;
+	gfp_t mask;
+
+	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
+		/* 965gm cannot relocate objects above 4GiB. */
+		mask &= ~__GFP_HIGHMEM;
+		mask |= __GFP_DMA32;
+	}
+	mapping_set_gfp_mask(mapping, mask);
+
+	return mapping;
+}
+
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size)
 {
 	struct drm_i915_gem_object *obj;
-	struct address_space *mapping;
-	gfp_t mask;
 	int ret;
 
 	obj = i915_gem_object_alloc(dev);
@@ -4529,15 +4544,7 @@  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 		return ERR_PTR(ret);
 	}
 
-	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
-	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
-		/* 965gm cannot relocate objects above 4GiB. */
-		mask &= ~__GFP_HIGHMEM;
-		mask |= __GFP_DMA32;
-	}
-
-	mapping = file_inode(obj->base.filp)->i_mapping;
-	mapping_set_gfp_mask(mapping, mask);
+	i915_gem_set_inode_gfp(dev, obj->base.filp);
 
 	i915_gem_object_init(obj, &i915_gem_object_ops);
 
@@ -4714,6 +4721,209 @@  i915_gem_stop_ringbuffers(struct drm_device *dev)
 		dev_priv->gt.stop_ring(ring);
 }
 
+static int
+i915_gem_object_migrate_stolen_to_shmemfs(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_vma *vma, *vn;
+	struct drm_mm_node node;
+	struct file *file;
+	struct address_space *mapping;
+	struct sg_table *stolen_pages, *shmemfs_pages;
+	int ret, i;
+
+	if (WARN_ON(i915_gem_object_needs_bit17_swizzle(obj)))
+		return -EINVAL;
+
+	ret = i915_gem_object_set_to_gtt_domain(obj, false);
+	if (ret)
+		return ret;
+
+	file = shmem_file_setup("drm mm object", obj->base.size, VM_NORESERVE);
+	if (IS_ERR(file))
+		return PTR_ERR(file);
+	mapping = i915_gem_set_inode_gfp(obj->base.dev, file);
+
+	list_for_each_entry_safe(vma, vn, &obj->vma_list, vma_link)
+		if (i915_vma_unbind(vma))
+			continue;
+
+	if (obj->madv != I915_MADV_WILLNEED && list_empty(&obj->vma_list)) {
+		/* Discard the stolen reservation, and replace with
+		 * an unpopulated shmemfs object.
+		 */
+		obj->madv = __I915_MADV_PURGED;
+		goto swap_pages;
+	}
+
+	/* stolen objects are already pinned to prevent shrinkage */
+	memset(&node, 0, sizeof(node));
+	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm,
+						  &node,
+						  4096, 0, I915_CACHE_NONE,
+						  0, i915->gtt.mappable_end,
+						  DRM_MM_SEARCH_DEFAULT,
+						  DRM_MM_CREATE_DEFAULT);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
+		struct page *page;
+		void *__iomem src;
+		void *dst;
+
+		wmb();
+		i915->gtt.base.insert_page(&i915->gtt.base,
+					   i915_gem_object_get_dma_address(obj, i),
+					   node.start,
+					   I915_CACHE_NONE,
+					   0);
+		wmb();
+
+		page = shmem_read_mapping_page(mapping, i);
+		if (IS_ERR(page)) {
+			ret = PTR_ERR(page);
+			goto err_node;
+		}
+
+		src = io_mapping_map_atomic_wc(i915->gtt.mappable, node.start + PAGE_SIZE * i);
+		dst = kmap_atomic(page);
+		memcpy_fromio(dst, src, PAGE_SIZE);
+		kunmap_atomic(dst);
+		io_mapping_unmap_atomic(src);
+
+		page_cache_release(page);
+	}
+
+	wmb();
+	i915->gtt.base.clear_range(&i915->gtt.base,
+				   node.start, node.size,
+				   true);
+	drm_mm_remove_node(&node);
+
+swap_pages:
+	stolen_pages = obj->pages;
+	obj->pages = NULL;
+
+	obj->base.filp = file;
+	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
+	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+
+	/* Recreate any pinned binding with pointers to the new storage */
+	if (!list_empty(&obj->vma_list)) {
+		ret = i915_gem_object_get_pages_gtt(obj);
+		if (ret) {
+			obj->pages = stolen_pages;
+			goto err_file;
+		}
+
+		ret = i915_gem_object_set_to_gtt_domain(obj, true);
+		if (ret) {
+			i915_gem_object_put_pages_gtt(obj);
+			obj->pages = stolen_pages;
+			goto err_file;
+		}
+
+		obj->get_page.sg = obj->pages->sgl;
+		obj->get_page.last = 0;
+
+		list_for_each_entry(vma, &obj->vma_list, vma_link) {
+			if (!drm_mm_node_allocated(&vma->node))
+				continue;
+
+			WARN_ON(i915_vma_bind(vma,
+					      obj->cache_level,
+					      PIN_UPDATE));
+		}
+	} else
+		list_del(&obj->global_list);
+
+	/* drop the stolen pin and backing */
+	shmemfs_pages = obj->pages;
+	obj->pages = stolen_pages;
+
+	i915_gem_object_unpin_pages(obj);
+	obj->ops->put_pages(obj);
+	if (obj->ops->release)
+		obj->ops->release(obj);
+
+	obj->ops = &i915_gem_object_ops;
+	obj->pages = shmemfs_pages;
+
+	return 0;
+
+err_node:
+	wmb();
+	i915->gtt.base.clear_range(&i915->gtt.base,
+				   node.start, node.size,
+				   true);
+	drm_mm_remove_node(&node);
+err_file:
+	fput(file);
+	obj->base.filp = NULL;
+	return ret;
+}
+
+int
+i915_gem_freeze(struct drm_device *dev)
+{
+	/* Called before i915_gem_suspend() when hibernating */
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct drm_i915_gem_object *obj, *tmp;
+	struct list_head *phase[] = {
+		&i915->mm.unbound_list, &i915->mm.bound_list, NULL
+	}, **p;
+	int ret;
+
+	ret = i915_mutex_lock_interruptible(dev);
+	if (ret)
+		return ret;
+	/* Across hibernation, the stolen area is not preserved.
+	 * Anything inside stolen must copied back to normal
+	 * memory if we wish to preserve it.
+	 */
+	for (p = phase; *p; p++) {
+		struct list_head migrate;
+		int ret;
+
+		INIT_LIST_HEAD(&migrate);
+		list_for_each_entry_safe(obj, tmp, *p, global_list) {
+			if (obj->stolen == NULL)
+				continue;
+
+			if (obj->internal_volatile)
+				continue;
+
+			/* In the general case, this object may only be alive
+			 * due to an active reference, and that may disappear
+			 * when we unbind any of the objects (and so wait upon
+			 * the GPU and retire requests). To prevent one of the
+			 * objects from disappearing beneath us, we need to
+			 * take a reference to each as we build the migration
+			 * list.
+			 *
+			 * This is similar to the strategy required whilst
+			 * shrinking or evicting objects (for the same reason).
+			 */
+			drm_gem_object_reference(&obj->base);
+			list_move(&obj->global_list, &migrate);
+		}
+
+		ret = 0;
+		list_for_each_entry_safe(obj, tmp, &migrate, global_list) {
+			if (ret == 0)
+				ret = i915_gem_object_migrate_stolen_to_shmemfs(obj);
+			drm_gem_object_unreference(&obj->base);
+		}
+		list_splice(&migrate, *p);
+		if (ret)
+			break;
+	}
+
+	mutex_unlock(&dev->struct_mutex);
+	return ret;
+}
+
 int
 i915_gem_suspend(struct drm_device *dev)
 {
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index f281e0b..0803922 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2549,6 +2549,9 @@  intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
 	if (IS_ERR(obj))
 		return false;
 
+	/* Not to be preserved across hibernation */
+	obj->internal_volatile = true;
+
 	obj->tiling_mode = plane_config->tiling;
 	if (obj->tiling_mode == I915_TILING_X)
 		obj->stride = fb->pitches[0];
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index f43681e..1d89253 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -154,6 +154,12 @@  static int intelfb_alloc(struct drm_fb_helper *helper,
 		goto out;
 	}
 
+	/* Discard the contents of the BIOS fb across hibernation.
+	 * We really want to completely throwaway the earlier fbdev
+	 * and reconfigure it anyway.
+	 */
+	obj->internal_volatile = true;
+
 	fb = __intel_framebuffer_create(dev, &mode_cmd, obj);
 	if (IS_ERR(fb)) {
 		ret = PTR_ERR(fb);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 03ad276..6ddc20a 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5181,6 +5181,8 @@  static void valleyview_setup_pctx(struct drm_device *dev)
 	I915_WRITE(VLV_PCBR, pctx_paddr);
 
 out:
+	/* The power context need not be preserved across hibernation */
+	pctx->internal_volatile = true;
 	DRM_DEBUG_DRIVER("PCBR: 0x%08x\n", I915_READ(VLV_PCBR));
 	dev_priv->vlv_pctx = pctx;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5eabaf6..370d96a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2090,6 +2090,12 @@  static int intel_alloc_ringbuffer_obj(struct drm_device *dev,
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
+	/* Ringbuffer objects are by definition volatile - only the commands
+	 * between HEAD and TAIL need to be preserved and whilst there are
+	 * any commands there, the ringbuffer is pinned by activity.
+	 */
+	obj->internal_volatile = true;
+
 	/* mark ring buffers as read-only from GPU side by default */
 	obj->gt_ro = 1;