[v4,2/2] drm/i915: Make GPU pages movable

Message ID	1459775891-32442-2-git-send-email-chris@chris-wilson.co.uk (mailing list archive)
State	New, archived
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: Chris Wilson <chris@chris-wilson.co.uk> To: intel-gfx@lists.freedesktop.org Date: Mon, 4 Apr 2016 14:18:11 +0100 Message-Id: <1459775891-32442-2-git-send-email-chris@chris-wilson.co.uk> In-Reply-To: <1459775891-32442-1-git-send-email-chris@chris-wilson.co.uk> References: <1459775891-32442-1-git-send-email-chris@chris-wilson.co.uk> Cc: linux-mm@kvack.org, Akash Goel <akash.goel@intel.com>, Hugh Dickins <hughd@google.com>, Sourab Gupta <sourab.gupta@intel.com> Subject: [Intel-gfx] [PATCH v4 2/2] drm/i915: Make GPU pages movable Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Message ID

1459775891-32442-2-git-send-email-chris@chris-wilson.co.uk (mailing list archive)

State

New, archived

Headers

From: Chris Wilson <chris@chris-wilson.co.uk>
To: intel-gfx@lists.freedesktop.org
Date: Mon,  4 Apr 2016 14:18:11 +0100
Message-Id: <1459775891-32442-2-git-send-email-chris@chris-wilson.co.uk>
In-Reply-To: <1459775891-32442-1-git-send-email-chris@chris-wilson.co.uk>
References: <1459775891-32442-1-git-send-email-chris@chris-wilson.co.uk>
Cc: linux-mm@kvack.org, Akash Goel <akash.goel@intel.com>,
	Hugh Dickins <hughd@google.com>, Sourab Gupta <sourab.gupta@intel.com>
Subject: [Intel-gfx] [PATCH v4 2/2] drm/i915: Make GPU pages movable
Precedence: list
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Commit Message

Chris Wilson April 4, 2016, 1:18 p.m. UTC

From: Akash Goel <akash.goel@intel.com>

On a long run of more than 2-3 days, physical memory tends to get
fragmented severely, which considerably slows down the system. In such a
scenario, the shrinker is also unable to help as lack of memory is not
the actual problem, since it has been observed that there are enough free
pages of 0 order. This also manifests itself when an indiviual zone in
the mm runs out of pages and if we cannot migrate pages between zones,
the kernel hits an out-of-memory even though there are free pages (and
often all of swap) available.

To address the issue of external fragementation, kernel does a compaction
(which involves migration of pages) but it's efficacy depends upon how
many pages are marked as MOVABLE, as only those pages can be migrated.

Currently the backing pages for GFX buffers are allocated from shmemfs
with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
swap space, it may not be possible always to reclaim or swap-out pages of
all the inactive objects, to make way for free space allowing formation
of higher order groups of physically-contiguous pages on compaction.

Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
pin the pages if they are in use by GPU, which will prevent their
migration. So the migratepage callback in shmem is also hooked up to get
a notification when kernel initiates the page migration. On the
notification, i915.ko appropriately unpin the pages.  With this we can
effectively mark the GPU pages as MOVABLE and hence mitigate the
fragmentation problem.

v2:
 - Rename the migration routine to gem_shrink_migratepage, move it to the
   shrinker file, and use the existing constructs (Chris)
 - To cleanup, add a new helper function to encapsulate all page migration
   skip conditions (Chris)
 - Add a new local helper function in shrinker file, for dropping the
   backing pages, and call the same from gem_shrink() also (Chris)

v3:
 - Fix/invert the check on the return value of unsafe_drop_pages (Chris)

v4:
 - Minor tidy

Testcase: igt/gem_shrink
Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h          |   3 +
 drivers/gpu/drm/i915/i915_gem.c          |  13 ++-
 drivers/gpu/drm/i915/i915_gem_shrinker.c | 174 ++++++++++++++++++++++++++++---
 3 files changed, 175 insertions(+), 15 deletions(-)

Comments

Joonas Lahtinen Oct. 18, 2016, 12:05 p.m. UTC | #1

On ma, 2016-04-04 at 14:18 +0100, Chris Wilson wrote:
> From: Akash Goel <akash.goel@intel.com>
> 
> On a long run of more than 2-3 days, physical memory tends to get
> fragmented severely, which considerably slows down the system. In such a
> scenario, the shrinker is also unable to help as lack of memory is not
> the actual problem, since it has been observed that there are enough free
> pages of 0 order. This also manifests itself when an indiviual zone in
> the mm runs out of pages and if we cannot migrate pages between zones,
> the kernel hits an out-of-memory even though there are free pages (and
> often all of swap) available.
> 
> To address the issue of external fragementation, kernel does a compaction
> (which involves migration of pages) but it's efficacy depends upon how
> many pages are marked as MOVABLE, as only those pages can be migrated.
> 
> Currently the backing pages for GFX buffers are allocated from shmemfs
> with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
> swap space, it may not be possible always to reclaim or swap-out pages of
> all the inactive objects, to make way for free space allowing formation
> of higher order groups of physically-contiguous pages on compaction.
> 
> Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
> pin the pages if they are in use by GPU, which will prevent their
> migration. So the migratepage callback in shmem is also hooked up to get
> a notification when kernel initiates the page migration. On the
> notification, i915.ko appropriately unpin the pages.  With this we can
> effectively mark the GPU pages as MOVABLE and hence mitigate the
> fragmentation problem.
> 
> v2:
>  - Rename the migration routine to gem_shrink_migratepage, move it to the
>    shrinker file, and use the existing constructs (Chris)
>  - To cleanup, add a new helper function to encapsulate all page migration
>    skip conditions (Chris)
>  - Add a new local helper function in shrinker file, for dropping the
>    backing pages, and call the same from gem_shrink() also (Chris)
> 
> v3:
>  - Fix/invert the check on the return value of unsafe_drop_pages (Chris)
> 
> v4:
>  - Minor tidy
> 
> Testcase: igt/gem_shrink
> Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
> Cc: Hugh Dickins <hughd@google.com>
> Cc: linux-mm@kvack.org
> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> Signed-off-by: Akash Goel <akash.goel@intel.com>
> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

Could this patch be re-spinned on top of current nightly?

After removing;

> WARN(page_count(newpage) != 1, "Unexpected ref count for newpage\n")

and

>	if (ret)
>		DRM_DEBUG_DRIVER("page=%p migration returned %d\n", page, ret);

This is;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas

akash.goel@intel.com Oct. 18, 2016, 1:25 p.m. UTC | #2

On 10/18/2016 5:35 PM, Joonas Lahtinen wrote:
> On ma, 2016-04-04 at 14:18 +0100, Chris Wilson wrote:
>> From: Akash Goel <akash.goel@intel.com>
>>
>> On a long run of more than 2-3 days, physical memory tends to get
>> fragmented severely, which considerably slows down the system. In such a
>> scenario, the shrinker is also unable to help as lack of memory is not
>> the actual problem, since it has been observed that there are enough free
>> pages of 0 order. This also manifests itself when an indiviual zone in
>> the mm runs out of pages and if we cannot migrate pages between zones,
>> the kernel hits an out-of-memory even though there are free pages (and
>> often all of swap) available.
>>
>> To address the issue of external fragementation, kernel does a compaction
>> (which involves migration of pages) but it's efficacy depends upon how
>> many pages are marked as MOVABLE, as only those pages can be migrated.
>>
>> Currently the backing pages for GFX buffers are allocated from shmemfs
>> with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
>> swap space, it may not be possible always to reclaim or swap-out pages of
>> all the inactive objects, to make way for free space allowing formation
>> of higher order groups of physically-contiguous pages on compaction.
>>
>> Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
>> pin the pages if they are in use by GPU, which will prevent their
>> migration. So the migratepage callback in shmem is also hooked up to get
>> a notification when kernel initiates the page migration. On the
>> notification, i915.ko appropriately unpin the pages.  With this we can
>> effectively mark the GPU pages as MOVABLE and hence mitigate the
>> fragmentation problem.
>>
>> v2:
>>  - Rename the migration routine to gem_shrink_migratepage, move it to the
>>    shrinker file, and use the existing constructs (Chris)
>>  - To cleanup, add a new helper function to encapsulate all page migration
>>    skip conditions (Chris)
>>  - Add a new local helper function in shrinker file, for dropping the
>>    backing pages, and call the same from gem_shrink() also (Chris)
>>
>> v3:
>>  - Fix/invert the check on the return value of unsafe_drop_pages (Chris)
>>
>> v4:
>>  - Minor tidy
>>
>> Testcase: igt/gem_shrink
>> Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: linux-mm@kvack.org
>> Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
>> Signed-off-by: Akash Goel <akash.goel@intel.com>
>> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
>
> Could this patch be re-spinned on top of current nightly?
>
Sure will rebase it on top of nightly.

> After removing;
>
>> WARN(page_count(newpage) != 1, "Unexpected ref count for newpage\n")
>
> and
>
>> 	if (ret)
>> 		DRM_DEBUG_DRIVER("page=%p migration returned %d\n", page, ret);
>
> This is;
>
> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Thanks much for the review.
But there is a precursor patch also, there has been no traction on that.
[1/2] shmem: Support for registration of Driver/file owner specific ops
https://patchwork.freedesktop.org/patch/77935/

Best regards
Akash

>
> Regards, Joonas
>

Chris Wilson Oct. 18, 2016, 1:39 p.m. UTC | #3

On Tue, Oct 18, 2016 at 06:55:12PM +0530, Goel, Akash wrote:
> 
> 
> On 10/18/2016 5:35 PM, Joonas Lahtinen wrote:
> >On ma, 2016-04-04 at 14:18 +0100, Chris Wilson wrote:
> >>From: Akash Goel <akash.goel@intel.com>
> >>
> >>On a long run of more than 2-3 days, physical memory tends to get
> >>fragmented severely, which considerably slows down the system. In such a
> >>scenario, the shrinker is also unable to help as lack of memory is not
> >>the actual problem, since it has been observed that there are enough free
> >>pages of 0 order. This also manifests itself when an indiviual zone in
> >>the mm runs out of pages and if we cannot migrate pages between zones,
> >>the kernel hits an out-of-memory even though there are free pages (and
> >>often all of swap) available.
> >>
> >>To address the issue of external fragementation, kernel does a compaction
> >>(which involves migration of pages) but it's efficacy depends upon how
> >>many pages are marked as MOVABLE, as only those pages can be migrated.
> >>
> >>Currently the backing pages for GFX buffers are allocated from shmemfs
> >>with GFP_RECLAIMABLE flag, in units of 4KB pages.  In the case of limited
> >>swap space, it may not be possible always to reclaim or swap-out pages of
> >>all the inactive objects, to make way for free space allowing formation
> >>of higher order groups of physically-contiguous pages on compaction.
> >>
> >>Just marking the GPU pages as MOVABLE will not suffice, as i915.ko has to
> >>pin the pages if they are in use by GPU, which will prevent their
> >>migration. So the migratepage callback in shmem is also hooked up to get
> >>a notification when kernel initiates the page migration. On the
> >>notification, i915.ko appropriately unpin the pages.  With this we can
> >>effectively mark the GPU pages as MOVABLE and hence mitigate the
> >>fragmentation problem.
> >>
> >>v2:
> >> - Rename the migration routine to gem_shrink_migratepage, move it to the
> >>   shrinker file, and use the existing constructs (Chris)
> >> - To cleanup, add a new helper function to encapsulate all page migration
> >>   skip conditions (Chris)
> >> - Add a new local helper function in shrinker file, for dropping the
> >>   backing pages, and call the same from gem_shrink() also (Chris)
> >>
> >>v3:
> >> - Fix/invert the check on the return value of unsafe_drop_pages (Chris)
> >>
> >>v4:
> >> - Minor tidy
> >>
> >>Testcase: igt/gem_shrink
> >>Bugzilla: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=90254
> >>Cc: Hugh Dickins <hughd@google.com>
> >>Cc: linux-mm@kvack.org
> >>Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
> >>Signed-off-by: Akash Goel <akash.goel@intel.com>
> >>Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
> >
> >Could this patch be re-spinned on top of current nightly?
> >
> Sure will rebase it on top of nightly.

It's in my tree (on top of nightly) already with Joonas' r-b.
-Chris

Joonas Lahtinen Oct. 19, 2016, 10:40 a.m. UTC | #4

On ti, 2016-10-18 at 14:39 +0100, Chris Wilson wrote:
> It's in my tree (on top of nightly) already with Joonas' r-b.

Patch 1/2 seems to have my comments already, could be addressed and
respined too.

Regards, Joonas

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index dd187727c813..2c1d5c3af780 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -52,6 +52,7 @@ 
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
 #include <linux/pm_qos.h>
+#include <linux/shmem_fs.h>
 #include "intel_guc.h"
 #include "intel_dpll_mgr.h"
 
@@ -1234,6 +1235,8 @@  struct intel_l3_parity {
 };
 
 struct i915_gem_mm {
+	struct shmem_dev_info shmem_info;
+
 	/** Memory allocator for GTT stolen memory */
 	struct drm_mm stolen;
 	/** Protects the usage of the GTT stolen memory allocator. This is
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ca96fc12cdf4..7cef03efb539 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2206,6 +2206,7 @@  i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 		if (obj->madv == I915_MADV_WILLNEED)
 			mark_page_accessed(page);
 
+		set_page_private(page, 0);
 		page_cache_release(page);
 	}
 	obj->dirty = 0;
@@ -2320,6 +2321,7 @@  i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 			sg->length += PAGE_SIZE;
 		}
 		last_pfn = page_to_pfn(page);
+		set_page_private(page, (unsigned long)obj);
 
 		/* Check that the i965g/gm workaround works. */
 		WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
@@ -2345,8 +2347,11 @@  i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 
 err_pages:
 	sg_mark_end(sg);
-	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0)
-		page_cache_release(sg_page_iter_page(&sg_iter));
+	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
+		page = sg_page_iter_page(&sg_iter);
+		set_page_private(page, 0);
+		page_cache_release(page);
+	}
 	sg_free_table(st);
 	kfree(st);
 
@@ -4468,6 +4473,7 @@  static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size)
 {
+	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_i915_gem_object *obj;
 	struct address_space *mapping;
 	gfp_t mask;
@@ -4481,7 +4487,7 @@  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 		return NULL;
 	}
 
-	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+	mask = GFP_HIGHUSER_MOVABLE;
 	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
 		/* 965gm cannot relocate objects above 4GiB. */
 		mask &= ~__GFP_HIGHMEM;
@@ -4490,6 +4496,7 @@  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 
 	mapping = file_inode(obj->base.filp)->i_mapping;
 	mapping_set_gfp_mask(mapping, mask);
+	shmem_set_device_ops(mapping, &dev_priv->mm.shmem_info);
 
 	i915_gem_object_init(obj, &i915_gem_object_ops);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index d3c473ffb90a..b29584149bdf 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -24,6 +24,7 @@ 
 
 #include <linux/oom.h>
 #include <linux/shmem_fs.h>
+#include <linux/migrate.h>
 #include <linux/slab.h>
 #include <linux/swap.h>
 #include <linux/pci.h>
@@ -87,6 +88,70 @@  static bool can_release_pages(struct drm_i915_gem_object *obj)
 	return swap_available() || obj->madv == I915_MADV_DONTNEED;
 }
 
+static bool can_migrate_page(struct drm_i915_gem_object *obj)
+{
+	/* Avoid the migration of page if being actively used by GPU */
+	if (obj->active)
+		return false;
+
+	/* Skip the migration for purgeable objects otherwise there
+	 * will be a deadlock when shmem will try to lock the page for
+	 * truncation, which is already locked by the caller before
+	 * migration.
+	 */
+	if (obj->madv == I915_MADV_DONTNEED)
+		return false;
+
+	/* Skip the migration for a pinned object */
+	if (obj->pages_pin_count != num_vma_bound(obj))
+		return false;
+
+	return true;
+}
+
+static int
+unsafe_drop_pages(struct drm_i915_gem_object *obj)
+{
+	struct i915_vma *vma, *next;
+	int ret;
+
+	drm_gem_object_reference(&obj->base);
+	list_for_each_entry_safe(vma, next, &obj->vma_list, obj_link)
+		if (i915_vma_unbind(vma))
+			break;
+
+	ret = i915_gem_object_put_pages(obj);
+	drm_gem_object_unreference(&obj->base);
+
+	return ret;
+}
+
+static int
+do_migrate_page(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+	int ret = 0;
+
+	if (!can_migrate_page(obj))
+		return -EBUSY;
+
+	/* HW access would be required for a GGTT bound object, for which
+	 * device has to be kept awake. But a deadlock scenario can arise if
+	 * the attempt is made to resume the device, when either a suspend
+	 * or a resume operation is already happening concurrently from some
+	 * other path and that only also triggers compaction. So only unbind
+	 * if the device is currently awake.
+	 */
+	if (!intel_runtime_pm_get_if_in_use(dev_priv))
+		return -EBUSY;
+
+	if (unsafe_drop_pages(obj))
+		ret = -EBUSY;
+
+	intel_runtime_pm_put(dev_priv);
+	return ret;
+}
+
 /**
  * i915_gem_shrink - Shrink buffer object caches
  * @dev_priv: i915 device
@@ -156,7 +221,6 @@  i915_gem_shrink(struct drm_i915_private *dev_priv,
 		INIT_LIST_HEAD(&still_in_list);
 		while (count < target && !list_empty(phase->list)) {
 			struct drm_i915_gem_object *obj;
-			struct i915_vma *vma, *v;
 
 			obj = list_first_entry(phase->list,
 					       typeof(*obj), global_list);
@@ -172,18 +236,8 @@  i915_gem_shrink(struct drm_i915_private *dev_priv,
 			if (!can_release_pages(obj))
 				continue;
 
-			drm_gem_object_reference(&obj->base);
-
-			/* For the unbound phase, this should be a no-op! */
-			list_for_each_entry_safe(vma, v,
-						 &obj->vma_list, obj_link)
-				if (i915_vma_unbind(vma))
-					break;
-
-			if (i915_gem_object_put_pages(obj) == 0)
+			if (unsafe_drop_pages(obj) == 0)
 				count += obj->base.size >> PAGE_SHIFT;
-
-			drm_gem_object_unreference(&obj->base);
 		}
 		list_splice(&still_in_list, phase->list);
 	}
@@ -356,6 +410,97 @@  i915_gem_shrinker_oom(struct notifier_block *nb, unsigned long event, void *ptr)
 	return NOTIFY_DONE;
 }
 
+#ifdef CONFIG_MIGRATION
+static int i915_gem_shrinker_migratepage(struct address_space *mapping,
+					 struct page *newpage,
+					 struct page *page,
+					 enum migrate_mode mode,
+					 void *dev_priv_data)
+{
+	struct drm_i915_private *dev_priv = dev_priv_data;
+	struct drm_device *dev = dev_priv->dev;
+	unsigned long timeout = msecs_to_jiffies(10) + 1;
+	bool unlock;
+	int ret;
+
+	WARN((page_count(newpage) != 1), "Unexpected ref count for newpage\n");
+
+	/*
+	 * Clear the private field of the new target page as it could have a
+	 * stale value in the private field. Otherwise later on if this page
+	 * itself gets migrated, without getting referred by the Driver
+	 * in between, the stale value would cause the i915_migratepage
+	 * function to go for a toss as object pointer is derived from it.
+	 * This should be safe since at the time of migration, private field
+	 * of the new page (which is actually an independent free 4KB page now)
+	 * should be like a don't care for the kernel.
+	 */
+	set_page_private(newpage, 0);
+
+	if (!page_private(page))
+		goto migrate;
+
+	/*
+	 * Check the page count, if Driver also has a reference then it should
+	 * be more than 2, as shmem will have one reference and one reference
+	 * would have been taken by the migration path itself. So if reference
+	 * is <=2, we can directly invoke the migration function.
+	 */
+	if (page_count(page) <= 2)
+		goto migrate;
+
+	/*
+	 * Use trylock here, with a timeout, for struct_mutex as
+	 * otherwise there is a possibility of deadlock due to lock
+	 * inversion. This path, which tries to migrate a particular
+	 * page after locking that page, can race with a path which
+	 * truncate/purge pages of the corresponding object (after
+	 * acquiring struct_mutex). Since page truncation will also
+	 * try to lock the page, a scenario of deadlock can arise.
+	 */
+	while (!i915_gem_shrinker_lock(dev, &unlock) && --timeout)
+		schedule_timeout_killable(1);
+	if (timeout == 0) {
+		DRM_DEBUG_DRIVER("Unable to acquire device mutex.\n");
+		return -EBUSY;
+	}
+
+	ret = 0;
+	if (!PageSwapCache(page) && page_private(page)) {
+		struct drm_i915_gem_object *obj =
+			(struct drm_i915_gem_object *)page_private(page);
+
+		ret = do_migrate_page(obj);
+	}
+
+	if (unlock)
+		mutex_unlock(&dev->struct_mutex);
+
+	if (ret)
+		return ret;
+
+	/*
+	 * Ideally here we don't expect the page count to be > 2, as driver
+	 * would have dropped its reference, but occasionally it has been seen
+	 * coming as 3 & 4. This leads to a situation of unexpected page count,
+	 * causing migration failure, with -EGAIN error. This then leads to
+	 * multiple attempts by the kernel to migrate the same set of pages.
+	 * And sometimes the repeated attempts proves detrimental for stability.
+	 * Also since we don't know who is the other owner, and for how long its
+	 * gonna keep the reference, its better to return -EBUSY.
+	 */
+	if (page_count(page) > 2)
+		return -EBUSY;
+
+migrate:
+	ret = migrate_page(mapping, newpage, page, mode);
+	if (ret)
+		DRM_DEBUG_DRIVER("page=%p migration returned %d\n", page, ret);
+
+	return ret;
+}
+#endif
+
 /**
  * i915_gem_shrinker_init - Initialize i915 shrinker
  * @dev_priv: i915 device
@@ -371,6 +516,11 @@  void i915_gem_shrinker_init(struct drm_i915_private *dev_priv)
 
 	dev_priv->mm.oom_notifier.notifier_call = i915_gem_shrinker_oom;
 	WARN_ON(register_oom_notifier(&dev_priv->mm.oom_notifier));
+
+	dev_priv->mm.shmem_info.dev_private_data = dev_priv;
+#ifdef CONFIG_MIGRATION
+	dev_priv->mm.shmem_info.dev_migratepage = i915_gem_shrinker_migratepage;
+#endif
 }
 
 /**