diff mbox

[2/2] drm/i915: Make pages of GFX allocations movable

Message ID 1458713384-25688-2-git-send-email-akash.goel@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

akash.goel@intel.com March 23, 2016, 6:09 a.m. UTC
From: Chris Wilson <chris@chris-wilson.co.uk>

On a long run of more than 2-3 days, physical memory tends to get fragmented
severely, which considerably slows down the system. In such a scenario,
Shrinker is also unable to help as lack of memory is not the actual problem,
since it has been observed that there are enough free pages of 0 order.

To address the issue of external fragementation, kernel does a compaction
(which involves migration of pages) but it's efficacy depends upon how many
pages are marked as MOVABLE, as only those pages can be migrated.

Currently the backing pages for GFX buffers are allocated from shmemfs
with GFP_RECLAIMABLE flag, in units of 4KB pages.
In the case of limited Swap space, it may not be possible always to reclaim
or swap-out pages of all the inactive objects, to make way for free space
allowing formation of higher order groups of physically-contiguous pages
on compaction.

Just marking the GFX pages as MOVABLE will not suffice, as i915 Driver
has to pin the pages if they are in use by GPU, which will prevent their
migration. So the migratepage callback in shmem is also hooked up to get a
notification when kernel initiates the page migration. On the notification,
i915 Driver appropriately unpin the pages.
With this Driver can effectively mark the GFX pages as MOVABLE and hence
mitigate the fragmentation problem.

Cc: Hugh Dickins <hughd@google.com>
Cc: linux-mm@kvack.org
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
Signed-off-by: Akash Goel <akash.goel@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |   3 +
 drivers/gpu/drm/i915/i915_gem.c | 128 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 128 insertions(+), 3 deletions(-)

Comments

Chris Wilson March 23, 2016, 7:58 a.m. UTC | #1
On Wed, Mar 23, 2016 at 11:39:44AM +0530, akash.goel@intel.com wrote:
> +#ifdef CONFIG_MIGRATION
> +static int i915_migratepage(struct address_space *mapping,
> +			    struct page *newpage, struct page *page,
> +			    enum migrate_mode mode, void *dev_priv_data)

If we move this to i915_gem_shrink_migratepage (i.e. i915_gem_shrink),
we can

> +	/*
> +	 * Use trylock here, with a timeout, for struct_mutex as
> +	 * otherwise there is a possibility of deadlock due to lock
> +	 * inversion. This path, which tries to migrate a particular
> +	 * page after locking that page, can race with a path which
> +	 * truncate/purge pages of the corresponding object (after
> +	 * acquiring struct_mutex). Since page truncation will also
> +	 * try to lock the page, a scenario of deadlock can arise.
> +	 */
> +	while (!mutex_trylock(&dev->struct_mutex) && --timeout)
> +		schedule_timeout_killable(1);

replace this with i915_gem_shrinker_lock() and like constructs with the
other shrinkers. Any reason for dropping the early
if (!page_private(obj)) skip?

Similarly there are other patterns here that would benefit from
integration with existing shrinker logic. However, things like tidying
up the pin_display, unbinding, rpm lock inversion are still only on
list.
-Chris
akash.goel@intel.com March 23, 2016, 8:25 a.m. UTC | #2
On 3/23/2016 1:28 PM, Chris Wilson wrote:
> On Wed, Mar 23, 2016 at 11:39:44AM +0530, akash.goel@intel.com wrote:
>> +#ifdef CONFIG_MIGRATION
>> +static int i915_migratepage(struct address_space *mapping,
>> +			    struct page *newpage, struct page *page,
>> +			    enum migrate_mode mode, void *dev_priv_data)
>
> If we move this to i915_gem_shrink_migratepage (i.e. i915_gem_shrink),
> we can
>
>> +	/*
>> +	 * Use trylock here, with a timeout, for struct_mutex as
>> +	 * otherwise there is a possibility of deadlock due to lock
>> +	 * inversion. This path, which tries to migrate a particular
>> +	 * page after locking that page, can race with a path which
>> +	 * truncate/purge pages of the corresponding object (after
>> +	 * acquiring struct_mutex). Since page truncation will also
>> +	 * try to lock the page, a scenario of deadlock can arise.
>> +	 */
>> +	while (!mutex_trylock(&dev->struct_mutex) && --timeout)
>> +		schedule_timeout_killable(1);
>
> replace this with i915_gem_shrinker_lock() and like constructs with the
> other shrinkers.

fine, will rename the function to gem_shrink_migratepage, move it inside 
the gem_shrinker.c file, and use the existing constructs.

 > Any reason for dropping the early
 > if (!page_private(obj)) skip?
 >

Would this sequence be fine ?

	if (!page_private(page))
		goto migrate; /*skip */

	Loop for locking mutex

	obj = (struct drm_i915_gem_object *)page_private(page);

	if (!PageSwapCache(page) && obj) {


> Similarly there are other patterns here that would benefit from
> integration with existing shrinker logic. However, things like tidying
> up the pin_display, unbinding, rpm lock inversion are still only on
> list.

Tidying, like split that one single if condition into multiple if, else 
if blocks ?

Best regards
Akash
> -Chris
>
Chris Wilson March 24, 2016, 8:13 a.m. UTC | #3
On Wed, Mar 23, 2016 at 01:55:14PM +0530, Goel, Akash wrote:
> 
> 
> On 3/23/2016 1:28 PM, Chris Wilson wrote:
> >On Wed, Mar 23, 2016 at 11:39:44AM +0530, akash.goel@intel.com wrote:
> >>+#ifdef CONFIG_MIGRATION
> >>+static int i915_migratepage(struct address_space *mapping,
> >>+			    struct page *newpage, struct page *page,
> >>+			    enum migrate_mode mode, void *dev_priv_data)
> >
> >If we move this to i915_gem_shrink_migratepage (i.e. i915_gem_shrink),
> >we can
> >
> >>+	/*
> >>+	 * Use trylock here, with a timeout, for struct_mutex as
> >>+	 * otherwise there is a possibility of deadlock due to lock
> >>+	 * inversion. This path, which tries to migrate a particular
> >>+	 * page after locking that page, can race with a path which
> >>+	 * truncate/purge pages of the corresponding object (after
> >>+	 * acquiring struct_mutex). Since page truncation will also
> >>+	 * try to lock the page, a scenario of deadlock can arise.
> >>+	 */
> >>+	while (!mutex_trylock(&dev->struct_mutex) && --timeout)
> >>+		schedule_timeout_killable(1);
> >
> >replace this with i915_gem_shrinker_lock() and like constructs with the
> >other shrinkers.
> 
> fine, will rename the function to gem_shrink_migratepage, move it
> inside the gem_shrinker.c file, and use the existing constructs.
> 
> > Any reason for dropping the early
> > if (!page_private(obj)) skip?
> >
> 
> Would this sequence be fine ?
> 
> 	if (!page_private(page))
> 		goto migrate; /*skip */
> 
> 	Loop for locking mutex
> 
> 	obj = (struct drm_i915_gem_object *)page_private(page);
> 
> 	if (!PageSwapCache(page) && obj) {

Yes.

> >Similarly there are other patterns here that would benefit from
> >integration with existing shrinker logic. However, things like tidying
> >up the pin_display, unbinding, rpm lock inversion are still only on
> >list.
> 
> Tidying, like split that one single if condition into multiple if,
> else if blocks ?

Just outstanding patches that simplify the condition and work we have to
do here.
-Chris
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f330a53..28e50c7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -52,6 +52,7 @@ 
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
 #include <linux/pm_qos.h>
+#include <linux/shmem_fs.h>
 #include "intel_guc.h"
 #include "intel_dpll_mgr.h"
 
@@ -1966,6 +1967,8 @@  struct drm_i915_private {
 
 	struct intel_encoder *dig_port_map[I915_MAX_PORTS];
 
+	struct shmem_dev_info  migrate_info;
+
 	/*
 	 * NOTE: This is the dri1/ums dungeon, don't add stuff here. Your patch
 	 * will be rejected. Instead look for a better place.
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8588c83..a4af5b6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -33,6 +33,7 @@ 
 #include "i915_trace.h"
 #include "intel_drv.h"
 #include <linux/shmem_fs.h>
+#include <linux/migrate.h>
 #include <linux/slab.h>
 #include <linux/swap.h>
 #include <linux/pci.h>
@@ -2204,6 +2205,7 @@  i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 		if (obj->madv == I915_MADV_WILLNEED)
 			mark_page_accessed(page);
 
+		set_page_private(page, 0);
 		page_cache_release(page);
 	}
 	obj->dirty = 0;
@@ -2318,6 +2320,7 @@  i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 			sg->length += PAGE_SIZE;
 		}
 		last_pfn = page_to_pfn(page);
+		set_page_private(page, (unsigned long)obj);
 
 		/* Check that the i965g/gm workaround works. */
 		WARN_ON((gfp & __GFP_DMA32) && (last_pfn >= 0x00100000UL));
@@ -2343,8 +2346,11 @@  i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 
 err_pages:
 	sg_mark_end(sg);
-	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0)
-		page_cache_release(sg_page_iter_page(&sg_iter));
+	for_each_sg_page(st->sgl, &sg_iter, st->nents, 0) {
+		page = sg_page_iter_page(&sg_iter);
+		set_page_private(page, 0);
+		page_cache_release(page);
+	}
 	sg_free_table(st);
 	kfree(st);
 
@@ -4465,9 +4471,116 @@  static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
 	.put_pages = i915_gem_object_put_pages_gtt,
 };
 
+#ifdef CONFIG_MIGRATION
+static int i915_migratepage(struct address_space *mapping,
+			    struct page *newpage, struct page *page,
+			    enum migrate_mode mode, void *dev_priv_data)
+{
+	struct drm_i915_private *dev_priv = dev_priv_data;
+	struct drm_device *dev = dev_priv->dev;
+	struct drm_i915_gem_object *obj;
+	unsigned long timeout = msecs_to_jiffies(10) + 1;
+	int ret = 0;
+
+	WARN((page_count(newpage) != 1), "Unexpected ref count for newpage\n");
+
+	/*
+	 * Clear the private field of the new target page as it could have a
+	 * stale value in the private field. Otherwise later on if this page
+	 * itself gets migrated, without getting referred by the Driver
+	 * in between, the stale value would cause the i915_migratepage
+	 * function to go for a toss as object pointer is derived from it.
+	 * This should be safe since at the time of migration, private field
+	 * of the new page (which is actually an independent free 4KB page now)
+	 * should be like a don't care for the kernel.
+	 */
+	set_page_private(newpage, 0);
+
+	/*
+	 * Check the page count, if Driver also has a reference then it should
+	 * be more than 2, as shmem will have one reference and one reference
+	 * would have been taken by the migration path itself. So if reference
+	 * is <=2, we can directly invoke the migration function.
+	 */
+	if (page_count(page) <= 2)
+		goto migrate;
+
+	/*
+	 * Use trylock here, with a timeout, for struct_mutex as
+	 * otherwise there is a possibility of deadlock due to lock
+	 * inversion. This path, which tries to migrate a particular
+	 * page after locking that page, can race with a path which
+	 * truncate/purge pages of the corresponding object (after
+	 * acquiring struct_mutex). Since page truncation will also
+	 * try to lock the page, a scenario of deadlock can arise.
+	 */
+	while (!mutex_trylock(&dev->struct_mutex) && --timeout)
+		schedule_timeout_killable(1);
+	if (timeout == 0) {
+		DRM_DEBUG_DRIVER("Unable to acquire device mutex.\n");
+		return -EBUSY;
+	}
+
+	obj = (struct drm_i915_gem_object *)page_private(page);
+
+	if (!PageSwapCache(page) && obj) {
+		/*
+		 * Avoid the migration of pages if they are being actively used
+		 * by GPU or pinned for Display.
+		 * Also skip the migration for purgeable objects otherwise there
+		 * will be a deadlock when shmem will try to lock the page for
+		 * truncation, which is already locked by the caller before
+		 * migration.
+		 * Also HW access would be required for a bound object for which
+		 * device has to be kept runtime active. But a deadlock scenario
+		 * can arise if the attempt is made to resume the device, when
+		 * either a suspend or a resume operation is already happening
+		 * concurrently from some other path and that only actually
+		 * triggered the compaction. So only unbind if the device is
+		 * currently runtime active.
+		 */
+		if (obj->active || obj->pin_display ||
+		    obj->madv == I915_MADV_DONTNEED ||
+		    !intel_runtime_pm_get_if_in_use(dev_priv)) {
+			ret = -EBUSY;
+		} else {
+			ret = drop_pages(obj);
+			intel_runtime_pm_put(dev_priv);
+		}
+
+		BUG_ON(!ret && page_private(page));
+	}
+
+	mutex_unlock(&dev->struct_mutex);
+	if (ret)
+		return ret;
+
+	/*
+	 * Ideally here we don't expect the page count to be > 2, as driver
+	 * would have dropped its reference, but occasionally it has been seen
+	 * coming as 3 & 4. This leads to a situation of unexpected page count,
+	 * causing migration failure, with -EGAIN error. This then leads to
+	 * multiple attempts by the kernel to migrate the same set of pages.
+	 * And sometimes the repeated attempts proves detrimental for stability.
+	 * Also since we don't know who is the other owner, and for how long its
+	 * gonna keep the reference, its better to return -EBUSY.
+	 */
+	if (page_count(page) > 2)
+		return -EBUSY;
+
+migrate:
+	ret = migrate_page(mapping, newpage, page, mode);
+	if (ret)
+		DRM_DEBUG_DRIVER("page=%p migration returned %d\n", page, ret);
+
+	return ret;
+}
+#endif
+
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	struct address_space *mapping;
 	gfp_t mask;
@@ -4481,7 +4594,7 @@  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 		return NULL;
 	}
 
-	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+	mask = GFP_HIGHUSER_MOVABLE;
 	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
 		/* 965gm cannot relocate objects above 4GiB. */
 		mask &= ~__GFP_HIGHMEM;
@@ -4491,6 +4604,10 @@  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 	mapping = file_inode(obj->base.filp)->i_mapping;
 	mapping_set_gfp_mask(mapping, mask);
 
+#ifdef CONFIG_MIGRATION
+	shmem_set_device_ops(mapping, &dev_priv->migrate_info);
+#endif
+
 	i915_gem_object_init(obj, &i915_gem_object_ops);
 
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
@@ -4993,6 +5110,11 @@  int i915_gem_init(struct drm_device *dev)
 		ret = 0;
 	}
 
+#ifdef CONFIG_MIGRATION
+	dev_priv->migrate_info.dev_private_data = dev_priv;
+	dev_priv->migrate_info.dev_migratepage = i915_migratepage;
+#endif
+
 out_unlock:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 	mutex_unlock(&dev->struct_mutex);