diff mbox

[RFC,3/3] drm/i915: Add the truncation logic for Stolen objects.

Message ID 1394019340-8811-4-git-send-email-sourab.gupta@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

sourab.gupta@intel.com March 5, 2014, 11:35 a.m. UTC
From: Sourab Gupta <sourab.gupta@intel.com>

Adding the truncation logic for buffer objects with backing storage
from stolen memory. The objects will be truncated when user marks them
as purgeable and they are part of the inactive list.

Testcase: igt/gem_stolen_mem

Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>

Signed-off-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Sourab Gupta <sourab.gupta@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h        |    1 +
 drivers/gpu/drm/i915/i915_gem.c        |   11 ++++++++++-
 drivers/gpu/drm/i915/i915_gem_stolen.c |   32 ++++++++++++++++++++++++++++++++
 3 files changed, 43 insertions(+), 1 deletion(-)

Comments

Chris Wilson March 5, 2014, 11:49 a.m. UTC | #1
On Wed, Mar 05, 2014 at 05:05:40PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Adding the truncation logic for buffer objects with backing storage
> from stolen memory. The objects will be truncated when user marks them
> as purgeable and they are part of the inactive list.
> 
> Testcase: igt/gem_stolen_mem

This code still has the same obvious bugs.
-Chris
sourab.gupta@intel.com March 5, 2014, 12:26 p.m. UTC | #2
Hi Chris,

Can you please through some light on the bug here, so that we're able to handle it.
Is it about the handling required for the pread/pwrite ioctls, or is it about the format of mentioning the testcase?

Thanks,
Sourab

-----Original Message-----
From: Chris Wilson [mailto:chris@chris-wilson.co.uk] 
Sent: Wednesday, March 05, 2014 5:19 PM
To: Gupta, Sourab
Cc: intel-gfx@lists.freedesktop.org; Vetter, Daniel; Barnes, Jesse; Wilson, Chris; Goel, Akash
Subject: Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.

On Wed, Mar 05, 2014 at 05:05:40PM +0530, sourab.gupta@intel.com wrote:
> From: Sourab Gupta <sourab.gupta@intel.com>
> 
> Adding the truncation logic for buffer objects with backing storage 
> from stolen memory. The objects will be truncated when user marks them 
> as purgeable and they are part of the inactive list.
> 
> Testcase: igt/gem_stolen_mem

This code still has the same obvious bugs.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre
Chris Wilson March 5, 2014, 12:45 p.m. UTC | #3
On Wed, Mar 05, 2014 at 12:26:26PM +0000, Gupta, Sourab wrote:
> Hi Chris,
> 
> Can you please through some light on the bug here, so that we're able to handle it.
> Is it about the handling required for the pread/pwrite ioctls, or is it about the format of mentioning the testcase?

You truncate the stolen region whilst it is still in use and then hand
it out ot a new object. Rinse and repeat.
-Chris
sourab.gupta@intel.com March 5, 2014, 7:24 p.m. UTC | #4
Hi Chris,

We have assumed the following lifecycle of the (stolen)object, w.r.t the truncation usecase:

1) The user creates the (non-cpu mappable)object --> the gem object is created. Shmem filep is allocated. No backing storage present
2) GPU operations performed (after mmap_gtt ) --> object is moved to stolen memory and the backing pages are allocated from stolen mem area.
3) Object is marked as purgeable (by madvise_ioctl)-->  here two scenarios can occur:
	a) Object has not yet been processed by GPU. In this case it will be in the active list. So, madvise_ioctl will mark it as purgeable and return. 
	     Subsequently, when the retire handler runs, and object is being moved to inactive list, then, the purgeable object is truncated and marked as purged.
	b) Object is already processed by the GPU. In this case, it will be in the inactive list. So, madvise_ioctl will do the object truncation and mark it as purged.
4) If object is not marked as purgeable, the backing pages of the object will remain.
5) When object is freed, backing pages are released, if the object has not been purged already.

Is the object being in the inactive list, the necessary and sufficient condition of it not being in use?
We may be missing out on some condition regarding when the object will be in use.
Can you please point us to the flaws in above assumption regarding when we can truncate the object.

Thanks and Regards,
Sourab


-----Original Message-----
From: Chris Wilson [mailto:chris@chris-wilson.co.uk] 
Sent: Wednesday, March 05, 2014 6:15 PM
To: Gupta, Sourab
Cc: intel-gfx@lists.freedesktop.org; Vetter, Daniel; Barnes, Jesse; Wilson, Chris; Goel, Akash
Subject: Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.

On Wed, Mar 05, 2014 at 12:26:26PM +0000, Gupta, Sourab wrote:
> Hi Chris,
> 
> Can you please through some light on the bug here, so that we're able to handle it.
> Is it about the handling required for the pread/pwrite ioctls, or is it about the format of mentioning the testcase?

You truncate the stolen region whilst it is still in use and then hand it out ot a new object. Rinse and repeat.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre
Vetter, Daniel March 5, 2014, 7:47 p.m. UTC | #5
On 05/03/2014 20:24, Gupta, Sourab wrote:
> We have assumed the following lifecycle of the (stolen)object, w.r.t the truncation usecase:
>
> 1) The user creates the (non-cpu mappable)object --> the gem object is created. Shmem filep is allocated. No backing storage present
> 2) GPU operations performed (after mmap_gtt ) --> object is moved to stolen memory and the backing pages are allocated from stolen mem area.
> 3) Object is marked as purgeable (by madvise_ioctl)-->  here two scenarios can occur:
> 	a) Object has not yet been processed by GPU. In this case it will be in the active list. So, madvise_ioctl will mark it as purgeable and return.
> 	     Subsequently, when the retire handler runs, and object is being moved to inactive list, then, the purgeable object is truncated and marked as purged.
> 	b) Object is already processed by the GPU. In this case, it will be in the inactive list. So, madvise_ioctl will do the object truncation and mark it as purged.
> 4) If object is not marked as purgeable, the backing pages of the object will remain.
> 5) When object is freed, backing pages are released, if the object has not been purged already.
>
> Is the object being in the inactive list, the necessary and sufficient condition of it not being in use?
> We may be missing out on some condition regarding when the object will be in use.
> Can you please point us to the flaws in above assumption regarding when we can truncate the object.
Immediately purging the backing storage is not the idea of the madvise 
ioctl. The idea is to purge it as late as possible, i.e. when we've 
tried to allocate more stolen space but failed. For comparison see how 
purgeable works for normal objects: Those only get purged in the 
shrinker, i.e. when the linux mm is short on memory.
-Daniel
Intel Semiconductor AG
Registered No. 020.30.913.786-7
Registered Office: World Trade Center, Leutschenbachstrasse 95, 8050 Zurich, Switzerland

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
sourab.gupta@intel.com March 6, 2014, 9:39 a.m. UTC | #6
Hi Daniel,

We had also considered the two points where the stolen mem objs can be truncated -
	a) at the point of a new bo allocation, if stolen mem space is exhausted. 
	b) at the point when the bo is marked as purgeable (madvise ioctl)

We decided upon the second case because:
1) For normal objects, there is a bo cacheing at libdrm level. So even after a bo is marked as purgeable, 
it can still get reused by libdrm (if not truncated already), thereby reducing the effort in new object creation. 
Currently for stolen objects we have not enabled this cacheing. So once a stolen bo is marked as purgeable by User, 
Driver can immediately reclaim the stolen space and there will be no obvious benefit in deferring the object truncation to 
a later stage (when there is a failure in allocation of a new bo). 
Actually we thought that since Stolen mem is a relatively much smaller space and has a limited usage,  
there may not be any additional benefit gained by enabling bo cacheing on libdrm side. 
Also the effort in getting the backing space from stolen mem shall be much less than from shmem. 

With the first approach, it would have required us to scan the entire list of bound & unbound objects to search for the 
purgeable stolen objects and purge them. With the 2nd approach, we could avoid this & purge the objects we already have
in our hand. 

2) Also, there is no coupling between the linux shrinker & the stolen memory. Shrinker invocation will depend upon the  
system memory usage. So, we cannot rely on shrinker to timely trigger the purging of objects from stolen mem.

We understand that we are diverging from the basic nature of madvise ioctl here (which is to only mark objs
as purgeable and so eligible for truncation if needed), but probably because of the above way of using the stolen memory, 
we thought madvise could also be one of the good checkpoint where we can truncate the stolen bos.

What are your views regarding above.
We will try to revisit our design if its digressing too much from the basic tenets regarding the madvise ioctl 

Thanks,
Sourab


-----Original Message-----
From: Vetter, Daniel 
Sent: Thursday, March 06, 2014 1:18 AM
To: Gupta, Sourab; Chris Wilson
Cc: intel-gfx@lists.freedesktop.org; Barnes, Jesse; Wilson, Chris; Goel, Akash
Subject: Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.

On 05/03/2014 20:24, Gupta, Sourab wrote:
> We have assumed the following lifecycle of the (stolen)object, w.r.t the truncation usecase:
>
> 1) The user creates the (non-cpu mappable)object --> the gem object is 
> created. Shmem filep is allocated. No backing storage present
> 2) GPU operations performed (after mmap_gtt ) --> object is moved to stolen memory and the backing pages are allocated from stolen mem area.
> 3) Object is marked as purgeable (by madvise_ioctl)-->  here two scenarios can occur:
> 	a) Object has not yet been processed by GPU. In this case it will be in the active list. So, madvise_ioctl will mark it as purgeable and return.
> 	     Subsequently, when the retire handler runs, and object is being moved to inactive list, then, the purgeable object is truncated and marked as purged.
> 	b) Object is already processed by the GPU. In this case, it will be in the inactive list. So, madvise_ioctl will do the object truncation and mark it as purged.
> 4) If object is not marked as purgeable, the backing pages of the object will remain.
> 5) When object is freed, backing pages are released, if the object has not been purged already.
>
> Is the object being in the inactive list, the necessary and sufficient condition of it not being in use?
> We may be missing out on some condition regarding when the object will be in use.
> Can you please point us to the flaws in above assumption regarding when we can truncate the object.
Immediately purging the backing storage is not the idea of the madvise ioctl. The idea is to purge it as late as possible, i.e. when we've tried to allocate more stolen space but failed. For comparison see how purgeable works for normal objects: Those only get purged in the shrinker, i.e. when the linux mm is short on memory.
-Daniel
Vetter, Daniel March 6, 2014, 9:48 a.m. UTC | #7
On 06/03/2014 10:39, Gupta, Sourab wrote:
> Hi Daniel,
>
> We had also considered the two points where the stolen mem objs can be truncated -
> 	a) at the point of a new bo allocation, if stolen mem space is exhausted.
> 	b) at the point when the bo is marked as purgeable (madvise ioctl)
>
> We decided upon the second case because:
> 1) For normal objects, there is a bo cacheing at libdrm level. So even after a bo is marked as purgeable,
> it can still get reused by libdrm (if not truncated already), thereby reducing the effort in new object creation.
> Currently for stolen objects we have not enabled this cacheing. So once a stolen bo is marked as purgeable by User,
> Driver can immediately reclaim the stolen space and there will be no obvious benefit in deferring the object truncation to
> a later stage (when there is a failure in allocation of a new bo).
> Actually we thought that since Stolen mem is a relatively much smaller space and has a limited usage,
> there may not be any additional benefit gained by enabling bo cacheing on libdrm side.
> Also the effort in getting the backing space from stolen mem shall be much less than from shmem.
>
> With the first approach, it would have required us to scan the entire list of bound & unbound objects to search for the
> purgeable stolen objects and purge them. With the 2nd approach, we could avoid this & purge the objects we already have
> in our hand.
Well, approach b) breaks the libdrm cache, since libdrm sets the madvise 
flag to purgeable when insert an unused object into it's cache. So you 
really can't truncate the object right away, since that completely 
defeats the point of the cache.
-Daniel
Intel Semiconductor AG
Registered No. 020.30.913.786-7
Registered Office: World Trade Center, Leutschenbachstrasse 95, 8050 Zurich, Switzerland

This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
sourab.gupta@intel.com March 6, 2014, 10:48 a.m. UTC | #8
Hi Daniel,

For stolen objects, we have not enabled the libdrm cacheing, so that when the user creates bo with this flag, the objects created are not cached.
Then the scenario of breaking the libdrm cacheing will not arise in this case.
This is ensured in the libdrm patches uploaded together with these patches. Can you please have a look at those patches.

Also, do we need to enable cacheing for stolen objects at libdrm ( therefore deferring the stolen object truncation to later) ?

Regards,
Sourab

-----Original Message-----
From: Vetter, Daniel 
Sent: Thursday, March 06, 2014 3:19 PM
To: Gupta, Sourab; Chris Wilson
Cc: intel-gfx@lists.freedesktop.org; Barnes, Jesse; Wilson, Chris; Goel, Akash
Subject: Re: [RFC 3/3] drm/i915: Add the truncation logic for Stolen objects.

On 06/03/2014 10:39, Gupta, Sourab wrote:
> Hi Daniel,
>
> We had also considered the two points where the stolen mem objs can be truncated -
> 	a) at the point of a new bo allocation, if stolen mem space is exhausted.
> 	b) at the point when the bo is marked as purgeable (madvise ioctl)
>
> We decided upon the second case because:
> 1) For normal objects, there is a bo cacheing at libdrm level. So even 
> after a bo is marked as purgeable, it can still get reused by libdrm (if not truncated already), thereby reducing the effort in new object creation.
> Currently for stolen objects we have not enabled this cacheing. So 
> once a stolen bo is marked as purgeable by User, Driver can 
> immediately reclaim the stolen space and there will be no obvious benefit in deferring the object truncation to a later stage (when there is a failure in allocation of a new bo).
> Actually we thought that since Stolen mem is a relatively much smaller 
> space and has a limited usage, there may not be any additional benefit gained by enabling bo cacheing on libdrm side.
> Also the effort in getting the backing space from stolen mem shall be much less than from shmem.
>
> With the first approach, it would have required us to scan the entire 
> list of bound & unbound objects to search for the purgeable stolen 
> objects and purge them. With the 2nd approach, we could avoid this & purge the objects we already have in our hand.
Well, approach b) breaks the libdrm cache, since libdrm sets the madvise flag to purgeable when insert an unused object into it's cache. So you really can't truncate the object right away, since that completely defeats the point of the cache.
-Daniel
Daniel Vetter March 6, 2014, 11:17 a.m. UTC | #9
On Thu, Mar 6, 2014 at 11:48 AM, Gupta, Sourab <sourab.gupta@intel.com> wrote:
> Also, do we need to enable cacheing for stolen objects at libdrm ( therefore deferring the stolen object truncation to later) ?

That was kinda the entire point of making stolen objects purgeable ...
-Daniel
diff mbox

Patch

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index b5f603f..02142c0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2443,6 +2443,7 @@  i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 					       u32 size);
 void i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj);
 void i915_gem_object_release_stolen(struct drm_i915_gem_object *obj);
+void i915_gem_object_truncate_stolen(struct drm_i915_gem_object *obj);
 
 /* i915_gem_tiling.c */
 static inline bool i915_gem_object_needs_bit17_swizzle(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f57ca31..d106806 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2071,6 +2071,11 @@  i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 	obj->fenced_gpu_access = false;
 
 	obj->active = 0;
+
+	/* Truncate the stolen obj immediately if it is marked as purgeable */
+	if(obj->stolen && (i915_gem_object_is_purgeable(obj) ) )
+		i915_gem_object_truncate_stolen(obj);
+
 	drm_gem_object_unreference(&obj->base);
 
 	WARN_ON(i915_verify_lists(dev));
@@ -4069,6 +4074,10 @@  i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	if (i915_gem_object_is_purgeable(obj) && obj->pages == NULL)
 		i915_gem_object_truncate(obj);
 
+	/* if the stolen mem object is no longer active, discard its backing storage */
+	if (obj->stolen && i915_gem_object_is_purgeable(obj) && i915_gem_object_is_inactive(obj))
+		i915_gem_object_truncate_stolen(obj);
+
 	args->retained = obj->madv != __I915_MADV_PURGED;
 
 out:
@@ -4187,7 +4196,7 @@  void i915_gem_free_object(struct drm_gem_object *gem_obj)
 
 	/* Stolen objects don't hold a ref, but do hold pin count. Fix that up
 	 * before progressing. */
-	if (obj->stolen)
+	if ((obj->stolen) && (obj->madv != __I915_MADV_PURGED))
 		i915_gem_object_unpin_pages(obj);
 
 	if (WARN_ON(obj->pages_pin_count))
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 6758ba4..1ecc447 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -471,6 +471,11 @@  i915_gem_object_move_to_stolen(struct drm_i915_gem_object *obj)
 	if (size == 0)
 		return;
 
+	if (obj->madv != I915_MADV_WILLNEED) {
+		DRM_ERROR("Attempting to allocate a purgeable object\n");
+		return;
+	}
+
 	/* Check if already shmem space has been allocated for the object
 	 * or not. We cannot rely upon on the value of 'pages' field for this.
 	 * As even though if the 'pages' field is NULL, it does not actually
@@ -549,3 +554,30 @@  i915_gem_object_release_stolen(struct drm_i915_gem_object *obj)
 		obj->stolen = NULL;
 	}
 }
+
+/*
+ * This function truncates stolen objects to reclaim their backing storage
+ * from stolen mem area. Currently objects are truncated when they are
+ * marked as purgeable and are in inactive list.
+ * Stolen objects are not considered for truncation by shrinker, as their
+ * physical pages are kept as pinned throughout their existance.
+ */
+void
+i915_gem_object_truncate_stolen(struct drm_i915_gem_object *obj)
+{
+	struct i915_vma *vma_temp,*next;
+
+	BUG_ON(!obj->stolen);
+
+	list_for_each_entry_safe(vma_temp, next, &obj->vma_list, vma_link)
+		WARN_ON(i915_vma_unbind(vma_temp));
+
+	i915_gem_object_unpin_pages(obj);
+	i915_gem_object_put_pages(obj);
+
+	/* Release the stolen space */
+	i915_gem_object_release_stolen(obj);
+
+	obj->madv = __I915_MADV_PURGED;
+
+}