diff mbox

drm/ttm: fix adding foreign BOs to the LRU during init

Message ID 1452260518-2716-1-git-send-email-deathsimple@vodafone.de (mailing list archive)
State New, archived
Headers show

Commit Message

Christian König Jan. 8, 2016, 1:41 p.m. UTC
From: Christian König <christian.koenig@amd.com>

If we import a BO with an eternal reservation object we don't
reserve/unreserve it. So we never add it to the LRU causing a possible
deny of service.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Thomas Hellström (VMware) Jan. 9, 2016, 7:05 a.m. UTC | #1
Hi!

I might be misunderderstanding the use-case here, but IIRC the
discussion with TTM vs imported / exported buffer objects is that a
buffer object needs to be marked NO_EVICT in TTM before it's exported
and an imported object should never end up on a LRU list in TTM because
TTM wouldn't know how to evict it.

/Thomas
 

On 01/08/2016 02:41 PM, Christian König wrote:
> From: Christian König <christian.koenig@amd.com>
>
> If we import a BO with an eternal reservation object we don't
> reserve/unreserve it. So we never add it to the LRU causing a possible
> deny of service.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
> index 745e996..a98a5d5 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
> @@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>  	if (likely(!ret))
>  		ret = ttm_bo_validate(bo, placement, interruptible, false);
>  
> -	if (!resv)
> +	if (!resv) {
>  		ttm_bo_unreserve(bo);
>  
> +	} else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
> +		spin_lock(&bo->glob->lru_lock);
> +		ttm_bo_add_to_lru(bo);
> +		spin_unlock(&bo->glob->lru_lock);
> +	}
> +
>  	if (unlikely(ret))
>  		ttm_bo_unref(&bo);
>
Christian König Jan. 9, 2016, 10:46 a.m. UTC | #2
It's correct that exported buffers can't be moved to another domain or 
swapped to disk while another device is using it.

But for imports that's a different story:
> an imported object should never end up on a LRU list in TTM because
> TTM wouldn't know how to evict it.
Even currently the imported objects are actually added to the LRU. The 
problem is just that they are not added immediately during creation, but 
only with the first command submission.

E.g. what you can do as userspace process is importing a lot of buffers 
from a different device into the GFX drivers and never use it. This way 
you can force the kernel into running out of GTT address space.

Regards,
Christian.

Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
> Hi!
>
> I might be misunderderstanding the use-case here, but IIRC the
> discussion with TTM vs imported / exported buffer objects is that a
> buffer object needs to be marked NO_EVICT in TTM before it's exported
> and an imported object should never end up on a LRU list in TTM because
> TTM wouldn't know how to evict it.
>
> /Thomas
>   
>
> On 01/08/2016 02:41 PM, Christian König wrote:
>> From: Christian König <christian.koenig@amd.com>
>>
>> If we import a BO with an eternal reservation object we don't
>> reserve/unreserve it. So we never add it to the LRU causing a possible
>> deny of service.
>>
>> Signed-off-by: Christian König <christian.koenig@amd.com>
>> ---
>>   drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
>> index 745e996..a98a5d5 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>> @@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>>   	if (likely(!ret))
>>   		ret = ttm_bo_validate(bo, placement, interruptible, false);
>>   
>> -	if (!resv)
>> +	if (!resv) {
>>   		ttm_bo_unreserve(bo);
>>   
>> +	} else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>> +		spin_lock(&bo->glob->lru_lock);
>> +		ttm_bo_add_to_lru(bo);
>> +		spin_unlock(&bo->glob->lru_lock);
>> +	}
>> +
>>   	if (unlikely(ret))
>>   		ttm_bo_unref(&bo);
>>   
>
>
Thomas Hellström (VMware) Jan. 10, 2016, 3:10 p.m. UTC | #3
On 01/09/2016 11:46 AM, Christian König wrote:
> It's correct that exported buffers can't be moved to another domain or
> swapped to disk while another device is using it.
>
> But for imports that's a different story:
>> an imported object should never end up on a LRU list in TTM because
>> TTM wouldn't know how to evict it.
> Even currently the imported objects are actually added to the LRU. The
> problem is just that they are not added immediately during creation,
> but only with the first command submission.

Hmm. The fact that they are added to LRU at all sounds weird. What
happens when TTM tries to evict or even swap out an imported buffer?

/Thomas


>
> Regards,
> Christian.
>
> Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
>> Hi!
>>
>> I might be misunderderstanding the use-case here, but IIRC the
>> discussion with TTM vs imported / exported buffer objects is that a
>> buffer object needs to be marked NO_EVICT in TTM before it's exported
>> and an imported object should never end up on a LRU list in TTM because
>> TTM wouldn't know how to evict it.
>>
>> /Thomas
>>  
>> On 01/08/2016 02:41 PM, Christian König wrote:
>>> From: Christian König <christian.koenig@amd.com>
>>>
>>> If we import a BO with an eternal reservation object we don't
>>> reserve/unreserve it. So we never add it to the LRU causing a possible
>>> deny of service.
>>>
>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>> ---
>>>   drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>> index 745e996..a98a5d5 100644
>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>> @@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>>>       if (likely(!ret))
>>>           ret = ttm_bo_validate(bo, placement, interruptible, false);
>>>   -    if (!resv)
>>> +    if (!resv) {
>>>           ttm_bo_unreserve(bo);
>>>   +    } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>>> +        spin_lock(&bo->glob->lru_lock);
>>> +        ttm_bo_add_to_lru(bo);
>>> +        spin_unlock(&bo->glob->lru_lock);
>>> +    }
>>> +
>>>       if (unlikely(ret))
>>>           ttm_bo_unref(&bo);
>>>   
>>
>>
>
Chunming Zhou Jan. 11, 2016, 3:38 a.m. UTC | #4
On 2016?01?10? 23:10, Thomas Hellstrom wrote:
> On 01/09/2016 11:46 AM, Christian König wrote:
>> It's correct that exported buffers can't be moved to another domain or
>> swapped to disk while another device is using it.
>>
>> But for imports that's a different story:
>>> an imported object should never end up on a LRU list in TTM because
>>> TTM wouldn't know how to evict it.
>> Even currently the imported objects are actually added to the LRU. The
>> problem is just that they are not added immediately during creation,
>> but only with the first command submission.
> Hmm. The fact that they are added to LRU at all sounds weird. What
> happens when TTM tries to evict or even swap out an imported buffer?
Not only the imported buffer needs this patch, But also the buffers 
shared the reservation. E.g. amdgpu is using one shared reservation for 
all BOs of page tables. We needs add it to LRU when creating the BO, 
otherwise the BO will be lost when system suspend.

Thanks,
David Zhou
>
> /Thomas
>
>
>> Regards,
>> Christian.
>>
>> Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
>>> Hi!
>>>
>>> I might be misunderderstanding the use-case here, but IIRC the
>>> discussion with TTM vs imported / exported buffer objects is that a
>>> buffer object needs to be marked NO_EVICT in TTM before it's exported
>>> and an imported object should never end up on a LRU list in TTM because
>>> TTM wouldn't know how to evict it.
>>>
>>> /Thomas
>>>   
>>> On 01/08/2016 02:41 PM, Christian König wrote:
>>>> From: Christian König <christian.koenig@amd.com>
>>>>
>>>> If we import a BO with an eternal reservation object we don't
>>>> reserve/unreserve it. So we never add it to the LRU causing a possible
>>>> deny of service.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> index 745e996..a98a5d5 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> @@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>>>>        if (likely(!ret))
>>>>            ret = ttm_bo_validate(bo, placement, interruptible, false);
>>>>    -    if (!resv)
>>>> +    if (!resv) {
>>>>            ttm_bo_unreserve(bo);
>>>>    +    } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>>>> +        spin_lock(&bo->glob->lru_lock);
>>>> +        ttm_bo_add_to_lru(bo);
>>>> +        spin_unlock(&bo->glob->lru_lock);
>>>> +    }
>>>> +
>>>>        if (unlikely(ret))
>>>>            ttm_bo_unref(&bo);
>>>>    
>>>
>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
Christian König Jan. 11, 2016, 12:39 p.m. UTC | #5
Am 10.01.2016 um 16:10 schrieb Thomas Hellstrom:
> On 01/09/2016 11:46 AM, Christian König wrote:
>> It's correct that exported buffers can't be moved to another domain or
>> swapped to disk while another device is using it.
>>
>> But for imports that's a different story:
>>> an imported object should never end up on a LRU list in TTM because
>>> TTM wouldn't know how to evict it.
>> Even currently the imported objects are actually added to the LRU. The
>> problem is just that they are not added immediately during creation,
>> but only with the first command submission.
> Hmm. The fact that they are added to LRU at all sounds weird. What
> happens when TTM tries to evict or even swap out an imported buffer?
Adding them to the domain LRU is perfectly normal for the imported 
buffers and also works fine as far as I know.

When TTM evicts them it just gets unmapped from GTT to make room in the 
address space, which at least for Radeon and Amdgpu is exactly what we want.

That imported buffers get added to the swap LRU is indeed nonsense, but 
not harmful as far as I can see. Going to fix that in a follow up patch.

Thanks for the comment,
Christian.

>
> /Thomas
>
>
>> Regards,
>> Christian.
>>
>> Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
>>> Hi!
>>>
>>> I might be misunderderstanding the use-case here, but IIRC the
>>> discussion with TTM vs imported / exported buffer objects is that a
>>> buffer object needs to be marked NO_EVICT in TTM before it's exported
>>> and an imported object should never end up on a LRU list in TTM because
>>> TTM wouldn't know how to evict it.
>>>
>>> /Thomas
>>>   
>>> On 01/08/2016 02:41 PM, Christian König wrote:
>>>> From: Christian König <christian.koenig@amd.com>
>>>>
>>>> If we import a BO with an eternal reservation object we don't
>>>> reserve/unreserve it. So we never add it to the LRU causing a possible
>>>> deny of service.
>>>>
>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>> ---
>>>>    drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> index 745e996..a98a5d5 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>> @@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>>>>        if (likely(!ret))
>>>>            ret = ttm_bo_validate(bo, placement, interruptible, false);
>>>>    -    if (!resv)
>>>> +    if (!resv) {
>>>>            ttm_bo_unreserve(bo);
>>>>    +    } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>>>> +        spin_lock(&bo->glob->lru_lock);
>>>> +        ttm_bo_add_to_lru(bo);
>>>> +        spin_unlock(&bo->glob->lru_lock);
>>>> +    }
>>>> +
>>>>        if (unlikely(ret))
>>>>            ttm_bo_unref(&bo);
>>>>    
>>>
>
>
Thomas Hellström (VMware) Jan. 11, 2016, 1 p.m. UTC | #6
On 01/11/2016 01:39 PM, Christian König wrote:
> Am 10.01.2016 um 16:10 schrieb Thomas Hellstrom:
>> On 01/09/2016 11:46 AM, Christian König wrote:
>>> It's correct that exported buffers can't be moved to another domain or
>>> swapped to disk while another device is using it.
>>>
>>> But for imports that's a different story:
>>>> an imported object should never end up on a LRU list in TTM because
>>>> TTM wouldn't know how to evict it.
>>> Even currently the imported objects are actually added to the LRU. The
>>> problem is just that they are not added immediately during creation,
>>> but only with the first command submission.
>> Hmm. The fact that they are added to LRU at all sounds weird. What
>> happens when TTM tries to evict or even swap out an imported buffer?
> Adding them to the domain LRU is perfectly normal for the imported
> buffers and also works fine as far as I know.
>
> When TTM evicts them it just gets unmapped from GTT to make room in
> the address space, which at least for Radeon and Amdgpu is exactly
> what we want.
>
> That imported buffers get added to the swap LRU is indeed nonsense,
> but not harmful as far as I can see. Going to fix that in a follow up
> patch.
>

So the way TTM was designed when it was written was to be used to place
data in memory types where it could later be mapped by CPU and GPU in a
coherent fashion.

The actual GPU mapping was primarily thought to be done by the driver as
part of the validation process *after* that placement if needed.

Now many (most, all) drivers don't need that second step since the
memory type is directly mappable by the GPU, but in a situation where
the core TTM functionality (placement and swapping) is completely
replaced by another mechanism (such as imported buffers), not
implementing the GPU binding and eviction in the driver as a separate
step naturally generates a conflict.

Now I'm not at all against TTM being used for GPU mapping only if that
simplifies things for imported buffers but in that case IMO one should
be aware of the limitations and we should find a way to mark those
buffers GPU_MAP_ONLY to avoid having TTM competing with the exporter
about the placement ad avoid putting it on the swap LRU.

While this would be one way to work around the GTT space DOS, it
wouldn't work around the DOS problem of the exporter pinning system
pages that probably never get accounted. That's a limitation of the
export-import mechanism (dma-buf).

/Thomas



> Thanks for the comment,
> Christian.
>
>>
>> /Thomas
>>
>>
>>> Regards,
>>> Christian.
>>>
>>> Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
>>>> Hi!
>>>>
>>>> I might be misunderderstanding the use-case here, but IIRC the
>>>> discussion with TTM vs imported / exported buffer objects is that a
>>>> buffer object needs to be marked NO_EVICT in TTM before it's exported
>>>> and an imported object should never end up on a LRU list in TTM
>>>> because
>>>> TTM wouldn't know how to evict it.
>>>>
>>>> /Thomas
>>>>   On 01/08/2016 02:41 PM, Christian König wrote:
>>>>> From: Christian König <christian.koenig@amd.com>
>>>>>
>>>>> If we import a BO with an eternal reservation object we don't
>>>>> reserve/unreserve it. So we never add it to the LRU causing a
>>>>> possible
>>>>> deny of service.
>>>>>
>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>> ---
>>>>>    drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
>>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> index 745e996..a98a5d5 100644
>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>> @@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>>>>>        if (likely(!ret))
>>>>>            ret = ttm_bo_validate(bo, placement, interruptible,
>>>>> false);
>>>>>    -    if (!resv)
>>>>> +    if (!resv) {
>>>>>            ttm_bo_unreserve(bo);
>>>>>    +    } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>>>>> +        spin_lock(&bo->glob->lru_lock);
>>>>> +        ttm_bo_add_to_lru(bo);
>>>>> +        spin_unlock(&bo->glob->lru_lock);
>>>>> +    }
>>>>> +
>>>>>        if (unlikely(ret))
>>>>>            ttm_bo_unref(&bo);
>>>>>    
>>>>
>>
>>
>
Christian König Jan. 11, 2016, 2:38 p.m. UTC | #7
> While this would be one way to work around the GTT space DOS, it
> wouldn't work around the DOS problem of the exporter pinning system
> pages that probably never get accounted. That's a limitation of the
> export-import mechanism (dma-buf).

Yeah, completely agree. That's a problem we sooner or later need to 
address as well.

But currently I worry mostly about that specific problem at hand and how 
to fix it, the larger design problems need to wait.

Just send five patches out to the list and CCing you as well. You can 
ignore the last two which are amdgpu specific, but would be nice if you 
could take a look at the first three.

Thanks in advance,
Christian.

Am 11.01.2016 um 14:00 schrieb Thomas Hellstrom:
> On 01/11/2016 01:39 PM, Christian König wrote:
>> Am 10.01.2016 um 16:10 schrieb Thomas Hellstrom:
>>> On 01/09/2016 11:46 AM, Christian König wrote:
>>>> It's correct that exported buffers can't be moved to another domain or
>>>> swapped to disk while another device is using it.
>>>>
>>>> But for imports that's a different story:
>>>>> an imported object should never end up on a LRU list in TTM because
>>>>> TTM wouldn't know how to evict it.
>>>> Even currently the imported objects are actually added to the LRU. The
>>>> problem is just that they are not added immediately during creation,
>>>> but only with the first command submission.
>>> Hmm. The fact that they are added to LRU at all sounds weird. What
>>> happens when TTM tries to evict or even swap out an imported buffer?
>> Adding them to the domain LRU is perfectly normal for the imported
>> buffers and also works fine as far as I know.
>>
>> When TTM evicts them it just gets unmapped from GTT to make room in
>> the address space, which at least for Radeon and Amdgpu is exactly
>> what we want.
>>
>> That imported buffers get added to the swap LRU is indeed nonsense,
>> but not harmful as far as I can see. Going to fix that in a follow up
>> patch.
>>
> So the way TTM was designed when it was written was to be used to place
> data in memory types where it could later be mapped by CPU and GPU in a
> coherent fashion.
>
> The actual GPU mapping was primarily thought to be done by the driver as
> part of the validation process *after* that placement if needed.
>
> Now many (most, all) drivers don't need that second step since the
> memory type is directly mappable by the GPU, but in a situation where
> the core TTM functionality (placement and swapping) is completely
> replaced by another mechanism (such as imported buffers), not
> implementing the GPU binding and eviction in the driver as a separate
> step naturally generates a conflict.
>
> Now I'm not at all against TTM being used for GPU mapping only if that
> simplifies things for imported buffers but in that case IMO one should
> be aware of the limitations and we should find a way to mark those
> buffers GPU_MAP_ONLY to avoid having TTM competing with the exporter
> about the placement ad avoid putting it on the swap LRU.
>
> While this would be one way to work around the GTT space DOS, it
> wouldn't work around the DOS problem of the exporter pinning system
> pages that probably never get accounted. That's a limitation of the
> export-import mechanism (dma-buf).
>
> /Thomas
>
>
>
>> Thanks for the comment,
>> Christian.
>>
>>> /Thomas
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>> Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
>>>>> Hi!
>>>>>
>>>>> I might be misunderderstanding the use-case here, but IIRC the
>>>>> discussion with TTM vs imported / exported buffer objects is that a
>>>>> buffer object needs to be marked NO_EVICT in TTM before it's exported
>>>>> and an imported object should never end up on a LRU list in TTM
>>>>> because
>>>>> TTM wouldn't know how to evict it.
>>>>>
>>>>> /Thomas
>>>>>    On 01/08/2016 02:41 PM, Christian König wrote:
>>>>>> From: Christian König <christian.koenig@amd.com>
>>>>>>
>>>>>> If we import a BO with an eternal reservation object we don't
>>>>>> reserve/unreserve it. So we never add it to the LRU causing a
>>>>>> possible
>>>>>> deny of service.
>>>>>>
>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>> ---
>>>>>>     drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
>>>>>>     1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> index 745e996..a98a5d5 100644
>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>> @@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>>>>>>         if (likely(!ret))
>>>>>>             ret = ttm_bo_validate(bo, placement, interruptible,
>>>>>> false);
>>>>>>     -    if (!resv)
>>>>>> +    if (!resv) {
>>>>>>             ttm_bo_unreserve(bo);
>>>>>>     +    } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>>>>>> +        spin_lock(&bo->glob->lru_lock);
>>>>>> +        ttm_bo_add_to_lru(bo);
>>>>>> +        spin_unlock(&bo->glob->lru_lock);
>>>>>> +    }
>>>>>> +
>>>>>>         if (unlikely(ret))
>>>>>>             ttm_bo_unref(&bo);
>>>>>>     
>>>
>
>
Daniel Vetter Jan. 11, 2016, 2:55 p.m. UTC | #8
On Mon, Jan 11, 2016 at 03:38:23PM +0100, Christian König wrote:
> >While this would be one way to work around the GTT space DOS, it
> >wouldn't work around the DOS problem of the exporter pinning system
> >pages that probably never get accounted. That's a limitation of the
> >export-import mechanism (dma-buf).
> 
> Yeah, completely agree. That's a problem we sooner or later need to address
> as well.
> 
> But currently I worry mostly about that specific problem at hand and how to
> fix it, the larger design problems need to wait.
> 
> Just send five patches out to the list and CCing you as well. You can ignore
> the last two which are amdgpu specific, but would be nice if you could take
> a look at the first three.

Out of curiosity, where do you generate so many foreign dma-buf backes
objects? With prime I expect at most a few framebuffers on each side to be
shared, and that shouldn't really cause much trouble. Just sharing
dma-bufs with your own device should result in the underlying native
objects being referenced, so work like flink.

Wrt fixing this for real: I think step 1 would be stop pinning dma-bufs on
import and instead implementing all the map/unmap vfunc hooks ttm already
implements (it's fully intentional that they match really closely
between) by virtualizing essentially ttm_bo_api.h.

After that we need to allow exporters to evict mappings importers have
created (using the ww_mutex locking and all that like ttm does for purely
native eviction). That likely needs a callback to dma_buf_attachment. With
that we should be 90% there, since most likely when the importer tries to
bind more stuff exported from somewhere we'll run into limits of the
exporter. It's not perfect since it's not (yet) a truly global evict, but
should get us a large step towards that goal.

If we really need global eviction across all drivers in a system (for e.g.
system memory, nothing else should be shared like that really I think)
then I guess we could look into either extending ttm_global (or adding
something similar at the dma-buf level).

Cheers, Daniel

> 
> Thanks in advance,
> Christian.
> 
> Am 11.01.2016 um 14:00 schrieb Thomas Hellstrom:
> >On 01/11/2016 01:39 PM, Christian König wrote:
> >>Am 10.01.2016 um 16:10 schrieb Thomas Hellstrom:
> >>>On 01/09/2016 11:46 AM, Christian König wrote:
> >>>>It's correct that exported buffers can't be moved to another domain or
> >>>>swapped to disk while another device is using it.
> >>>>
> >>>>But for imports that's a different story:
> >>>>>an imported object should never end up on a LRU list in TTM because
> >>>>>TTM wouldn't know how to evict it.
> >>>>Even currently the imported objects are actually added to the LRU. The
> >>>>problem is just that they are not added immediately during creation,
> >>>>but only with the first command submission.
> >>>Hmm. The fact that they are added to LRU at all sounds weird. What
> >>>happens when TTM tries to evict or even swap out an imported buffer?
> >>Adding them to the domain LRU is perfectly normal for the imported
> >>buffers and also works fine as far as I know.
> >>
> >>When TTM evicts them it just gets unmapped from GTT to make room in
> >>the address space, which at least for Radeon and Amdgpu is exactly
> >>what we want.
> >>
> >>That imported buffers get added to the swap LRU is indeed nonsense,
> >>but not harmful as far as I can see. Going to fix that in a follow up
> >>patch.
> >>
> >So the way TTM was designed when it was written was to be used to place
> >data in memory types where it could later be mapped by CPU and GPU in a
> >coherent fashion.
> >
> >The actual GPU mapping was primarily thought to be done by the driver as
> >part of the validation process *after* that placement if needed.
> >
> >Now many (most, all) drivers don't need that second step since the
> >memory type is directly mappable by the GPU, but in a situation where
> >the core TTM functionality (placement and swapping) is completely
> >replaced by another mechanism (such as imported buffers), not
> >implementing the GPU binding and eviction in the driver as a separate
> >step naturally generates a conflict.
> >
> >Now I'm not at all against TTM being used for GPU mapping only if that
> >simplifies things for imported buffers but in that case IMO one should
> >be aware of the limitations and we should find a way to mark those
> >buffers GPU_MAP_ONLY to avoid having TTM competing with the exporter
> >about the placement ad avoid putting it on the swap LRU.
> >
> >While this would be one way to work around the GTT space DOS, it
> >wouldn't work around the DOS problem of the exporter pinning system
> >pages that probably never get accounted. That's a limitation of the
> >export-import mechanism (dma-buf).
> >
> >/Thomas
> >
> >
> >
> >>Thanks for the comment,
> >>Christian.
> >>
> >>>/Thomas
> >>>
> >>>
> >>>>Regards,
> >>>>Christian.
> >>>>
> >>>>Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
> >>>>>Hi!
> >>>>>
> >>>>>I might be misunderderstanding the use-case here, but IIRC the
> >>>>>discussion with TTM vs imported / exported buffer objects is that a
> >>>>>buffer object needs to be marked NO_EVICT in TTM before it's exported
> >>>>>and an imported object should never end up on a LRU list in TTM
> >>>>>because
> >>>>>TTM wouldn't know how to evict it.
> >>>>>
> >>>>>/Thomas
> >>>>>   On 01/08/2016 02:41 PM, Christian König wrote:
> >>>>>>From: Christian König <christian.koenig@amd.com>
> >>>>>>
> >>>>>>If we import a BO with an eternal reservation object we don't
> >>>>>>reserve/unreserve it. So we never add it to the LRU causing a
> >>>>>>possible
> >>>>>>deny of service.
> >>>>>>
> >>>>>>Signed-off-by: Christian König <christian.koenig@amd.com>
> >>>>>>---
> >>>>>>    drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
> >>>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
> >>>>>>
> >>>>>>diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>index 745e996..a98a5d5 100644
> >>>>>>--- a/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>+++ b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>@@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
> >>>>>>        if (likely(!ret))
> >>>>>>            ret = ttm_bo_validate(bo, placement, interruptible,
> >>>>>>false);
> >>>>>>    -    if (!resv)
> >>>>>>+    if (!resv) {
> >>>>>>            ttm_bo_unreserve(bo);
> >>>>>>    +    } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
> >>>>>>+        spin_lock(&bo->glob->lru_lock);
> >>>>>>+        ttm_bo_add_to_lru(bo);
> >>>>>>+        spin_unlock(&bo->glob->lru_lock);
> >>>>>>+    }
> >>>>>>+
> >>>>>>        if (unlikely(ret))
> >>>>>>            ttm_bo_unref(&bo);
> >>>
> >
> >
> 
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
Christian König Jan. 11, 2016, 3:54 p.m. UTC | #9
> Out of curiosity, where do you generate so many foreign dma-buf backes
> objects?
That was just an experiment of sharing buffers between two AMD devices. 
A stupid reference count bug and I've leaked 60 full HD frame a second.

Took a bit over two minutes to eat up 1GB of GTT space and the box dying 
rather spectacularly.

Regards,
Christian.

Am 11.01.2016 um 15:55 schrieb Daniel Vetter:
> On Mon, Jan 11, 2016 at 03:38:23PM +0100, Christian König wrote:
>>> While this would be one way to work around the GTT space DOS, it
>>> wouldn't work around the DOS problem of the exporter pinning system
>>> pages that probably never get accounted. That's a limitation of the
>>> export-import mechanism (dma-buf).
>> Yeah, completely agree. That's a problem we sooner or later need to address
>> as well.
>>
>> But currently I worry mostly about that specific problem at hand and how to
>> fix it, the larger design problems need to wait.
>>
>> Just send five patches out to the list and CCing you as well. You can ignore
>> the last two which are amdgpu specific, but would be nice if you could take
>> a look at the first three.
> Out of curiosity, where do you generate so many foreign dma-buf backes
> objects? With prime I expect at most a few framebuffers on each side to be
> shared, and that shouldn't really cause much trouble. Just sharing
> dma-bufs with your own device should result in the underlying native
> objects being referenced, so work like flink.
>
> Wrt fixing this for real: I think step 1 would be stop pinning dma-bufs on
> import and instead implementing all the map/unmap vfunc hooks ttm already
> implements (it's fully intentional that they match really closely
> between) by virtualizing essentially ttm_bo_api.h.
>
> After that we need to allow exporters to evict mappings importers have
> created (using the ww_mutex locking and all that like ttm does for purely
> native eviction). That likely needs a callback to dma_buf_attachment. With
> that we should be 90% there, since most likely when the importer tries to
> bind more stuff exported from somewhere we'll run into limits of the
> exporter. It's not perfect since it's not (yet) a truly global evict, but
> should get us a large step towards that goal.
>
> If we really need global eviction across all drivers in a system (for e.g.
> system memory, nothing else should be shared like that really I think)
> then I guess we could look into either extending ttm_global (or adding
> something similar at the dma-buf level).
>
> Cheers, Daniel
>
>> Thanks in advance,
>> Christian.
>>
>> Am 11.01.2016 um 14:00 schrieb Thomas Hellstrom:
>>> On 01/11/2016 01:39 PM, Christian König wrote:
>>>> Am 10.01.2016 um 16:10 schrieb Thomas Hellstrom:
>>>>> On 01/09/2016 11:46 AM, Christian König wrote:
>>>>>> It's correct that exported buffers can't be moved to another domain or
>>>>>> swapped to disk while another device is using it.
>>>>>>
>>>>>> But for imports that's a different story:
>>>>>>> an imported object should never end up on a LRU list in TTM because
>>>>>>> TTM wouldn't know how to evict it.
>>>>>> Even currently the imported objects are actually added to the LRU. The
>>>>>> problem is just that they are not added immediately during creation,
>>>>>> but only with the first command submission.
>>>>> Hmm. The fact that they are added to LRU at all sounds weird. What
>>>>> happens when TTM tries to evict or even swap out an imported buffer?
>>>> Adding them to the domain LRU is perfectly normal for the imported
>>>> buffers and also works fine as far as I know.
>>>>
>>>> When TTM evicts them it just gets unmapped from GTT to make room in
>>>> the address space, which at least for Radeon and Amdgpu is exactly
>>>> what we want.
>>>>
>>>> That imported buffers get added to the swap LRU is indeed nonsense,
>>>> but not harmful as far as I can see. Going to fix that in a follow up
>>>> patch.
>>>>
>>> So the way TTM was designed when it was written was to be used to place
>>> data in memory types where it could later be mapped by CPU and GPU in a
>>> coherent fashion.
>>>
>>> The actual GPU mapping was primarily thought to be done by the driver as
>>> part of the validation process *after* that placement if needed.
>>>
>>> Now many (most, all) drivers don't need that second step since the
>>> memory type is directly mappable by the GPU, but in a situation where
>>> the core TTM functionality (placement and swapping) is completely
>>> replaced by another mechanism (such as imported buffers), not
>>> implementing the GPU binding and eviction in the driver as a separate
>>> step naturally generates a conflict.
>>>
>>> Now I'm not at all against TTM being used for GPU mapping only if that
>>> simplifies things for imported buffers but in that case IMO one should
>>> be aware of the limitations and we should find a way to mark those
>>> buffers GPU_MAP_ONLY to avoid having TTM competing with the exporter
>>> about the placement ad avoid putting it on the swap LRU.
>>>
>>> While this would be one way to work around the GTT space DOS, it
>>> wouldn't work around the DOS problem of the exporter pinning system
>>> pages that probably never get accounted. That's a limitation of the
>>> export-import mechanism (dma-buf).
>>>
>>> /Thomas
>>>
>>>
>>>
>>>> Thanks for the comment,
>>>> Christian.
>>>>
>>>>> /Thomas
>>>>>
>>>>>
>>>>>> Regards,
>>>>>> Christian.
>>>>>>
>>>>>> Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
>>>>>>> Hi!
>>>>>>>
>>>>>>> I might be misunderderstanding the use-case here, but IIRC the
>>>>>>> discussion with TTM vs imported / exported buffer objects is that a
>>>>>>> buffer object needs to be marked NO_EVICT in TTM before it's exported
>>>>>>> and an imported object should never end up on a LRU list in TTM
>>>>>>> because
>>>>>>> TTM wouldn't know how to evict it.
>>>>>>>
>>>>>>> /Thomas
>>>>>>>    On 01/08/2016 02:41 PM, Christian König wrote:
>>>>>>>> From: Christian König <christian.koenig@amd.com>
>>>>>>>>
>>>>>>>> If we import a BO with an eternal reservation object we don't
>>>>>>>> reserve/unreserve it. So we never add it to the LRU causing a
>>>>>>>> possible
>>>>>>>> deny of service.
>>>>>>>>
>>>>>>>> Signed-off-by: Christian König <christian.koenig@amd.com>
>>>>>>>> ---
>>>>>>>>     drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
>>>>>>>>     1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>> b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>> index 745e996..a98a5d5 100644
>>>>>>>> --- a/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo.c
>>>>>>>> @@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
>>>>>>>>         if (likely(!ret))
>>>>>>>>             ret = ttm_bo_validate(bo, placement, interruptible,
>>>>>>>> false);
>>>>>>>>     -    if (!resv)
>>>>>>>> +    if (!resv) {
>>>>>>>>             ttm_bo_unreserve(bo);
>>>>>>>>     +    } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
>>>>>>>> +        spin_lock(&bo->glob->lru_lock);
>>>>>>>> +        ttm_bo_add_to_lru(bo);
>>>>>>>> +        spin_unlock(&bo->glob->lru_lock);
>>>>>>>> +    }
>>>>>>>> +
>>>>>>>>         if (unlikely(ret))
>>>>>>>>             ttm_bo_unref(&bo);
>>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
Daniel Vetter Jan. 11, 2016, 7:37 p.m. UTC | #10
On Mon, Jan 11, 2016 at 04:54:19PM +0100, Christian König wrote:
> >Out of curiosity, where do you generate so many foreign dma-buf backes
> >objects?
> That was just an experiment of sharing buffers between two AMD devices. A
> stupid reference count bug and I've leaked 60 full HD frame a second.
> 
> Took a bit over two minutes to eat up 1GB of GTT space and the box dying
> rather spectacularly.

Ah ok, that explains it ;-)
-Daniel

> 
> Regards,
> Christian.
> 
> Am 11.01.2016 um 15:55 schrieb Daniel Vetter:
> >On Mon, Jan 11, 2016 at 03:38:23PM +0100, Christian König wrote:
> >>>While this would be one way to work around the GTT space DOS, it
> >>>wouldn't work around the DOS problem of the exporter pinning system
> >>>pages that probably never get accounted. That's a limitation of the
> >>>export-import mechanism (dma-buf).
> >>Yeah, completely agree. That's a problem we sooner or later need to address
> >>as well.
> >>
> >>But currently I worry mostly about that specific problem at hand and how to
> >>fix it, the larger design problems need to wait.
> >>
> >>Just send five patches out to the list and CCing you as well. You can ignore
> >>the last two which are amdgpu specific, but would be nice if you could take
> >>a look at the first three.
> >Out of curiosity, where do you generate so many foreign dma-buf backes
> >objects? With prime I expect at most a few framebuffers on each side to be
> >shared, and that shouldn't really cause much trouble. Just sharing
> >dma-bufs with your own device should result in the underlying native
> >objects being referenced, so work like flink.
> >
> >Wrt fixing this for real: I think step 1 would be stop pinning dma-bufs on
> >import and instead implementing all the map/unmap vfunc hooks ttm already
> >implements (it's fully intentional that they match really closely
> >between) by virtualizing essentially ttm_bo_api.h.
> >
> >After that we need to allow exporters to evict mappings importers have
> >created (using the ww_mutex locking and all that like ttm does for purely
> >native eviction). That likely needs a callback to dma_buf_attachment. With
> >that we should be 90% there, since most likely when the importer tries to
> >bind more stuff exported from somewhere we'll run into limits of the
> >exporter. It's not perfect since it's not (yet) a truly global evict, but
> >should get us a large step towards that goal.
> >
> >If we really need global eviction across all drivers in a system (for e.g.
> >system memory, nothing else should be shared like that really I think)
> >then I guess we could look into either extending ttm_global (or adding
> >something similar at the dma-buf level).
> >
> >Cheers, Daniel
> >
> >>Thanks in advance,
> >>Christian.
> >>
> >>Am 11.01.2016 um 14:00 schrieb Thomas Hellstrom:
> >>>On 01/11/2016 01:39 PM, Christian König wrote:
> >>>>Am 10.01.2016 um 16:10 schrieb Thomas Hellstrom:
> >>>>>On 01/09/2016 11:46 AM, Christian König wrote:
> >>>>>>It's correct that exported buffers can't be moved to another domain or
> >>>>>>swapped to disk while another device is using it.
> >>>>>>
> >>>>>>But for imports that's a different story:
> >>>>>>>an imported object should never end up on a LRU list in TTM because
> >>>>>>>TTM wouldn't know how to evict it.
> >>>>>>Even currently the imported objects are actually added to the LRU. The
> >>>>>>problem is just that they are not added immediately during creation,
> >>>>>>but only with the first command submission.
> >>>>>Hmm. The fact that they are added to LRU at all sounds weird. What
> >>>>>happens when TTM tries to evict or even swap out an imported buffer?
> >>>>Adding them to the domain LRU is perfectly normal for the imported
> >>>>buffers and also works fine as far as I know.
> >>>>
> >>>>When TTM evicts them it just gets unmapped from GTT to make room in
> >>>>the address space, which at least for Radeon and Amdgpu is exactly
> >>>>what we want.
> >>>>
> >>>>That imported buffers get added to the swap LRU is indeed nonsense,
> >>>>but not harmful as far as I can see. Going to fix that in a follow up
> >>>>patch.
> >>>>
> >>>So the way TTM was designed when it was written was to be used to place
> >>>data in memory types where it could later be mapped by CPU and GPU in a
> >>>coherent fashion.
> >>>
> >>>The actual GPU mapping was primarily thought to be done by the driver as
> >>>part of the validation process *after* that placement if needed.
> >>>
> >>>Now many (most, all) drivers don't need that second step since the
> >>>memory type is directly mappable by the GPU, but in a situation where
> >>>the core TTM functionality (placement and swapping) is completely
> >>>replaced by another mechanism (such as imported buffers), not
> >>>implementing the GPU binding and eviction in the driver as a separate
> >>>step naturally generates a conflict.
> >>>
> >>>Now I'm not at all against TTM being used for GPU mapping only if that
> >>>simplifies things for imported buffers but in that case IMO one should
> >>>be aware of the limitations and we should find a way to mark those
> >>>buffers GPU_MAP_ONLY to avoid having TTM competing with the exporter
> >>>about the placement ad avoid putting it on the swap LRU.
> >>>
> >>>While this would be one way to work around the GTT space DOS, it
> >>>wouldn't work around the DOS problem of the exporter pinning system
> >>>pages that probably never get accounted. That's a limitation of the
> >>>export-import mechanism (dma-buf).
> >>>
> >>>/Thomas
> >>>
> >>>
> >>>
> >>>>Thanks for the comment,
> >>>>Christian.
> >>>>
> >>>>>/Thomas
> >>>>>
> >>>>>
> >>>>>>Regards,
> >>>>>>Christian.
> >>>>>>
> >>>>>>Am 09.01.2016 um 08:05 schrieb Thomas Hellstrom:
> >>>>>>>Hi!
> >>>>>>>
> >>>>>>>I might be misunderderstanding the use-case here, but IIRC the
> >>>>>>>discussion with TTM vs imported / exported buffer objects is that a
> >>>>>>>buffer object needs to be marked NO_EVICT in TTM before it's exported
> >>>>>>>and an imported object should never end up on a LRU list in TTM
> >>>>>>>because
> >>>>>>>TTM wouldn't know how to evict it.
> >>>>>>>
> >>>>>>>/Thomas
> >>>>>>>   On 01/08/2016 02:41 PM, Christian König wrote:
> >>>>>>>>From: Christian König <christian.koenig@amd.com>
> >>>>>>>>
> >>>>>>>>If we import a BO with an eternal reservation object we don't
> >>>>>>>>reserve/unreserve it. So we never add it to the LRU causing a
> >>>>>>>>possible
> >>>>>>>>deny of service.
> >>>>>>>>
> >>>>>>>>Signed-off-by: Christian König <christian.koenig@amd.com>
> >>>>>>>>---
> >>>>>>>>    drivers/gpu/drm/ttm/ttm_bo.c | 8 +++++++-
> >>>>>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
> >>>>>>>>
> >>>>>>>>diff --git a/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>>>b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>>>index 745e996..a98a5d5 100644
> >>>>>>>>--- a/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>>>+++ b/drivers/gpu/drm/ttm/ttm_bo.c
> >>>>>>>>@@ -1170,9 +1170,15 @@ int ttm_bo_init(struct ttm_bo_device *bdev,
> >>>>>>>>        if (likely(!ret))
> >>>>>>>>            ret = ttm_bo_validate(bo, placement, interruptible,
> >>>>>>>>false);
> >>>>>>>>    -    if (!resv)
> >>>>>>>>+    if (!resv) {
> >>>>>>>>            ttm_bo_unreserve(bo);
> >>>>>>>>    +    } else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
> >>>>>>>>+        spin_lock(&bo->glob->lru_lock);
> >>>>>>>>+        ttm_bo_add_to_lru(bo);
> >>>>>>>>+        spin_unlock(&bo->glob->lru_lock);
> >>>>>>>>+    }
> >>>>>>>>+
> >>>>>>>>        if (unlikely(ret))
> >>>>>>>>            ttm_bo_unref(&bo);
> >>>
> >>_______________________________________________
> >>dri-devel mailing list
> >>dri-devel@lists.freedesktop.org
> >>http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
diff mbox

Patch

diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 745e996..a98a5d5 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -1170,9 +1170,15 @@  int ttm_bo_init(struct ttm_bo_device *bdev,
 	if (likely(!ret))
 		ret = ttm_bo_validate(bo, placement, interruptible, false);
 
-	if (!resv)
+	if (!resv) {
 		ttm_bo_unreserve(bo);
 
+	} else if (!(bo->mem.placement & TTM_PL_FLAG_NO_EVICT)) {
+		spin_lock(&bo->glob->lru_lock);
+		ttm_bo_add_to_lru(bo);
+		spin_unlock(&bo->glob->lru_lock);
+	}
+
 	if (unlikely(ret))
 		ttm_bo_unref(&bo);