diff mbox

drm/amdgpu: improve GTT BO alloc speed in OGL

Message ID 1473698681-28637-1-git-send-email-alexander.deucher@amd.com (mailing list archive)
State New, archived
Headers show

Commit Message

Alex Deucher Sept. 12, 2016, 4:44 p.m. UTC
From: "monk.liu" <monk.liu@amd.com>

original we use ttm_dma path to allocate GTT bo, which is too much
slower than the path of ttm_pool, in most cases.

The swiotlb checks don't seem to work and we always end up in the
slow path even when an IOMMU is available.

Signed-off-by: monk.liu <Monk.Liu@amd.com>
Reviewed-by: Jammy Zhou <jammy.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 13 -------------
 1 file changed, 13 deletions(-)

Comments

Christian König Sept. 12, 2016, 6:26 p.m. UTC | #1
Am 12.09.2016 um 18:44 schrieb Alex Deucher:
> From: "monk.liu" <monk.liu@amd.com>
>
> original we use ttm_dma path to allocate GTT bo, which is too much
> slower than the path of ttm_pool, in most cases.
>
> The swiotlb checks don't seem to work and we always end up in the
> slow path even when an IOMMU is available.

While the check is clearly not correct. Simply always using the direct 
mapping and not checking the fallback path can break as well.

So this patch is clearly not a good idea and needs to be fixed before it 
is pushed.

Christian.

> Signed-off-by: monk.liu <Monk.Liu@amd.com>
> Reviewed-by: Jammy Zhou <jammy.zhou@amd.com>
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 13 -------------
>   1 file changed, 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> index 3beb10b..e2fcd39 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
> @@ -783,12 +783,6 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
>   
>   	adev = amdgpu_get_adev(ttm->bdev);
>   
> -#ifdef CONFIG_SWIOTLB
> -	if (swiotlb_nr_tbl()) {
> -		return ttm_dma_populate(&gtt->ttm, adev->dev);
> -	}
> -#endif
> -
>   	r = ttm_pool_populate(ttm);
>   	if (r) {
>   		return r;
> @@ -829,13 +823,6 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_tt *ttm)
>   
>   	adev = amdgpu_get_adev(ttm->bdev);
>   
> -#ifdef CONFIG_SWIOTLB
> -	if (swiotlb_nr_tbl()) {
> -		ttm_dma_unpopulate(&gtt->ttm, adev->dev);
> -		return;
> -	}
> -#endif
> -
>   	for (i = 0; i < ttm->num_pages; i++) {
>   		if (gtt->ttm.dma_address[i]) {
>   			pci_unmap_page(adev->pdev, gtt->ttm.dma_address[i],
Alex Deucher Sept. 12, 2016, 7:04 p.m. UTC | #2
On Mon, Sep 12, 2016 at 2:26 PM, Christian König
<deathsimple@vodafone.de> wrote:
> Am 12.09.2016 um 18:44 schrieb Alex Deucher:
>>
>> From: "monk.liu" <monk.liu@amd.com>
>>
>> original we use ttm_dma path to allocate GTT bo, which is too much
>> slower than the path of ttm_pool, in most cases.
>>
>> The swiotlb checks don't seem to work and we always end up in the
>> slow path even when an IOMMU is available.
>
>
> While the check is clearly not correct. Simply always using the direct
> mapping and not checking the fallback path can break as well.
>
> So this patch is clearly not a good idea and needs to be fixed before it is
> pushed.

Jerome looked into it when Monk first debugged this, but I don't think
anything ever came of it:
https://patchwork.kernel.org/patch/7079521/

Alex

>
> Christian.
>
>
>> Signed-off-by: monk.liu <Monk.Liu@amd.com>
>> Reviewed-by: Jammy Zhou <jammy.zhou@amd.com>
>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 13 -------------
>>   1 file changed, 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> index 3beb10b..e2fcd39 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
>> @@ -783,12 +783,6 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
>>         adev = amdgpu_get_adev(ttm->bdev);
>>   -#ifdef CONFIG_SWIOTLB
>> -       if (swiotlb_nr_tbl()) {
>> -               return ttm_dma_populate(&gtt->ttm, adev->dev);
>> -       }
>> -#endif
>> -
>>         r = ttm_pool_populate(ttm);
>>         if (r) {
>>                 return r;
>> @@ -829,13 +823,6 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_tt
>> *ttm)
>>         adev = amdgpu_get_adev(ttm->bdev);
>>   -#ifdef CONFIG_SWIOTLB
>> -       if (swiotlb_nr_tbl()) {
>> -               ttm_dma_unpopulate(&gtt->ttm, adev->dev);
>> -               return;
>> -       }
>> -#endif
>> -
>>         for (i = 0; i < ttm->num_pages; i++) {
>>                 if (gtt->ttm.dma_address[i]) {
>>                         pci_unmap_page(adev->pdev,
>> gtt->ttm.dma_address[i],
>
>
>
Michel Dänzer Sept. 13, 2016, 1:17 a.m. UTC | #3
On 13/09/16 01:44 AM, Alex Deucher wrote:
> From: "monk.liu" <monk.liu@amd.com>
> 
> original we use ttm_dma path to allocate GTT bo, which is too much
> slower than the path of ttm_pool, in most cases.
> 
> The swiotlb checks don't seem to work and we always end up in the
> slow path even when an IOMMU is available.

This change will break any cases where SWIOTLB is actually necessary
though, won't it?
Alex Deucher Sept. 13, 2016, 2:53 a.m. UTC | #4
On Mon, Sep 12, 2016 at 9:17 PM, Michel Dänzer <michel@daenzer.net> wrote:
> On 13/09/16 01:44 AM, Alex Deucher wrote:
>> From: "monk.liu" <monk.liu@amd.com>
>>
>> original we use ttm_dma path to allocate GTT bo, which is too much
>> slower than the path of ttm_pool, in most cases.
>>
>> The swiotlb checks don't seem to work and we always end up in the
>> slow path even when an IOMMU is available.
>
> This change will break any cases where SWIOTLB is actually necessary
> though, won't it?

Yes, theoretically.

Alex

>
>
> --
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer
diff mbox

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index 3beb10b..e2fcd39 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -783,12 +783,6 @@  static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
 
 	adev = amdgpu_get_adev(ttm->bdev);
 
-#ifdef CONFIG_SWIOTLB
-	if (swiotlb_nr_tbl()) {
-		return ttm_dma_populate(&gtt->ttm, adev->dev);
-	}
-#endif
-
 	r = ttm_pool_populate(ttm);
 	if (r) {
 		return r;
@@ -829,13 +823,6 @@  static void amdgpu_ttm_tt_unpopulate(struct ttm_tt *ttm)
 
 	adev = amdgpu_get_adev(ttm->bdev);
 
-#ifdef CONFIG_SWIOTLB
-	if (swiotlb_nr_tbl()) {
-		ttm_dma_unpopulate(&gtt->ttm, adev->dev);
-		return;
-	}
-#endif
-
 	for (i = 0; i < ttm->num_pages; i++) {
 		if (gtt->ttm.dma_address[i]) {
 			pci_unmap_page(adev->pdev, gtt->ttm.dma_address[i],