diff mbox

[2/2] drm/amdgpu: Use new TTM flag to avoid OOM triggering.

Message ID MWHPR1201MB012784C4B5F4E7E90FE49F2FFDEA0@MWHPR1201MB0127.namprd12.prod.outlook.com (mailing list archive)
State New, archived
Headers show

Commit Message

He, Hongbo Jan. 16, 2018, 6:18 a.m. UTC
-----Original Message-----
From: Andrey Grodzovsky [mailto:andrey.grodzovsky@amd.com] 
Sent: Saturday, January 13, 2018 6:29 AM
To: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Cc: Koenig, Christian <Christian.Koenig@amd.com>; He, Roger <Hongbo.He@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
Subject: [PATCH 2/2] drm/amdgpu: Use new TTM flag to avoid OOM triggering.

This to have a load time option to avoid OOM on RAM allocations.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 1 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 4 ++++
 drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++++
 3 files changed, 9 insertions(+)

Comments

Christian König Jan. 16, 2018, 8:54 a.m. UTC | #1
Am 16.01.2018 um 07:18 schrieb He, Roger:
> -----Original Message-----
> From: Andrey Grodzovsky [mailto:andrey.grodzovsky@amd.com]
> Sent: Saturday, January 13, 2018 6:29 AM
> To: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Cc: Koenig, Christian <Christian.Koenig@amd.com>; He, Roger <Hongbo.He@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
> Subject: [PATCH 2/2] drm/amdgpu: Use new TTM flag to avoid OOM triggering.
>
> This to have a load time option to avoid OOM on RAM allocations.
>
> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 1 +
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 4 ++++
>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++++
>   3 files changed, 9 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> index b7c181e..1387239 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
> @@ -127,6 +127,7 @@ extern int amdgpu_job_hang_limit;  extern int amdgpu_lbpw;  extern int amdgpu_compute_multipipe;  extern int amdgpu_gpu_recovery;
> +extern int amdgpu_alloc_no_oom;
>   
>   #ifdef CONFIG_DRM_AMDGPU_SI
>   extern int amdgpu_si_support;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> index d96f9ac..6e98189 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
> @@ -130,6 +130,7 @@ int amdgpu_job_hang_limit = 0;  int amdgpu_lbpw = -1;  int amdgpu_compute_multipipe = -1;  int amdgpu_gpu_recovery = -1; /* auto */
> +int amdgpu_alloc_no_oom = -1; /* auto */
>
> How about turn it on as default?

I think we can even go a step further, drop the module parameter and 
just turn it always on for amdgpu.

Christian.

>
> Thanks
> Roger(Hongbo.He)
>
> MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");  module_param_named(vramlimit, amdgpu_vram_limit, int, 0600); @@ -285,6 +286,9 @@ module_param_named(compute_multipipe, amdgpu_compute_multipipe, int, 0444);  MODULE_PARM_DESC(gpu_recovery, "Enable GPU recovery mechanism, (1 = enable, 0 = disable, -1 = auto");  module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 0444);
>   
> +MODULE_PARM_DESC(alloc_no_oom, "Allocate RAM without triggering OOM
> +killer, (1 = enable, 0 = disable, -1 = auto");
> +module_param_named(alloc_no_oom, amdgpu_alloc_no_oom, int, 0444);
> +
>   #ifdef CONFIG_DRM_AMDGPU_SI
>   
>   #if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> index 5c4c3e0..fc27164 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
> @@ -420,6 +420,10 @@ static int amdgpu_bo_do_create(struct amdgpu_device *adev,  #endif
>   
>   	bo->tbo.bdev = &adev->mman.bdev;
> +
> +	if (amdgpu_alloc_no_oom == 1)
> +		bo->tbo.bdev->no_retry = true;
> +
>   	amdgpu_ttm_placement_from_domain(bo, domain);
>   
>   	r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
> --
> 2.7.4
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Andrey Grodzovsky Jan. 16, 2018, 12:43 p.m. UTC | #2
On 01/16/2018 03:54 AM, Christian König wrote:
> Am 16.01.2018 um 07:18 schrieb He, Roger:
>> -----Original Message-----
>> From: Andrey Grodzovsky [mailto:andrey.grodzovsky@amd.com]
>> Sent: Saturday, January 13, 2018 6:29 AM
>> To: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
>> Cc: Koenig, Christian <Christian.Koenig@amd.com>; He, Roger 
>> <Hongbo.He@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>> Subject: [PATCH 2/2] drm/amdgpu: Use new TTM flag to avoid OOM 
>> triggering.
>>
>> This to have a load time option to avoid OOM on RAM allocations.
>>
>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>> ---
>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 1 +
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 4 ++++
>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++++
>>   3 files changed, 9 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> index b7c181e..1387239 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>> @@ -127,6 +127,7 @@ extern int amdgpu_job_hang_limit;  extern int 
>> amdgpu_lbpw;  extern int amdgpu_compute_multipipe;  extern int 
>> amdgpu_gpu_recovery;
>> +extern int amdgpu_alloc_no_oom;
>>     #ifdef CONFIG_DRM_AMDGPU_SI
>>   extern int amdgpu_si_support;
>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> index d96f9ac..6e98189 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>> @@ -130,6 +130,7 @@ int amdgpu_job_hang_limit = 0;  int amdgpu_lbpw = 
>> -1;  int amdgpu_compute_multipipe = -1;  int amdgpu_gpu_recovery = 
>> -1; /* auto */
>> +int amdgpu_alloc_no_oom = -1; /* auto */
>>
>> How about turn it on as default?
>
> I think we can even go a step further, drop the module parameter and 
> just turn it always on for amdgpu.
>
> Christian.

Will fix, just a reminder that Roger's patches -
[PATCH 1/2] drm/ttm: don't update global memory count for some special cases
[PATCH 2/2] drm/ttm: only free pages rather than update global memory 
count together

Needs to be merged before my patches since the fix a TTM bug on 
allocation failure.

Thanks,
Andrey

>
>>
>> Thanks
>> Roger(Hongbo.He)
>>
>> MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in 
>> megabytes");  module_param_named(vramlimit, amdgpu_vram_limit, int, 
>> 0600); @@ -285,6 +286,9 @@ module_param_named(compute_multipipe, 
>> amdgpu_compute_multipipe, int, 0444);  MODULE_PARM_DESC(gpu_recovery, 
>> "Enable GPU recovery mechanism, (1 = enable, 0 = disable, -1 = 
>> auto"); module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 
>> 0444);
>>   +MODULE_PARM_DESC(alloc_no_oom, "Allocate RAM without triggering OOM
>> +killer, (1 = enable, 0 = disable, -1 = auto");
>> +module_param_named(alloc_no_oom, amdgpu_alloc_no_oom, int, 0444);
>> +
>>   #ifdef CONFIG_DRM_AMDGPU_SI
>>     #if defined(CONFIG_DRM_RADEON) || 
>> defined(CONFIG_DRM_RADEON_MODULE) diff --git 
>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> index 5c4c3e0..fc27164 100644
>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>> @@ -420,6 +420,10 @@ static int amdgpu_bo_do_create(struct 
>> amdgpu_device *adev,  #endif
>>         bo->tbo.bdev = &adev->mman.bdev;
>> +
>> +    if (amdgpu_alloc_no_oom == 1)
>> +        bo->tbo.bdev->no_retry = true;
>> +
>>       amdgpu_ttm_placement_from_domain(bo, domain);
>>         r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
>> -- 
>> 2.7.4
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
Christian König Jan. 16, 2018, 12:46 p.m. UTC | #3
Am 16.01.2018 um 13:43 schrieb Andrey Grodzovsky:
>
>
> On 01/16/2018 03:54 AM, Christian König wrote:
>> Am 16.01.2018 um 07:18 schrieb He, Roger:
>>> -----Original Message-----
>>> From: Andrey Grodzovsky [mailto:andrey.grodzovsky@amd.com]
>>> Sent: Saturday, January 13, 2018 6:29 AM
>>> To: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
>>> Cc: Koenig, Christian <Christian.Koenig@amd.com>; He, Roger 
>>> <Hongbo.He@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>
>>> Subject: [PATCH 2/2] drm/amdgpu: Use new TTM flag to avoid OOM 
>>> triggering.
>>>
>>> This to have a load time option to avoid OOM on RAM allocations.
>>>
>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
>>> ---
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 1 +
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 4 ++++
>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++++
>>>   3 files changed, 9 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> index b7c181e..1387239 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
>>> @@ -127,6 +127,7 @@ extern int amdgpu_job_hang_limit;  extern int 
>>> amdgpu_lbpw;  extern int amdgpu_compute_multipipe;  extern int 
>>> amdgpu_gpu_recovery;
>>> +extern int amdgpu_alloc_no_oom;
>>>     #ifdef CONFIG_DRM_AMDGPU_SI
>>>   extern int amdgpu_si_support;
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> index d96f9ac..6e98189 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
>>> @@ -130,6 +130,7 @@ int amdgpu_job_hang_limit = 0;  int amdgpu_lbpw 
>>> = -1;  int amdgpu_compute_multipipe = -1;  int amdgpu_gpu_recovery = 
>>> -1; /* auto */
>>> +int amdgpu_alloc_no_oom = -1; /* auto */
>>>
>>> How about turn it on as default?
>>
>> I think we can even go a step further, drop the module parameter and 
>> just turn it always on for amdgpu.
>>
>> Christian.
>
> Will fix, just a reminder that Roger's patches -
> [PATCH 1/2] drm/ttm: don't update global memory count for some special 
> cases
> [PATCH 2/2] drm/ttm: only free pages rather than update global memory 
> count together
>
> Needs to be merged before my patches since the fix a TTM bug on 
> allocation failure.

The second is merged, but I had some comments on the first and Roger 
hasn't replied yet.

Roger what's the status on that one?

Regards,
Christian.

>
> Thanks,
> Andrey
>
>>
>>>
>>> Thanks
>>> Roger(Hongbo.He)
>>>
>>> MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in 
>>> megabytes");  module_param_named(vramlimit, amdgpu_vram_limit, int, 
>>> 0600); @@ -285,6 +286,9 @@ module_param_named(compute_multipipe, 
>>> amdgpu_compute_multipipe, int, 0444); MODULE_PARM_DESC(gpu_recovery, 
>>> "Enable GPU recovery mechanism, (1 = enable, 0 = disable, -1 = 
>>> auto"); module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 
>>> 0444);
>>>   +MODULE_PARM_DESC(alloc_no_oom, "Allocate RAM without triggering OOM
>>> +killer, (1 = enable, 0 = disable, -1 = auto");
>>> +module_param_named(alloc_no_oom, amdgpu_alloc_no_oom, int, 0444);
>>> +
>>>   #ifdef CONFIG_DRM_AMDGPU_SI
>>>     #if defined(CONFIG_DRM_RADEON) || 
>>> defined(CONFIG_DRM_RADEON_MODULE) diff --git 
>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c 
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 5c4c3e0..fc27164 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -420,6 +420,10 @@ static int amdgpu_bo_do_create(struct 
>>> amdgpu_device *adev,  #endif
>>>         bo->tbo.bdev = &adev->mman.bdev;
>>> +
>>> +    if (amdgpu_alloc_no_oom == 1)
>>> +        bo->tbo.bdev->no_retry = true;
>>> +
>>>       amdgpu_ttm_placement_from_domain(bo, domain);
>>>         r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, 
>>> type,
>>> -- 
>>> 2.7.4
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
He, Hongbo Jan. 17, 2018, 2:09 a.m. UTC | #4
-----Original Message-----
From: Christian König [mailto:ckoenig.leichtzumerken@gmail.com] 

Sent: Tuesday, January 16, 2018 8:46 PM
To: Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>; Koenig, Christian <Christian.Koenig@amd.com>; He, Roger <Hongbo.He@amd.com>; dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
Subject: Re: [PATCH 2/2] drm/amdgpu: Use new TTM flag to avoid OOM triggering.

Am 16.01.2018 um 13:43 schrieb Andrey Grodzovsky:
>

>

> On 01/16/2018 03:54 AM, Christian König wrote:

>> Am 16.01.2018 um 07:18 schrieb He, Roger:

>>> -----Original Message-----

>>> From: Andrey Grodzovsky [mailto:andrey.grodzovsky@amd.com]

>>> Sent: Saturday, January 13, 2018 6:29 AM

>>> To: dri-devel@lists.freedesktop.org; amd-gfx@lists.freedesktop.org

>>> Cc: Koenig, Christian <Christian.Koenig@amd.com>; He, Roger 

>>> <Hongbo.He@amd.com>; Grodzovsky, Andrey <Andrey.Grodzovsky@amd.com>

>>> Subject: [PATCH 2/2] drm/amdgpu: Use new TTM flag to avoid OOM 

>>> triggering.

>>>

>>> This to have a load time option to avoid OOM on RAM allocations.

>>>

>>> Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>

>>> ---

>>>   drivers/gpu/drm/amd/amdgpu/amdgpu.h        | 1 +

>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c    | 4 ++++

>>>   drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 4 ++++

>>>   3 files changed, 9 insertions(+)

>>>

>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h

>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

>>> index b7c181e..1387239 100644

>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h

>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h

>>> @@ -127,6 +127,7 @@ extern int amdgpu_job_hang_limit;  extern int 

>>> amdgpu_lbpw;  extern int amdgpu_compute_multipipe;  extern int 

>>> amdgpu_gpu_recovery;

>>> +extern int amdgpu_alloc_no_oom;

>>>     #ifdef CONFIG_DRM_AMDGPU_SI

>>>   extern int amdgpu_si_support;

>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

>>> index d96f9ac..6e98189 100644

>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c

>>> @@ -130,6 +130,7 @@ int amdgpu_job_hang_limit = 0;  int amdgpu_lbpw 

>>> = -1;  int amdgpu_compute_multipipe = -1;  int amdgpu_gpu_recovery = 

>>> -1; /* auto */

>>> +int amdgpu_alloc_no_oom = -1; /* auto */

>>>

>>> How about turn it on as default?

>>

>> I think we can even go a step further, drop the module parameter and 

>> just turn it always on for amdgpu.

>>

>> Christian.

>

> Will fix, just a reminder that Roger's patches - [PATCH 1/2] drm/ttm: 

> don't update global memory count for some special cases [PATCH 2/2] 

> drm/ttm: only free pages rather than update global memory count 

> together

>

> Needs to be merged before my patches since the fix a TTM bug on 

> allocation failure.


	The second is merged, but I had some comments on the first and Roger hasn't replied yet.

	Roger what's the status on that one?

Already fixed locally, but not tested yet.  Try to send out today.

Thanks
Roger(Hongbo.He)

>

> Thanks,

> Andrey

>

>>

>>>

>>> Thanks

>>> Roger(Hongbo.He)

>>>

>>> MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in 

>>> megabytes");  module_param_named(vramlimit, amdgpu_vram_limit, int, 

>>> 0600); @@ -285,6 +286,9 @@ module_param_named(compute_multipipe,

>>> amdgpu_compute_multipipe, int, 0444); MODULE_PARM_DESC(gpu_recovery, 

>>> "Enable GPU recovery mechanism, (1 = enable, 0 = disable, -1 = 

>>> auto"); module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 

>>> 0444);

>>>   +MODULE_PARM_DESC(alloc_no_oom, "Allocate RAM without triggering 

>>> OOM

>>> +killer, (1 = enable, 0 = disable, -1 = auto"); 

>>> +module_param_named(alloc_no_oom, amdgpu_alloc_no_oom, int, 0444);

>>> +

>>>   #ifdef CONFIG_DRM_AMDGPU_SI

>>>     #if defined(CONFIG_DRM_RADEON) ||

>>> defined(CONFIG_DRM_RADEON_MODULE) diff --git 

>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

>>> index 5c4c3e0..fc27164 100644

>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c

>>> @@ -420,6 +420,10 @@ static int amdgpu_bo_do_create(struct 

>>> amdgpu_device *adev,  #endif

>>>         bo->tbo.bdev = &adev->mman.bdev;

>>> +

>>> +    if (amdgpu_alloc_no_oom == 1)

>>> +        bo->tbo.bdev->no_retry = true;

>>> +

>>>       amdgpu_ttm_placement_from_domain(bo, domain);

>>>         r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, 

>>> type,

>>> --

>>> 2.7.4

>>>

>>> _______________________________________________

>>> dri-devel mailing list

>>> dri-devel@lists.freedesktop.org

>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel

>>

>

> _______________________________________________

> dri-devel mailing list

> dri-devel@lists.freedesktop.org

> https://lists.freedesktop.org/mailman/listinfo/dri-devel
diff mbox

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index b7c181e..1387239 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -127,6 +127,7 @@  extern int amdgpu_job_hang_limit;  extern int amdgpu_lbpw;  extern int amdgpu_compute_multipipe;  extern int amdgpu_gpu_recovery;
+extern int amdgpu_alloc_no_oom;
 
 #ifdef CONFIG_DRM_AMDGPU_SI
 extern int amdgpu_si_support;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index d96f9ac..6e98189 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -130,6 +130,7 @@  int amdgpu_job_hang_limit = 0;  int amdgpu_lbpw = -1;  int amdgpu_compute_multipipe = -1;  int amdgpu_gpu_recovery = -1; /* auto */
+int amdgpu_alloc_no_oom = -1; /* auto */

How about turn it on as default?

Thanks
Roger(Hongbo.He)

MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");  module_param_named(vramlimit, amdgpu_vram_limit, int, 0600); @@ -285,6 +286,9 @@ module_param_named(compute_multipipe, amdgpu_compute_multipipe, int, 0444);  MODULE_PARM_DESC(gpu_recovery, "Enable GPU recovery mechanism, (1 = enable, 0 = disable, -1 = auto");  module_param_named(gpu_recovery, amdgpu_gpu_recovery, int, 0444);
 
+MODULE_PARM_DESC(alloc_no_oom, "Allocate RAM without triggering OOM 
+killer, (1 = enable, 0 = disable, -1 = auto"); 
+module_param_named(alloc_no_oom, amdgpu_alloc_no_oom, int, 0444);
+
 #ifdef CONFIG_DRM_AMDGPU_SI
 
 #if defined(CONFIG_DRM_RADEON) || defined(CONFIG_DRM_RADEON_MODULE) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 5c4c3e0..fc27164 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -420,6 +420,10 @@  static int amdgpu_bo_do_create(struct amdgpu_device *adev,  #endif
 
 	bo->tbo.bdev = &adev->mman.bdev;
+
+	if (amdgpu_alloc_no_oom == 1)
+		bo->tbo.bdev->no_retry = true;
+
 	amdgpu_ttm_placement_from_domain(bo, domain);
 
 	r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
--
2.7.4