diff mbox series

[4/4] drm/amdgpu: stop removing BOs from the LRU during CS

Message ID 20190510141316.1746-4-christian.koenig@amd.com (mailing list archive)
State New, archived
Headers show
Series [1/4] drm/ttm: Make LRU removal optional. | expand

Commit Message

Christian König May 10, 2019, 2:13 p.m. UTC
This avoids OOM situations when we have lots of threads
submitting at the same time.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Marek Olšák May 11, 2019, 1:08 a.m. UTC | #1
Hi,

This patch series doesn't help with the OOM errors due to GDS. Reproducible
with:

AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears

Marek


On Fri, May 10, 2019 at 10:13 AM Christian König <
ckoenig.leichtzumerken@gmail.com> wrote:

> This avoids OOM situations when we have lots of threads
> submitting at the same time.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index a1d6a0721e53..8828d30cd409 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct
> amdgpu_cs_parser *p,
>         }
>
>         r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,
> -                                  &duplicates, true);
> +                                  &duplicates, false);
>         if (unlikely(r != 0)) {
>                 if (r != -ERESTARTSYS)
>                         DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
> --
> 2.17.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
<div dir="ltr"><div dir="ltr"><div>Hi,</div><div><br></div><div>This patch series doesn&#39;t help with the OOM errors due to GDS. Reproducible with:</div><div><br></div><div>AMD_DEBUG=testgdsmm glxgears &amp; AMD_DEBUG=testgdsmm glxgears</div><div><br></div><div>Marek</div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, May 10, 2019 at 10:13 AM Christian König &lt;<a href="mailto:ckoenig.leichtzumerken@gmail.com" target="_blank">ckoenig.leichtzumerken@gmail.com</a>&gt; wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This avoids OOM situations when we have lots of threads<br>
submitting at the same time.<br>
<br>
Signed-off-by: Christian König &lt;<a href="mailto:christian.koenig@amd.com" target="_blank">christian.koenig@amd.com</a>&gt;<br>
---<br>
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-<br>
 1 file changed, 1 insertion(+), 1 deletion(-)<br>
<br>
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
index a1d6a0721e53..8828d30cd409 100644<br>
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br>
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,<br>
        }<br>
<br>
        r = ttm_eu_reserve_buffers(&amp;p-&gt;ticket, &amp;p-&gt;validated, true,<br>
-                                  &amp;duplicates, true);<br>
+                                  &amp;duplicates, false);<br>
        if (unlikely(r != 0)) {<br>
                if (r != -ERESTARTSYS)<br>
                        DRM_ERROR(&quot;ttm_eu_reserve_buffers failed.\n&quot;);<br>
-- <br>
2.17.1<br>
<br>
_______________________________________________<br>
dri-devel mailing list<br>
<a href="mailto:dri-devel@lists.freedesktop.org" target="_blank">dri-devel@lists.freedesktop.org</a><br>
<a href="https://lists.freedesktop.org/mailman/listinfo/dri-devel" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/dri-devel</a></blockquote></div>
Liang, Prike May 13, 2019, 2:01 p.m. UTC | #2
Hi Christian ,

The series patch can resolve Abaqus pinned failed issue .
Would you like push the four fix patches to drm-next branch .

Thanks,
Prike
-----Original Message-----
From: Christian König <ckoenig.leichtzumerken@gmail.com> 
Sent: Friday, May 10, 2019 10:13 PM
To: Olsak, Marek <Marek.Olsak@amd.com>; Zhou, David(ChunMing) <David1.Zhou@amd.com>; Liang, Prike <Prike.Liang@amd.com>; dri-devel@lists.freedesktop.org
Subject: [PATCH 4/4] drm/amdgpu: stop removing BOs from the LRU during CS

[CAUTION: External Email]

This avoids OOM situations when we have lots of threads submitting at the same time.

Signed-off-by: Christian König <christian.koenig@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index a1d6a0721e53..8828d30cd409 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
        }

        r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,
-                                  &duplicates, true);
+                                  &duplicates, false);
        if (unlikely(r != 0)) {
                if (r != -ERESTARTSYS)
                        DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
--
2.17.1
Christian König May 13, 2019, 2:17 p.m. UTC | #3
Hi Prike,

unfortunately Marek came up with an even better test case, and this 
unfortunately solves only about 80% of all cases where this problem can 
happen.

So Abaqus might work in 4 of 5 runs, but then still fail. I'm currently 
working on trying to fix the remaining 20%.

Give me a day or two to figure things out,
Christian.

Am 13.05.19 um 16:01 schrieb Liang, Prike:
> Hi Christian ,
>
> The series patch can resolve Abaqus pinned failed issue .
> Would you like push the four fix patches to drm-next branch .
>
> Thanks,
> Prike
> -----Original Message-----
> From: Christian König <ckoenig.leichtzumerken@gmail.com>
> Sent: Friday, May 10, 2019 10:13 PM
> To: Olsak, Marek <Marek.Olsak@amd.com>; Zhou, David(ChunMing) <David1.Zhou@amd.com>; Liang, Prike <Prike.Liang@amd.com>; dri-devel@lists.freedesktop.org
> Subject: [PATCH 4/4] drm/amdgpu: stop removing BOs from the LRU during CS
>
> [CAUTION: External Email]
>
> This avoids OOM situations when we have lots of threads submitting at the same time.
>
> Signed-off-by: Christian König <christian.koenig@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> index a1d6a0721e53..8828d30cd409 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
> @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
>          }
>
>          r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,
> -                                  &duplicates, true);
> +                                  &duplicates, false);
>          if (unlikely(r != 0)) {
>                  if (r != -ERESTARTSYS)
>                          DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
> --
> 2.17.1
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index a1d6a0721e53..8828d30cd409 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -648,7 +648,7 @@  static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 	}
 
 	r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,
-				   &duplicates, true);
+				   &duplicates, false);
 	if (unlikely(r != 0)) {
 		if (r != -ERESTARTSYS)
 			DRM_ERROR("ttm_eu_reserve_buffers failed.\n");