Message ID | 20190510141316.1746-4-christian.koenig@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/4] drm/ttm: Make LRU removal optional. | expand |
Hi, This patch series doesn't help with the OOM errors due to GDS. Reproducible with: AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears Marek On Fri, May 10, 2019 at 10:13 AM Christian König < ckoenig.leichtzumerken@gmail.com> wrote: > This avoids OOM situations when we have lots of threads > submitting at the same time. > > Signed-off-by: Christian König <christian.koenig@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index a1d6a0721e53..8828d30cd409 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct > amdgpu_cs_parser *p, > } > > r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true, > - &duplicates, true); > + &duplicates, false); > if (unlikely(r != 0)) { > if (r != -ERESTARTSYS) > DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); > -- > 2.17.1 > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel <div dir="ltr"><div dir="ltr"><div>Hi,</div><div><br></div><div>This patch series doesn't help with the OOM errors due to GDS. Reproducible with:</div><div><br></div><div>AMD_DEBUG=testgdsmm glxgears & AMD_DEBUG=testgdsmm glxgears</div><div><br></div><div>Marek</div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, May 10, 2019 at 10:13 AM Christian König <<a href="mailto:ckoenig.leichtzumerken@gmail.com" target="_blank">ckoenig.leichtzumerken@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">This avoids OOM situations when we have lots of threads<br> submitting at the same time.<br> <br> Signed-off-by: Christian König <<a href="mailto:christian.koenig@amd.com" target="_blank">christian.koenig@amd.com</a>><br> ---<br> drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +-<br> 1 file changed, 1 insertion(+), 1 deletion(-)<br> <br> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br> index a1d6a0721e53..8828d30cd409 100644<br> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c<br> @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,<br> }<br> <br> r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true,<br> - &duplicates, true);<br> + &duplicates, false);<br> if (unlikely(r != 0)) {<br> if (r != -ERESTARTSYS)<br> DRM_ERROR("ttm_eu_reserve_buffers failed.\n");<br> -- <br> 2.17.1<br> <br> _______________________________________________<br> dri-devel mailing list<br> <a href="mailto:dri-devel@lists.freedesktop.org" target="_blank">dri-devel@lists.freedesktop.org</a><br> <a href="https://lists.freedesktop.org/mailman/listinfo/dri-devel" rel="noreferrer" target="_blank">https://lists.freedesktop.org/mailman/listinfo/dri-devel</a></blockquote></div>
Hi Christian , The series patch can resolve Abaqus pinned failed issue . Would you like push the four fix patches to drm-next branch . Thanks, Prike -----Original Message----- From: Christian König <ckoenig.leichtzumerken@gmail.com> Sent: Friday, May 10, 2019 10:13 PM To: Olsak, Marek <Marek.Olsak@amd.com>; Zhou, David(ChunMing) <David1.Zhou@amd.com>; Liang, Prike <Prike.Liang@amd.com>; dri-devel@lists.freedesktop.org Subject: [PATCH 4/4] drm/amdgpu: stop removing BOs from the LRU during CS [CAUTION: External Email] This avoids OOM situations when we have lots of threads submitting at the same time. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index a1d6a0721e53..8828d30cd409 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true, - &duplicates, true); + &duplicates, false); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); -- 2.17.1
Hi Prike, unfortunately Marek came up with an even better test case, and this unfortunately solves only about 80% of all cases where this problem can happen. So Abaqus might work in 4 of 5 runs, but then still fail. I'm currently working on trying to fix the remaining 20%. Give me a day or two to figure things out, Christian. Am 13.05.19 um 16:01 schrieb Liang, Prike: > Hi Christian , > > The series patch can resolve Abaqus pinned failed issue . > Would you like push the four fix patches to drm-next branch . > > Thanks, > Prike > -----Original Message----- > From: Christian König <ckoenig.leichtzumerken@gmail.com> > Sent: Friday, May 10, 2019 10:13 PM > To: Olsak, Marek <Marek.Olsak@amd.com>; Zhou, David(ChunMing) <David1.Zhou@amd.com>; Liang, Prike <Prike.Liang@amd.com>; dri-devel@lists.freedesktop.org > Subject: [PATCH 4/4] drm/amdgpu: stop removing BOs from the LRU during CS > > [CAUTION: External Email] > > This avoids OOM situations when we have lots of threads submitting at the same time. > > Signed-off-by: Christian König <christian.koenig@amd.com> > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > index a1d6a0721e53..8828d30cd409 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c > @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, > } > > r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true, > - &duplicates, true); > + &duplicates, false); > if (unlikely(r != 0)) { > if (r != -ERESTARTSYS) > DRM_ERROR("ttm_eu_reserve_buffers failed.\n"); > -- > 2.17.1 >
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index a1d6a0721e53..8828d30cd409 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -648,7 +648,7 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p, } r = ttm_eu_reserve_buffers(&p->ticket, &p->validated, true, - &duplicates, true); + &duplicates, false); if (unlikely(r != 0)) { if (r != -ERESTARTSYS) DRM_ERROR("ttm_eu_reserve_buffers failed.\n");
This avoids OOM situations when we have lots of threads submitting at the same time. Signed-off-by: Christian König <christian.koenig@amd.com> --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)