Message ID | 1448025816-25584-1-git-send-email-tvrtko.ursulin@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, Nov 20, 2015 at 01:23:36PM +0000, Tvrtko Ursulin wrote: > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae > Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Date: Mon Oct 5 13:26:36 2015 +0100 > > drm/i915: Clean up associated VMAs on context destruction > > Added a warning based on an incorrect assumption that all VMAs > in a VM will be on the inactive list at the point last reference > to a context and VM is dropped. > > This is not true because i915_gem_object_retire__read will not > put VMA on the inactive list until all activities on the object > in question (in all VMs) have been retired. > > As a consequence, whether or not a context/VM will be destroyed > with its VMAs still on the active list, can depend on completely > unrelated activities using the same object from a different > context or engine. > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638 > Testcase: igt/gem_request_retire/retire-vma-not-inactive > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Michel Thierry <michel.thierry@intel.com> Queued for -next, thanks for the patch. -Daniel > --- > drivers/gpu/drm/i915/i915_gem_context.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c > index 204dc7c0b2d6..59dba318213e 100644 > --- a/drivers/gpu/drm/i915/i915_gem_context.c > +++ b/drivers/gpu/drm/i915/i915_gem_context.c > @@ -141,8 +141,6 @@ static void i915_gem_context_clean(struct intel_context *ctx) > if (!ppgtt) > return; > > - WARN_ON(!list_empty(&ppgtt->base.active_list)); > - > list_for_each_entry_safe(vma, next, &ppgtt->base.inactive_list, > mm_list) { > if (WARN_ON(__i915_vma_unbind_no_wait(vma))) > -- > 1.9.1 >
On Tue, Nov 24, 2015 at 11:58:22AM +0100, Daniel Vetter wrote: > On Fri, Nov 20, 2015 at 01:23:36PM +0000, Tvrtko Ursulin wrote: > > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > > > Commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae > > Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Date: Mon Oct 5 13:26:36 2015 +0100 > > > > drm/i915: Clean up associated VMAs on context destruction > > > > Added a warning based on an incorrect assumption that all VMAs > > in a VM will be on the inactive list at the point last reference > > to a context and VM is dropped. > > > > This is not true because i915_gem_object_retire__read will not > > put VMA on the inactive list until all activities on the object > > in question (in all VMs) have been retired. > > > > As a consequence, whether or not a context/VM will be destroyed > > with its VMAs still on the active list, can depend on completely > > unrelated activities using the same object from a different > > context or engine. > > > > Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638 > > Testcase: igt/gem_request_retire/retire-vma-not-inactive > > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Cc: Michel Thierry <michel.thierry@intel.com> > > Queued for -next, thanks for the patch. The WARN_ON is accurate though. The original patch fails to fix even the limited aspect of the bug it claimed to. -Chris
On 24/11/15 12:53, Chris Wilson wrote: > On Tue, Nov 24, 2015 at 11:58:22AM +0100, Daniel Vetter wrote: >> On Fri, Nov 20, 2015 at 01:23:36PM +0000, Tvrtko Ursulin wrote: >>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>> >>> Commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae >>> Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>> Date: Mon Oct 5 13:26:36 2015 +0100 >>> >>> drm/i915: Clean up associated VMAs on context destruction >>> >>> Added a warning based on an incorrect assumption that all VMAs >>> in a VM will be on the inactive list at the point last reference >>> to a context and VM is dropped. >>> >>> This is not true because i915_gem_object_retire__read will not >>> put VMA on the inactive list until all activities on the object >>> in question (in all VMs) have been retired. >>> >>> As a consequence, whether or not a context/VM will be destroyed >>> with its VMAs still on the active list, can depend on completely >>> unrelated activities using the same object from a different >>> context or engine. >>> >>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638 >>> Testcase: igt/gem_request_retire/retire-vma-not-inactive >>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> >>> Cc: Chris Wilson <chris@chris-wilson.co.uk> >>> Cc: Michel Thierry <michel.thierry@intel.com> >> >> Queued for -next, thanks for the patch. > > The WARN_ON is accurate though. The original patch fails to fix even the > limited aspect of the bug it claimed to. That is not true. It only makes it a bit more limited, and not by its fault even. Even with that it makes things a bit better, not worse. And does not impede your VMA rewrite at all. For which I did offer help to review as you send out in manageable chunks. If it is not realistically possible to split it out and do in increments, then it would be more constructive to discuss how to do it than to keep it in limbo for 15 months, as you say, and use it as a reason to shoot down everything else. Regards, Tvrtko
On Tue, Nov 24, 2015 at 01:17:57PM +0000, Tvrtko Ursulin wrote: > > On 24/11/15 12:53, Chris Wilson wrote: > >On Tue, Nov 24, 2015 at 11:58:22AM +0100, Daniel Vetter wrote: > >>On Fri, Nov 20, 2015 at 01:23:36PM +0000, Tvrtko Ursulin wrote: > >>>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >>> > >>>Commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae > >>>Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >>>Date: Mon Oct 5 13:26:36 2015 +0100 > >>> > >>> drm/i915: Clean up associated VMAs on context destruction > >>> > >>>Added a warning based on an incorrect assumption that all VMAs > >>>in a VM will be on the inactive list at the point last reference > >>>to a context and VM is dropped. > >>> > >>>This is not true because i915_gem_object_retire__read will not > >>>put VMA on the inactive list until all activities on the object > >>>in question (in all VMs) have been retired. > >>> > >>>As a consequence, whether or not a context/VM will be destroyed > >>>with its VMAs still on the active list, can depend on completely > >>>unrelated activities using the same object from a different > >>>context or engine. > >>> > >>>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> > >>>Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638 > >>>Testcase: igt/gem_request_retire/retire-vma-not-inactive > >>>Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > >>>Cc: Chris Wilson <chris@chris-wilson.co.uk> > >>>Cc: Michel Thierry <michel.thierry@intel.com> > >> > >>Queued for -next, thanks for the patch. > > > >The WARN_ON is accurate though. The original patch fails to fix even the > >limited aspect of the bug it claimed to. > > That is not true. It only makes it a bit more limited, and not by > its fault even. Even with that it makes things a bit better, not > worse. It makes the code worse for very limited improvement, for which we did not have a publically reported bug, i.e. the impact is very small. > And does not impede your VMA rewrite at all. For which I did offer > help to review as you send out in manageable chunks. I have been. -Chris
Hi, On 24 November 2015 at 13:27, Chris Wilson <chris@chris-wilson.co.uk> wrote: > On Tue, Nov 24, 2015 at 01:17:57PM +0000, Tvrtko Ursulin wrote: >> On 24/11/15 12:53, Chris Wilson wrote: >> >The WARN_ON is accurate though. The original patch fails to fix even the >> >limited aspect of the bug it claimed to. >> >> That is not true. It only makes it a bit more limited, and not by >> its fault even. Even with that it makes things a bit better, not >> worse. > > It makes the code worse for very limited improvement, for which we did > not have a publically reported bug, i.e. the impact is very small. I can get the person who reported it to me to raise a Bugzilla complaining about the WARN_ON if you like ... Cheers, Daniel
On 24/11/15 13:27, Chris Wilson wrote: > On Tue, Nov 24, 2015 at 01:17:57PM +0000, Tvrtko Ursulin wrote: >> >> On 24/11/15 12:53, Chris Wilson wrote: >>> On Tue, Nov 24, 2015 at 11:58:22AM +0100, Daniel Vetter wrote: >>>> On Fri, Nov 20, 2015 at 01:23:36PM +0000, Tvrtko Ursulin wrote: >>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>>>> >>>>> Commit e9f24d5fb7cf3628b195b18ff3ac4e37937ceeae >>>>> Author: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>>>> Date: Mon Oct 5 13:26:36 2015 +0100 >>>>> >>>>> drm/i915: Clean up associated VMAs on context destruction >>>>> >>>>> Added a warning based on an incorrect assumption that all VMAs >>>>> in a VM will be on the inactive list at the point last reference >>>>> to a context and VM is dropped. >>>>> >>>>> This is not true because i915_gem_object_retire__read will not >>>>> put VMA on the inactive list until all activities on the object >>>>> in question (in all VMs) have been retired. >>>>> >>>>> As a consequence, whether or not a context/VM will be destroyed >>>>> with its VMAs still on the active list, can depend on completely >>>>> unrelated activities using the same object from a different >>>>> context or engine. >>>>> >>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> >>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92638 >>>>> Testcase: igt/gem_request_retire/retire-vma-not-inactive >>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> >>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk> >>>>> Cc: Michel Thierry <michel.thierry@intel.com> >>>> >>>> Queued for -next, thanks for the patch. >>> >>> The WARN_ON is accurate though. The original patch fails to fix even the >>> limited aspect of the bug it claimed to. >> >> That is not true. It only makes it a bit more limited, and not by >> its fault even. Even with that it makes things a bit better, not >> worse. > > It makes the code worse for very limited improvement, for which we did > not have a publically reported bug, i.e. the impact is very small. Well impact was huge for Android userspace but you are probably right that BZ was not created for that. It was somewhat related to https://bugs.freedesktop.org/show_bug.cgi?id=87477 on small memory configurations if I remember correctly. Although that hasn't been correctly captured in there or a new entry forked off. We have on the other hand added an IGT for it gem_ppgtt/flink-and-close-vma-leak so I don't think your argument is fair. Especially if the rewrite of it all is imminent - so the worse code, even if you think it is so much worse which I disagree with, is only in there temporary. And the memory leak was real even with fbcon and Xorg which I am sure you know. >> And does not impede your VMA rewrite at all. For which I did offer >> help to review as you send out in manageable chunks. > > I have been. And I have reviewed some, no? Feel free to ping me if I missed some. Regards, Tvrtko
On Tue, Nov 24, 2015 at 01:29:07PM +0000, Daniel Stone wrote: > Hi, > > On 24 November 2015 at 13:27, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > On Tue, Nov 24, 2015 at 01:17:57PM +0000, Tvrtko Ursulin wrote: > >> On 24/11/15 12:53, Chris Wilson wrote: > >> >The WARN_ON is accurate though. The original patch fails to fix even the > >> >limited aspect of the bug it claimed to. > >> > >> That is not true. It only makes it a bit more limited, and not by > >> its fault even. Even with that it makes things a bit better, not > >> worse. > > > > It makes the code worse for very limited improvement, for which we did > > not have a publically reported bug, i.e. the impact is very small. > > I can get the person who reported it to me to raise a Bugzilla > complaining about the WARN_ON if you like ... This is about the original bug, for with the bugfix resulted in the WARN_ON now being removed here. The underlying problem (I think, it's a maze) is that our vma active tracking is a bit ... underwhelming. -Daniel
Hey, On 24 November 2015 at 13:59, Daniel Vetter <daniel@ffwll.ch> wrote: > On Tue, Nov 24, 2015 at 01:29:07PM +0000, Daniel Stone wrote: >> On 24 November 2015 at 13:27, Chris Wilson <chris@chris-wilson.co.uk> wrote: >> > On Tue, Nov 24, 2015 at 01:17:57PM +0000, Tvrtko Ursulin wrote: >> >> On 24/11/15 12:53, Chris Wilson wrote: >> >> >The WARN_ON is accurate though. The original patch fails to fix even the >> >> >limited aspect of the bug it claimed to. >> >> >> >> That is not true. It only makes it a bit more limited, and not by >> >> its fault even. Even with that it makes things a bit better, not >> >> worse. >> > >> > It makes the code worse for very limited improvement, for which we did >> > not have a publically reported bug, i.e. the impact is very small. >> >> I can get the person who reported it to me to raise a Bugzilla >> complaining about the WARN_ON if you like ... > > This is about the original bug, for with the bugfix resulted in the > WARN_ON now being removed here. The underlying problem (I think, it's a > maze) is that our vma active tracking is a bit ... underwhelming. Sure, which is fair enough, but OTOH is there an actual plan for redoing the VMA tracking? Cheers, Daniel
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c index 204dc7c0b2d6..59dba318213e 100644 --- a/drivers/gpu/drm/i915/i915_gem_context.c +++ b/drivers/gpu/drm/i915/i915_gem_context.c @@ -141,8 +141,6 @@ static void i915_gem_context_clean(struct intel_context *ctx) if (!ppgtt) return; - WARN_ON(!list_empty(&ppgtt->base.active_list)); - list_for_each_entry_safe(vma, next, &ppgtt->base.inactive_list, mm_list) { if (WARN_ON(__i915_vma_unbind_no_wait(vma)))