diff mbox

[i-g-t,v2] tests/kms_frontbuffer_tracking: increase FBC wait timeout to 5s

Message ID 20170825104029.18440-1-marta.lofstedt@intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Marta Lofstedt Aug. 25, 2017, 10:40 a.m. UTC
From: "Lofstedt, Marta" <marta.lofstedt@intel.com>

The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*
has non-consistent results, pending between fail and pass.
The fails are always due to "FBC disabled".
With this increase in timeout the flip-flop behavior is no
longer reproducible.

This is a partial revert of:
64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,
where the timeout was decreased from 5s to 2s.
After investigating the timeout needed, the conclusion is that
the longer timeout is only needed when the test swaps between
some specific draw domains, typically blt vs. mmap_cpu.
The objective of the FBC part of the tests is not to benchmark
draw domain changes, it is to check that FBC was (re-)enabled.

V2: Added documentation

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101623
Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
---
 tests/kms_frontbuffer_tracking.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Petri Latvala Aug. 25, 2017, 10:46 a.m. UTC | #1
On 08/25/2017 01:40 PM, Marta Lofstedt wrote:
> After investigating the timeout needed, the conclusion is that
> the longer timeout is only needed when the test swaps between
> some specific draw domains, typically blt vs. mmap_cpu.
> The objective of the FBC part of the tests is not to benchmark
> draw domain changes, it is to check that FBC was (re-)enabled.

Can this explanation be added to the code as a comment too?
Chris Wilson Aug. 25, 2017, 10:47 a.m. UTC | #2
Quoting Marta Lofstedt (2017-08-25 11:40:29)
> From: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> 
> The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*
> has non-consistent results, pending between fail and pass.
> The fails are always due to "FBC disabled".
> With this increase in timeout the flip-flop behavior is no
> longer reproducible.
> 
> This is a partial revert of:
> 64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,
> where the timeout was decreased from 5s to 2s.
> After investigating the timeout needed, the conclusion is that
> the longer timeout is only needed when the test swaps between
> some specific draw domains, typically blt vs. mmap_cpu.
> The objective of the FBC part of the tests is not to benchmark
> draw domain changes, it is to check that FBC was (re-)enabled.
> 
> V2: Added documentation
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101623
> Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
> Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> ---
>  tests/kms_frontbuffer_tracking.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/kms_frontbuffer_tracking.c b/tests/kms_frontbuffer_tracking.c
> index e03524f1..2538450c 100644
> --- a/tests/kms_frontbuffer_tracking.c
> +++ b/tests/kms_frontbuffer_tracking.c
> @@ -924,7 +924,7 @@ static bool fbc_stride_not_supported(void)
>  
>  static bool fbc_wait_until_enabled(void)
>  {

Try igt_drop_caches_set(device, DROP_RETIRE); instead of relaxing the
timeout.
-Chris
Marta Lofstedt Aug. 25, 2017, 11:54 a.m. UTC | #3
> -----Original Message-----

> From: Chris Wilson [mailto:chris@chris-wilson.co.uk]

> Sent: Friday, August 25, 2017 1:47 PM

> To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-

> gfx@lists.freedesktop.org

> Subject: Re: [Intel-gfx] [PATCH i-g-t v2] tests/kms_frontbuffer_tracking:

> increase FBC wait timeout to 5s

> 

> Quoting Marta Lofstedt (2017-08-25 11:40:29)

> > From: "Lofstedt, Marta" <marta.lofstedt@intel.com>

> >

> > The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*

> > has non-consistent results, pending between fail and pass.

> > The fails are always due to "FBC disabled".

> > With this increase in timeout the flip-flop behavior is no longer

> > reproducible.

> >

> > This is a partial revert of:

> > 64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,

> > where the timeout was decreased from 5s to 2s.

> > After investigating the timeout needed, the conclusion is that the

> > longer timeout is only needed when the test swaps between some

> > specific draw domains, typically blt vs. mmap_cpu.

> > The objective of the FBC part of the tests is not to benchmark draw

> > domain changes, it is to check that FBC was (re-)enabled.

> >

> > V2: Added documentation

> >

> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101623

> > Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>

> > Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

> > ---

> >  tests/kms_frontbuffer_tracking.c | 2 +-

> >  1 file changed, 1 insertion(+), 1 deletion(-)

> >

> > diff --git a/tests/kms_frontbuffer_tracking.c

> > b/tests/kms_frontbuffer_tracking.c

> > index e03524f1..2538450c 100644

> > --- a/tests/kms_frontbuffer_tracking.c

> > +++ b/tests/kms_frontbuffer_tracking.c

> > @@ -924,7 +924,7 @@ static bool fbc_stride_not_supported(void)

> >

> >  static bool fbc_wait_until_enabled(void)  {

> 

> Try igt_drop_caches_set(device, DROP_RETIRE); instead of relaxing the

> timeout.

> -Chris


OK, I will test that and do a V3 if it works!
/Marta
Marta Lofstedt Aug. 25, 2017, 12:50 p.m. UTC | #4
> -----Original Message-----

> From: Lofstedt, Marta

> Sent: Friday, August 25, 2017 2:54 PM

> To: 'Chris Wilson' <chris@chris-wilson.co.uk>; intel-gfx@lists.freedesktop.org

> Subject: RE: [Intel-gfx] [PATCH i-g-t v2] tests/kms_frontbuffer_tracking:

> increase FBC wait timeout to 5s

> 

> 

> 

> > -----Original Message-----

> > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]

> > Sent: Friday, August 25, 2017 1:47 PM

> > To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-

> > gfx@lists.freedesktop.org

> > Subject: Re: [Intel-gfx] [PATCH i-g-t v2] tests/kms_frontbuffer_tracking:

> > increase FBC wait timeout to 5s

> >

> > Quoting Marta Lofstedt (2017-08-25 11:40:29)

> > > From: "Lofstedt, Marta" <marta.lofstedt@intel.com>

> > >

> > > The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*

> > > has non-consistent results, pending between fail and pass.

> > > The fails are always due to "FBC disabled".

> > > With this increase in timeout the flip-flop behavior is no longer

> > > reproducible.

> > >

> > > This is a partial revert of:

> > > 64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,

> > > where the timeout was decreased from 5s to 2s.

> > > After investigating the timeout needed, the conclusion is that the

> > > longer timeout is only needed when the test swaps between some

> > > specific draw domains, typically blt vs. mmap_cpu.

> > > The objective of the FBC part of the tests is not to benchmark draw

> > > domain changes, it is to check that FBC was (re-)enabled.

> > >

> > > V2: Added documentation

> > >

> > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101623

> > > Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>

> > > Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

> > > ---

> > >  tests/kms_frontbuffer_tracking.c | 2 +-

> > >  1 file changed, 1 insertion(+), 1 deletion(-)

> > >

> > > diff --git a/tests/kms_frontbuffer_tracking.c

> > > b/tests/kms_frontbuffer_tracking.c

> > > index e03524f1..2538450c 100644

> > > --- a/tests/kms_frontbuffer_tracking.c

> > > +++ b/tests/kms_frontbuffer_tracking.c

> > > @@ -924,7 +924,7 @@ static bool fbc_stride_not_supported(void)

> > >

> > >  static bool fbc_wait_until_enabled(void)  {

> >

> > Try igt_drop_caches_set(device, DROP_RETIRE); instead of relaxing the

> > timeout.

> > -Chris

> 

> OK, I will test that and do a V3 if it works!

> /Marta


I did some initial testing with igt_drop_caches_set inside fbc_wait_until_enabled and it looks good, I will add this to my weekend tests to get more results. This also appear to improve the runtime of the tests quite a bit. So, maybe the igt_drop_caches_set should be placed somewhere else so it will give runtime improvements not only for the FBC related sub-tests.
/Marta
Chris Wilson Aug. 25, 2017, 1:11 p.m. UTC | #5
Quoting Lofstedt, Marta (2017-08-25 13:50:16)
> 
> 
> > -----Original Message-----
> > From: Lofstedt, Marta
> > Sent: Friday, August 25, 2017 2:54 PM
> > To: 'Chris Wilson' <chris@chris-wilson.co.uk>; intel-gfx@lists.freedesktop.org
> > Subject: RE: [Intel-gfx] [PATCH i-g-t v2] tests/kms_frontbuffer_tracking:
> > increase FBC wait timeout to 5s
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> > > Sent: Friday, August 25, 2017 1:47 PM
> > > To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-
> > > gfx@lists.freedesktop.org
> > > Subject: Re: [Intel-gfx] [PATCH i-g-t v2] tests/kms_frontbuffer_tracking:
> > > increase FBC wait timeout to 5s
> > >
> > > Quoting Marta Lofstedt (2017-08-25 11:40:29)
> > > > From: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> > > >
> > > > The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*
> > > > has non-consistent results, pending between fail and pass.
> > > > The fails are always due to "FBC disabled".
> > > > With this increase in timeout the flip-flop behavior is no longer
> > > > reproducible.
> > > >
> > > > This is a partial revert of:
> > > > 64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,
> > > > where the timeout was decreased from 5s to 2s.
> > > > After investigating the timeout needed, the conclusion is that the
> > > > longer timeout is only needed when the test swaps between some
> > > > specific draw domains, typically blt vs. mmap_cpu.
> > > > The objective of the FBC part of the tests is not to benchmark draw
> > > > domain changes, it is to check that FBC was (re-)enabled.
> > > >
> > > > V2: Added documentation
> > > >
> > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101623
> > > > Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
> > > > Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > > ---
> > > >  tests/kms_frontbuffer_tracking.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/tests/kms_frontbuffer_tracking.c
> > > > b/tests/kms_frontbuffer_tracking.c
> > > > index e03524f1..2538450c 100644
> > > > --- a/tests/kms_frontbuffer_tracking.c
> > > > +++ b/tests/kms_frontbuffer_tracking.c
> > > > @@ -924,7 +924,7 @@ static bool fbc_stride_not_supported(void)
> > > >
> > > >  static bool fbc_wait_until_enabled(void)  {
> > >
> > > Try igt_drop_caches_set(device, DROP_RETIRE); instead of relaxing the
> > > timeout.
> > > -Chris
> > 
> > OK, I will test that and do a V3 if it works!
> > /Marta
> 
> I did some initial testing with igt_drop_caches_set inside fbc_wait_until_enabled and it looks good, I will add this to my weekend tests to get more results. This also appear to improve the runtime of the tests quite a bit. So, maybe the igt_drop_caches_set should be placed somewhere else so it will give runtime improvements not only for the FBC related sub-tests.

Sure, all the waits can do with the retire first, give it a common
function and a comment for the rationale (which should pretty much the
same as given in the changelog). Anytime we use the GPU to invalidate
the frontbuffer tracking, we have to wait for a retire to do the flush.
Retirement is lazy, and is normally driven by GPU activity but we have a
background kworker to make sure we notice when the system becomes idle
independent of userspace - except it's low frequency.
-Chris
Marta Lofstedt Aug. 25, 2017, 1:33 p.m. UTC | #6
+paulo

> -----Original Message-----

> From: Chris Wilson [mailto:chris@chris-wilson.co.uk]

> Sent: Friday, August 25, 2017 4:12 PM

> To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-

> gfx@lists.freedesktop.org

> Subject: RE: [Intel-gfx] [PATCH i-g-t v2] tests/kms_frontbuffer_tracking:

> increase FBC wait timeout to 5s

> 

> Quoting Lofstedt, Marta (2017-08-25 13:50:16)

> >

> >

> > > -----Original Message-----

> > > From: Lofstedt, Marta

> > > Sent: Friday, August 25, 2017 2:54 PM

> > > To: 'Chris Wilson' <chris@chris-wilson.co.uk>;

> > > intel-gfx@lists.freedesktop.org

> > > Subject: RE: [Intel-gfx] [PATCH i-g-t v2] tests/kms_frontbuffer_tracking:

> > > increase FBC wait timeout to 5s

> > >

> > >

> > >

> > > > -----Original Message-----

> > > > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]

> > > > Sent: Friday, August 25, 2017 1:47 PM

> > > > To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-

> > > > gfx@lists.freedesktop.org

> > > > Subject: Re: [Intel-gfx] [PATCH i-g-t v2]

> tests/kms_frontbuffer_tracking:

> > > > increase FBC wait timeout to 5s

> > > >

> > > > Quoting Marta Lofstedt (2017-08-25 11:40:29)

> > > > > From: "Lofstedt, Marta" <marta.lofstedt@intel.com>

> > > > >

> > > > > The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*

> > > > > has non-consistent results, pending between fail and pass.

> > > > > The fails are always due to "FBC disabled".

> > > > > With this increase in timeout the flip-flop behavior is no

> > > > > longer reproducible.

> > > > >

> > > > > This is a partial revert of:

> > > > > 64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,

> > > > > where the timeout was decreased from 5s to 2s.

> > > > > After investigating the timeout needed, the conclusion is that

> > > > > the longer timeout is only needed when the test swaps between

> > > > > some specific draw domains, typically blt vs. mmap_cpu.

> > > > > The objective of the FBC part of the tests is not to benchmark

> > > > > draw domain changes, it is to check that FBC was (re-)enabled.

> > > > >

> > > > > V2: Added documentation

> > > > >

> > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101623

> > > > > Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>

> > > > > Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

> > > > > ---

> > > > >  tests/kms_frontbuffer_tracking.c | 2 +-

> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)

> > > > >

> > > > > diff --git a/tests/kms_frontbuffer_tracking.c

> > > > > b/tests/kms_frontbuffer_tracking.c

> > > > > index e03524f1..2538450c 100644

> > > > > --- a/tests/kms_frontbuffer_tracking.c

> > > > > +++ b/tests/kms_frontbuffer_tracking.c

> > > > > @@ -924,7 +924,7 @@ static bool fbc_stride_not_supported(void)

> > > > >

> > > > >  static bool fbc_wait_until_enabled(void)  {

> > > >

> > > > Try igt_drop_caches_set(device, DROP_RETIRE); instead of relaxing

> > > > the timeout.

> > > > -Chris

> > >

> > > OK, I will test that and do a V3 if it works!

> > > /Marta

> >

> > I did some initial testing with igt_drop_caches_set inside

> fbc_wait_until_enabled and it looks good, I will add this to my weekend tests

> to get more results. This also appear to improve the runtime of the tests

> quite a bit. So, maybe the igt_drop_caches_set should be placed somewhere

> else so it will give runtime improvements not only for the FBC related sub-

> tests.

> 

> Sure, all the waits can do with the retire first, give it a common function and a

> comment for the rationale (which should pretty much the same as given in

> the changelog). Anytime we use the GPU to invalidate the frontbuffer

> tracking, we have to wait for a retire to do the flush.

> Retirement is lazy, and is normally driven by GPU activity but we have a

> background kworker to make sure we notice when the system becomes idle

> independent of userspace - except it's low frequency.

> -Chris
Marta Lofstedt Aug. 29, 2017, 7:16 a.m. UTC | #7
I can no longer reproduce the flip/flopping "FBC disabled" on the kms_frontbuffer_tracking tests. 
Instead I hit:
WARNING: CPU: 2 PID: 25732 at drivers/gpu/drm/i915/intel_fbc.c:1173
WARNING: CPU: 2 PID: 25732 at drivers/gpu/drm/i915/intel_fbc.c:1141

/Marta

> -----Original Message-----

> From: Lofstedt, Marta

> Sent: Friday, August 25, 2017 4:34 PM

> To: Chris Wilson <chris@chris-wilson.co.uk>; intel-gfx@lists.freedesktop.org

> Cc: Zanoni, Paulo R <paulo.r.zanoni@intel.com>

> Subject: RE: [Intel-gfx] [PATCH i-g-t v2] tests/kms_frontbuffer_tracking:

> increase FBC wait timeout to 5s

> 

> +paulo

> 

> > -----Original Message-----

> > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]

> > Sent: Friday, August 25, 2017 4:12 PM

> > To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-

> > gfx@lists.freedesktop.org

> > Subject: RE: [Intel-gfx] [PATCH i-g-t v2] tests/kms_frontbuffer_tracking:

> > increase FBC wait timeout to 5s

> >

> > Quoting Lofstedt, Marta (2017-08-25 13:50:16)

> > >

> > >

> > > > -----Original Message-----

> > > > From: Lofstedt, Marta

> > > > Sent: Friday, August 25, 2017 2:54 PM

> > > > To: 'Chris Wilson' <chris@chris-wilson.co.uk>;

> > > > intel-gfx@lists.freedesktop.org

> > > > Subject: RE: [Intel-gfx] [PATCH i-g-t v2]

> tests/kms_frontbuffer_tracking:

> > > > increase FBC wait timeout to 5s

> > > >

> > > >

> > > >

> > > > > -----Original Message-----

> > > > > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]

> > > > > Sent: Friday, August 25, 2017 1:47 PM

> > > > > To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-

> > > > > gfx@lists.freedesktop.org

> > > > > Subject: Re: [Intel-gfx] [PATCH i-g-t v2]

> > tests/kms_frontbuffer_tracking:

> > > > > increase FBC wait timeout to 5s

> > > > >

> > > > > Quoting Marta Lofstedt (2017-08-25 11:40:29)

> > > > > > From: "Lofstedt, Marta" <marta.lofstedt@intel.com>

> > > > > >

> > > > > > The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*

> > > > > > has non-consistent results, pending between fail and pass.

> > > > > > The fails are always due to "FBC disabled".

> > > > > > With this increase in timeout the flip-flop behavior is no

> > > > > > longer reproducible.

> > > > > >

> > > > > > This is a partial revert of:

> > > > > > 64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,

> > > > > > where the timeout was decreased from 5s to 2s.

> > > > > > After investigating the timeout needed, the conclusion is that

> > > > > > the longer timeout is only needed when the test swaps between

> > > > > > some specific draw domains, typically blt vs. mmap_cpu.

> > > > > > The objective of the FBC part of the tests is not to benchmark

> > > > > > draw domain changes, it is to check that FBC was (re-)enabled.

> > > > > >

> > > > > > V2: Added documentation

> > > > > >

> > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101623

> > > > > > Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>

> > > > > > Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>

> > > > > > ---

> > > > > >  tests/kms_frontbuffer_tracking.c | 2 +-

> > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)

> > > > > >

> > > > > > diff --git a/tests/kms_frontbuffer_tracking.c

> > > > > > b/tests/kms_frontbuffer_tracking.c

> > > > > > index e03524f1..2538450c 100644

> > > > > > --- a/tests/kms_frontbuffer_tracking.c

> > > > > > +++ b/tests/kms_frontbuffer_tracking.c

> > > > > > @@ -924,7 +924,7 @@ static bool fbc_stride_not_supported(void)

> > > > > >

> > > > > >  static bool fbc_wait_until_enabled(void)  {

> > > > >

> > > > > Try igt_drop_caches_set(device, DROP_RETIRE); instead of

> > > > > relaxing the timeout.

> > > > > -Chris

> > > >

> > > > OK, I will test that and do a V3 if it works!

> > > > /Marta

> > >

> > > I did some initial testing with igt_drop_caches_set inside

> > fbc_wait_until_enabled and it looks good, I will add this to my

> > weekend tests to get more results. This also appear to improve the

> > runtime of the tests quite a bit. So, maybe the igt_drop_caches_set

> > should be placed somewhere else so it will give runtime improvements

> > not only for the FBC related sub- tests.

> >

> > Sure, all the waits can do with the retire first, give it a common

> > function and a comment for the rationale (which should pretty much the

> > same as given in the changelog). Anytime we use the GPU to invalidate

> > the frontbuffer tracking, we have to wait for a retire to do the flush.

> > Retirement is lazy, and is normally driven by GPU activity but we have

> > a background kworker to make sure we notice when the system becomes

> > idle independent of userspace - except it's low frequency.

> > -Chris
Zanoni, Paulo R Sept. 1, 2017, 7:12 p.m. UTC | #8
Em Sex, 2017-08-25 às 14:11 +0100, Chris Wilson escreveu:
> Quoting Lofstedt, Marta (2017-08-25 13:50:16)
> > 
> > 
> > > -----Original Message-----
> > > From: Lofstedt, Marta
> > > Sent: Friday, August 25, 2017 2:54 PM
> > > To: 'Chris Wilson' <chris@chris-wilson.co.uk>; intel-gfx@lists.fr
> > > eedesktop.org
> > > Subject: RE: [Intel-gfx] [PATCH i-g-t v2]
> > > tests/kms_frontbuffer_tracking:
> > > increase FBC wait timeout to 5s
> > > 
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> > > > Sent: Friday, August 25, 2017 1:47 PM
> > > > To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-
> > > > gfx@lists.freedesktop.org
> > > > Subject: Re: [Intel-gfx] [PATCH i-g-t v2]
> > > > tests/kms_frontbuffer_tracking:
> > > > increase FBC wait timeout to 5s
> > > > 
> > > > Quoting Marta Lofstedt (2017-08-25 11:40:29)
> > > > > From: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> > > > > 
> > > > > The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*
> > > > > has non-consistent results, pending between fail and pass.
> > > > > The fails are always due to "FBC disabled".
> > > > > With this increase in timeout the flip-flop behavior is no
> > > > > longer
> > > > > reproducible.
> > > > > 
> > > > > This is a partial revert of:
> > > > > 64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,
> > > > > where the timeout was decreased from 5s to 2s.
> > > > > After investigating the timeout needed, the conclusion is
> > > > > that the
> > > > > longer timeout is only needed when the test swaps between
> > > > > some
> > > > > specific draw domains, typically blt vs. mmap_cpu.
> > > > > The objective of the FBC part of the tests is not to
> > > > > benchmark draw
> > > > > domain changes, it is to check that FBC was (re-)enabled.
> > > > > 
> > > > > V2: Added documentation
> > > > > 
> > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101623
> > > > > Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
> > > > > Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > > > ---
> > > > >  tests/kms_frontbuffer_tracking.c | 2 +-
> > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/tests/kms_frontbuffer_tracking.c
> > > > > b/tests/kms_frontbuffer_tracking.c
> > > > > index e03524f1..2538450c 100644
> > > > > --- a/tests/kms_frontbuffer_tracking.c
> > > > > +++ b/tests/kms_frontbuffer_tracking.c
> > > > > @@ -924,7 +924,7 @@ static bool
> > > > > fbc_stride_not_supported(void)
> > > > > 
> > > > >  static bool fbc_wait_until_enabled(void)  {
> > > > 
> > > > Try igt_drop_caches_set(device, DROP_RETIRE); instead of
> > > > relaxing the
> > > > timeout.
> > > > -Chris
> > > 
> > > OK, I will test that and do a V3 if it works!
> > > /Marta
> > 
> > I did some initial testing with igt_drop_caches_set inside
> > fbc_wait_until_enabled and it looks good, I will add this to my
> > weekend tests to get more results. This also appear to improve the
> > runtime of the tests quite a bit. So, maybe the igt_drop_caches_set
> > should be placed somewhere else so it will give runtime
> > improvements not only for the FBC related sub-tests.
> 
> Sure, all the waits can do with the retire first, give it a common
> function and a comment for the rationale (which should pretty much
> the
> same as given in the changelog). 

We can do that, sure, especially if it makes the tests faster...

> Anytime we use the GPU to invalidate
> the frontbuffer tracking, we have to wait for a retire to do the
> flush.
> Retirement is lazy, and is normally driven by GPU activity but we
> have a
> background kworker to make sure we notice when the system becomes
> idle
> independent of userspace - except it's low frequency.

... but our current 2s timeout should have been enough for that,
shouldn't it? If I'm looking at the right part of the code, retirement
should be once per second, so 2s should have been enough. But it looks
like it's not enough

Unless I'm misinterpreting the round_up part, which could convert the
1s to 2s, which would still probably be fine...

Anyway, 3s looks like as definitely safe even in this case. Maybe we
could go with 3s?

We can both increase the timeout *and* do cache dropping. Although I
think not doing the cache dropping is definitely something that needs
to be tested, so doing the cache dropping every time may not be a good
idea.



> -Chris
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Chris Wilson Sept. 4, 2017, 10:45 a.m. UTC | #9
Quoting Paulo Zanoni (2017-09-01 20:12:01)
> Em Sex, 2017-08-25 às 14:11 +0100, Chris Wilson escreveu:
> > Quoting Lofstedt, Marta (2017-08-25 13:50:16)
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: Lofstedt, Marta
> > > > Sent: Friday, August 25, 2017 2:54 PM
> > > > To: 'Chris Wilson' <chris@chris-wilson.co.uk>; intel-gfx@lists.fr
> > > > eedesktop.org
> > > > Subject: RE: [Intel-gfx] [PATCH i-g-t v2]
> > > > tests/kms_frontbuffer_tracking:
> > > > increase FBC wait timeout to 5s
> > > > 
> > > > 
> > > > 
> > > > > -----Original Message-----
> > > > > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> > > > > Sent: Friday, August 25, 2017 1:47 PM
> > > > > To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-
> > > > > gfx@lists.freedesktop.org
> > > > > Subject: Re: [Intel-gfx] [PATCH i-g-t v2]
> > > > > tests/kms_frontbuffer_tracking:
> > > > > increase FBC wait timeout to 5s
> > > > > 
> > > > > Quoting Marta Lofstedt (2017-08-25 11:40:29)
> > > > > > From: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> > > > > > 
> > > > > > The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*
> > > > > > has non-consistent results, pending between fail and pass.
> > > > > > The fails are always due to "FBC disabled".
> > > > > > With this increase in timeout the flip-flop behavior is no
> > > > > > longer
> > > > > > reproducible.
> > > > > > 
> > > > > > This is a partial revert of:
> > > > > > 64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,
> > > > > > where the timeout was decreased from 5s to 2s.
> > > > > > After investigating the timeout needed, the conclusion is
> > > > > > that the
> > > > > > longer timeout is only needed when the test swaps between
> > > > > > some
> > > > > > specific draw domains, typically blt vs. mmap_cpu.
> > > > > > The objective of the FBC part of the tests is not to
> > > > > > benchmark draw
> > > > > > domain changes, it is to check that FBC was (re-)enabled.
> > > > > > 
> > > > > > V2: Added documentation
> > > > > > 
> > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101623
> > > > > > Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
> > > > > > Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > > > > ---
> > > > > >  tests/kms_frontbuffer_tracking.c | 2 +-
> > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > 
> > > > > > diff --git a/tests/kms_frontbuffer_tracking.c
> > > > > > b/tests/kms_frontbuffer_tracking.c
> > > > > > index e03524f1..2538450c 100644
> > > > > > --- a/tests/kms_frontbuffer_tracking.c
> > > > > > +++ b/tests/kms_frontbuffer_tracking.c
> > > > > > @@ -924,7 +924,7 @@ static bool
> > > > > > fbc_stride_not_supported(void)
> > > > > > 
> > > > > >  static bool fbc_wait_until_enabled(void)  {
> > > > > 
> > > > > Try igt_drop_caches_set(device, DROP_RETIRE); instead of
> > > > > relaxing the
> > > > > timeout.
> > > > > -Chris
> > > > 
> > > > OK, I will test that and do a V3 if it works!
> > > > /Marta
> > > 
> > > I did some initial testing with igt_drop_caches_set inside
> > > fbc_wait_until_enabled and it looks good, I will add this to my
> > > weekend tests to get more results. This also appear to improve the
> > > runtime of the tests quite a bit. So, maybe the igt_drop_caches_set
> > > should be placed somewhere else so it will give runtime
> > > improvements not only for the FBC related sub-tests.
> > 
> > Sure, all the waits can do with the retire first, give it a common
> > function and a comment for the rationale (which should pretty much
> > the
> > same as given in the changelog). 
> 
> We can do that, sure, especially if it makes the tests faster...
> 
> > Anytime we use the GPU to invalidate
> > the frontbuffer tracking, we have to wait for a retire to do the
> > flush.
> > Retirement is lazy, and is normally driven by GPU activity but we
> > have a
> > background kworker to make sure we notice when the system becomes
> > idle
> > independent of userspace - except it's low frequency.
> 
> ... but our current 2s timeout should have been enough for that,
> shouldn't it? If I'm looking at the right part of the code, retirement
> should be once per second, so 2s should have been enough. But it looks
> like it's not enough
> 
> Unless I'm misinterpreting the round_up part, which could convert the
> 1s to 2s, which would still probably be fine...

It can bump the wait by upto a second (it tries to align wakeups on
second boundaries). And we may skip the work if the device is busy
elsewhere.

> Anyway, 3s looks like as definitely safe even in this case. Maybe we
> could go with 3s?
> 
> We can both increase the timeout *and* do cache dropping. Although I
> think not doing the cache dropping is definitely something that needs
> to be tested, so doing the cache dropping every time may not be a good
> idea.

You are not dropping the caches, it is just doing a retire.

The real question is what is the expectation? If we want the test to
simply state that when ready FBC et al will be re-enabled, then just add
a synchronous debugfs that establishes the condition in the driver that
FBC should be ready (atm that is DROP_RETIRE, but you will probably want
a better specified knob). If the test is to make sure that FBC is
reenabled automatically, then we need to think some more. In a normal
workload, this should be the case (since the retire worker you rely on
is for hostile userspace). If you simply look at the hostile userspace
(and you already are for the frontbuffer writes), then a longer timeout
is definitely acceptable, but how long? What is that limit?

If you define an upper bound for how long you allow fbc et al to remain
off, then we will need an explicit timer to match.
-Chris
Zanoni, Paulo R Sept. 4, 2017, 6:26 p.m. UTC | #10
Em Seg, 2017-09-04 às 11:45 +0100, Chris Wilson escreveu:
> Quoting Paulo Zanoni (2017-09-01 20:12:01)
> > Em Sex, 2017-08-25 às 14:11 +0100, Chris Wilson escreveu:
> > > Quoting Lofstedt, Marta (2017-08-25 13:50:16)
> > > > 
> > > > 
> > > > > -----Original Message-----
> > > > > From: Lofstedt, Marta
> > > > > Sent: Friday, August 25, 2017 2:54 PM
> > > > > To: 'Chris Wilson' <chris@chris-wilson.co.uk>; intel-gfx@list
> > > > > s.fr
> > > > > eedesktop.org
> > > > > Subject: RE: [Intel-gfx] [PATCH i-g-t v2]
> > > > > tests/kms_frontbuffer_tracking:
> > > > > increase FBC wait timeout to 5s
> > > > > 
> > > > > 
> > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Chris Wilson [mailto:chris@chris-wilson.co.uk]
> > > > > > Sent: Friday, August 25, 2017 1:47 PM
> > > > > > To: Lofstedt, Marta <marta.lofstedt@intel.com>; intel-
> > > > > > gfx@lists.freedesktop.org
> > > > > > Subject: Re: [Intel-gfx] [PATCH i-g-t v2]
> > > > > > tests/kms_frontbuffer_tracking:
> > > > > > increase FBC wait timeout to 5s
> > > > > > 
> > > > > > Quoting Marta Lofstedt (2017-08-25 11:40:29)
> > > > > > > From: "Lofstedt, Marta" <marta.lofstedt@intel.com>
> > > > > > > 
> > > > > > > The subtests: igt@kms_frontbuffer_tracking@fbc-*draw*
> > > > > > > has non-consistent results, pending between fail and
> > > > > > > pass.
> > > > > > > The fails are always due to "FBC disabled".
> > > > > > > With this increase in timeout the flip-flop behavior is
> > > > > > > no
> > > > > > > longer
> > > > > > > reproducible.
> > > > > > > 
> > > > > > > This is a partial revert of:
> > > > > > > 64590c7b768dc8d8dd962f812d5ff5a39e7e8b54,
> > > > > > > where the timeout was decreased from 5s to 2s.
> > > > > > > After investigating the timeout needed, the conclusion is
> > > > > > > that the
> > > > > > > longer timeout is only needed when the test swaps between
> > > > > > > some
> > > > > > > specific draw domains, typically blt vs. mmap_cpu.
> > > > > > > The objective of the FBC part of the tests is not to
> > > > > > > benchmark draw
> > > > > > > domain changes, it is to check that FBC was (re-)enabled.
> > > > > > > 
> > > > > > > V2: Added documentation
> > > > > > > 
> > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=10
> > > > > > > 1623
> > > > > > > Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
> > > > > > > Acked-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
> > > > > > > ---
> > > > > > >  tests/kms_frontbuffer_tracking.c | 2 +-
> > > > > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/tests/kms_frontbuffer_tracking.c
> > > > > > > b/tests/kms_frontbuffer_tracking.c
> > > > > > > index e03524f1..2538450c 100644
> > > > > > > --- a/tests/kms_frontbuffer_tracking.c
> > > > > > > +++ b/tests/kms_frontbuffer_tracking.c
> > > > > > > @@ -924,7 +924,7 @@ static bool
> > > > > > > fbc_stride_not_supported(void)
> > > > > > > 
> > > > > > >  static bool fbc_wait_until_enabled(void)  {
> > > > > > 
> > > > > > Try igt_drop_caches_set(device, DROP_RETIRE); instead of
> > > > > > relaxing the
> > > > > > timeout.
> > > > > > -Chris
> > > > > 
> > > > > OK, I will test that and do a V3 if it works!
> > > > > /Marta
> > > > 
> > > > I did some initial testing with igt_drop_caches_set inside
> > > > fbc_wait_until_enabled and it looks good, I will add this to my
> > > > weekend tests to get more results. This also appear to improve
> > > > the
> > > > runtime of the tests quite a bit. So, maybe the
> > > > igt_drop_caches_set
> > > > should be placed somewhere else so it will give runtime
> > > > improvements not only for the FBC related sub-tests.
> > > 
> > > Sure, all the waits can do with the retire first, give it a
> > > common
> > > function and a comment for the rationale (which should pretty
> > > much
> > > the
> > > same as given in the changelog). 
> > 
> > We can do that, sure, especially if it makes the tests faster...
> > 
> > > Anytime we use the GPU to invalidate
> > > the frontbuffer tracking, we have to wait for a retire to do the
> > > flush.
> > > Retirement is lazy, and is normally driven by GPU activity but we
> > > have a
> > > background kworker to make sure we notice when the system becomes
> > > idle
> > > independent of userspace - except it's low frequency.
> > 
> > ... but our current 2s timeout should have been enough for that,
> > shouldn't it? If I'm looking at the right part of the code,
> > retirement
> > should be once per second, so 2s should have been enough. But it
> > looks
> > like it's not enough
> > 
> > Unless I'm misinterpreting the round_up part, which could convert
> > the
> > 1s to 2s, which would still probably be fine...
> 
> It can bump the wait by upto a second (it tries to align wakeups on
> second boundaries). And we may skip the work if the device is busy
> elsewhere.

Okay, so you're saying that there's no amount of seconds we can wait
that will guarantee the retire handler will run, even in IGT's limited
environment where the only DRM client running is
kms_frontbuffer_tracking? If the answer is yes, then we definitely need
to patch kms_frontbuffer_tracking and do something about it. My
assumption was that 2s (or 5s here) would be enough.

Of course, since this is CI we need a 100% guarantee, 99.99999% is
unacceptable.

> 
> > Anyway, 3s looks like as definitely safe even in this case. Maybe
> > we
> > could go with 3s?
> > 
> > We can both increase the timeout *and* do cache dropping. Although
> > I
> > think not doing the cache dropping is definitely something that
> > needs
> > to be tested, so doing the cache dropping every time may not be a
> > good
> > idea.
> 
> You are not dropping the caches, it is just doing a retire.
> 
> The real question is what is the expectation? If we want the test to
> simply state that when ready FBC et al will be re-enabled, then just
> add
> a synchronous debugfs that establishes the condition in the driver
> that
> FBC should be ready (atm that is DROP_RETIRE, but you will probably
> want
> a better specified knob). 

As much as that's a valid option, I'd prefer to do something that
didn't require adding more complex non-standard interactions between
kms_frontbuffer_tracking and the Kernel.


> If the test is to make sure that FBC is
> reenabled automatically, 

We definitely want to check that. A bug in how we receive/treat the
frontbuffer invalidate/flush calls can lead FBC to never get enabled
again.


> then we need to think some more. In a normal
> workload, this should be the case (since the retire worker you rely
> on
> is for hostile userspace). If you simply look at the hostile
> userspace
> (and you already are for the frontbuffer writes), then a longer
> timeout
> is definitely acceptable, but how long? What is that limit?
> 
> If you define an upper bound for how long you allow fbc et al to
> remain
> off, then we will need an explicit timer to match.

See above. I thought there existed an amount of time that we could wait
which would guarantee the retire handler would have run.

> -Chris
diff mbox

Patch

diff --git a/tests/kms_frontbuffer_tracking.c b/tests/kms_frontbuffer_tracking.c
index e03524f1..2538450c 100644
--- a/tests/kms_frontbuffer_tracking.c
+++ b/tests/kms_frontbuffer_tracking.c
@@ -924,7 +924,7 @@  static bool fbc_stride_not_supported(void)
 
 static bool fbc_wait_until_enabled(void)
 {
-	return igt_wait(fbc_is_enabled(), 2000, 1);
+	return igt_wait(fbc_is_enabled(), 5000, 1);
 }
 
 static bool psr_wait_until_enabled(void)