Message ID | 1390514090-17875-1-git-send-email-jeff.mcgee@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jan 23, 2014 at 03:54:50PM -0600, jeff.mcgee@intel.com wrote: > From: Jeff McGee <jeff.mcgee@intel.com> > > The current frequency should reach the minimum frequency within a > reasonable time during idle. > > v2: Not using forcewake for this particular subtest per Daniel's > suggestion. > > Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> Hm, I guess I wasn't clear enough: I've thought about adding a new subtest which just does the check that the actual frequency reaches the lowest level on idle a new subtest, not extend the existing one. Same somewhat holds for your first patch, atm the testcase is a very basic test of the kernel error checking. But even ignoring that I'm not really sure what you're aiming at. Imo the current coverage is good enough since it makes sure that we have at least a bit of error checking in place. Any extensions to this test should imo only be done when we add new features or to exercise bugs (or classes of bugs) that actually happened. Iirc (and I didn't check olds mails, so this might be wrong) we have some corner-cases across suspend/resume and some with the in-kernel code to adjust the requested frequency under load. So imo that's what we should test for. Reading through my test requirements write-up I've just noticed that I didn't emphasis that there's also too much testing possible imho. I'm not a big proponent of test driven developement and similar validate everything approaches. I guess I need to write up my thoughts about too much testing, too ;-) Anyway I'm a bit confused about what's the overall goal here, so please elaborate. Cheers, Daniel > --- tests/pm_rps.c | 22 ++++++++++++++++++---- 1 file changed, 18 > insertions(+), 4 deletions(-) > > diff --git a/tests/pm_rps.c b/tests/pm_rps.c index 7ae0438..24a1ad6 > 100644 --- a/tests/pm_rps.c +++ b/tests/pm_rps.c @@ -206,13 +206,27 @@ > static void min_max_config(void (*check)(void)) check(); } > > +#define IDLE_WAIT_TIMESTEP_MSEC 100 +#define IDLE_WAIT_TIMEOUT_MSEC > 3000 static void idle_check(void) { int freqs[NUMFREQ]; - - > read_freqs(freqs); - dump(freqs); - checkit(freqs); + int wait = > 0; + + /* Monitor frequencies until cur settles down to min, > which should + * happen within the allotted time */ + do { + > read_freqs(freqs); + dump(freqs); + checkit(freqs); + > if (freqs[CUR] == freqs[MIN]) + break; + > usleep(1000 * IDLE_WAIT_TIMESTEP_MSEC); + wait += > IDLE_WAIT_TIMESTEP_MSEC; + } while (wait < IDLE_WAIT_TIMEOUT_MSEC); + > + igt_assert(freqs[CUR] == freqs[MIN]); + log("Required %d msec to > reach cur=min\n", wait); } > > static void pm_rps_exit_handler(int sig) -- 1.8.5.2 > > _______________________________________________ Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Sat, Jan 25, 2014 at 08:46:45PM +0100, Daniel Vetter wrote: > On Thu, Jan 23, 2014 at 03:54:50PM -0600, jeff.mcgee@intel.com wrote: > > From: Jeff McGee <jeff.mcgee@intel.com> > > > > The current frequency should reach the minimum frequency within a > > reasonable time during idle. > > > > v2: Not using forcewake for this particular subtest per Daniel's > > suggestion. > > > > Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> > > Hm, I guess I wasn't clear enough: I've thought about adding a new subtest > which just does the check that the actual frequency reaches the lowest > level on idle a new subtest, not extend the existing one. > > Same somewhat holds for your first patch, atm the testcase is a very basic > test of the kernel error checking. > > But even ignoring that I'm not really sure what you're aiming at. Imo the > current coverage is good enough since it makes sure that we have at least > a bit of error checking in place. Any extensions to this test should imo > only be done when we add new features or to exercise bugs (or classes of > bugs) that actually happened. Iirc (and I didn't check olds mails, so this > might be wrong) we have some corner-cases across suspend/resume and some > with the in-kernel code to adjust the requested frequency under load. So > imo that's what we should test for. > > Reading through my test requirements write-up I've just noticed that I > didn't emphasis that there's also too much testing possible imho. I'm not > a big proponent of test driven developement and similar validate > everything approaches. I guess I need to write up my thoughts about too > much testing, too ;-) > > Anyway I'm a bit confused about what's the overall goal here, so please > elaborate. > > Cheers, Daniel > Hi Daniel. I guess it would have helped if I described my overall goals with this test development in the beginning. I have two rps related driver patches on hold from a few months ago because you felt the igt testing wasn't adequate to accept these. They are: "Update rps interrupt limits" "Restore rps/rc6 on reset" It took me some time to get around to this, but now I am intent on expanding pm_rps with the proper subtests to expose the issues. An outline of the subtests I have in mind: min-max-config-at-idle - Manipulate min and max through sysfs and make sure driver does the right thing. Right thing includes accepting valid values, rejecting invalid values, ensuring that cur remains between min and max, and (for idle) ensuring that cur reaches min after such changes. This last requirement is not being met in part because interrupt limits are not being updated properly when min and max are changed. That will be justification for patch "Update rps interrupt limits". - I have been forming this subtest as an expansion of Ben's original test. His cases for min/max checking are a subset of this. min-max-config-at-load - Manipulate min and max as above, but do so during load. Check for the same basic requirements. When heavily loaded, cur should reach max. There is also potentially some problem with this scenario due to interrupt limits not being updated reliably. If cur is at max, and max is then increased, cur may stay at old max. - This will be a new subtest that re-uses the min/max manipulation function but just do so under load. restore-on-reset - Trigger gpu reset, then test that user-set (non-default) min/max settings were retained and that turbo functions correctly when load is applied and removed. This scenario will fail today and is the point of "Restore rps/rc6 on reset". - This will be another new subtest not yet submitted, So the subtests are focused on known issues. I'm open to any suggestions on their design or implementation. Thanks. Jeff
On Mon, Jan 27, 2014 at 10:24:53AM -0600, Jeff McGee wrote: > On Sat, Jan 25, 2014 at 08:46:45PM +0100, Daniel Vetter wrote: > > On Thu, Jan 23, 2014 at 03:54:50PM -0600, jeff.mcgee@intel.com wrote: > > > From: Jeff McGee <jeff.mcgee@intel.com> > > > > > > The current frequency should reach the minimum frequency within a > > > reasonable time during idle. > > > > > > v2: Not using forcewake for this particular subtest per Daniel's > > > suggestion. > > > > > > Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> > > > > Hm, I guess I wasn't clear enough: I've thought about adding a new subtest > > which just does the check that the actual frequency reaches the lowest > > level on idle a new subtest, not extend the existing one. > > > > Same somewhat holds for your first patch, atm the testcase is a very basic > > test of the kernel error checking. > > > > But even ignoring that I'm not really sure what you're aiming at. Imo the > > current coverage is good enough since it makes sure that we have at least > > a bit of error checking in place. Any extensions to this test should imo > > only be done when we add new features or to exercise bugs (or classes of > > bugs) that actually happened. Iirc (and I didn't check olds mails, so this > > might be wrong) we have some corner-cases across suspend/resume and some > > with the in-kernel code to adjust the requested frequency under load. So > > imo that's what we should test for. > > > > Reading through my test requirements write-up I've just noticed that I > > didn't emphasis that there's also too much testing possible imho. I'm not > > a big proponent of test driven developement and similar validate > > everything approaches. I guess I need to write up my thoughts about too > > much testing, too ;-) > > > > Anyway I'm a bit confused about what's the overall goal here, so please > > elaborate. > > > > Cheers, Daniel > > > Hi Daniel. I guess it would have helped if I described my overall goals > with this test development in the beginning. I have two rps related > driver patches on hold from a few months ago because you felt the igt > testing wasn't adequate to accept these. They are: > > "Update rps interrupt limits" > "Restore rps/rc6 on reset" > > It took me some time to get around to this, but now I am intent on > expanding pm_rps with the proper subtests to expose the issues. An outline > of the subtests I have in mind: > > min-max-config-at-idle > - Manipulate min and max through sysfs and make sure driver does the right > thing. Right thing includes accepting valid values, rejecting invalid > values, ensuring that cur remains between min and max, and (for idle) > ensuring that cur reaches min after such changes. This last requirement > is not being met in part because interrupt limits are not being updated > properly when min and max are changed. That will be justification for > patch "Update rps interrupt limits". > - I have been forming this subtest as an expansion of Ben's original test. > His cases for min/max checking are a subset of this. Oh, I didn't realize that the idle limits are currently broken and that your first patches fixes that. I think I now also understand why do you the idle-check at each step, so that this actually gets properly exercised. Problem with your approach here is that this will cause a spurious regression report from our QA since the current min-max-config-at-idle will suddenly starts failing (since it will test more and currently broken things with your patches applied). And that might shadow other basic issues (yeah, unlikely but still). So I think the right approach here is to keep the current subtest as-is (maybe rename it to "basic-api" since that's pretty much the only thing it does), and then add a completely new set of subtests like you've laid out here. So just leave the existing code as-is and copypaste the code for all your new tests (where you make changes at least). It's probably simplest if you do this shuffling to avoid a rebase mess with your existing patches. > min-max-config-at-load > - Manipulate min and max as above, but do so during load. Check for the > same basic requirements. When heavily loaded, cur should reach max. There > is also potentially some problem with this scenario due to interrupt > limits not being updated reliably. If cur is at max, and max is then > increased, cur may stay at old max. > - This will be a new subtest that re-uses the min/max manipulation function > but just do so under load. > > restore-on-reset > - Trigger gpu reset, then test that user-set (non-default) min/max settings > were retained and that turbo functions correctly when load is applied and > removed. This scenario will fail today and is the point of "Restore rps/rc6 > on reset". > - This will be another new subtest not yet submitted, > > So the subtests are focused on known issues. I'm open to any suggestions on > their design or implementation. Thanks. Excellent test plan imo and thanks a lot for unconfusing me. Cheers, Daniel
On Mon, Jan 27, 2014 at 05:50:04PM +0100, Daniel Vetter wrote: > On Mon, Jan 27, 2014 at 10:24:53AM -0600, Jeff McGee wrote: > > On Sat, Jan 25, 2014 at 08:46:45PM +0100, Daniel Vetter wrote: > > > On Thu, Jan 23, 2014 at 03:54:50PM -0600, jeff.mcgee@intel.com wrote: > > > > From: Jeff McGee <jeff.mcgee@intel.com> > > > > > > > > The current frequency should reach the minimum frequency within a > > > > reasonable time during idle. > > > > > > > > v2: Not using forcewake for this particular subtest per Daniel's > > > > suggestion. > > > > > > > > Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> > > > > > > Hm, I guess I wasn't clear enough: I've thought about adding a new subtest > > > which just does the check that the actual frequency reaches the lowest > > > level on idle a new subtest, not extend the existing one. > > > > > > Same somewhat holds for your first patch, atm the testcase is a very basic > > > test of the kernel error checking. > > > > > > But even ignoring that I'm not really sure what you're aiming at. Imo the > > > current coverage is good enough since it makes sure that we have at least > > > a bit of error checking in place. Any extensions to this test should imo > > > only be done when we add new features or to exercise bugs (or classes of > > > bugs) that actually happened. Iirc (and I didn't check olds mails, so this > > > might be wrong) we have some corner-cases across suspend/resume and some > > > with the in-kernel code to adjust the requested frequency under load. So > > > imo that's what we should test for. > > > > > > Reading through my test requirements write-up I've just noticed that I > > > didn't emphasis that there's also too much testing possible imho. I'm not > > > a big proponent of test driven developement and similar validate > > > everything approaches. I guess I need to write up my thoughts about too > > > much testing, too ;-) > > > > > > Anyway I'm a bit confused about what's the overall goal here, so please > > > elaborate. > > > > > > Cheers, Daniel > > > > > Hi Daniel. I guess it would have helped if I described my overall goals > > with this test development in the beginning. I have two rps related > > driver patches on hold from a few months ago because you felt the igt > > testing wasn't adequate to accept these. They are: > > > > "Update rps interrupt limits" > > "Restore rps/rc6 on reset" > > > > It took me some time to get around to this, but now I am intent on > > expanding pm_rps with the proper subtests to expose the issues. An outline > > of the subtests I have in mind: > > > > min-max-config-at-idle > > - Manipulate min and max through sysfs and make sure driver does the right > > thing. Right thing includes accepting valid values, rejecting invalid > > values, ensuring that cur remains between min and max, and (for idle) > > ensuring that cur reaches min after such changes. This last requirement > > is not being met in part because interrupt limits are not being updated > > properly when min and max are changed. That will be justification for > > patch "Update rps interrupt limits". > > - I have been forming this subtest as an expansion of Ben's original test. > > His cases for min/max checking are a subset of this. > > Oh, I didn't realize that the idle limits are currently broken and that > your first patches fixes that. I think I now also understand why do you > the idle-check at each step, so that this actually gets properly > exercised. > > Problem with your approach here is that this will cause a spurious > regression report from our QA since the current min-max-config-at-idle > will suddenly starts failing (since it will test more and currently broken > things with your patches applied). And that might shadow other basic > issues (yeah, unlikely but still). > > So I think the right approach here is to keep the current subtest as-is > (maybe rename it to "basic-api" since that's pretty much the only thing it > does), and then add a completely new set of subtests like you've laid out > here. So just leave the existing code as-is and copypaste the code for all > your new tests (where you make changes at least). > Ah yes, expanding the subtest to the point that it uncovers an old problem will appear as a regression from recent driver updates, which it is not. I didn't consider this. I agree with your approach to have the current subtest cover only basic min/max parameter validation. But I would recommend taking patches 1-3 of this set - they work within that scope and will not cause failures. Can we just drop patch 4? I will re-introduce it as part of adding the new subtests. -Jeff > It's probably simplest if you do this shuffling to avoid a rebase mess > with your existing patches. > > > min-max-config-at-load > > - Manipulate min and max as above, but do so during load. Check for the > > same basic requirements. When heavily loaded, cur should reach max. There > > is also potentially some problem with this scenario due to interrupt > > limits not being updated reliably. If cur is at max, and max is then > > increased, cur may stay at old max. > > - This will be a new subtest that re-uses the min/max manipulation function > > but just do so under load. > > > > restore-on-reset > > - Trigger gpu reset, then test that user-set (non-default) min/max settings > > were retained and that turbo functions correctly when load is applied and > > removed. This scenario will fail today and is the point of "Restore rps/rc6 > > on reset". > > - This will be another new subtest not yet submitted, > > > > So the subtests are focused on known issues. I'm open to any suggestions on > > their design or implementation. Thanks. > > Excellent test plan imo and thanks a lot for unconfusing me. > > Cheers, Daniel > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Mon, Jan 27, 2014 at 04:23:26PM -0600, Jeff McGee wrote: > On Mon, Jan 27, 2014 at 05:50:04PM +0100, Daniel Vetter wrote: > > On Mon, Jan 27, 2014 at 10:24:53AM -0600, Jeff McGee wrote: > > > On Sat, Jan 25, 2014 at 08:46:45PM +0100, Daniel Vetter wrote: > > > > On Thu, Jan 23, 2014 at 03:54:50PM -0600, jeff.mcgee@intel.com wrote: > > > > > From: Jeff McGee <jeff.mcgee@intel.com> > > > > > > > > > > The current frequency should reach the minimum frequency within a > > > > > reasonable time during idle. > > > > > > > > > > v2: Not using forcewake for this particular subtest per Daniel's > > > > > suggestion. > > > > > > > > > > Signed-off-by: Jeff McGee <jeff.mcgee@intel.com> > > > > > > > > Hm, I guess I wasn't clear enough: I've thought about adding a new subtest > > > > which just does the check that the actual frequency reaches the lowest > > > > level on idle a new subtest, not extend the existing one. > > > > > > > > Same somewhat holds for your first patch, atm the testcase is a very basic > > > > test of the kernel error checking. > > > > > > > > But even ignoring that I'm not really sure what you're aiming at. Imo the > > > > current coverage is good enough since it makes sure that we have at least > > > > a bit of error checking in place. Any extensions to this test should imo > > > > only be done when we add new features or to exercise bugs (or classes of > > > > bugs) that actually happened. Iirc (and I didn't check olds mails, so this > > > > might be wrong) we have some corner-cases across suspend/resume and some > > > > with the in-kernel code to adjust the requested frequency under load. So > > > > imo that's what we should test for. > > > > > > > > Reading through my test requirements write-up I've just noticed that I > > > > didn't emphasis that there's also too much testing possible imho. I'm not > > > > a big proponent of test driven developement and similar validate > > > > everything approaches. I guess I need to write up my thoughts about too > > > > much testing, too ;-) > > > > > > > > Anyway I'm a bit confused about what's the overall goal here, so please > > > > elaborate. > > > > > > > > Cheers, Daniel > > > > > > > Hi Daniel. I guess it would have helped if I described my overall goals > > > with this test development in the beginning. I have two rps related > > > driver patches on hold from a few months ago because you felt the igt > > > testing wasn't adequate to accept these. They are: > > > > > > "Update rps interrupt limits" > > > "Restore rps/rc6 on reset" > > > > > > It took me some time to get around to this, but now I am intent on > > > expanding pm_rps with the proper subtests to expose the issues. An outline > > > of the subtests I have in mind: > > > > > > min-max-config-at-idle > > > - Manipulate min and max through sysfs and make sure driver does the right > > > thing. Right thing includes accepting valid values, rejecting invalid > > > values, ensuring that cur remains between min and max, and (for idle) > > > ensuring that cur reaches min after such changes. This last requirement > > > is not being met in part because interrupt limits are not being updated > > > properly when min and max are changed. That will be justification for > > > patch "Update rps interrupt limits". > > > - I have been forming this subtest as an expansion of Ben's original test. > > > His cases for min/max checking are a subset of this. > > > > Oh, I didn't realize that the idle limits are currently broken and that > > your first patches fixes that. I think I now also understand why do you > > the idle-check at each step, so that this actually gets properly > > exercised. > > > > Problem with your approach here is that this will cause a spurious > > regression report from our QA since the current min-max-config-at-idle > > will suddenly starts failing (since it will test more and currently broken > > things with your patches applied). And that might shadow other basic > > issues (yeah, unlikely but still). > > > > So I think the right approach here is to keep the current subtest as-is > > (maybe rename it to "basic-api" since that's pretty much the only thing it > > does), and then add a completely new set of subtests like you've laid out > > here. So just leave the existing code as-is and copypaste the code for all > > your new tests (where you make changes at least). > > > Ah yes, expanding the subtest to the point that it uncovers an old problem > will appear as a regression from recent driver updates, which it is not. I > didn't consider this. I agree with your approach to have the current subtest > cover only basic min/max parameter validation. But I would recommend taking > patches 1-3 of this set - they work within that scope and will not cause > failures. Can we just drop patch 4? I will re-introduce it as part of adding > the new subtests. Works for me, too. I've also mashed the rename to "basic-api" on top while at it. Patches merged to i-g-t, thanks. -Daniel
diff --git a/tests/pm_rps.c b/tests/pm_rps.c index 7ae0438..24a1ad6 100644 --- a/tests/pm_rps.c +++ b/tests/pm_rps.c @@ -206,13 +206,27 @@ static void min_max_config(void (*check)(void)) check(); } +#define IDLE_WAIT_TIMESTEP_MSEC 100 +#define IDLE_WAIT_TIMEOUT_MSEC 3000 static void idle_check(void) { int freqs[NUMFREQ]; - - read_freqs(freqs); - dump(freqs); - checkit(freqs); + int wait = 0; + + /* Monitor frequencies until cur settles down to min, which should + * happen within the allotted time */ + do { + read_freqs(freqs); + dump(freqs); + checkit(freqs); + if (freqs[CUR] == freqs[MIN]) + break; + usleep(1000 * IDLE_WAIT_TIMESTEP_MSEC); + wait += IDLE_WAIT_TIMESTEP_MSEC; + } while (wait < IDLE_WAIT_TIMEOUT_MSEC); + + igt_assert(freqs[CUR] == freqs[MIN]); + log("Required %d msec to reach cur=min\n", wait); } static void pm_rps_exit_handler(int sig)