Message ID | cover.1726594684.git.jani.nikula@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | drm: revert some framebuffer API tests | expand |
On Tue, Sep 17, 2024 at 08:43:50PM +0300, Jani Nikula wrote: > The tests consistently trigger WARNs in drm_framebuffer code. I'm not > sure what the point is with type of belts and suspenders tests. The > warnings *are* the way to flag erroneous API usage. > > Warnings in turn trigger failures in CI. Filtering the warnings are > error prone, and, crucially, would also filter actual errors in case the > kunit tests are not run. > > I acknowledge there may be complex test cases where you'd end up > triggering warnings somewhere deep, but these are not it. These are > simple. > > Revert the tests, back to the drawing board. Yeah I think long-term we might want a kunit framework so that we can catch dmesg warnings we expect and test for those, without those warnings actually going to dmesg. Similar to how the lockdep tests also reroute locking validation, so that the expected positive tests don't wreak lockdep for real. But until that exists, we can't have tests that splat in dmesg when they work as intended. Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Cheers, Sima > > BR, > Jani. > > > Cc: Carlos Eduardo Gallo Filho <gcarlos@disroot.org> > Cc: Maxime Ripard <mripard@kernel.org> > Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> > Cc: Thomas Zimmermann <tzimmermann@suse.de> > Cc: Jeff Johnson <quic_jjohnson@quicinc.com> > > > Jani Nikula (2): > Revert "drm/tests: Add test for drm_framebuffer_free()" > Revert "drm/tests: Add test for drm_framebuffer_init()" > > drivers/gpu/drm/drm_framebuffer.c | 1 - > drivers/gpu/drm/drm_mode_object.c | 1 - > drivers/gpu/drm/tests/drm_framebuffer_test.c | 134 ------------------- > 3 files changed, 136 deletions(-) > > -- > 2.39.2 >
+Guenter On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: > On Tue, Sep 17, 2024 at 08:43:50PM +0300, Jani Nikula wrote: > > The tests consistently trigger WARNs in drm_framebuffer code. I'm not > > sure what the point is with type of belts and suspenders tests. The > > warnings *are* the way to flag erroneous API usage. > > > > Warnings in turn trigger failures in CI. Filtering the warnings are > > error prone, and, crucially, would also filter actual errors in case the > > kunit tests are not run. > > > > I acknowledge there may be complex test cases where you'd end up > > triggering warnings somewhere deep, but these are not it. These are > > simple. > > > > Revert the tests, back to the drawing board. > > Yeah I think long-term we might want a kunit framework so that we can > catch dmesg warnings we expect and test for those, without those warnings > actually going to dmesg. Similar to how the lockdep tests also reroute > locking validation, so that the expected positive tests don't wreak > lockdep for real. > > But until that exists, we can't have tests that splat in dmesg when they > work as intended. It should be pretty soon: https://lore.kernel.org/dri-devel/20240403131936.787234-1-linux@roeck-us.net/ I'm not sure what happened to that series, but it looks like everybody was mostly happy with it? Maxime
On 9/24/24 04:54, Maxime Ripard wrote: > +Guenter > > On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: >> On Tue, Sep 17, 2024 at 08:43:50PM +0300, Jani Nikula wrote: >>> The tests consistently trigger WARNs in drm_framebuffer code. I'm not >>> sure what the point is with type of belts and suspenders tests. The >>> warnings *are* the way to flag erroneous API usage. >>> >>> Warnings in turn trigger failures in CI. Filtering the warnings are >>> error prone, and, crucially, would also filter actual errors in case the >>> kunit tests are not run. >>> >>> I acknowledge there may be complex test cases where you'd end up >>> triggering warnings somewhere deep, but these are not it. These are >>> simple. >>> >>> Revert the tests, back to the drawing board. >> >> Yeah I think long-term we might want a kunit framework so that we can >> catch dmesg warnings we expect and test for those, without those warnings >> actually going to dmesg. Similar to how the lockdep tests also reroute >> locking validation, so that the expected positive tests don't wreak >> lockdep for real. >> >> But until that exists, we can't have tests that splat in dmesg when they >> work as intended. > > It should be pretty soon: > https://lore.kernel.org/dri-devel/20240403131936.787234-1-linux@roeck-us.net/ > > I'm not sure what happened to that series, but it looks like everybody > was mostly happy with it? > Major subsystem maintainers did not provide any feedback at all, and then there came the "it is not perfect enough" feedback, so I gave up on it. Guenter
On Tue, Sep 24, 2024 at 06:37:59AM GMT, Guenter Roeck wrote: > On 9/24/24 04:54, Maxime Ripard wrote: > > +Guenter > > > > On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: > > > On Tue, Sep 17, 2024 at 08:43:50PM +0300, Jani Nikula wrote: > > > > The tests consistently trigger WARNs in drm_framebuffer code. I'm not > > > > sure what the point is with type of belts and suspenders tests. The > > > > warnings *are* the way to flag erroneous API usage. > > > > > > > > Warnings in turn trigger failures in CI. Filtering the warnings are > > > > error prone, and, crucially, would also filter actual errors in case the > > > > kunit tests are not run. > > > > > > > > I acknowledge there may be complex test cases where you'd end up > > > > triggering warnings somewhere deep, but these are not it. These are > > > > simple. > > > > > > > > Revert the tests, back to the drawing board. > > > > > > Yeah I think long-term we might want a kunit framework so that we can > > > catch dmesg warnings we expect and test for those, without those warnings > > > actually going to dmesg. Similar to how the lockdep tests also reroute > > > locking validation, so that the expected positive tests don't wreak > > > lockdep for real. > > > > > > But until that exists, we can't have tests that splat in dmesg when they > > > work as intended. > > > > It should be pretty soon: > > https://lore.kernel.org/dri-devel/20240403131936.787234-1-linux@roeck-us.net/ > > > > I'm not sure what happened to that series, but it looks like everybody > > was mostly happy with it? > > > > Major subsystem maintainers did not provide any feedback at all, and then > there came the "it is not perfect enough" feedback, so I gave up on it. Well, if that means anything, we're interested even in something imperfect if it solves the above case :) Maxime
On 9/24/24 06:56, Maxime Ripard wrote: > On Tue, Sep 24, 2024 at 06:37:59AM GMT, Guenter Roeck wrote: >> On 9/24/24 04:54, Maxime Ripard wrote: >>> +Guenter >>> >>> On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: >>>> On Tue, Sep 17, 2024 at 08:43:50PM +0300, Jani Nikula wrote: >>>>> The tests consistently trigger WARNs in drm_framebuffer code. I'm not >>>>> sure what the point is with type of belts and suspenders tests. The >>>>> warnings *are* the way to flag erroneous API usage. >>>>> >>>>> Warnings in turn trigger failures in CI. Filtering the warnings are >>>>> error prone, and, crucially, would also filter actual errors in case the >>>>> kunit tests are not run. >>>>> >>>>> I acknowledge there may be complex test cases where you'd end up >>>>> triggering warnings somewhere deep, but these are not it. These are >>>>> simple. >>>>> >>>>> Revert the tests, back to the drawing board. >>>> >>>> Yeah I think long-term we might want a kunit framework so that we can >>>> catch dmesg warnings we expect and test for those, without those warnings >>>> actually going to dmesg. Similar to how the lockdep tests also reroute >>>> locking validation, so that the expected positive tests don't wreak >>>> lockdep for real. >>>> >>>> But until that exists, we can't have tests that splat in dmesg when they >>>> work as intended. >>> FWIW, that is arguable. More and more tests are added which do add such splats, and I don't see any hesitance by developers to adding more. So far I counted two alone in this commit window, and that does not include new splats from tests which I had already disabled. I simply disable those tests or don't enable them in the first place if they are new. I did the same with the drm unit tests due to the splats generated by the scaling unit tests, so any additional drm unit test splats don't make a difference for me since the tests are already disabled. >>> It should be pretty soon: >>> https://lore.kernel.org/dri-devel/20240403131936.787234-1-linux@roeck-us.net/ >>> >>> I'm not sure what happened to that series, but it looks like everybody >>> was mostly happy with it? >>> >> >> Major subsystem maintainers did not provide any feedback at all, and then >> there came the "it is not perfect enough" feedback, so I gave up on it. > > Well, if that means anything, we're interested even in something > imperfect if it solves the above case :) > Maybe someone else is interested picking it up (and no need for credits). I am buried in work and don't have the time (nor, frankly, much interest) to revive it. Also, just for reference: The patch series touches a total of 8 architectures, and unless I missed some I only got feedback from two architecture maintainers. That doesn't include arm - I don't recall if it doesn't need changes or if I never got there. Guenter
On Tue, 24 Sep 2024, Guenter Roeck <linux@roeck-us.net> wrote: >>>> On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: >>>>> Yeah I think long-term we might want a kunit framework so that we can >>>>> catch dmesg warnings we expect and test for those, without those warnings >>>>> actually going to dmesg. Similar to how the lockdep tests also reroute >>>>> locking validation, so that the expected positive tests don't wreak >>>>> lockdep for real. >>>>> >>>>> But until that exists, we can't have tests that splat in dmesg when they >>>>> work as intended. > > FWIW, that is arguable. More and more tests are added which do add such splats, > and I don't see any hesitance by developers to adding more. So far I counted > two alone in this commit window, and that does not include new splats from > tests which I had already disabled. I simply disable those tests or don't > enable them in the first place if they are new. I did the same with the drm > unit tests due to the splats generated by the scaling unit tests, so any > additional drm unit test splats don't make a difference for me since the > tests are already disabled. What's the point of having unit tests that CI systems routinely have to filter out of test runs? Or filter warnings generated by the tests, potentially missing new warnings. Who is going to run the tests if the existing CI systems choose to ignore them? Automation on a massive scale is key here, and making that harder is counter-productive. BR, Jani.
On 9/24/24 08:56, Jani Nikula wrote: > On Tue, 24 Sep 2024, Guenter Roeck <linux@roeck-us.net> wrote: >>>>> On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: >>>>>> Yeah I think long-term we might want a kunit framework so that we can >>>>>> catch dmesg warnings we expect and test for those, without those warnings >>>>>> actually going to dmesg. Similar to how the lockdep tests also reroute >>>>>> locking validation, so that the expected positive tests don't wreak >>>>>> lockdep for real. >>>>>> >>>>>> But until that exists, we can't have tests that splat in dmesg when they >>>>>> work as intended. >> >> FWIW, that is arguable. More and more tests are added which do add such splats, >> and I don't see any hesitance by developers to adding more. So far I counted >> two alone in this commit window, and that does not include new splats from >> tests which I had already disabled. I simply disable those tests or don't >> enable them in the first place if they are new. I did the same with the drm >> unit tests due to the splats generated by the scaling unit tests, so any >> additional drm unit test splats don't make a difference for me since the >> tests are already disabled. > > What's the point of having unit tests that CI systems routinely have to > filter out of test runs? Or filter warnings generated by the tests, > potentially missing new warnings. Who is going to run the tests if the > existing CI systems choose to ignore them? > > Automation on a massive scale is key here, and making that harder is > counter-productive. > As I have said elsewhere, not being able to handle backtraces generated on purpose is a limitation of my testbed. Other testbeds with active (and paid) maintainers may not have that limitation. I deal with it by disabling affected tests. Others may deal with it by maintaining exception lists. Ultimately it is up to developers and testbed maintainers to decide the level of test coverage they are looking for or able to support. Thanks, Guenter
On Tue, Sep 24, 2024 at 06:56:26PM GMT, Jani Nikula wrote: > On Tue, 24 Sep 2024, Guenter Roeck <linux@roeck-us.net> wrote: > >>>> On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: > >>>>> Yeah I think long-term we might want a kunit framework so that we can > >>>>> catch dmesg warnings we expect and test for those, without those warnings > >>>>> actually going to dmesg. Similar to how the lockdep tests also reroute > >>>>> locking validation, so that the expected positive tests don't wreak > >>>>> lockdep for real. > >>>>> > >>>>> But until that exists, we can't have tests that splat in dmesg when they > >>>>> work as intended. > > > > FWIW, that is arguable. More and more tests are added which do add such splats, > > and I don't see any hesitance by developers to adding more. So far I counted > > two alone in this commit window, and that does not include new splats from > > tests which I had already disabled. I simply disable those tests or don't > > enable them in the first place if they are new. I did the same with the drm > > unit tests due to the splats generated by the scaling unit tests, so any > > additional drm unit test splats don't make a difference for me since the > > tests are already disabled. > > What's the point of having unit tests that CI systems routinely have to > filter out of test runs? Or filter warnings generated by the tests, > potentially missing new warnings. Who is going to run the tests if the > existing CI systems choose to ignore them? If we turn this argument around, that means we can't write unit test for code that will create a warning. IMO, this creates a bad incentive, and saying that any capable CI system should reject them is certainly opiniated. Maxime
On 9/24/24 09:57, Maxime Ripard wrote: > On Tue, Sep 24, 2024 at 06:56:26PM GMT, Jani Nikula wrote: >> On Tue, 24 Sep 2024, Guenter Roeck <linux@roeck-us.net> wrote: >>>>>> On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: >>>>>>> Yeah I think long-term we might want a kunit framework so that we can >>>>>>> catch dmesg warnings we expect and test for those, without those warnings >>>>>>> actually going to dmesg. Similar to how the lockdep tests also reroute >>>>>>> locking validation, so that the expected positive tests don't wreak >>>>>>> lockdep for real. >>>>>>> >>>>>>> But until that exists, we can't have tests that splat in dmesg when they >>>>>>> work as intended. >>> >>> FWIW, that is arguable. More and more tests are added which do add such splats, >>> and I don't see any hesitance by developers to adding more. So far I counted >>> two alone in this commit window, and that does not include new splats from >>> tests which I had already disabled. I simply disable those tests or don't >>> enable them in the first place if they are new. I did the same with the drm >>> unit tests due to the splats generated by the scaling unit tests, so any >>> additional drm unit test splats don't make a difference for me since the >>> tests are already disabled. >> >> What's the point of having unit tests that CI systems routinely have to >> filter out of test runs? Or filter warnings generated by the tests, >> potentially missing new warnings. Who is going to run the tests if the >> existing CI systems choose to ignore them? > > If we turn this argument around, that means we can't write unit test for > code that will create a warning. > > IMO, this creates a bad incentive, and saying that any capable CI system > should reject them is certainly opiniated. > Agreed. All I am saying is that _I_ am rejecting them, but it is up to each individual testbed (or, rather, testbed maintainer) to decide how to handle the situation. Guenter
On Tue, 24 Sep 2024, Maxime Ripard <mripard@kernel.org> wrote: > On Tue, Sep 24, 2024 at 06:56:26PM GMT, Jani Nikula wrote: >> On Tue, 24 Sep 2024, Guenter Roeck <linux@roeck-us.net> wrote: >> >>>> On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: >> >>>>> Yeah I think long-term we might want a kunit framework so that we can >> >>>>> catch dmesg warnings we expect and test for those, without those warnings >> >>>>> actually going to dmesg. Similar to how the lockdep tests also reroute >> >>>>> locking validation, so that the expected positive tests don't wreak >> >>>>> lockdep for real. >> >>>>> >> >>>>> But until that exists, we can't have tests that splat in dmesg when they >> >>>>> work as intended. >> > >> > FWIW, that is arguable. More and more tests are added which do add such splats, >> > and I don't see any hesitance by developers to adding more. So far I counted >> > two alone in this commit window, and that does not include new splats from >> > tests which I had already disabled. I simply disable those tests or don't >> > enable them in the first place if they are new. I did the same with the drm >> > unit tests due to the splats generated by the scaling unit tests, so any >> > additional drm unit test splats don't make a difference for me since the >> > tests are already disabled. >> >> What's the point of having unit tests that CI systems routinely have to >> filter out of test runs? Or filter warnings generated by the tests, >> potentially missing new warnings. Who is going to run the tests if the >> existing CI systems choose to ignore them? > > If we turn this argument around, that means we can't write unit test for > code that will create a warning. The question really is, which warnings we get because of the functionality being tested, and which warnings are due to an unexpected and unrelated bug? Is it a pass or fail if the test triggers an unrelated warning? If the developer runs the tests, are they expected to look at dmesg and inspect every warning to decide? > IMO, this creates a bad incentive, and saying that any capable CI system > should reject them is certainly opiniated. All I'm saying it generates an extra burden for current real world CI systems that run tests on a massive scale and even have paid maintainers. It's not trivial to sort out expected and unexpected warnings, and keep the filters updated every time the warnings change. It's not without a cost. And the end result might just be that the unit tests won't get run at all. Or warnings just get completely ignored for kunit tests. I don't completely disagree with you either. It's just that we don't have that automation on the kernel side, and in the mean time, perhaps we should be really conservative about the warnings we create in tests? If we can't filter out the warnings kernel side, perhaps we could figure out a way to annotate the kunit tests that generate warnings on passing runs? BR, Jani.
On Tue, Sep 24, 2024 at 08:09:38AM -0700, Guenter Roeck wrote: > On 9/24/24 06:56, Maxime Ripard wrote: > > On Tue, Sep 24, 2024 at 06:37:59AM GMT, Guenter Roeck wrote: > > > On 9/24/24 04:54, Maxime Ripard wrote: > > > > +Guenter > > > > > > > > On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: > > > > > On Tue, Sep 17, 2024 at 08:43:50PM +0300, Jani Nikula wrote: > > > > > > The tests consistently trigger WARNs in drm_framebuffer code. I'm not > > > > > > sure what the point is with type of belts and suspenders tests. The > > > > > > warnings *are* the way to flag erroneous API usage. > > > > > > > > > > > > Warnings in turn trigger failures in CI. Filtering the warnings are > > > > > > error prone, and, crucially, would also filter actual errors in case the > > > > > > kunit tests are not run. > > > > > > > > > > > > I acknowledge there may be complex test cases where you'd end up > > > > > > triggering warnings somewhere deep, but these are not it. These are > > > > > > simple. > > > > > > > > > > > > Revert the tests, back to the drawing board. > > > > > > > > > > Yeah I think long-term we might want a kunit framework so that we can > > > > > catch dmesg warnings we expect and test for those, without those warnings > > > > > actually going to dmesg. Similar to how the lockdep tests also reroute > > > > > locking validation, so that the expected positive tests don't wreak > > > > > lockdep for real. > > > > > > > > > > But until that exists, we can't have tests that splat in dmesg when they > > > > > work as intended. > > > > > > FWIW, that is arguable. More and more tests are added which do add such splats, > and I don't see any hesitance by developers to adding more. So far I counted > two alone in this commit window, and that does not include new splats from > tests which I had already disabled. I simply disable those tests or don't > enable them in the first place if they are new. I did the same with the drm > unit tests due to the splats generated by the scaling unit tests, so any > additional drm unit test splats don't make a difference for me since the > tests are already disabled. I think for at least drm the consensus is clear, we won't have kunit tests that splat. Personally I really don't see the point of unit tests that are somewhere between unecessarily hard or outright too much pain to deploy in a test rig: Either you don't run them (not great), or you filter splats and might filter too much (not great either) or you filter as little as possible and fight false positives (also kinda suboptimal). Especially for unit tests this stuff must be totally rock solid. We've had similar discussions in the past around unit tests that wasted too much cpu time with randomized combinatorial testing, and we've thrown those out too from drm. Test hygiene matters. Also this means if you see this stuff in drm unit tests (or related things like dma-buf/fence), please report or send revert. > > > > It should be pretty soon: > > > > https://lore.kernel.org/dri-devel/20240403131936.787234-1-linux@roeck-us.net/ > > > > > > > > I'm not sure what happened to that series, but it looks like everybody > > > > was mostly happy with it? > > > > > > > > > > Major subsystem maintainers did not provide any feedback at all, and then > > > there came the "it is not perfect enough" feedback, so I gave up on it. > > > > Well, if that means anything, we're interested even in something > > imperfect if it solves the above case :) > > > > Maybe someone else is interested picking it up (and no need for credits). > I am buried in work and don't have the time (nor, frankly, much interest) > to revive it. Also, just for reference: The patch series touches a total > of 8 architectures, and unless I missed some I only got feedback from two > architecture maintainers. That doesn't include arm - I don't recall if it > doesn't need changes or if I never got there. This is great, somehow I missed it and much appreciated for the initial version even if you can't push it further. Hopefully one of the folks working on drm kunit tests will pick it up, it will be needed here. Cheers, Sima
On Wed, Sep 25, 2024 at 12:41:40PM GMT, Jani Nikula wrote: > On Tue, 24 Sep 2024, Maxime Ripard <mripard@kernel.org> wrote: > > On Tue, Sep 24, 2024 at 06:56:26PM GMT, Jani Nikula wrote: > >> On Tue, 24 Sep 2024, Guenter Roeck <linux@roeck-us.net> wrote: > >> >>>> On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: > >> >>>>> Yeah I think long-term we might want a kunit framework so that we can > >> >>>>> catch dmesg warnings we expect and test for those, without those warnings > >> >>>>> actually going to dmesg. Similar to how the lockdep tests also reroute > >> >>>>> locking validation, so that the expected positive tests don't wreak > >> >>>>> lockdep for real. > >> >>>>> > >> >>>>> But until that exists, we can't have tests that splat in dmesg when they > >> >>>>> work as intended. > >> > > >> > FWIW, that is arguable. More and more tests are added which do add such splats, > >> > and I don't see any hesitance by developers to adding more. So far I counted > >> > two alone in this commit window, and that does not include new splats from > >> > tests which I had already disabled. I simply disable those tests or don't > >> > enable them in the first place if they are new. I did the same with the drm > >> > unit tests due to the splats generated by the scaling unit tests, so any > >> > additional drm unit test splats don't make a difference for me since the > >> > tests are already disabled. > >> > >> What's the point of having unit tests that CI systems routinely have to > >> filter out of test runs? Or filter warnings generated by the tests, > >> potentially missing new warnings. Who is going to run the tests if the > >> existing CI systems choose to ignore them? > > > > If we turn this argument around, that means we can't write unit test for > > code that will create a warning. > > The question really is, which warnings we get because of the > functionality being tested, and which warnings are due to an unexpected > and unrelated bug? Is it a pass or fail if the test triggers an > unrelated warning? > > If the developer runs the tests, are they expected to look at dmesg and > inspect every warning to decide? No, there's no such expectation. Yet Intel's CI chose to do so, and chose to treat any warning as a failure, despite the fact that kunit doesn't have the same policy. > > IMO, this creates a bad incentive, and saying that any capable CI system > > should reject them is certainly opiniated. > > All I'm saying it generates an extra burden for current real world CI > systems that run tests on a massive scale and even have paid > maintainers. It's not trivial to sort out expected and unexpected > warnings, and keep the filters updated every time the warnings > change. It's not without a cost. And the end result might just be that > the unit tests won't get run at all. Or warnings just get completely > ignored for kunit tests. I realise it must take a significant amount of resources, but it's also self inflicted. You could also stop looking for warnings when running kunit. > I don't completely disagree with you either. It's just that we don't > have that automation on the kernel side, and in the mean time, perhaps > we should be really conservative about the warnings we create in tests? > > If we can't filter out the warnings kernel side, perhaps we could figure > out a way to annotate the kunit tests that generate warnings on passing > runs? Yeah, I think that would be the best solution. That's what Guenther's series was about :/ Maxime
On Wed, Sep 25, 2024 at 01:52:02PM GMT, Simona Vetter wrote: > On Tue, Sep 24, 2024 at 08:09:38AM -0700, Guenter Roeck wrote: > > On 9/24/24 06:56, Maxime Ripard wrote: > > > On Tue, Sep 24, 2024 at 06:37:59AM GMT, Guenter Roeck wrote: > > > > On 9/24/24 04:54, Maxime Ripard wrote: > > > > > +Guenter > > > > > > > > > > On Tue, Sep 24, 2024 at 12:06:28PM GMT, Simona Vetter wrote: > > > > > > On Tue, Sep 17, 2024 at 08:43:50PM +0300, Jani Nikula wrote: > > > > > > > The tests consistently trigger WARNs in drm_framebuffer code. I'm not > > > > > > > sure what the point is with type of belts and suspenders tests. The > > > > > > > warnings *are* the way to flag erroneous API usage. > > > > > > > > > > > > > > Warnings in turn trigger failures in CI. Filtering the warnings are > > > > > > > error prone, and, crucially, would also filter actual errors in case the > > > > > > > kunit tests are not run. > > > > > > > > > > > > > > I acknowledge there may be complex test cases where you'd end up > > > > > > > triggering warnings somewhere deep, but these are not it. These are > > > > > > > simple. > > > > > > > > > > > > > > Revert the tests, back to the drawing board. > > > > > > > > > > > > Yeah I think long-term we might want a kunit framework so that we can > > > > > > catch dmesg warnings we expect and test for those, without those warnings > > > > > > actually going to dmesg. Similar to how the lockdep tests also reroute > > > > > > locking validation, so that the expected positive tests don't wreak > > > > > > lockdep for real. > > > > > > > > > > > > But until that exists, we can't have tests that splat in dmesg when they > > > > > > work as intended. > > > > > > > > > FWIW, that is arguable. More and more tests are added which do add such splats, > > and I don't see any hesitance by developers to adding more. So far I counted > > two alone in this commit window, and that does not include new splats from > > tests which I had already disabled. I simply disable those tests or don't > > enable them in the first place if they are new. I did the same with the drm > > unit tests due to the splats generated by the scaling unit tests, so any > > additional drm unit test splats don't make a difference for me since the > > tests are already disabled. > > I think for at least drm the consensus is clear, we won't have kunit tests > that splat. Where was that ever discussed? > Personally I really don't see the point of unit tests that are > somewhere between unecessarily hard or outright too much pain to > deploy in a test rig: Either you don't run them (not great), or you > filter splats and might filter too much (not great either) or you > filter as little as possible and fight false positives (also kinda > suboptimal). Or you don't do any of that, and just rely on the canonical way to run kunit test and trust it's going to pass tests that do indeed pass, and fail / warn on those that don't. > Especially for unit tests this stuff must be totally rock solid. Is there any evidence that those tests were not rock solid? > We've had similar discussions in the past around unit tests that wasted > too much cpu time with randomized combinatorial testing, and we've thrown > those out too from drm. Test hygiene matters. We had that discussion because those tests could run for up to around a minute on certain platforms. It's not the same issue at all? Maxime
On 9/25/24 05:59, Maxime Ripard wrote: ... >> All I'm saying it generates an extra burden for current real world CI >> systems that run tests on a massive scale and even have paid >> maintainers. It's not trivial to sort out expected and unexpected >> warnings, and keep the filters updated every time the warnings >> change. It's not without a cost. And the end result might just be that >> the unit tests won't get run at all. Or warnings just get completely >> ignored for kunit tests. > > I realise it must take a significant amount of resources, but it's also > self inflicted. You could also stop looking for warnings when running > kunit. > FWIW, that doesn't work if kunit tests are built into the kernel and run when booting the system. It also doesn't work well if kunit tests trigger a WARN_ONCE() because that causes any real subsequent warning to be dropped. Also, ignoring warnings while running a kunit test means that _real_ unexpected warnings triggered by those tests will be ignored as well. Guenter
On 9/25/24 06:05, Maxime Ripard wrote: [ ... ] >> We've had similar discussions in the past around unit tests that wasted >> too much cpu time with randomized combinatorial testing, and we've thrown >> those out too from drm. Test hygiene matters. > > We had that discussion because those tests could run for up to around a > minute on certain platforms. It's not the same issue at all? > I added "kunit.stats_enabled=2 kunit.filter=speed>slow" to the kernel command line in my tests to avoid long-running tests. That works as long as slow tests are marked accordingly using KUNIT_CASE_SLOW(), KUNIT_SPEED_SLOW, or KUNIT_SPEED_VERY_SLOW. Guenter
On Wed, Sep 25, 2024 at 09:04:39AM GMT, Guenter Roeck wrote: > On 9/25/24 06:05, Maxime Ripard wrote: > [ ... ] > > > > We've had similar discussions in the past around unit tests that wasted > > > too much cpu time with randomized combinatorial testing, and we've thrown > > > those out too from drm. Test hygiene matters. > > > > We had that discussion because those tests could run for up to around a > > minute on certain platforms. It's not the same issue at all? > > > > I added "kunit.stats_enabled=2 kunit.filter=speed>slow" to the kernel command > line in my tests to avoid long-running tests. That works as long as slow tests > are marked accordingly using KUNIT_CASE_SLOW(), KUNIT_SPEED_SLOW, or > KUNIT_SPEED_VERY_SLOW. Yeah, and I've added a warning some time if a test runs for too long but isn't flagged as slow. Maxime
On Wed, 25 Sep 2024, Maxime Ripard <mripard@kernel.org> wrote: > On Wed, Sep 25, 2024 at 01:52:02PM GMT, Simona Vetter wrote: >> I think for at least drm the consensus is clear, we won't have kunit tests >> that splat. > > Where was that ever discussed? Well, where was it ever agreed that it's okay for drm kunit tests to emit warnings? :p >> Personally I really don't see the point of unit tests that are >> somewhere between unecessarily hard or outright too much pain to >> deploy in a test rig: Either you don't run them (not great), or you >> filter splats and might filter too much (not great either) or you >> filter as little as possible and fight false positives (also kinda >> suboptimal). > > Or you don't do any of that, and just rely on the canonical way to run > kunit test and trust it's going to pass tests that do indeed pass, and > fail / warn on those that don't. That still doesn't address code being tested emitting *unexpected* warnings. BR, Jani.
On Wed, Oct 02, 2024 at 03:10:14PM GMT, Jani Nikula wrote: > On Wed, 25 Sep 2024, Maxime Ripard <mripard@kernel.org> wrote: > > On Wed, Sep 25, 2024 at 01:52:02PM GMT, Simona Vetter wrote: > >> I think for at least drm the consensus is clear, we won't have kunit tests > >> that splat. > > > > Where was that ever discussed? > > Well, where was it ever agreed that it's okay for drm kunit tests to > emit warnings? :p One policy is upstream, the other isn't. And it does seem like the consensus towards downstream patches and policies is pretty well established. And I wasn't the one claiming that there's a consensus to begin with, so I don't see why I should prove anything? > >> Personally I really don't see the point of unit tests that are > >> somewhere between unecessarily hard or outright too much pain to > >> deploy in a test rig: Either you don't run them (not great), or you > >> filter splats and might filter too much (not great either) or you > >> filter as little as possible and fight false positives (also kinda > >> suboptimal). > > > > Or you don't do any of that, and just rely on the canonical way to run > > kunit test and trust it's going to pass tests that do indeed pass, and > > fail / warn on those that don't. > > That still doesn't address code being tested emitting *unexpected* > warnings. Then make kunit report a warning / failure when there's an unexpected warning splat. At the end of the day, the discussion is that we already require for for each committer to: - Apply patches - Check for checkpatch issues - Solve potential conflicts - Make sure it compiles on three different architectures, with huge defconfigs - Make sure the unit tests still pass I'm not going to add "and grep through the output of those 1000-ish tests for a warning" to that list. Maxime