diff mbox series

[v2,12/17] kunit: tool: add Python wrappers for running KUnit tests

Message ID 20190501230126.229218-13-brendanhiggins@google.com (mailing list archive)
State New, archived
Headers show
Series kunit: introduce KUnit, the Linux kernel unit testing framework | expand

Commit Message

Brendan Higgins May 1, 2019, 11:01 p.m. UTC
From: Felix Guo <felixguoxiuping@gmail.com>

The ultimate goal is to create minimal isolated test binaries; in the
meantime we are using UML to provide the infrastructure to run tests, so
define an abstract way to configure and run tests that allow us to
change the context in which tests are built without affecting the user.
This also makes pretty and dynamic error reporting, and a lot of other
nice features easier.

kunit_config.py:
  - parse .config and Kconfig files.

kunit_kernel.py: provides helper functions to:
  - configure the kernel using kunitconfig.
  - build the kernel with the appropriate configuration.
  - provide function to invoke the kernel and stream the output back.

Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
---
 tools/testing/kunit/.gitignore      |   3 +
 tools/testing/kunit/kunit.py        |  78 +++++++++++++++
 tools/testing/kunit/kunit_config.py |  66 +++++++++++++
 tools/testing/kunit/kunit_kernel.py | 148 ++++++++++++++++++++++++++++
 tools/testing/kunit/kunit_parser.py | 119 ++++++++++++++++++++++
 5 files changed, 414 insertions(+)
 create mode 100644 tools/testing/kunit/.gitignore
 create mode 100755 tools/testing/kunit/kunit.py
 create mode 100644 tools/testing/kunit/kunit_config.py
 create mode 100644 tools/testing/kunit/kunit_kernel.py
 create mode 100644 tools/testing/kunit/kunit_parser.py

Comments

Greg Kroah-Hartman May 2, 2019, 11:02 a.m. UTC | #1
On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
> From: Felix Guo <felixguoxiuping@gmail.com>
> 
> The ultimate goal is to create minimal isolated test binaries; in the
> meantime we are using UML to provide the infrastructure to run tests, so
> define an abstract way to configure and run tests that allow us to
> change the context in which tests are built without affecting the user.
> This also makes pretty and dynamic error reporting, and a lot of other
> nice features easier.
> 
> kunit_config.py:
>   - parse .config and Kconfig files.
> 
> kunit_kernel.py: provides helper functions to:
>   - configure the kernel using kunitconfig.
>   - build the kernel with the appropriate configuration.
>   - provide function to invoke the kernel and stream the output back.
> 
> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>

Ah, here's probably my answer to my previous logging format question,
right?  What's the chance that these wrappers output stuff in a standard
format that test-framework-tools can already parse?  :)

thanks,

greg k-h
Brendan Higgins May 2, 2019, 6:07 p.m. UTC | #2
On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
> > From: Felix Guo <felixguoxiuping@gmail.com>
> >
> > The ultimate goal is to create minimal isolated test binaries; in the
> > meantime we are using UML to provide the infrastructure to run tests, so
> > define an abstract way to configure and run tests that allow us to
> > change the context in which tests are built without affecting the user.
> > This also makes pretty and dynamic error reporting, and a lot of other
> > nice features easier.
> >
> > kunit_config.py:
> >   - parse .config and Kconfig files.
> >
> > kunit_kernel.py: provides helper functions to:
> >   - configure the kernel using kunitconfig.
> >   - build the kernel with the appropriate configuration.
> >   - provide function to invoke the kernel and stream the output back.
> >
> > Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
> > Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
>
> Ah, here's probably my answer to my previous logging format question,
> right?  What's the chance that these wrappers output stuff in a standard
> format that test-framework-tools can already parse?  :)

It should be pretty easy to do. I had some patches that pack up the
results into a serialized format for a presubmit service; it should be
pretty straightforward to take the same logic and just change the
output format.

Cheers
Frank Rowand May 2, 2019, 9:16 p.m. UTC | #3
On 5/2/19 11:07 AM, Brendan Higgins wrote:
> On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>>
>> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
>>> From: Felix Guo <felixguoxiuping@gmail.com>
>>>
>>> The ultimate goal is to create minimal isolated test binaries; in the
>>> meantime we are using UML to provide the infrastructure to run tests, so
>>> define an abstract way to configure and run tests that allow us to
>>> change the context in which tests are built without affecting the user.
>>> This also makes pretty and dynamic error reporting, and a lot of other
>>> nice features easier.
>>>
>>> kunit_config.py:
>>>   - parse .config and Kconfig files.
>>>
>>> kunit_kernel.py: provides helper functions to:
>>>   - configure the kernel using kunitconfig.
>>>   - build the kernel with the appropriate configuration.
>>>   - provide function to invoke the kernel and stream the output back.
>>>
>>> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
>>> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
>>
>> Ah, here's probably my answer to my previous logging format question,
>> right?  What's the chance that these wrappers output stuff in a standard
>> format that test-framework-tools can already parse?  :)
> 
> It should be pretty easy to do. I had some patches that pack up the
> results into a serialized format for a presubmit service; it should be
> pretty straightforward to take the same logic and just change the
> output format.

When examining and trying out the previous versions of the patch I found
the wrappers useful to provide information about how to control and use
the tests, but I had no interest in using the scripts as they do not
fit in with my personal environment and workflow.

In the previous versions of the patch, these helper scripts are optional,
which is good for my use case.  If the helper scripts are required to
get the data into the proper format then the scripts are not quite so
optional, they become the expected environment.  I think the proper
format should exist without the helper scripts.

-Frank
Brendan Higgins May 2, 2019, 11:45 p.m. UTC | #4
On Thu, May 2, 2019 at 2:16 PM Frank Rowand <frowand.list@gmail.com> wrote:
>
> On 5/2/19 11:07 AM, Brendan Higgins wrote:
> > On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >>
> >> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
> >>> From: Felix Guo <felixguoxiuping@gmail.com>
> >>>
> >>> The ultimate goal is to create minimal isolated test binaries; in the
> >>> meantime we are using UML to provide the infrastructure to run tests, so
> >>> define an abstract way to configure and run tests that allow us to
> >>> change the context in which tests are built without affecting the user.
> >>> This also makes pretty and dynamic error reporting, and a lot of other
> >>> nice features easier.
> >>>
> >>> kunit_config.py:
> >>>   - parse .config and Kconfig files.
> >>>
> >>> kunit_kernel.py: provides helper functions to:
> >>>   - configure the kernel using kunitconfig.
> >>>   - build the kernel with the appropriate configuration.
> >>>   - provide function to invoke the kernel and stream the output back.
> >>>
> >>> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
> >>> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
> >>
> >> Ah, here's probably my answer to my previous logging format question,
> >> right?  What's the chance that these wrappers output stuff in a standard
> >> format that test-framework-tools can already parse?  :)

To be clear, the test-framework-tools format we are talking about is
TAP13[1], correct?

My understanding is that is what kselftest is being converted to use.

> >
> > It should be pretty easy to do. I had some patches that pack up the
> > results into a serialized format for a presubmit service; it should be
> > pretty straightforward to take the same logic and just change the
> > output format.
>
> When examining and trying out the previous versions of the patch I found
> the wrappers useful to provide information about how to control and use
> the tests, but I had no interest in using the scripts as they do not
> fit in with my personal environment and workflow.
>
> In the previous versions of the patch, these helper scripts are optional,
> which is good for my use case.  If the helper scripts are required to

They are still optional.

> get the data into the proper format then the scripts are not quite so
> optional, they become the expected environment.  I think the proper
> format should exist without the helper scripts.

That's a good point. A couple things,

First off, supporting TAP13, either in the kernel or the wrapper
script is not hard, but I don't think that is the real issue that you
raise.

If your only concern is that you will always be able to have human
readable KUnit results printed to the kernel log, that is a guarantee
I feel comfortable making. Beyond that, I think it is going to take a
long while before I would feel comfortable guaranteeing anything about
how will KUnit work, what kind of data it will want to expose, and how
it will be organized. I think the wrapper script provides a nice
facade that I can maintain, can mediate between the implementation
details and the user, and can mediate between the implementation
details and other pieces of software that might want to consume
results.

[1] https://testanything.org/tap-version-13-specification.html
Frank Rowand May 3, 2019, 1:45 a.m. UTC | #5
On 5/2/19 4:45 PM, Brendan Higgins wrote:
> On Thu, May 2, 2019 at 2:16 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 5/2/19 11:07 AM, Brendan Higgins wrote:
>>> On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>>>>
>>>> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
>>>>> From: Felix Guo <felixguoxiuping@gmail.com>
>>>>>
>>>>> The ultimate goal is to create minimal isolated test binaries; in the
>>>>> meantime we are using UML to provide the infrastructure to run tests, so
>>>>> define an abstract way to configure and run tests that allow us to
>>>>> change the context in which tests are built without affecting the user.
>>>>> This also makes pretty and dynamic error reporting, and a lot of other
>>>>> nice features easier.
>>>>>
>>>>> kunit_config.py:
>>>>>   - parse .config and Kconfig files.
>>>>>
>>>>> kunit_kernel.py: provides helper functions to:
>>>>>   - configure the kernel using kunitconfig.
>>>>>   - build the kernel with the appropriate configuration.
>>>>>   - provide function to invoke the kernel and stream the output back.
>>>>>
>>>>> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
>>>>> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
>>>>
>>>> Ah, here's probably my answer to my previous logging format question,
>>>> right?  What's the chance that these wrappers output stuff in a standard
>>>> format that test-framework-tools can already parse?  :)
> 
> To be clear, the test-framework-tools format we are talking about is
> TAP13[1], correct?

I'm not sure what the test community prefers for a format.  I'll let them
jump in and debate that question.


> 
> My understanding is that is what kselftest is being converted to use.
> 
>>>
>>> It should be pretty easy to do. I had some patches that pack up the
>>> results into a serialized format for a presubmit service; it should be
>>> pretty straightforward to take the same logic and just change the
>>> output format.
>>
>> When examining and trying out the previous versions of the patch I found
>> the wrappers useful to provide information about how to control and use
>> the tests, but I had no interest in using the scripts as they do not
>> fit in with my personal environment and workflow.
>>
>> In the previous versions of the patch, these helper scripts are optional,
>> which is good for my use case.  If the helper scripts are required to
> 
> They are still optional.
> 
>> get the data into the proper format then the scripts are not quite so
>> optional, they become the expected environment.  I think the proper
>> format should exist without the helper scripts.
> 
> That's a good point. A couple things,
> 
> First off, supporting TAP13, either in the kernel or the wrapper
> script is not hard, but I don't think that is the real issue that you
> raise.
> 
> If your only concern is that you will always be able to have human
> readable KUnit results printed to the kernel log, that is a guarantee
> I feel comfortable making. Beyond that, I think it is going to take a
> long while before I would feel comfortable guaranteeing anything about
> how will KUnit work, what kind of data it will want to expose, and how
> it will be organized. I think the wrapper script provides a nice
> facade that I can maintain, can mediate between the implementation
> details and the user, and can mediate between the implementation
> details and other pieces of software that might want to consume
> results.
> 
> [1] https://testanything.org/tap-version-13-specification.html

My concern is based on a focus on my little part of the world
(which in _previous_ versions of the patch series was the devicetree
unittest.c tests being converted to use the kunit infrastructure).
If I step back and think of the entire kernel globally I may end
up with a different conclusion - but I'm going to remain myopic
for this email.

I want the test results to be usable by me and my fellow
developers.  I prefer that the test results be easily accessible
(current printk() implementation means that kunit messages are
just as accessible as the current unittest.c printk() output).
If the printk() output needs to be filtered through a script
to generate the actual test results then that is sub-optimal
to me.  It is one more step added to my workflow.  And
potentially with an embedded target a major pain to get a
data file (the kernel log file) transferred from a target
to my development host.

I want a reported test failure to be easy to trace back to the
point in the source where the failure is reported.  With printk()
the search is a simple grep for the failure message.  If the
failure message has been processed by a script, and then the
failure reported to me in an email, then I may have to look
at the script to reverse engineer how the original failure
message was transformed into the message that was reported
to me in the email.  Then I search for the point in the
source where the failure is reported.  So a basic task has
just become more difficult and time consuming.

-Frank
Brendan Higgins May 3, 2019, 5:36 a.m. UTC | #6
On Thu, May 2, 2019 at 6:45 PM Frank Rowand <frowand.list@gmail.com> wrote:
>
> On 5/2/19 4:45 PM, Brendan Higgins wrote:
> > On Thu, May 2, 2019 at 2:16 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >>
> >> On 5/2/19 11:07 AM, Brendan Higgins wrote:
> >>> On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >>>>
> >>>> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
> >>>>> From: Felix Guo <felixguoxiuping@gmail.com>
> >>>>>
> >>>>> The ultimate goal is to create minimal isolated test binaries; in the
> >>>>> meantime we are using UML to provide the infrastructure to run tests, so
> >>>>> define an abstract way to configure and run tests that allow us to
> >>>>> change the context in which tests are built without affecting the user.
> >>>>> This also makes pretty and dynamic error reporting, and a lot of other
> >>>>> nice features easier.
> >>>>>
> >>>>> kunit_config.py:
> >>>>>   - parse .config and Kconfig files.
> >>>>>
> >>>>> kunit_kernel.py: provides helper functions to:
> >>>>>   - configure the kernel using kunitconfig.
> >>>>>   - build the kernel with the appropriate configuration.
> >>>>>   - provide function to invoke the kernel and stream the output back.
> >>>>>
> >>>>> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
> >>>>> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
> >>>>
> >>>> Ah, here's probably my answer to my previous logging format question,
> >>>> right?  What's the chance that these wrappers output stuff in a standard
> >>>> format that test-framework-tools can already parse?  :)
> >
> > To be clear, the test-framework-tools format we are talking about is
> > TAP13[1], correct?
>
> I'm not sure what the test community prefers for a format.  I'll let them
> jump in and debate that question.
>
>
> >
> > My understanding is that is what kselftest is being converted to use.
> >
> >>>
> >>> It should be pretty easy to do. I had some patches that pack up the
> >>> results into a serialized format for a presubmit service; it should be
> >>> pretty straightforward to take the same logic and just change the
> >>> output format.
> >>
> >> When examining and trying out the previous versions of the patch I found
> >> the wrappers useful to provide information about how to control and use
> >> the tests, but I had no interest in using the scripts as they do not
> >> fit in with my personal environment and workflow.
> >>
> >> In the previous versions of the patch, these helper scripts are optional,
> >> which is good for my use case.  If the helper scripts are required to
> >
> > They are still optional.
> >
> >> get the data into the proper format then the scripts are not quite so
> >> optional, they become the expected environment.  I think the proper
> >> format should exist without the helper scripts.
> >
> > That's a good point. A couple things,
> >
> > First off, supporting TAP13, either in the kernel or the wrapper
> > script is not hard, but I don't think that is the real issue that you
> > raise.
> >
> > If your only concern is that you will always be able to have human
> > readable KUnit results printed to the kernel log, that is a guarantee
> > I feel comfortable making. Beyond that, I think it is going to take a
> > long while before I would feel comfortable guaranteeing anything about
> > how will KUnit work, what kind of data it will want to expose, and how
> > it will be organized. I think the wrapper script provides a nice
> > facade that I can maintain, can mediate between the implementation
> > details and the user, and can mediate between the implementation
> > details and other pieces of software that might want to consume
> > results.
> >
> > [1] https://testanything.org/tap-version-13-specification.html
>
> My concern is based on a focus on my little part of the world
> (which in _previous_ versions of the patch series was the devicetree
> unittest.c tests being converted to use the kunit infrastructure).
> If I step back and think of the entire kernel globally I may end
> up with a different conclusion - but I'm going to remain myopic
> for this email.
>
> I want the test results to be usable by me and my fellow
> developers.  I prefer that the test results be easily accessible
> (current printk() implementation means that kunit messages are
> just as accessible as the current unittest.c printk() output).
> If the printk() output needs to be filtered through a script
> to generate the actual test results then that is sub-optimal
> to me.  It is one more step added to my workflow.  And
> potentially with an embedded target a major pain to get a
> data file (the kernel log file) transferred from a target
> to my development host.

That's fair. If that is indeed your only concern, then I don't think
the wrapper script will ever be an issue for you. You will always be
able to execute a given test the old fashioned/manual way, and the
wrapper script only summarizes results, it does not change the
contents.

>
> I want a reported test failure to be easy to trace back to the
> point in the source where the failure is reported.  With printk()
> the search is a simple grep for the failure message.  If the
> failure message has been processed by a script, and then the
> failure reported to me in an email, then I may have to look
> at the script to reverse engineer how the original failure
> message was transformed into the message that was reported
> to me in the email.  Then I search for the point in the
> source where the failure is reported.  So a basic task has
> just become more difficult and time consuming.

That seems to be a valid concern. I would reiterate that you shouldn't
be concerned by any processing done by the wrapper script itself, but
the reality is that depending on what happens with automated
testing/presubmit/CI other people might end up parsing and
transforming test results - it might happen, it might not. I currently
have a CI system set up for KUnit on my public repo that I don't think
you would be offended by, but I don't know what we are going to do
when it comes time to integrate with existing upstream CI systems.

In anycase, I don't think that either sticking with or doing away with
the wrapper script is going to have any long term bearing on what
happens in this regard.

Cheers
Greg Kroah-Hartman May 3, 2019, 6:41 a.m. UTC | #7
On Thu, May 02, 2019 at 04:45:29PM -0700, Brendan Higgins wrote:
> On Thu, May 2, 2019 at 2:16 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >
> > On 5/2/19 11:07 AM, Brendan Higgins wrote:
> > > On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >>
> > >> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
> > >>> From: Felix Guo <felixguoxiuping@gmail.com>
> > >>>
> > >>> The ultimate goal is to create minimal isolated test binaries; in the
> > >>> meantime we are using UML to provide the infrastructure to run tests, so
> > >>> define an abstract way to configure and run tests that allow us to
> > >>> change the context in which tests are built without affecting the user.
> > >>> This also makes pretty and dynamic error reporting, and a lot of other
> > >>> nice features easier.
> > >>>
> > >>> kunit_config.py:
> > >>>   - parse .config and Kconfig files.
> > >>>
> > >>> kunit_kernel.py: provides helper functions to:
> > >>>   - configure the kernel using kunitconfig.
> > >>>   - build the kernel with the appropriate configuration.
> > >>>   - provide function to invoke the kernel and stream the output back.
> > >>>
> > >>> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
> > >>> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
> > >>
> > >> Ah, here's probably my answer to my previous logging format question,
> > >> right?  What's the chance that these wrappers output stuff in a standard
> > >> format that test-framework-tools can already parse?  :)
> 
> To be clear, the test-framework-tools format we are talking about is
> TAP13[1], correct?

Yes.

> My understanding is that is what kselftest is being converted to use.

Yes, and I think it's almost done.  The core of kselftest provides
functions that all tests can use to log messages in the correct format.

The core of kunit should also log the messages in this format as well,
and not rely on the helper scripts as Frank points out, not everyone
will use/want them.  Might as well make it easy for everyone to always
do the right thing and not force it to always be added in later.

thanks,

greg k-h
Frank Rowand May 3, 2019, 6:59 p.m. UTC | #8
On 5/2/19 10:36 PM, Brendan Higgins wrote:
> On Thu, May 2, 2019 at 6:45 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>
>> On 5/2/19 4:45 PM, Brendan Higgins wrote:
>>> On Thu, May 2, 2019 at 2:16 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>>>
>>>> On 5/2/19 11:07 AM, Brendan Higgins wrote:
>>>>> On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>>>>>>
>>>>>> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
>>>>>>> From: Felix Guo <felixguoxiuping@gmail.com>
>>>>>>>
>>>>>>> The ultimate goal is to create minimal isolated test binaries; in the
>>>>>>> meantime we are using UML to provide the infrastructure to run tests, so
>>>>>>> define an abstract way to configure and run tests that allow us to
>>>>>>> change the context in which tests are built without affecting the user.
>>>>>>> This also makes pretty and dynamic error reporting, and a lot of other
>>>>>>> nice features easier.
>>>>>>>
>>>>>>> kunit_config.py:
>>>>>>>   - parse .config and Kconfig files.
>>>>>>>
>>>>>>> kunit_kernel.py: provides helper functions to:
>>>>>>>   - configure the kernel using kunitconfig.
>>>>>>>   - build the kernel with the appropriate configuration.
>>>>>>>   - provide function to invoke the kernel and stream the output back.
>>>>>>>
>>>>>>> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
>>>>>>> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
>>>>>>
>>>>>> Ah, here's probably my answer to my previous logging format question,
>>>>>> right?  What's the chance that these wrappers output stuff in a standard
>>>>>> format that test-framework-tools can already parse?  :)
>>>
>>> To be clear, the test-framework-tools format we are talking about is
>>> TAP13[1], correct?
>>
>> I'm not sure what the test community prefers for a format.  I'll let them
>> jump in and debate that question.
>>
>>
>>>
>>> My understanding is that is what kselftest is being converted to use.
>>>
>>>>>
>>>>> It should be pretty easy to do. I had some patches that pack up the
>>>>> results into a serialized format for a presubmit service; it should be
>>>>> pretty straightforward to take the same logic and just change the
>>>>> output format.
>>>>
>>>> When examining and trying out the previous versions of the patch I found
>>>> the wrappers useful to provide information about how to control and use
>>>> the tests, but I had no interest in using the scripts as they do not
>>>> fit in with my personal environment and workflow.
>>>>
>>>> In the previous versions of the patch, these helper scripts are optional,
>>>> which is good for my use case.  If the helper scripts are required to
>>>
>>> They are still optional.
>>>
>>>> get the data into the proper format then the scripts are not quite so
>>>> optional, they become the expected environment.  I think the proper
>>>> format should exist without the helper scripts.
>>>
>>> That's a good point. A couple things,
>>>
>>> First off, supporting TAP13, either in the kernel or the wrapper
>>> script is not hard, but I don't think that is the real issue that you
>>> raise.
>>>
>>> If your only concern is that you will always be able to have human
>>> readable KUnit results printed to the kernel log, that is a guarantee
>>> I feel comfortable making. Beyond that, I think it is going to take a
>>> long while before I would feel comfortable guaranteeing anything about
>>> how will KUnit work, what kind of data it will want to expose, and how
>>> it will be organized. I think the wrapper script provides a nice
>>> facade that I can maintain, can mediate between the implementation
>>> details and the user, and can mediate between the implementation
>>> details and other pieces of software that might want to consume
>>> results.
>>>
>>> [1] https://testanything.org/tap-version-13-specification.html
>>
>> My concern is based on a focus on my little part of the world
>> (which in _previous_ versions of the patch series was the devicetree
>> unittest.c tests being converted to use the kunit infrastructure).
>> If I step back and think of the entire kernel globally I may end
>> up with a different conclusion - but I'm going to remain myopic
>> for this email.
>>
>> I want the test results to be usable by me and my fellow
>> developers.  I prefer that the test results be easily accessible
>> (current printk() implementation means that kunit messages are
>> just as accessible as the current unittest.c printk() output).
>> If the printk() output needs to be filtered through a script
>> to generate the actual test results then that is sub-optimal
>> to me.  It is one more step added to my workflow.  And
>> potentially with an embedded target a major pain to get a
>> data file (the kernel log file) transferred from a target
>> to my development host.
> 
> That's fair. If that is indeed your only concern, then I don't think
> the wrapper script will ever be an issue for you. You will always be
> able to execute a given test the old fashioned/manual way, and the
> wrapper script only summarizes results, it does not change the
> contents.
> 
>>
>> I want a reported test failure to be easy to trace back to the
>> point in the source where the failure is reported.  With printk()
>> the search is a simple grep for the failure message.  If the
>> failure message has been processed by a script, and then the
>> failure reported to me in an email, then I may have to look
>> at the script to reverse engineer how the original failure
>> message was transformed into the message that was reported
>> to me in the email.  Then I search for the point in the
>> source where the failure is reported.  So a basic task has
>> just become more difficult and time consuming.
> 
> That seems to be a valid concern. I would reiterate that you shouldn't
> be concerned by any processing done by the wrapper script itself, but
> the reality is that depending on what happens with automated
> testing/presubmit/CI other people might end up parsing and
> transforming test results - it might happen, it might not.

You seem to be missing my point.

Greg asked that the output be in a standard format.

You replied that the standard format could be created by the wrapper script.

Now you say that "it might happen, it might not".  In other words the output
may or may not end up in the standard format.

As Greg points out in comments to patch 12:

  "The core of kunit should also log the messages in this format as well,
  and not rely on the helper scripts as Frank points out, not everyone
  will use/want them.  Might as well make it easy for everyone to always
  do the right thing and not force it to always be added in later."

I am requesting that the original message be in the standard format.  Of
course anyone is free to transform the messages in later processing, no
big deal.


> I currently
> have a CI system set up for KUnit on my public repo that I don't think
> you would be offended by, but I don't know what we are going to do
> when it comes time to integrate with existing upstream CI systems.
> 
> In anycase, I don't think that either sticking with or doing away with
> the wrapper script is going to have any long term bearing on what
> happens in this regard.
> 
> Cheers
>
Brendan Higgins May 3, 2019, 11:14 p.m. UTC | #9
> On 5/2/19 10:36 PM, Brendan Higgins wrote:
> > On Thu, May 2, 2019 at 6:45 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >>
> >> On 5/2/19 4:45 PM, Brendan Higgins wrote:
> >>> On Thu, May 2, 2019 at 2:16 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >>>>
> >>>> On 5/2/19 11:07 AM, Brendan Higgins wrote:
> >>>>> On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >>>>>>
> >>>>>> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
> >>>>>>> From: Felix Guo <felixguoxiuping@gmail.com>
> >>>>>>>
> >>>>>>> The ultimate goal is to create minimal isolated test binaries; in the
> >>>>>>> meantime we are using UML to provide the infrastructure to run tests, so
> >>>>>>> define an abstract way to configure and run tests that allow us to
> >>>>>>> change the context in which tests are built without affecting the user.
> >>>>>>> This also makes pretty and dynamic error reporting, and a lot of other
> >>>>>>> nice features easier.
> >>>>>>>
> >>>>>>> kunit_config.py:
> >>>>>>>   - parse .config and Kconfig files.
> >>>>>>>
> >>>>>>> kunit_kernel.py: provides helper functions to:
> >>>>>>>   - configure the kernel using kunitconfig.
> >>>>>>>   - build the kernel with the appropriate configuration.
> >>>>>>>   - provide function to invoke the kernel and stream the output back.
> >>>>>>>
> >>>>>>> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
> >>>>>>> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
> >>>>>>
> >>>>>> Ah, here's probably my answer to my previous logging format question,
> >>>>>> right?  What's the chance that these wrappers output stuff in a standard
> >>>>>> format that test-framework-tools can already parse?  :)
> >>>
> >>> To be clear, the test-framework-tools format we are talking about is
> >>> TAP13[1], correct?
> >>
> >> I'm not sure what the test community prefers for a format.  I'll let them
> >> jump in and debate that question.
> >>
> >>
> >>>
> >>> My understanding is that is what kselftest is being converted to use.
> >>>
> >>>>>
> >>>>> It should be pretty easy to do. I had some patches that pack up the
> >>>>> results into a serialized format for a presubmit service; it should be
> >>>>> pretty straightforward to take the same logic and just change the
> >>>>> output format.
> >>>>
> >>>> When examining and trying out the previous versions of the patch I found
> >>>> the wrappers useful to provide information about how to control and use
> >>>> the tests, but I had no interest in using the scripts as they do not
> >>>> fit in with my personal environment and workflow.
> >>>>
> >>>> In the previous versions of the patch, these helper scripts are optional,
> >>>> which is good for my use case.  If the helper scripts are required to
> >>>
> >>> They are still optional.
> >>>
> >>>> get the data into the proper format then the scripts are not quite so
> >>>> optional, they become the expected environment.  I think the proper
> >>>> format should exist without the helper scripts.
> >>>
> >>> That's a good point. A couple things,
> >>>
> >>> First off, supporting TAP13, either in the kernel or the wrapper
> >>> script is not hard, but I don't think that is the real issue that you
> >>> raise.
> >>>
> >>> If your only concern is that you will always be able to have human
> >>> readable KUnit results printed to the kernel log, that is a guarantee
> >>> I feel comfortable making. Beyond that, I think it is going to take a
> >>> long while before I would feel comfortable guaranteeing anything about
> >>> how will KUnit work, what kind of data it will want to expose, and how
> >>> it will be organized. I think the wrapper script provides a nice
> >>> facade that I can maintain, can mediate between the implementation
> >>> details and the user, and can mediate between the implementation
> >>> details and other pieces of software that might want to consume
> >>> results.
> >>>
> >>> [1] https://testanything.org/tap-version-13-specification.html
> >>
> >> My concern is based on a focus on my little part of the world
> >> (which in _previous_ versions of the patch series was the devicetree
> >> unittest.c tests being converted to use the kunit infrastructure).
> >> If I step back and think of the entire kernel globally I may end
> >> up with a different conclusion - but I'm going to remain myopic
> >> for this email.
> >>
> >> I want the test results to be usable by me and my fellow
> >> developers.  I prefer that the test results be easily accessible
> >> (current printk() implementation means that kunit messages are
> >> just as accessible as the current unittest.c printk() output).
> >> If the printk() output needs to be filtered through a script
> >> to generate the actual test results then that is sub-optimal
> >> to me.  It is one more step added to my workflow.  And
> >> potentially with an embedded target a major pain to get a
> >> data file (the kernel log file) transferred from a target
> >> to my development host.
> >
> > That's fair. If that is indeed your only concern, then I don't think
> > the wrapper script will ever be an issue for you. You will always be
> > able to execute a given test the old fashioned/manual way, and the
> > wrapper script only summarizes results, it does not change the
> > contents.
> >
> >>
> >> I want a reported test failure to be easy to trace back to the
> >> point in the source where the failure is reported.  With printk()
> >> the search is a simple grep for the failure message.  If the
> >> failure message has been processed by a script, and then the
> >> failure reported to me in an email, then I may have to look
> >> at the script to reverse engineer how the original failure
> >> message was transformed into the message that was reported
> >> to me in the email.  Then I search for the point in the
> >> source where the failure is reported.  So a basic task has
> >> just become more difficult and time consuming.
> >
> > That seems to be a valid concern. I would reiterate that you shouldn't
> > be concerned by any processing done by the wrapper script itself, but
> > the reality is that depending on what happens with automated
> > testing/presubmit/CI other people might end up parsing and
> > transforming test results - it might happen, it might not.
>
> You seem to be missing my point.
>
> Greg asked that the output be in a standard format.
>
> You replied that the standard format could be created by the wrapper script.

I thought Greg originally meant that that is how it could be done when
he first commented on this patch, so I was agreeing and elaborating.
Nevertheless, it seems you and Greg are now in agreement on this
point, so I won't argue it further.

>
> Now you say that "it might happen, it might not".  In other words the output
> may or may not end up in the standard format.

Sorry, that was in reference to your concern about getting an email in
a different format than what the tool that you use generates. It
wasn't a statement about what I was or wasn't going to do in regards
to supporting a standard format.

>
> As Greg points out in comments to patch 12:
>
>   "The core of kunit should also log the messages in this format as well,
>   and not rely on the helper scripts as Frank points out, not everyone
>   will use/want them.  Might as well make it easy for everyone to always
>   do the right thing and not force it to always be added in later."
>
> I am requesting that the original message be in the standard format.  Of
> course anyone is free to transform the messages in later processing, no
> big deal.

My mistake, I thought that was a concern of yours.

In any case, it sounds like you and Greg are in agreement on the core
libraries generating the output in TAP13, so I won't argue that point
further.

## Analysis of using TAP13

One of my earlier concerns was that TAP13 is a bit over constrained
for what I would like to output from the KUnit core. It only allows
data to be output as either:
 - test number
 - ok/not ok with single line description
 - directive
 - diagnostics
 - YAML block

The test number must become before a set of ok/not ok lines, and does
not contain any additional information. One annoying thing about this
is it doesn't provide any kind of nesting or grouping.

There is one ok/not ok line per test and it may have a short
description of the test immediately after 'ok' or 'not ok'; this is
problematic because it wants the first thing you say about a test to
be after you know whether it passes or not.

Directives are just a way to specify skipped tests and TODOs.

Diagnostics seem useful, it looks like you can put whatever
information in them and print them out at anytime. It looks like a lot
of kselftests emit a lot of data this way.

The YAML block seems to be the way that they prefer users to emit data
beyond number of tests run and whether a test passed or failed. I
could express most things I want to express in terms of YAML, but it
is not the nicest format for displaying a lot of data like
expectations, missed function calls, and other things which have a
natural concise representation. Nevertheless, YAML readability is
mostly a problem who won't be using the wrapper scripts. My biggest
problem with the YAML block is that you can only have one, and TAP
specifies that it must come after the corresponding ok/not ok line,
which again has the issue that you have to hold on to a lot of
diagnostic data longer than you ideally would. Another downside is
that I now have to write a YAML serializer for the kernel.

## Here is what I propose for this patchset:

 - Print out test number range at the beginning of each test suite.
 - Print out log lines as soon as they happen as diagnostics.
 - Print out the lines that state whether a test passes or fails as a
ok/not ok line.

This would be technically conforming with TAP13 and is consistent with
what some kselftests have done.

## To be done in a future patchset:

Add a YAML serializer and print out some logs containing structured
data (like expectation failures, unexpected function calls, etc) in
YAML blocks.

Does this sound reasonable? I will go ahead and start working on this,
but feel free to give me feedback on the overall idea in the meantime.

Cheers
Greg Kroah-Hartman May 4, 2019, 10:42 a.m. UTC | #10
On Fri, May 03, 2019 at 04:14:49PM -0700, Brendan Higgins wrote:
> In any case, it sounds like you and Greg are in agreement on the core
> libraries generating the output in TAP13, so I won't argue that point
> further.

Great!

> ## Analysis of using TAP13
> 
> One of my earlier concerns was that TAP13 is a bit over constrained
> for what I would like to output from the KUnit core. It only allows
> data to be output as either:
>  - test number
>  - ok/not ok with single line description
>  - directive
>  - diagnostics
>  - YAML block
> 
> The test number must become before a set of ok/not ok lines, and does
> not contain any additional information. One annoying thing about this
> is it doesn't provide any kind of nesting or grouping.

It should handle nesting just fine, I think we do that already today.

> There is one ok/not ok line per test and it may have a short
> description of the test immediately after 'ok' or 'not ok'; this is
> problematic because it wants the first thing you say about a test to
> be after you know whether it passes or not.

Take a look at the output of our current tests, I think you might find
it to be a bit more flexible than you think.

Also, this isn't our standard, we picked it because we needed a standard
that the tools of today already understand.  It might have issues and
other problems, but we are not in the business of writing test output
parsing tools, and we don't want to force everyone out there to write
custom parsers.  We want them to be able to use the tools they already
have so they can test the kernel, and to do so as easily as possible.

thanks,

greg k-h
Frank Rowand May 6, 2019, 12:19 a.m. UTC | #11
On 5/3/19 4:14 PM, Brendan Higgins wrote:
>> On 5/2/19 10:36 PM, Brendan Higgins wrote:
>>> On Thu, May 2, 2019 at 6:45 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>>>
>>>> On 5/2/19 4:45 PM, Brendan Higgins wrote:
>>>>> On Thu, May 2, 2019 at 2:16 PM Frank Rowand <frowand.list@gmail.com> wrote:
>>>>>>
>>>>>> On 5/2/19 11:07 AM, Brendan Higgins wrote:
>>>>>>> On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>>>>>>>>
>>>>>>>> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
>>>>>>>>> From: Felix Guo <felixguoxiuping@gmail.com>
>>>>>>>>>
>>>>>>>>> The ultimate goal is to create minimal isolated test binaries; in the
>>>>>>>>> meantime we are using UML to provide the infrastructure to run tests, so
>>>>>>>>> define an abstract way to configure and run tests that allow us to
>>>>>>>>> change the context in which tests are built without affecting the user.
>>>>>>>>> This also makes pretty and dynamic error reporting, and a lot of other
>>>>>>>>> nice features easier.
>>>>>>>>>
>>>>>>>>> kunit_config.py:
>>>>>>>>>   - parse .config and Kconfig files.
>>>>>>>>>
>>>>>>>>> kunit_kernel.py: provides helper functions to:
>>>>>>>>>   - configure the kernel using kunitconfig.
>>>>>>>>>   - build the kernel with the appropriate configuration.
>>>>>>>>>   - provide function to invoke the kernel and stream the output back.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
>>>>>>>>> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
>>>>>>>>
>>>>>>>> Ah, here's probably my answer to my previous logging format question,
>>>>>>>> right?  What's the chance that these wrappers output stuff in a standard
>>>>>>>> format that test-framework-tools can already parse?  :)
>>>>>
>>>>> To be clear, the test-framework-tools format we are talking about is
>>>>> TAP13[1], correct?
>>>>
>>>> I'm not sure what the test community prefers for a format.  I'll let them
>>>> jump in and debate that question.
>>>>
>>>>
>>>>>
>>>>> My understanding is that is what kselftest is being converted to use.
>>>>>
>>>>>>>
>>>>>>> It should be pretty easy to do. I had some patches that pack up the
>>>>>>> results into a serialized format for a presubmit service; it should be
>>>>>>> pretty straightforward to take the same logic and just change the
>>>>>>> output format.
>>>>>>
>>>>>> When examining and trying out the previous versions of the patch I found
>>>>>> the wrappers useful to provide information about how to control and use
>>>>>> the tests, but I had no interest in using the scripts as they do not
>>>>>> fit in with my personal environment and workflow.
>>>>>>
>>>>>> In the previous versions of the patch, these helper scripts are optional,
>>>>>> which is good for my use case.  If the helper scripts are required to
>>>>>
>>>>> They are still optional.
>>>>>
>>>>>> get the data into the proper format then the scripts are not quite so
>>>>>> optional, they become the expected environment.  I think the proper
>>>>>> format should exist without the helper scripts.
>>>>>
>>>>> That's a good point. A couple things,
>>>>>
>>>>> First off, supporting TAP13, either in the kernel or the wrapper
>>>>> script is not hard, but I don't think that is the real issue that you
>>>>> raise.
>>>>>
>>>>> If your only concern is that you will always be able to have human
>>>>> readable KUnit results printed to the kernel log, that is a guarantee
>>>>> I feel comfortable making. Beyond that, I think it is going to take a
>>>>> long while before I would feel comfortable guaranteeing anything about
>>>>> how will KUnit work, what kind of data it will want to expose, and how
>>>>> it will be organized. I think the wrapper script provides a nice
>>>>> facade that I can maintain, can mediate between the implementation
>>>>> details and the user, and can mediate between the implementation
>>>>> details and other pieces of software that might want to consume
>>>>> results.
>>>>>
>>>>> [1] https://testanything.org/tap-version-13-specification.html
>>>>
>>>> My concern is based on a focus on my little part of the world
>>>> (which in _previous_ versions of the patch series was the devicetree
>>>> unittest.c tests being converted to use the kunit infrastructure).
>>>> If I step back and think of the entire kernel globally I may end
>>>> up with a different conclusion - but I'm going to remain myopic
>>>> for this email.
>>>>
>>>> I want the test results to be usable by me and my fellow
>>>> developers.  I prefer that the test results be easily accessible
>>>> (current printk() implementation means that kunit messages are
>>>> just as accessible as the current unittest.c printk() output).
>>>> If the printk() output needs to be filtered through a script
>>>> to generate the actual test results then that is sub-optimal
>>>> to me.  It is one more step added to my workflow.  And
>>>> potentially with an embedded target a major pain to get a
>>>> data file (the kernel log file) transferred from a target
>>>> to my development host.
>>>
>>> That's fair. If that is indeed your only concern, then I don't think
>>> the wrapper script will ever be an issue for you. You will always be
>>> able to execute a given test the old fashioned/manual way, and the
>>> wrapper script only summarizes results, it does not change the
>>> contents.
>>>
>>>>
>>>> I want a reported test failure to be easy to trace back to the
>>>> point in the source where the failure is reported.  With printk()
>>>> the search is a simple grep for the failure message.  If the
>>>> failure message has been processed by a script, and then the
>>>> failure reported to me in an email, then I may have to look
>>>> at the script to reverse engineer how the original failure
>>>> message was transformed into the message that was reported
>>>> to me in the email.  Then I search for the point in the
>>>> source where the failure is reported.  So a basic task has
>>>> just become more difficult and time consuming.
>>>
>>> That seems to be a valid concern. I would reiterate that you shouldn't
>>> be concerned by any processing done by the wrapper script itself, but
>>> the reality is that depending on what happens with automated
>>> testing/presubmit/CI other people might end up parsing and
>>> transforming test results - it might happen, it might not.
>>
>> You seem to be missing my point.
>>
>> Greg asked that the output be in a standard format.
>>
>> You replied that the standard format could be created by the wrapper script.
> 
> I thought Greg originally meant that that is how it could be done when
> he first commented on this patch, so I was agreeing and elaborating.
> Nevertheless, it seems you and Greg are now in agreement on this
> point, so I won't argue it further.
> 
>>
>> Now you say that "it might happen, it might not".  In other words the output
>> may or may not end up in the standard format.
> 
> Sorry, that was in reference to your concern about getting an email in
> a different format than what the tool that you use generates. It
> wasn't a statement about what I was or wasn't going to do in regards
> to supporting a standard format.
> 
>>
>> As Greg points out in comments to patch 12:
>>
>>   "The core of kunit should also log the messages in this format as well,
>>   and not rely on the helper scripts as Frank points out, not everyone
>>   will use/want them.  Might as well make it easy for everyone to always
>>   do the right thing and not force it to always be added in later."
>>
>> I am requesting that the original message be in the standard format.  Of
>> course anyone is free to transform the messages in later processing, no
>> big deal.
> 
> My mistake, I thought that was a concern of yours.
> 
> In any case, it sounds like you and Greg are in agreement on the core
> libraries generating the output in TAP13, so I won't argue that point
> further.
> 
> ## Analysis of using TAP13

I have never looked at TAP version 13 in any depth at all, so do not consider
me to be any sort of expert.

My entire TAP knowledge is based on:

  https://testanything.org/tap-version-13-specification.html

and the pull request to create the TAP version 14 specification:

   https://github.com/TestAnything/testanything.github.io/pull/36/files

You can see the full version 14 document in the submitter's repo:

  $ git clone https://github.com/isaacs/testanything.github.io.git
  $ cd testanything.github.io
  $ git checkout tap14
  $ ls tap-version-14-specification.md

My understanding is the the version 14 specification is not trying to
add new features, but instead capture what is already implemented in
the wild.


> One of my earlier concerns was that TAP13 is a bit over constrained
> for what I would like to output from the KUnit core. It only allows
> data to be output as either:
>  - test number
>  - ok/not ok with single line description
>  - directive
>  - diagnostics
>  - YAML block
> 
> The test number must become before a set of ok/not ok lines, and does
> not contain any additional information. One annoying thing about this
> is it doesn't provide any kind of nesting or grouping.

Greg's response mentions ktest (?) already does nesting.

Version 14 allows nesting through subtests.  I have not looked at what
ktest does, so I do not know if it uses subtest, or something else.


> There is one ok/not ok line per test and it may have a short
> description of the test immediately after 'ok' or 'not ok'; this is
> problematic because it wants the first thing you say about a test to
> be after you know whether it passes or not.

I think you could output a diagnostic line that says a test is starting.
This is important to me because printk() errors and warnings that are
related to a test can be output by a subsystem other than the subsystem
that I am testing.  If there is no marker at the start of the test
then there is no way to attribute the printk()s to the test.


> Directives are just a way to specify skipped tests and TODOs.
> 
> Diagnostics seem useful, it looks like you can put whatever
> information in them and print them out at anytime. It looks like a lot
> of kselftests emit a lot of data this way.
> 
> The YAML block seems to be the way that they prefer users to emit data
> beyond number of tests run and whether a test passed or failed. I
> could express most things I want to express in terms of YAML, but it
> is not the nicest format for displaying a lot of data like
> expectations, missed function calls, and other things which have a
> natural concise representation. Nevertheless, YAML readability is
> mostly a problem who won't be using the wrapper scripts.

The examples in specification V13 and V14 look very simple and very
readable to me.  (And I am not a fan of YAML.)


> My biggest
> problem with the YAML block is that you can only have one, and TAP
> specifies that it must come after the corresponding ok/not ok line,
> which again has the issue that you have to hold on to a lot of
> diagnostic data longer than you ideally would. Another downside is
> that I now have to write a YAML serializer for the kernel.

If a test generates diagnostic data, then I would expect that to be
the direct result of a test failure.  So the test can output the
"not ok" line, then immediately output the YAML block.  I do not
see a need for stashing YAML output ahead of time.

If diagnostic data is generated before the test can determine
success or failure, then it can be output as diagnostic data
instead of stashing it for later.


> ## Here is what I propose for this patchset:
> 
>  - Print out test number range at the beginning of each test suite.
>  - Print out log lines as soon as they happen as diagnostics.
>  - Print out the lines that state whether a test passes or fails as a
> ok/not ok line.
> 
> This would be technically conforming with TAP13 and is consistent with
> what some kselftests have done.
> 
> ## To be done in a future patchset:
> 
> Add a YAML serializer and print out some logs containing structured
> data (like expectation failures, unexpected function calls, etc) in
> YAML blocks.

YAML serializer sounds like not needed complexity.

> 
> Does this sound reasonable? I will go ahead and start working on this,
> but feel free to give me feedback on the overall idea in the meantime.
> 
> Cheers
>
Kees Cook May 6, 2019, 5:43 p.m. UTC | #12
On Sun, May 5, 2019 at 5:19 PM Frank Rowand <frowand.list@gmail.com> wrote:
> You can see the full version 14 document in the submitter's repo:
>
>   $ git clone https://github.com/isaacs/testanything.github.io.git
>   $ cd testanything.github.io
>   $ git checkout tap14
>   $ ls tap-version-14-specification.md
>
> My understanding is the the version 14 specification is not trying to
> add new features, but instead capture what is already implemented in
> the wild.

Oh! I didn't know about the work on TAP 14. I'll go read through this.

> > ## Here is what I propose for this patchset:
> >
> >  - Print out test number range at the beginning of each test suite.
> >  - Print out log lines as soon as they happen as diagnostics.
> >  - Print out the lines that state whether a test passes or fails as a
> > ok/not ok line.
> >
> > This would be technically conforming with TAP13 and is consistent with
> > what some kselftests have done.

This is what I fixed kselftest to actually do (it wasn't doing correct
TAP13), and Shuah is testing the series now:
https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/log/?h=ksft-tap-refactor

I'll go read TAP 14 now...
Brendan Higgins May 6, 2019, 9:39 p.m. UTC | #13
> On 5/3/19 4:14 PM, Brendan Higgins wrote:
> >> On 5/2/19 10:36 PM, Brendan Higgins wrote:
> >>> On Thu, May 2, 2019 at 6:45 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >>>>
> >>>> On 5/2/19 4:45 PM, Brendan Higgins wrote:
> >>>>> On Thu, May 2, 2019 at 2:16 PM Frank Rowand <frowand.list@gmail.com> wrote:
> >>>>>>
> >>>>>> On 5/2/19 11:07 AM, Brendan Higgins wrote:
> >>>>>>> On Thu, May 2, 2019 at 4:02 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> >>>>>>>>
> >>>>>>>> On Wed, May 01, 2019 at 04:01:21PM -0700, Brendan Higgins wrote:
> >>>>>>>>> From: Felix Guo <felixguoxiuping@gmail.com>
> >>>>>>>>>
> >>>>>>>>> The ultimate goal is to create minimal isolated test binaries; in the
> >>>>>>>>> meantime we are using UML to provide the infrastructure to run tests, so
> >>>>>>>>> define an abstract way to configure and run tests that allow us to
> >>>>>>>>> change the context in which tests are built without affecting the user.
> >>>>>>>>> This also makes pretty and dynamic error reporting, and a lot of other
> >>>>>>>>> nice features easier.
> >>>>>>>>>
> >>>>>>>>> kunit_config.py:
> >>>>>>>>>   - parse .config and Kconfig files.
> >>>>>>>>>
> >>>>>>>>> kunit_kernel.py: provides helper functions to:
> >>>>>>>>>   - configure the kernel using kunitconfig.
> >>>>>>>>>   - build the kernel with the appropriate configuration.
> >>>>>>>>>   - provide function to invoke the kernel and stream the output back.
> >>>>>>>>>
> >>>>>>>>> Signed-off-by: Felix Guo <felixguoxiuping@gmail.com>
> >>>>>>>>> Signed-off-by: Brendan Higgins <brendanhiggins@google.com>
> >>>>>>>>
> >>>>>>>> Ah, here's probably my answer to my previous logging format question,
> >>>>>>>> right?  What's the chance that these wrappers output stuff in a standard
> >>>>>>>> format that test-framework-tools can already parse?  :)
> >>>>>
> >>>>> To be clear, the test-framework-tools format we are talking about is
> >>>>> TAP13[1], correct?
> >>>>
> >>>> I'm not sure what the test community prefers for a format.  I'll let them
> >>>> jump in and debate that question.
> >>>>
> >>>>
> >>>>>
> >>>>> My understanding is that is what kselftest is being converted to use.
> >>>>>
> >>>>>>>
> >>>>>>> It should be pretty easy to do. I had some patches that pack up the
> >>>>>>> results into a serialized format for a presubmit service; it should be
> >>>>>>> pretty straightforward to take the same logic and just change the
> >>>>>>> output format.
> >>>>>>
> >>>>>> When examining and trying out the previous versions of the patch I found
> >>>>>> the wrappers useful to provide information about how to control and use
> >>>>>> the tests, but I had no interest in using the scripts as they do not
> >>>>>> fit in with my personal environment and workflow.
> >>>>>>
> >>>>>> In the previous versions of the patch, these helper scripts are optional,
> >>>>>> which is good for my use case.  If the helper scripts are required to
> >>>>>
> >>>>> They are still optional.
> >>>>>
> >>>>>> get the data into the proper format then the scripts are not quite so
> >>>>>> optional, they become the expected environment.  I think the proper
> >>>>>> format should exist without the helper scripts.
> >>>>>
> >>>>> That's a good point. A couple things,
> >>>>>
> >>>>> First off, supporting TAP13, either in the kernel or the wrapper
> >>>>> script is not hard, but I don't think that is the real issue that you
> >>>>> raise.
> >>>>>
> >>>>> If your only concern is that you will always be able to have human
> >>>>> readable KUnit results printed to the kernel log, that is a guarantee
> >>>>> I feel comfortable making. Beyond that, I think it is going to take a
> >>>>> long while before I would feel comfortable guaranteeing anything about
> >>>>> how will KUnit work, what kind of data it will want to expose, and how
> >>>>> it will be organized. I think the wrapper script provides a nice
> >>>>> facade that I can maintain, can mediate between the implementation
> >>>>> details and the user, and can mediate between the implementation
> >>>>> details and other pieces of software that might want to consume
> >>>>> results.
> >>>>>
> >>>>> [1] https://testanything.org/tap-version-13-specification.html
> >>>>
> >>>> My concern is based on a focus on my little part of the world
> >>>> (which in _previous_ versions of the patch series was the devicetree
> >>>> unittest.c tests being converted to use the kunit infrastructure).
> >>>> If I step back and think of the entire kernel globally I may end
> >>>> up with a different conclusion - but I'm going to remain myopic
> >>>> for this email.
> >>>>
> >>>> I want the test results to be usable by me and my fellow
> >>>> developers.  I prefer that the test results be easily accessible
> >>>> (current printk() implementation means that kunit messages are
> >>>> just as accessible as the current unittest.c printk() output).
> >>>> If the printk() output needs to be filtered through a script
> >>>> to generate the actual test results then that is sub-optimal
> >>>> to me.  It is one more step added to my workflow.  And
> >>>> potentially with an embedded target a major pain to get a
> >>>> data file (the kernel log file) transferred from a target
> >>>> to my development host.
> >>>
> >>> That's fair. If that is indeed your only concern, then I don't think
> >>> the wrapper script will ever be an issue for you. You will always be
> >>> able to execute a given test the old fashioned/manual way, and the
> >>> wrapper script only summarizes results, it does not change the
> >>> contents.
> >>>
> >>>>
> >>>> I want a reported test failure to be easy to trace back to the
> >>>> point in the source where the failure is reported.  With printk()
> >>>> the search is a simple grep for the failure message.  If the
> >>>> failure message has been processed by a script, and then the
> >>>> failure reported to me in an email, then I may have to look
> >>>> at the script to reverse engineer how the original failure
> >>>> message was transformed into the message that was reported
> >>>> to me in the email.  Then I search for the point in the
> >>>> source where the failure is reported.  So a basic task has
> >>>> just become more difficult and time consuming.
> >>>
> >>> That seems to be a valid concern. I would reiterate that you shouldn't
> >>> be concerned by any processing done by the wrapper script itself, but
> >>> the reality is that depending on what happens with automated
> >>> testing/presubmit/CI other people might end up parsing and
> >>> transforming test results - it might happen, it might not.
> >>
> >> You seem to be missing my point.
> >>
> >> Greg asked that the output be in a standard format.
> >>
> >> You replied that the standard format could be created by the wrapper script.
> >
> > I thought Greg originally meant that that is how it could be done when
> > he first commented on this patch, so I was agreeing and elaborating.
> > Nevertheless, it seems you and Greg are now in agreement on this
> > point, so I won't argue it further.
> >
> >>
> >> Now you say that "it might happen, it might not".  In other words the output
> >> may or may not end up in the standard format.
> >
> > Sorry, that was in reference to your concern about getting an email in
> > a different format than what the tool that you use generates. It
> > wasn't a statement about what I was or wasn't going to do in regards
> > to supporting a standard format.
> >
> >>
> >> As Greg points out in comments to patch 12:
> >>
> >>   "The core of kunit should also log the messages in this format as well,
> >>   and not rely on the helper scripts as Frank points out, not everyone
> >>   will use/want them.  Might as well make it easy for everyone to always
> >>   do the right thing and not force it to always be added in later."
> >>
> >> I am requesting that the original message be in the standard format.  Of
> >> course anyone is free to transform the messages in later processing, no
> >> big deal.
> >
> > My mistake, I thought that was a concern of yours.
> >
> > In any case, it sounds like you and Greg are in agreement on the core
> > libraries generating the output in TAP13, so I won't argue that point
> > further.
> >
> > ## Analysis of using TAP13
>
> I have never looked at TAP version 13 in any depth at all, so do not consider
> me to be any sort of expert.
>
> My entire TAP knowledge is based on:
>
>   https://testanything.org/tap-version-13-specification.html
>
> and the pull request to create the TAP version 14 specification:
>
>    https://github.com/TestAnything/testanything.github.io/pull/36/files
>
> You can see the full version 14 document in the submitter's repo:
>
>   $ git clone https://github.com/isaacs/testanything.github.io.git
>   $ cd testanything.github.io
>   $ git checkout tap14
>   $ ls tap-version-14-specification.md
>
> My understanding is the the version 14 specification is not trying to
> add new features, but instead capture what is already implemented in
> the wild.
>
>
> > One of my earlier concerns was that TAP13 is a bit over constrained
> > for what I would like to output from the KUnit core. It only allows
> > data to be output as either:
> >  - test number
> >  - ok/not ok with single line description
> >  - directive
> >  - diagnostics
> >  - YAML block
> >
> > The test number must become before a set of ok/not ok lines, and does
> > not contain any additional information. One annoying thing about this
> > is it doesn't provide any kind of nesting or grouping.
>
> Greg's response mentions ktest (?) already does nesting.

I think we are talking about kselftest.

> Version 14 allows nesting through subtests.  I have not looked at what
> ktest does, so I do not know if it uses subtest, or something else.

Oh nice! That is new in version 14. I can use that.

> > There is one ok/not ok line per test and it may have a short
> > description of the test immediately after 'ok' or 'not ok'; this is
> > problematic because it wants the first thing you say about a test to
> > be after you know whether it passes or not.
>
> I think you could output a diagnostic line that says a test is starting.
> This is important to me because printk() errors and warnings that are
> related to a test can be output by a subsystem other than the subsystem
> that I am testing.  If there is no marker at the start of the test
> then there is no way to attribute the printk()s to the test.

I agree.

Technically conforms with the spec, and kselftest does that, but is
also not part of the spec. Well, it *is* specified if you use
subtests. I think the right approach is to make each
"kunit_module/test suite" a test, and all the test cases will be
subtests.

> > Directives are just a way to specify skipped tests and TODOs.
> >
> > Diagnostics seem useful, it looks like you can put whatever
> > information in them and print them out at anytime. It looks like a lot
> > of kselftests emit a lot of data this way.
> >
> > The YAML block seems to be the way that they prefer users to emit data
> > beyond number of tests run and whether a test passed or failed. I
> > could express most things I want to express in terms of YAML, but it
> > is not the nicest format for displaying a lot of data like
> > expectations, missed function calls, and other things which have a
> > natural concise representation. Nevertheless, YAML readability is
> > mostly a problem who won't be using the wrapper scripts.
>
> The examples in specification V13 and V14 look very simple and very
> readable to me.  (And I am not a fan of YAML.)
>
>
> > My biggest
> > problem with the YAML block is that you can only have one, and TAP
> > specifies that it must come after the corresponding ok/not ok line,
> > which again has the issue that you have to hold on to a lot of
> > diagnostic data longer than you ideally would. Another downside is
> > that I now have to write a YAML serializer for the kernel.
>
> If a test generates diagnostic data, then I would expect that to be
> the direct result of a test failure.  So the test can output the
> "not ok" line, then immediately output the YAML block.  I do not
> see a need for stashing YAML output ahead of time.
>
> If diagnostic data is generated before the test can determine
> success or failure, then it can be output as diagnostic data
> instead of stashing it for later.

Cool, that's what I am thinking I am going to do - I just wanted to
make sure people were okay with this approach. I mean, I think that is
what kselftest does.

We can hold off on the YAML stuff for now then.

> > ## Here is what I propose for this patchset:
> >
> >  - Print out test number range at the beginning of each test suite.
> >  - Print out log lines as soon as they happen as diagnostics.
> >  - Print out the lines that state whether a test passes or fails as a
> > ok/not ok line.
> >
> > This would be technically conforming with TAP13 and is consistent with
> > what some kselftests have done.
> >
> > ## To be done in a future patchset:
> >
> > Add a YAML serializer and print out some logs containing structured
> > data (like expectation failures, unexpected function calls, etc) in
> > YAML blocks.
>
> YAML serializer sounds like not needed complexity.
>
> >
> > Does this sound reasonable? I will go ahead and start working on this,
> > but feel free to give me feedback on the overall idea in the meantime.
> >
> > Cheers

Thanks!
Brendan Higgins May 6, 2019, 9:42 p.m. UTC | #14
> On Sun, May 5, 2019 at 5:19 PM Frank Rowand <frowand.list@gmail.com> wrote:
> > You can see the full version 14 document in the submitter's repo:
> >
> >   $ git clone https://github.com/isaacs/testanything.github.io.git
> >   $ cd testanything.github.io
> >   $ git checkout tap14
> >   $ ls tap-version-14-specification.md
> >
> > My understanding is the the version 14 specification is not trying to
> > add new features, but instead capture what is already implemented in
> > the wild.
>
> Oh! I didn't know about the work on TAP 14. I'll go read through this.
>
> > > ## Here is what I propose for this patchset:
> > >
> > >  - Print out test number range at the beginning of each test suite.
> > >  - Print out log lines as soon as they happen as diagnostics.
> > >  - Print out the lines that state whether a test passes or fails as a
> > > ok/not ok line.
> > >
> > > This would be technically conforming with TAP13 and is consistent with
> > > what some kselftests have done.
>
> This is what I fixed kselftest to actually do (it wasn't doing correct
> TAP13), and Shuah is testing the series now:
> https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git/log/?h=ksft-tap-refactor

Oh, cool! I guess this is an okay approach then.

Thanks!
Bird, Tim May 7, 2019, 7:13 p.m. UTC | #15
Here is a bit of inline commentary on the TAP13/TAP14 discussion.

> -----Original Message-----
> From: Brendan Higgins 
> 
> > On 5/3/19 4:14 PM, Brendan Higgins wrote:
> > >> On 5/2/19 10:36 PM, Brendan Higgins wrote:
> > > In any case, it sounds like you and Greg are in agreement on the core
> > > libraries generating the output in TAP13, so I won't argue that point
> > > further.
> > >
> > > ## Analysis of using TAP13
> >
> > I have never looked at TAP version 13 in any depth at all, so do not consider
> > me to be any sort of expert.
> >
> > My entire TAP knowledge is based on:
> >
> >   https://testanything.org/tap-version-13-specification.html
> >
> > and the pull request to create the TAP version 14 specification:
> >
> >    https://github.com/TestAnything/testanything.github.io/pull/36/files
> >
> > You can see the full version 14 document in the submitter's repo:
> >
> >   $ git clone https://github.com/isaacs/testanything.github.io.git
> >   $ cd testanything.github.io
> >   $ git checkout tap14
> >   $ ls tap-version-14-specification.md
> >
> > My understanding is the the version 14 specification is not trying to
> > add new features, but instead capture what is already implemented in
> > the wild.
> >
> >
> > > One of my earlier concerns was that TAP13 is a bit over constrained
> > > for what I would like to output from the KUnit core. It only allows
> > > data to be output as either:
> > >  - test number
> > >  - ok/not ok with single line description
> > >  - directive
> > >  - diagnostics
> > >  - YAML block
> > >
> > > The test number must become before a set of ok/not ok lines, and does
> > > not contain any additional information. One annoying thing about this
> > > is it doesn't provide any kind of nesting or grouping.
> >
> > Greg's response mentions ktest (?) already does nesting.
> 
> I think we are talking about kselftest.
> 
> > Version 14 allows nesting through subtests.  I have not looked at what
> > ktest does, so I do not know if it uses subtest, or something else.
> 
> Oh nice! That is new in version 14. I can use that.

We have run into the problem of subtests (or nested tests, both using
TAP13) in Fuego.  I recall that this issue came up in kselftest, and I believe
we discussed a solution, but I don't recall what it was.

Can someone remind me what kselftest does to handle nested tests
(in terms of TAP13 output)?

> 
> > > There is one ok/not ok line per test and it may have a short
> > > description of the test immediately after 'ok' or 'not ok'; this is
> > > problematic because it wants the first thing you say about a test to
> > > be after you know whether it passes or not.
> >
> > I think you could output a diagnostic line that says a test is starting.
> > This is important to me because printk() errors and warnings that are
> > related to a test can be output by a subsystem other than the subsystem
> > that I am testing.  If there is no marker at the start of the test
> > then there is no way to attribute the printk()s to the test.
> 
> I agree.

This is a significant problem.  In Fuego we output each line with a test id prefix,
which goes against the spec, but helps solve this.  Test output should be
kept separate from system output, but if I understand correctly, there are no
channels in prinkt to use to keep different data streams separate.

How does kselftest deal with this now?

> 
> Technically conforms with the spec, and kselftest does that, but is
> also not part of the spec. Well, it *is* specified if you use
> subtests. I think the right approach is to make each
> "kunit_module/test suite" a test, and all the test cases will be
> subtests.
> 
> > > Directives are just a way to specify skipped tests and TODOs.
> > >
> > > Diagnostics seem useful, it looks like you can put whatever
> > > information in them and print them out at anytime. It looks like a lot
> > > of kselftests emit a lot of data this way.
> > >
> > > The YAML block seems to be the way that they prefer users to emit data
> > > beyond number of tests run and whether a test passed or failed. I
> > > could express most things I want to express in terms of YAML, but it
> > > is not the nicest format for displaying a lot of data like
> > > expectations, missed function calls, and other things which have a
> > > natural concise representation. Nevertheless, YAML readability is
> > > mostly a problem who won't be using the wrapper scripts.
> >
> > The examples in specification V13 and V14 look very simple and very
> > readable to me.  (And I am not a fan of YAML.)
> >
> >
> > > My biggest
> > > problem with the YAML block is that you can only have one, and TAP
> > > specifies that it must come after the corresponding ok/not ok line,
> > > which again has the issue that you have to hold on to a lot of
> > > diagnostic data longer than you ideally would. Another downside is
> > > that I now have to write a YAML serializer for the kernel.
> >
> > If a test generates diagnostic data, then I would expect that to be
> > the direct result of a test failure.  So the test can output the
> > "not ok" line, then immediately output the YAML block.  I do not
> > see a need for stashing YAML output ahead of time.
> >
> > If diagnostic data is generated before the test can determine
> > success or failure, then it can be output as diagnostic data
> > instead of stashing it for later.
> 
> Cool, that's what I am thinking I am going to do - I just wanted to
> make sure people were okay with this approach. I mean, I think that is
> what kselftest does.

IMHO the diagnostic data does not have to be in YAML.  That's only
if there's a well-known schema for the diagnostic data, to make the
data machine-readable.   TAP13 specifically avoided defining such a
schema.  I need to look at TAP14 and see if they have defined something.
(Thanks for bringing that to my attention.)

The important part, since there are no start and end delimiters for each
testcase, is to structure output (including from unrelated sub-systems
affected by the test) to either occur all before or all after the test line.
Otherwise it's impossible to sensibly parse the diagnostic data and associate it
with a test.  (That is, the TAP lines become the delimiters between each testcase's
output and data).  This is a pretty big weakness of TAP13.  Since the TAP line
has the test result, it usually means that the subsystem output for the test
is emitted *before* the TAP line.  It's preferable, in order to keep the
data together, that the diagnostic data also be emitted before the TAP
line.

> 
> We can hold off on the YAML stuff for now then.
> 
> > > ## Here is what I propose for this patchset:
> > >
> > >  - Print out test number range at the beginning of each test suite.
> > >  - Print out log lines as soon as they happen as diagnostics.
> > >  - Print out the lines that state whether a test passes or fails as a
> > > ok/not ok line.
> > >
> > > This would be technically conforming with TAP13 and is consistent with
> > > what some kselftests have done.
> > >
> > > ## To be done in a future patchset:
> > >
> > > Add a YAML serializer and print out some logs containing structured
> > > data (like expectation failures, unexpected function calls, etc) in
> > > YAML blocks.
> >
> > YAML serializer sounds like not needed complexity.
I agree, for now.

I think if we start to see some patterns for some data that many tests
output, we might want (as a kernel community) to define a YAML
schema for the kselftest output.  But I think that's biting off too much
right now.  IMHO we would want any YAML schema we define to
cover more than just unit tests, so the job of defining that would be
pretty big.

This would be a good discussion to have at a testing micro-conference
or summit. :-)

> >
> > >
> > > Does this sound reasonable? I will go ahead and start working on this,
> > > but feel free to give me feedback on the overall idea in the meantime.

Sounds good.  Thanks for working on this.
 -- Tim
diff mbox series

Patch

diff --git a/tools/testing/kunit/.gitignore b/tools/testing/kunit/.gitignore
new file mode 100644
index 0000000000000..c791ff59a37a9
--- /dev/null
+++ b/tools/testing/kunit/.gitignore
@@ -0,0 +1,3 @@ 
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
\ No newline at end of file
diff --git a/tools/testing/kunit/kunit.py b/tools/testing/kunit/kunit.py
new file mode 100755
index 0000000000000..7413ec7351a20
--- /dev/null
+++ b/tools/testing/kunit/kunit.py
@@ -0,0 +1,78 @@ 
+#!/usr/bin/python3
+# SPDX-License-Identifier: GPL-2.0
+#
+# A thin wrapper on top of the KUnit Kernel
+#
+# Copyright (C) 2019, Google LLC.
+# Author: Felix Guo <felixguoxiuping@gmail.com>
+# Author: Brendan Higgins <brendanhiggins@google.com>
+
+import argparse
+import sys
+import os
+import time
+
+import kunit_config
+import kunit_kernel
+import kunit_parser
+
+parser = argparse.ArgumentParser(description='Runs KUnit tests.')
+
+parser.add_argument('--raw_output', help='don\'t format output from kernel',
+		    action='store_true')
+
+parser.add_argument('--timeout', help='maximum number of seconds to allow for '
+		    'all tests to run. This does not include time taken to '
+		    'build the tests.', type=int, default=300,
+		    metavar='timeout')
+
+parser.add_argument('--jobs',
+		    help='As in the make command, "Specifies  the number of '
+		    'jobs (commands) to run simultaneously."',
+		    type=int, default=8, metavar='jobs')
+
+parser.add_argument('--build_dir',
+		    help='As in the make command, it specifies the build '
+		    'directory.',
+		    type=str, default=None, metavar='build_dir')
+
+cli_args = parser.parse_args()
+
+linux = kunit_kernel.LinuxSourceTree()
+
+build_dir = None
+if cli_args.build_dir:
+	build_dir = cli_args.build_dir
+
+config_start = time.time()
+success = linux.build_reconfig(build_dir)
+config_end = time.time()
+if not success:
+	quit()
+
+kunit_parser.print_with_timestamp('Building KUnit Kernel ...')
+
+build_start = time.time()
+
+success = linux.build_um_kernel(jobs=cli_args.jobs, build_dir=build_dir)
+build_end = time.time()
+if not success:
+	quit()
+
+kunit_parser.print_with_timestamp('Starting KUnit Kernel ...')
+test_start = time.time()
+
+if cli_args.raw_output:
+	kunit_parser.raw_output(linux.run_kernel(timeout=cli_args.timeout,
+						 build_dir=build_dir))
+else:
+	kunit_parser.parse_run_tests(linux.run_kernel(timeout=cli_args.timeout,
+						      build_dir=build_dir))
+
+test_end = time.time()
+
+kunit_parser.print_with_timestamp((
+	"Elapsed time: %.3fs total, %.3fs configuring, %.3fs " +
+	"building, %.3fs running.\n") % (test_end - config_start,
+	config_end - config_start, build_end - build_start,
+	test_end - test_start))
diff --git a/tools/testing/kunit/kunit_config.py b/tools/testing/kunit/kunit_config.py
new file mode 100644
index 0000000000000..167f47d9ab8e4
--- /dev/null
+++ b/tools/testing/kunit/kunit_config.py
@@ -0,0 +1,66 @@ 
+# SPDX-License-Identifier: GPL-2.0
+#
+# Builds a .config from a kunitconfig.
+#
+# Copyright (C) 2019, Google LLC.
+# Author: Felix Guo <felixguoxiuping@gmail.com>
+# Author: Brendan Higgins <brendanhiggins@google.com>
+
+import collections
+import re
+
+CONFIG_IS_NOT_SET_PATTERN = r'^# CONFIG_\w+ is not set$'
+CONFIG_PATTERN = r'^CONFIG_\w+=\S+$'
+
+KconfigEntryBase = collections.namedtuple('KconfigEntry', ['raw_entry'])
+
+
+class KconfigEntry(KconfigEntryBase):
+
+	def __str__(self) -> str:
+		return self.raw_entry
+
+
+class KconfigParseError(Exception):
+	"""Error parsing Kconfig defconfig or .config."""
+
+
+class Kconfig(object):
+	"""Represents defconfig or .config specified using the Kconfig language."""
+
+	def __init__(self):
+		self._entries = []
+
+	def entries(self):
+		return set(self._entries)
+
+	def add_entry(self, entry: KconfigEntry) -> None:
+		self._entries.append(entry)
+
+	def is_subset_of(self, other: "Kconfig") -> bool:
+		return self.entries().issubset(other.entries())
+
+	def write_to_file(self, path: str) -> None:
+		with open(path, 'w') as f:
+			for entry in self.entries():
+				f.write(str(entry) + '\n')
+
+	def parse_from_string(self, blob: str) -> None:
+		"""Parses a string containing KconfigEntrys and populates this Kconfig."""
+		self._entries = []
+		is_not_set_matcher = re.compile(CONFIG_IS_NOT_SET_PATTERN)
+		config_matcher = re.compile(CONFIG_PATTERN)
+		for line in blob.split('\n'):
+			line = line.strip()
+			if not line:
+				continue
+			elif config_matcher.match(line) or is_not_set_matcher.match(line):
+				self._entries.append(KconfigEntry(line))
+			elif line[0] == '#':
+				continue
+			else:
+				raise KconfigParseError('Failed to parse: ' + line)
+
+	def read_from_file(self, path: str) -> None:
+		with open(path, 'r') as f:
+			self.parse_from_string(f.read())
diff --git a/tools/testing/kunit/kunit_kernel.py b/tools/testing/kunit/kunit_kernel.py
new file mode 100644
index 0000000000000..07c0abf2f47df
--- /dev/null
+++ b/tools/testing/kunit/kunit_kernel.py
@@ -0,0 +1,148 @@ 
+# SPDX-License-Identifier: GPL-2.0
+#
+# Runs UML kernel, collects output, and handles errors.
+#
+# Copyright (C) 2019, Google LLC.
+# Author: Felix Guo <felixguoxiuping@gmail.com>
+# Author: Brendan Higgins <brendanhiggins@google.com>
+
+
+import logging
+import subprocess
+import os
+
+import kunit_config
+
+KCONFIG_PATH = '.config'
+
+class ConfigError(Exception):
+	"""Represents an error trying to configure the Linux kernel."""
+
+
+class BuildError(Exception):
+	"""Represents an error trying to build the Linux kernel."""
+
+
+class LinuxSourceTreeOperations(object):
+	"""An abstraction over command line operations performed on a source tree."""
+
+	def make_mrproper(self):
+		try:
+			subprocess.check_output(['make', 'mrproper'])
+		except OSError as e:
+			raise ConfigError('Could not call make command: ' + e)
+		except subprocess.CalledProcessError as e:
+			raise ConfigError(e.output)
+
+	def make_olddefconfig(self, build_dir):
+		command = ['make', 'ARCH=um', 'olddefconfig']
+		if build_dir:
+			command += ['O=' + build_dir]
+		try:
+			subprocess.check_output(command)
+		except OSError as e:
+			raise ConfigError('Could not call make command: ' + e)
+		except subprocess.CalledProcessError as e:
+			raise ConfigError(e.output)
+
+	def make(self, jobs, build_dir):
+		command = ['make', 'ARCH=um', '--jobs=' + str(jobs)]
+		if build_dir:
+			command += ['O=' + build_dir]
+		try:
+			subprocess.check_output(command)
+		except OSError as e:
+			raise BuildError('Could not call execute make: ' + e)
+		except subprocess.CalledProcessError as e:
+			raise BuildError(e.output)
+
+	def linux_bin(self, params, timeout, build_dir):
+		"""Runs the Linux UML binary. Must be named 'linux'."""
+		linux_bin = './linux'
+		if build_dir:
+			linux_bin = os.path.join(build_dir, 'linux')
+		process = subprocess.Popen(
+			[linux_bin] + params,
+			stdin=subprocess.PIPE,
+			stdout=subprocess.PIPE,
+			stderr=subprocess.PIPE)
+		process.wait(timeout=timeout)
+		return process
+
+
+def get_kconfig_path(build_dir):
+	kconfig_path = KCONFIG_PATH
+	if build_dir:
+		kconfig_path = os.path.join(build_dir, KCONFIG_PATH)
+	return kconfig_path
+
+class LinuxSourceTree(object):
+	"""Represents a Linux kernel source tree with KUnit tests."""
+
+	def __init__(self):
+		self._kconfig = kunit_config.Kconfig()
+		self._kconfig.read_from_file('kunitconfig')
+		self._ops = LinuxSourceTreeOperations()
+
+	def clean(self):
+		try:
+			self._ops.make_mrproper()
+		except ConfigError as e:
+			logging.error(e)
+			return False
+		return True
+
+	def build_config(self, build_dir):
+		kconfig_path = get_kconfig_path(build_dir)
+		if build_dir and not os.path.exists(build_dir):
+			os.mkdir(build_dir)
+		self._kconfig.write_to_file(kconfig_path)
+		try:
+			self._ops.make_olddefconfig(build_dir)
+		except ConfigError as e:
+			logging.error(e)
+			return False
+		validated_kconfig = kunit_config.Kconfig()
+		validated_kconfig.read_from_file(kconfig_path)
+		if not self._kconfig.is_subset_of(validated_kconfig):
+			logging.error('Provided Kconfig is not contained in validated .config!')
+			return False
+		return True
+
+	def build_reconfig(self, build_dir):
+		"""Creates a new .config if it is not a subset of the kunitconfig."""
+		kconfig_path = get_kconfig_path(build_dir)
+		if os.path.exists(kconfig_path):
+			existing_kconfig = kunit_config.Kconfig()
+			existing_kconfig.read_from_file(kconfig_path)
+			if not self._kconfig.is_subset_of(existing_kconfig):
+				print('Regenerating .config ...')
+				os.remove(kconfig_path)
+				return self.build_config(build_dir)
+			else:
+				return True
+		else:
+			print('Generating .config ...')
+			return self.build_config(build_dir)
+
+	def build_um_kernel(self, jobs, build_dir):
+		try:
+			self._ops.make_olddefconfig(build_dir)
+			self._ops.make(jobs, build_dir)
+		except (ConfigError, BuildError) as e:
+			logging.error(e)
+			return False
+		used_kconfig = kunit_config.Kconfig()
+		used_kconfig.read_from_file(get_kconfig_path(build_dir))
+		if not self._kconfig.is_subset_of(used_kconfig):
+			logging.error('Provided Kconfig is not contained in final config!')
+			return False
+		return True
+
+	def run_kernel(self, args=[], timeout=None, build_dir=None):
+		args.extend(['mem=256M'])
+		process = self._ops.linux_bin(args, timeout, build_dir)
+		with open('test.log', 'w') as f:
+			for line in process.stdout:
+				f.write(line.rstrip().decode('ascii') + '\n')
+				yield line.rstrip().decode('ascii')
diff --git a/tools/testing/kunit/kunit_parser.py b/tools/testing/kunit/kunit_parser.py
new file mode 100644
index 0000000000000..6c81d4dcfabb5
--- /dev/null
+++ b/tools/testing/kunit/kunit_parser.py
@@ -0,0 +1,119 @@ 
+# SPDX-License-Identifier: GPL-2.0
+#
+# Parses test results from a kernel dmesg log.
+#
+# Copyright (C) 2019, Google LLC.
+# Author: Felix Guo <felixguoxiuping@gmail.com>
+# Author: Brendan Higgins <brendanhiggins@google.com>
+
+import re
+from datetime import datetime
+
+kunit_start_re = re.compile('printk: console .* enabled')
+kunit_end_re = re.compile('List of all partitions:')
+
+def isolate_kunit_output(kernel_output):
+	started = False
+	for line in kernel_output:
+		if kunit_start_re.match(line):
+			started = True
+		elif kunit_end_re.match(line):
+			break
+		elif started:
+			yield line
+
+def raw_output(kernel_output):
+	for line in kernel_output:
+		print(line)
+
+DIVIDER = "=" * 30
+
+RESET = '\033[0;0m'
+
+def red(text):
+	return '\033[1;31m' + text + RESET
+
+def yellow(text):
+	return '\033[1;33m' + text + RESET
+
+def green(text):
+	return '\033[1;32m' + text + RESET
+
+def print_with_timestamp(message):
+	print('[%s] %s' % (datetime.now().strftime('%H:%M:%S'), message))
+
+def print_log(log):
+	for m in log:
+		print_with_timestamp(m)
+
+def parse_run_tests(kernel_output):
+	test_case_output = re.compile('^kunit .*?: (.*)$')
+
+	test_module_success = re.compile('^kunit .*: all tests passed')
+	test_module_fail = re.compile('^kunit .*: one or more tests failed')
+
+	test_case_success = re.compile('^kunit (.*): (.*) passed')
+	test_case_fail = re.compile('^kunit (.*): (.*) failed')
+	test_case_crash = re.compile('^kunit (.*): (.*) crashed')
+
+	total_tests = set()
+	failed_tests = set()
+	crashed_tests = set()
+
+	def get_test_name(match):
+		return match.group(1) + ":" + match.group(2)
+
+	current_case_log = []
+	def end_one_test(match, log):
+		log.clear()
+		total_tests.add(get_test_name(match))
+
+	print_with_timestamp(DIVIDER)
+	for line in isolate_kunit_output(kernel_output):
+		# Ignore module output:
+		if (test_module_success.match(line) or
+		    test_module_fail.match(line)):
+			print_with_timestamp(DIVIDER)
+			continue
+
+		match = re.match(test_case_success, line)
+		if match:
+			print_with_timestamp(green("[PASSED] ") +
+					     get_test_name(match))
+			end_one_test(match, current_case_log)
+			continue
+
+		match = re.match(test_case_fail, line)
+		# Crashed tests will report as both failed and crashed. We only
+		# want to show and count it once.
+		if match and get_test_name(match) not in crashed_tests:
+			failed_tests.add(get_test_name(match))
+			print_with_timestamp(red("[FAILED] " +
+						 get_test_name(match)))
+			print_log(map(yellow, current_case_log))
+			print_with_timestamp("")
+			end_one_test(match, current_case_log)
+			continue
+
+		match = re.match(test_case_crash, line)
+		if match:
+			crashed_tests.add(get_test_name(match))
+			print_with_timestamp(yellow("[CRASH] " +
+						    get_test_name(match)))
+			print_log(current_case_log)
+			print_with_timestamp("")
+			end_one_test(match, current_case_log)
+			continue
+
+		# Strip off the `kunit module-name:` prefix
+		match = re.match(test_case_output, line)
+		if match:
+			current_case_log.append(match.group(1))
+		else:
+			current_case_log.append(line)
+
+	fmt = green if (len(failed_tests) + len(crashed_tests) == 0) else red
+	print_with_timestamp(
+		fmt("Testing complete. %d tests run. %d failed. %d crashed." %
+		    (len(total_tests), len(failed_tests), len(crashed_tests))))
+