mbox series

[RFC,0/1] KVM selftests runner for running more than just default

Message ID 20240821223012.3757828-1-vipinsh@google.com (mailing list archive)
Headers show
Series KVM selftests runner for running more than just default | expand

Message

Vipin Sharma Aug. 21, 2024, 10:30 p.m. UTC
This series is introducing a KVM selftests runner to make it easier to
run selftests with some interesting configurations and provide some
enhancement over existing kselftests runner.

I would like to get an early feedback from the community and see if this
is something which can be useful for improving KVM selftests coverage
and worthwhile investing time in it. Some specific questions:

1. Should this be done?
2. What features are must?
3. Any other way to write test configuration compared to what is done here?

Note, python code written for runner is not optimized but shows how this
runner can be useful.

What are the goals?
- Run tests with more than just the default settings of KVM module
  parameters and test itself.
- Capture issues which only show when certain combination of module
  parameter and tests options are used.
- Provide minimum testing which can be standardised for KVM patches.
- Run tests parallely.
- Dump output in a hierarchical folder structure for easier tracking of
  failures/success output
- Feel free to add yours :)

Why not use/extend kselftests?
- Other submodules goal might not align and its gonna be difficult to
  capture broader set of requirements.
- Instead of test configuration we will need separate shell scripts
  which will act as tests for each test arg and module parameter
  combination. This will easily pollute the KVM selftests directory.
- Easier to enhance features using Python packages than shell scripts.

What this runner do?
- Reads a test configuration file (tests.json in patch 1).
  Configuration in json are written in hierarchy where multiple suites
  exist and each suite contains multiple tests.
- Provides a way to execute tests inside a suite parallelly.
- Provides a way to dump output to a folder in a hierarchical manner.
- Allows to run selected suites, or tests in a specific suite.
- Allows to do some setup and teardown for test suites and tests.
- Timeout can be provided to limit test execution duration.
- Allows to run test suites or tests on specific architecture only.

Runner is written in python and goal is to only use standard library
constructs. This runner will work on Python 3.6 and up

What does a test configuration file looks like?
Test configuration are written in json as it is easier to read and has
inbuilt package support in Python. Root level is a json array denoting
suites and each suite can multiple tests in it using json array.

[
  {
    "suite": "dirty_log_perf_tests",
    "timeout_s": 300,
    "arch": "x86_64",
    "setup": "echo Setting up suite",
    "teardown": "echo tearing down suite",
    "tests": [
      {
        "name": "dirty_log_perf_test_max_vcpu_no_manual_protect",
        "command": "./dirty_log_perf_test -v $(grep -c ^processor /proc/cpuinfo) -g",
        "arch": "x86_64",
	"setup": "echo Setting up test",
	"teardown": "echo tearing down test",
        "timeout_s": 5
      }
    ]
  }
]

Usage:
Runner "runner.py" and test configuration "tests.json" lives in
tool/testing/selftests/kvm directory.

To run serially:
./runner.py tests.json

To run specific test suites:
./runner.py tests.json dirty_log_perf_tests x86_sanity_tests

To run specific test in a suite:
./runner.py tests.json x86_sanity_tests/vmx_msrs_test

To run everything parallely (runs tests inside a suite parallely):
./runner.py -j 10 tests.json

To dump output to disk:
./runner.py -j 10 tests.json -o sample_run

Sample output (after removing timestamp, process ID, and logging
level columns):

  ./runner.py tests.json  -j 10 -o sample_run
  PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_no_manual_protect
  PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect
  PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect_random_access
  PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_10_vcpu_hugetlb
  PASSED: x86_sanity_tests/vmx_msrs_test
  SKIPPED: x86_sanity_tests/private_mem_conversions_test
  FAILED: x86_sanity_tests/apic_bus_clock_test
  PASSED: x86_sanity_tests/dirty_log_page_splitting_test
  --------------------------------------------------------------------------
  Test runner result:
  1) dirty_log_perf_tests:
     1) PASSED: dirty_log_perf_test_max_vcpu_no_manual_protect
     2) PASSED: dirty_log_perf_test_max_vcpu_manual_protect
     3) PASSED: dirty_log_perf_test_max_vcpu_manual_protect_random_access
     4) PASSED: dirty_log_perf_test_max_10_vcpu_hugetlb
  2) x86_sanity_tests:
     1) PASSED: vmx_msrs_test
     2) SKIPPED: private_mem_conversions_test
     3) FAILED: apic_bus_clock_test
     4) PASSED: dirty_log_page_splitting_test
  --------------------------------------------------------------------------

Directory structure created:

sample_run/
|-- dirty_log_perf_tests
|   |-- dirty_log_perf_test_max_10_vcpu_hugetlb
|   |   |-- command.stderr
|   |   |-- command.stdout
|   |   |-- setup.stderr
|   |   |-- setup.stdout
|   |   |-- teardown.stderr
|   |   `-- teardown.stdout
|   |-- dirty_log_perf_test_max_vcpu_manual_protect
|   |   |-- command.stderr
|   |   `-- command.stdout
|   |-- dirty_log_perf_test_max_vcpu_manual_protect_random_access
|   |   |-- command.stderr
|   |   `-- command.stdout
|   `-- dirty_log_perf_test_max_vcpu_no_manual_protect
|       |-- command.stderr
|       `-- command.stdout
`-- x86_sanity_tests
    |-- apic_bus_clock_test
    |   |-- command.stderr
    |   `-- command.stdout
    |-- dirty_log_page_splitting_test
    |   |-- command.stderr
    |   |-- command.stdout
    |   |-- setup.stderr
    |   |-- setup.stdout
    |   |-- teardown.stderr
    |   `-- teardown.stdout
    |-- private_mem_conversions_test
    |   |-- command.stderr
    |   `-- command.stdout
    `-- vmx_msrs_test
        |-- command.stderr
        `-- command.stdout


Some other features for future:
- Provide "precheck" command option in json, which can filter/skip tests if
  certain conditions are not met.
- Iteration option in the runner. This will allow the same test suites to
  run again.

Vipin Sharma (1):
  KVM: selftestsi: Create KVM selftests runnner to run interesting tests

 tools/testing/selftests/kvm/runner.py  | 282 +++++++++++++++++++++++++
 tools/testing/selftests/kvm/tests.json |  60 ++++++
 2 files changed, 342 insertions(+)
 create mode 100755 tools/testing/selftests/kvm/runner.py
 create mode 100644 tools/testing/selftests/kvm/tests.json


base-commit: de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed

Comments

Vipin Sharma Aug. 22, 2024, 8:55 p.m. UTC | #1
Oops! Adding archs mailing list and maintainers which have arch folder
in tool/testing/selftests/kvm

On Wed, Aug 21, 2024 at 3:30 PM Vipin Sharma <vipinsh@google.com> wrote:
>
> This series is introducing a KVM selftests runner to make it easier to
> run selftests with some interesting configurations and provide some
> enhancement over existing kselftests runner.
>
> I would like to get an early feedback from the community and see if this
> is something which can be useful for improving KVM selftests coverage
> and worthwhile investing time in it. Some specific questions:
>
> 1. Should this be done?
> 2. What features are must?
> 3. Any other way to write test configuration compared to what is done here?
>
> Note, python code written for runner is not optimized but shows how this
> runner can be useful.
>
> What are the goals?
> - Run tests with more than just the default settings of KVM module
>   parameters and test itself.
> - Capture issues which only show when certain combination of module
>   parameter and tests options are used.
> - Provide minimum testing which can be standardised for KVM patches.
> - Run tests parallely.
> - Dump output in a hierarchical folder structure for easier tracking of
>   failures/success output
> - Feel free to add yours :)
>
> Why not use/extend kselftests?
> - Other submodules goal might not align and its gonna be difficult to
>   capture broader set of requirements.
> - Instead of test configuration we will need separate shell scripts
>   which will act as tests for each test arg and module parameter
>   combination. This will easily pollute the KVM selftests directory.
> - Easier to enhance features using Python packages than shell scripts.
>
> What this runner do?
> - Reads a test configuration file (tests.json in patch 1).
>   Configuration in json are written in hierarchy where multiple suites
>   exist and each suite contains multiple tests.
> - Provides a way to execute tests inside a suite parallelly.
> - Provides a way to dump output to a folder in a hierarchical manner.
> - Allows to run selected suites, or tests in a specific suite.
> - Allows to do some setup and teardown for test suites and tests.
> - Timeout can be provided to limit test execution duration.
> - Allows to run test suites or tests on specific architecture only.
>
> Runner is written in python and goal is to only use standard library
> constructs. This runner will work on Python 3.6 and up
>
> What does a test configuration file looks like?
> Test configuration are written in json as it is easier to read and has
> inbuilt package support in Python. Root level is a json array denoting
> suites and each suite can multiple tests in it using json array.
>
> [
>   {
>     "suite": "dirty_log_perf_tests",
>     "timeout_s": 300,
>     "arch": "x86_64",
>     "setup": "echo Setting up suite",
>     "teardown": "echo tearing down suite",
>     "tests": [
>       {
>         "name": "dirty_log_perf_test_max_vcpu_no_manual_protect",
>         "command": "./dirty_log_perf_test -v $(grep -c ^processor /proc/cpuinfo) -g",
>         "arch": "x86_64",
>         "setup": "echo Setting up test",
>         "teardown": "echo tearing down test",
>         "timeout_s": 5
>       }
>     ]
>   }
> ]
>
> Usage:
> Runner "runner.py" and test configuration "tests.json" lives in
> tool/testing/selftests/kvm directory.
>
> To run serially:
> ./runner.py tests.json
>
> To run specific test suites:
> ./runner.py tests.json dirty_log_perf_tests x86_sanity_tests
>
> To run specific test in a suite:
> ./runner.py tests.json x86_sanity_tests/vmx_msrs_test
>
> To run everything parallely (runs tests inside a suite parallely):
> ./runner.py -j 10 tests.json
>
> To dump output to disk:
> ./runner.py -j 10 tests.json -o sample_run
>
> Sample output (after removing timestamp, process ID, and logging
> level columns):
>
>   ./runner.py tests.json  -j 10 -o sample_run
>   PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_no_manual_protect
>   PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect
>   PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect_random_access
>   PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_10_vcpu_hugetlb
>   PASSED: x86_sanity_tests/vmx_msrs_test
>   SKIPPED: x86_sanity_tests/private_mem_conversions_test
>   FAILED: x86_sanity_tests/apic_bus_clock_test
>   PASSED: x86_sanity_tests/dirty_log_page_splitting_test
>   --------------------------------------------------------------------------
>   Test runner result:
>   1) dirty_log_perf_tests:
>      1) PASSED: dirty_log_perf_test_max_vcpu_no_manual_protect
>      2) PASSED: dirty_log_perf_test_max_vcpu_manual_protect
>      3) PASSED: dirty_log_perf_test_max_vcpu_manual_protect_random_access
>      4) PASSED: dirty_log_perf_test_max_10_vcpu_hugetlb
>   2) x86_sanity_tests:
>      1) PASSED: vmx_msrs_test
>      2) SKIPPED: private_mem_conversions_test
>      3) FAILED: apic_bus_clock_test
>      4) PASSED: dirty_log_page_splitting_test
>   --------------------------------------------------------------------------
>
> Directory structure created:
>
> sample_run/
> |-- dirty_log_perf_tests
> |   |-- dirty_log_perf_test_max_10_vcpu_hugetlb
> |   |   |-- command.stderr
> |   |   |-- command.stdout
> |   |   |-- setup.stderr
> |   |   |-- setup.stdout
> |   |   |-- teardown.stderr
> |   |   `-- teardown.stdout
> |   |-- dirty_log_perf_test_max_vcpu_manual_protect
> |   |   |-- command.stderr
> |   |   `-- command.stdout
> |   |-- dirty_log_perf_test_max_vcpu_manual_protect_random_access
> |   |   |-- command.stderr
> |   |   `-- command.stdout
> |   `-- dirty_log_perf_test_max_vcpu_no_manual_protect
> |       |-- command.stderr
> |       `-- command.stdout
> `-- x86_sanity_tests
>     |-- apic_bus_clock_test
>     |   |-- command.stderr
>     |   `-- command.stdout
>     |-- dirty_log_page_splitting_test
>     |   |-- command.stderr
>     |   |-- command.stdout
>     |   |-- setup.stderr
>     |   |-- setup.stdout
>     |   |-- teardown.stderr
>     |   `-- teardown.stdout
>     |-- private_mem_conversions_test
>     |   |-- command.stderr
>     |   `-- command.stdout
>     `-- vmx_msrs_test
>         |-- command.stderr
>         `-- command.stdout
>
>
> Some other features for future:
> - Provide "precheck" command option in json, which can filter/skip tests if
>   certain conditions are not met.
> - Iteration option in the runner. This will allow the same test suites to
>   run again.
>
> Vipin Sharma (1):
>   KVM: selftestsi: Create KVM selftests runnner to run interesting tests
>
>  tools/testing/selftests/kvm/runner.py  | 282 +++++++++++++++++++++++++
>  tools/testing/selftests/kvm/tests.json |  60 ++++++
>  2 files changed, 342 insertions(+)
>  create mode 100755 tools/testing/selftests/kvm/runner.py
>  create mode 100644 tools/testing/selftests/kvm/tests.json
>
>
> base-commit: de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed
> --
> 2.46.0.184.g6999bdac58-goog
>
Vipin Sharma Nov. 1, 2024, 10:13 p.m. UTC | #2
On Thu, Aug 22, 2024 at 1:55 PM Vipin Sharma <vipinsh@google.com> wrote:
>
> Oops! Adding archs mailing list and maintainers which have arch folder
> in tool/testing/selftests/kvm
>
> On Wed, Aug 21, 2024 at 3:30 PM Vipin Sharma <vipinsh@google.com> wrote:
> >
> > This series is introducing a KVM selftests runner to make it easier to
> > run selftests with some interesting configurations and provide some
> > enhancement over existing kselftests runner.
> >
> > I would like to get an early feedback from the community and see if this
> > is something which can be useful for improving KVM selftests coverage
> > and worthwhile investing time in it. Some specific questions:
> >
> > 1. Should this be done?
> > 2. What features are must?
> > 3. Any other way to write test configuration compared to what is done here?
> >
> > Note, python code written for runner is not optimized but shows how this
> > runner can be useful.
> >
> > What are the goals?
> > - Run tests with more than just the default settings of KVM module
> >   parameters and test itself.
> > - Capture issues which only show when certain combination of module
> >   parameter and tests options are used.
> > - Provide minimum testing which can be standardised for KVM patches.
> > - Run tests parallely.
> > - Dump output in a hierarchical folder structure for easier tracking of
> >   failures/success output
> > - Feel free to add yours :)
> >
> > Why not use/extend kselftests?
> > - Other submodules goal might not align and its gonna be difficult to
> >   capture broader set of requirements.
> > - Instead of test configuration we will need separate shell scripts
> >   which will act as tests for each test arg and module parameter
> >   combination. This will easily pollute the KVM selftests directory.
> > - Easier to enhance features using Python packages than shell scripts.
> >
> > What this runner do?
> > - Reads a test configuration file (tests.json in patch 1).
> >   Configuration in json are written in hierarchy where multiple suites
> >   exist and each suite contains multiple tests.
> > - Provides a way to execute tests inside a suite parallelly.
> > - Provides a way to dump output to a folder in a hierarchical manner.
> > - Allows to run selected suites, or tests in a specific suite.
> > - Allows to do some setup and teardown for test suites and tests.
> > - Timeout can be provided to limit test execution duration.
> > - Allows to run test suites or tests on specific architecture only.
> >
> > Runner is written in python and goal is to only use standard library
> > constructs. This runner will work on Python 3.6 and up
> >
> > What does a test configuration file looks like?
> > Test configuration are written in json as it is easier to read and has
> > inbuilt package support in Python. Root level is a json array denoting
> > suites and each suite can multiple tests in it using json array.
> >
> > [
> >   {
> >     "suite": "dirty_log_perf_tests",
> >     "timeout_s": 300,
> >     "arch": "x86_64",
> >     "setup": "echo Setting up suite",
> >     "teardown": "echo tearing down suite",
> >     "tests": [
> >       {
> >         "name": "dirty_log_perf_test_max_vcpu_no_manual_protect",
> >         "command": "./dirty_log_perf_test -v $(grep -c ^processor /proc/cpuinfo) -g",
> >         "arch": "x86_64",
> >         "setup": "echo Setting up test",
> >         "teardown": "echo tearing down test",
> >         "timeout_s": 5
> >       }
> >     ]
> >   }
> > ]
> >
> > Usage:
> > Runner "runner.py" and test configuration "tests.json" lives in
> > tool/testing/selftests/kvm directory.
> >
> > To run serially:
> > ./runner.py tests.json
> >
> > To run specific test suites:
> > ./runner.py tests.json dirty_log_perf_tests x86_sanity_tests
> >
> > To run specific test in a suite:
> > ./runner.py tests.json x86_sanity_tests/vmx_msrs_test
> >
> > To run everything parallely (runs tests inside a suite parallely):
> > ./runner.py -j 10 tests.json
> >
> > To dump output to disk:
> > ./runner.py -j 10 tests.json -o sample_run
> >
> > Sample output (after removing timestamp, process ID, and logging
> > level columns):
> >
> >   ./runner.py tests.json  -j 10 -o sample_run
> >   PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_no_manual_protect
> >   PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect
> >   PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_vcpu_manual_protect_random_access
> >   PASSED: dirty_log_perf_tests/dirty_log_perf_test_max_10_vcpu_hugetlb
> >   PASSED: x86_sanity_tests/vmx_msrs_test
> >   SKIPPED: x86_sanity_tests/private_mem_conversions_test
> >   FAILED: x86_sanity_tests/apic_bus_clock_test
> >   PASSED: x86_sanity_tests/dirty_log_page_splitting_test
> >   --------------------------------------------------------------------------
> >   Test runner result:
> >   1) dirty_log_perf_tests:
> >      1) PASSED: dirty_log_perf_test_max_vcpu_no_manual_protect
> >      2) PASSED: dirty_log_perf_test_max_vcpu_manual_protect
> >      3) PASSED: dirty_log_perf_test_max_vcpu_manual_protect_random_access
> >      4) PASSED: dirty_log_perf_test_max_10_vcpu_hugetlb
> >   2) x86_sanity_tests:
> >      1) PASSED: vmx_msrs_test
> >      2) SKIPPED: private_mem_conversions_test
> >      3) FAILED: apic_bus_clock_test
> >      4) PASSED: dirty_log_page_splitting_test
> >   --------------------------------------------------------------------------
> >
> > Directory structure created:
> >
> > sample_run/
> > |-- dirty_log_perf_tests
> > |   |-- dirty_log_perf_test_max_10_vcpu_hugetlb
> > |   |   |-- command.stderr
> > |   |   |-- command.stdout
> > |   |   |-- setup.stderr
> > |   |   |-- setup.stdout
> > |   |   |-- teardown.stderr
> > |   |   `-- teardown.stdout
> > |   |-- dirty_log_perf_test_max_vcpu_manual_protect
> > |   |   |-- command.stderr
> > |   |   `-- command.stdout
> > |   |-- dirty_log_perf_test_max_vcpu_manual_protect_random_access
> > |   |   |-- command.stderr
> > |   |   `-- command.stdout
> > |   `-- dirty_log_perf_test_max_vcpu_no_manual_protect
> > |       |-- command.stderr
> > |       `-- command.stdout
> > `-- x86_sanity_tests
> >     |-- apic_bus_clock_test
> >     |   |-- command.stderr
> >     |   `-- command.stdout
> >     |-- dirty_log_page_splitting_test
> >     |   |-- command.stderr
> >     |   |-- command.stdout
> >     |   |-- setup.stderr
> >     |   |-- setup.stdout
> >     |   |-- teardown.stderr
> >     |   `-- teardown.stdout
> >     |-- private_mem_conversions_test
> >     |   |-- command.stderr
> >     |   `-- command.stdout
> >     `-- vmx_msrs_test
> >         |-- command.stderr
> >         `-- command.stdout
> >
> >
> > Some other features for future:
> > - Provide "precheck" command option in json, which can filter/skip tests if
> >   certain conditions are not met.
> > - Iteration option in the runner. This will allow the same test suites to
> >   run again.
> >
> > Vipin Sharma (1):
> >   KVM: selftestsi: Create KVM selftests runnner to run interesting tests
> >
> >  tools/testing/selftests/kvm/runner.py  | 282 +++++++++++++++++++++++++
> >  tools/testing/selftests/kvm/tests.json |  60 ++++++
> >  2 files changed, 342 insertions(+)
> >  create mode 100755 tools/testing/selftests/kvm/runner.py
> >  create mode 100644 tools/testing/selftests/kvm/tests.json
> >
> >
> > base-commit: de9c2c66ad8e787abec7c9d7eff4f8c3cdd28aed
> > --
> > 2.46.0.184.g6999bdac58-goog
> >

Had an offline discussion with Sean, providing a summary on what we
discussed (Sean, correct me if something is not aligned from our
discussion):

We need to have a roadmap for the runner in terms of features we support.


Phase 1: Having a basic selftest runner is useful which can:

- Run tests parallely
- Provide a summary of what passed and failed, or only in case of failure.
- Dump output which can be easily accessed and parsed.
- Allow to run with different command line parameters.

Current patch does more than this and can be simplified.


Phase 2: Environment setup via runner

Current patch, allows to write "setup" commands at test suite and test
level in the json config file to setup the environment needed by a
test to run. This might not be ideal as some settings are exposed
differently on different platforms.

For example,
To enable TDP:
- Intel needs npt=Y
- AMD needs ept=Y
- ARM always on.

To enable APIC virtualization
- Intel needs enable_apicv=Y
- AMD needs avic=Y

To enable/disable nested, they both have the same file name "nested"
in their module params directory which should be changed.

These kinds of settings become more verbose and unnecessary on other
platforms. Instead, runners should have some programming constructs
(API, command line options, default) to enable these options in a
generic way. For example, enable/disable nested can be exposed as a
command line --enable_nested, then based on the platform, runner can
update corresponding module param or ignore.

This will easily extend to providing sane configuration on the
corresponding platforms without lots of hardcoding in JSON. These
individual constructs will provide a generic view/option to run a KVM
feature, and under the hood will do things differently based on the
platform it is running on like arm, x86-intel, x86-amd, s390, etc.


Phase 3: Provide collection of interesting configurations

Specific individual constructs can be combined in a meaningful way to
provide interesting configurations to run on a platform. For example,
user doesn't need to specify each individual configuration instead,
some prebuilt configurations can be exposed like
--stress_test_shadow_mmu, --test_basic_nested

Tests need to handle the environment in which they are running
gracefully, which many tests already do but not exhaustively. If some
setting is not provided or set up properly for their execution then
they should fail/skip accordingly.

Runner will not be responsible to precheck things on tests behalf.


Next steps:
1. Consensus on above phases and features.
2. Start development.

Thanks,
Vipin
Sean Christopherson Nov. 6, 2024, 5:06 p.m. UTC | #3
On Fri, Nov 01, 2024, Vipin Sharma wrote:
> Had an offline discussion with Sean, providing a summary on what we
> discussed (Sean, correct me if something is not aligned from our
> discussion):
> 
> We need to have a roadmap for the runner in terms of features we support.
> 
> Phase 1: Having a basic selftest runner is useful which can:
> 
> - Run tests parallely

Maybe with a (very conversative) per test timeout?  Selftests generally don't have
the same problems as KVM-Unit-Tests (KUT), as selftests are a little better at
guarding against waiting indefinitely, i.e. I don't think we need a configurable
timeout.  But a 120 second timeout or so would be helpful.

E.g. I recently was testing a patch (of mine) that had a "minor" bug where it
caused KVM to do a remote TLB flush on *every* SPTE update in the shadow MMU,
which manifested as hilariously long runtimes for max_guest_memory_test.  I was
_this_ close to not catching the bug (which would have been quite embarrasing),
because my hack-a-scripts don't use timeouts (I only noticed because a completely
unrelated bug was causing failures).

> - Provide a summary of what passed and failed, or only in case of failure.

I think a summary is always warranted.  And for failures, it would be helpful to
spit out _what_ test failed, versus the annoying KUT runner's behavior of stating
only the number of passes/failures, which forces the user to go spelunking just
to find out what (sub)test failed.

I also think the runner should have a "heartbeat" mechanism, i.e. something that
communicates to the user that forward progress is being made.  And IMO, that
mechanism should also spit out skips and failures (this could be optional though).
One of the flaws with the KUT runner is that it's either super noisy and super
quiet.

E.g. my mess of bash outputs this when running selftests in parallel (trimmed for
brevity):

        Running selftests with npt_disabled
        Waiting for 'access_tracking_perf_test', PID '92317'
        Waiting for 'amx_test', PID '92318'
        SKIPPED amx_test
        Waiting for 'apic_bus_clock_test', PID '92319'
        Waiting for 'coalesced_io_test', PID '92321'
        Waiting for 'cpuid_test', PID '92324'
        
        ...
        
        Waiting for 'hyperv_svm_test', PID '92552'
        SKIPPED hyperv_svm_test
        Waiting for 'hyperv_tlb_flush', PID '92563'
        FAILED hyperv_tlb_flush : ret ='254'
        Random seed: 0x6b8b4567
        ==== Test Assertion Failure ====
          x86_64/hyperv_tlb_flush.c:117: val == expected
          pid=92731 tid=93548 errno=4 - Interrupted system call
             1	0x0000000000411566: assert_on_unhandled_exception at processor.c:627
             2	0x000000000040889a: _vcpu_run at kvm_util.c:1649
             3	 (inlined by) vcpu_run at kvm_util.c:1660
             4	0x00000000004041a1: vcpu_thread at hyperv_tlb_flush.c:548
             5	0x000000000043a305: start_thread at pthread_create.o:?
             6	0x000000000045f857: __clone3 at ??:?
          val == expected
        Waiting for 'kvm_binary_stats_test', PID '92579'
        
        ...
        
        SKIPPED vmx_preemption_timer_test
        Waiting for 'vmx_set_nested_state_test', PID '93316'
        SKIPPED vmx_set_nested_state_test
        Waiting for 'vmx_tsc_adjust_test', PID '93329'
        SKIPPED vmx_tsc_adjust_test
        Waiting for 'xapic_ipi_test', PID '93350'
        Waiting for 'xapic_state_test', PID '93360'
        Waiting for 'xcr0_cpuid_test', PID '93374'
        Waiting for 'xen_shinfo_test', PID '93391'
        Waiting for 'xen_vmcall_test', PID '93405'
        Waiting for 'xss_msr_test', PID '93420'

It's far from perfect, e.g. just waits in alphabetical order, but it gives me
easy to read feedback, and signal that tests are indeed running and completing.
        
> - Dump output which can be easily accessed and parsed.

And persist the output/logs somewhere, e.g. so that the user can triage failures
after the fact.

> - Allow to run with different command line parameters.

Command line parameters for tests?  If so, I would put this in phase 3.  I.e. make
the goal of Phase 1 purely about running tests in parallel.

> Current patch does more than this and can be simplified.
> 
> Phase 2: Environment setup via runner
> 
> Current patch, allows to write "setup" commands at test suite and test
> level in the json config file to setup the environment needed by a
> test to run. This might not be ideal as some settings are exposed
> differently on different platforms.
> 
> For example,
> To enable TDP:
> - Intel needs npt=Y
> - AMD needs ept=Y
> - ARM always on.
> 
> To enable APIC virtualization
> - Intel needs enable_apicv=Y
> - AMD needs avic=Y
> 
> To enable/disable nested, they both have the same file name "nested"
> in their module params directory which should be changed.
> 
> These kinds of settings become more verbose and unnecessary on other
> platforms. Instead, runners should have some programming constructs
> (API, command line options, default) to enable these options in a
> generic way. For example, enable/disable nested can be exposed as a
> command line --enable_nested, then based on the platform, runner can
> update corresponding module param or ignore.
> 
> This will easily extend to providing sane configuration on the
> corresponding platforms without lots of hardcoding in JSON. These
> individual constructs will provide a generic view/option to run a KVM
> feature, and under the hood will do things differently based on the
> platform it is running on like arm, x86-intel, x86-amd, s390, etc.

My main input on this front is that the runner needs to configure module params
(and other environment settings) _on behalf of the user_, i.e. in response to a
command line option (to the runner), not in response to per-test configurations.

One of my complaints with our internal infrastructure is that the testcases
themselves can dictate environment settings.  There are certainly benefits to
that approach, but it really only makes sense at scale where there are many
machines available, i.e. where the runner can achieve parallelism by running
tests on multiple machines, and where the complexity of managing the environment
on a per-test basis is worth the payout.

For the upstream runner, I want to cater to developers, i.e. to people that are
running tests on one or two machines.  And I want the runner to rip through tests
as fast as possible, i.e. I don't want tests to get serialized because each one
insists on being a special snowflake and doesn't play nice with other children.
Organizations that the have a fleet of systems can pony up the resources to develop
their own support (on top?).

Selftests can and do check for module params, and should and do use TEST_REQUIRE()
to skip when a module param isn't set as needed.  Extending that to arbitrary
sysfs knobs should be trivial.  I.e. if we get _failures_ because of an incompatible
environment, then it's a test bug.

> Phase 3: Provide collection of interesting configurations
> 
> Specific individual constructs can be combined in a meaningful way to
> provide interesting configurations to run on a platform. For example,
> user doesn't need to specify each individual configuration instead,
> some prebuilt configurations can be exposed like
> --stress_test_shadow_mmu, --test_basic_nested

IMO, this shouldn't be baked into the runner, i.e. should not surface as dedicated
command line options.  Users shouldn't need to modify the runner just to bring
their own configuration.  I also think configurations should be discoverable,
e.g. not hardcoded like KUT's unittest.cfg.  A very real problem with KUT's
approach is that testing different combinations is frustratingly difficult,
because running a testcase with different configuration requires modifying a file
that is tracked by git.

There are underlying issues with KUT that essentially necessitate that approach,
e.g. x86 has several testcases that fail if run without the exact right config.
But that's just another reason to NOT follow KUT's pattern, e.g. to force us to
write robust tests.

E.g. instead of per-config command line options, let the user specify a file,
and/or a directory (using a well known filename pattern to detect configs).

> Tests need to handle the environment in which they are running
> gracefully, which many tests already do but not exhaustively. If some
> setting is not provided or set up properly for their execution then
> they should fail/skip accordingly.

This belongs in phase 2.

> Runner will not be responsible to precheck things on tests behalf.
> 
> 
> Next steps:
> 1. Consensus on above phases and features.
> 2. Start development.
> 
> Thanks,
> Vipin