mbox series

[RFC,kvm-unit-tests,0/4] add generic stress test

Message ID 20201223010850.111882-1-pbonzini@redhat.com (mailing list archive)
Headers show
Series add generic stress test | expand

Message

Paolo Bonzini Dec. 23, 2020, 1:08 a.m. UTC
This short series adds a generic stress test to KVM unit tests that runs a
series of

The test could grow a lot more features, including:

- wrapping the stress test with a VMX or SVM veneer which would forward
  or inject interrupts periodically

- test perf events

- do some work in the MSI handler, so that they have a chance
  of overlapping

- use PV EOI

- play with TPR and self IPIs, similar to Windows DPCs.

The configuration of the test is set individually for each VCPU on
the command line, for example:

   ./x86/run x86/chaos.flat -smp 2 \
      -append 'invtlb=1,mem=12,hz=100  hz=250,edu=1,edu_hz=53,hlt' -device edu

runs a continuous INVLPG+write test on 1<<12 pages on CPU 0, interrupted
by a 100 Hz timer tick; and keeps CPU 1 mostly idle except for 250 timer
ticks and 53 edu device interrupts per second.

For now, the test runs for an infinite time so it's not included in
unittests.cfg.  Do you think this is worth including in kvm-unit-tests,
and if so are you interested in non-x86 versions of it?  Or should the
code be as pluggable as possible to make it easier to port it?

Thanks,

Paolo

Paolo Bonzini (4):
  libcflat: add a few more runtime functions
  chaos: add generic stress test
  chaos: add timer interrupt to the workload
  chaos: add edu device interrupt to the workload

 lib/alloc.c         |   9 +-
 lib/alloc.h         |   1 +
 lib/libcflat.h      |   4 +-
 lib/string.c        |  59 +++++++++-
 lib/string.h        |   4 +
 lib/x86/processor.h |   2 +-
 x86/Makefile.x86_64 |   1 +
 x86/chaos.c         | 263 ++++++++++++++++++++++++++++++++++++++++++++
 8 files changed, 337 insertions(+), 6 deletions(-)
 create mode 100644 x86/chaos.c

Comments

Sean Christopherson Dec. 28, 2020, 10:25 p.m. UTC | #1
On Wed, Dec 23, 2020, Paolo Bonzini wrote:
> This short series adds a generic stress test to KVM unit tests that runs a
> series of

Unintentional cliffhanger?

> The test could grow a lot more features, including:
> 
> - wrapping the stress test with a VMX or SVM veneer which would forward
>   or inject interrupts periodically
> 
> - test perf events
> 
> - do some work in the MSI handler, so that they have a chance
>   of overlapping
> 
> - use PV EOI
> 
> - play with TPR and self IPIs, similar to Windows DPCs.
> 
> The configuration of the test is set individually for each VCPU on
> the command line, for example:
> 
>    ./x86/run x86/chaos.flat -smp 2 \
>       -append 'invtlb=1,mem=12,hz=100  hz=250,edu=1,edu_hz=53,hlt' -device edu
> 
> runs a continuous INVLPG+write test on 1<<12 pages on CPU 0, interrupted
> by a 100 Hz timer tick; and keeps CPU 1 mostly idle except for 250 timer
> ticks and 53 edu device interrupts per second.

Maybe take the target cpu as part of the command line instead of implicitly
defining it via group position?  The "duplicate" hz=??? is confusing.  E.g.

    ./x86/run x86/chaos.flat -smp 2 \
      -append 'cpu=0,invtlb=1,mem=12,hz=100 cpu=1,hz=250,edu=1,edu_hz=53,hlt' -device edu

> For now, the test runs for an infinite time so it's not included in
> unittests.cfg.  Do you think this is worth including in kvm-unit-tests,

What's the motivation for this type of test?  What class of bugs can it find
that won't be found by existing kvm-unit-tests or simple boot tests?

> and if so are you interested in non-x86 versions of it?  Or should the
> code be as pluggable as possible to make it easier to port it?
Paolo Bonzini Jan. 2, 2021, 8:46 a.m. UTC | #2
On 28/12/20 23:25, Sean Christopherson wrote:
> On Wed, Dec 23, 2020, Paolo Bonzini wrote:
>> This short series adds a generic stress test to KVM unit tests that runs a
>> series of
> 
> Unintentional cliffhanger?

... event injections, timer cycles, memory updates and TLB invalidations.

>> The configuration of the test is set individually for each VCPU on
>> the command line, for example:
>>
>>     ./x86/run x86/chaos.flat -smp 2 \
>>        -append 'invtlb=1,mem=12,hz=100  hz=250,edu=1,edu_hz=53,hlt' -device edu
>>
>> runs a continuous INVLPG+write test on 1<<12 pages on CPU 0, interrupted
>> by a 100 Hz timer tick; and keeps CPU 1 mostly idle except for 250 timer
>> ticks and 53 edu device interrupts per second.
> 
> Maybe take the target cpu as part of the command line instead of implicitly
> defining it via group position?

Sure, the command line syntax can be adjusted.

   The "duplicate" hz=??? is confusing.  E.g.
> 
>      ./x86/run x86/chaos.flat -smp 2 \
>        -append 'cpu=0,invtlb=1,mem=12,hz=100 cpu=1,hz=250,edu=1,edu_hz=53,hlt' -device edu
> 
>> For now, the test runs for an infinite time so it's not included in
>> unittests.cfg.  Do you think this is worth including in kvm-unit-tests,
> 
> What's the motivation for this type of test?  What class of bugs can it find
> that won't be found by existing kvm-unit-tests or simple boot tests?

Mostly live migration tests.  For example, Maxim found a corner case in 
KVM_GET_VCPU_EVENTS that affects both nVMX and nSVM live migration 
(patches coming), and it is quite hard to turn it into a selftest 
because it requires the ioctl to be invoked exactly when 
nested_run_pending==1.  Such a test would allow stress-testing live 
migration without having to set up L1 and L2 virtual machine images.

Paolo
Sean Christopherson Jan. 12, 2021, 10:28 p.m. UTC | #3
On Sat, Jan 02, 2021, Paolo Bonzini wrote:
> On 28/12/20 23:25, Sean Christopherson wrote:
> > On Wed, Dec 23, 2020, Paolo Bonzini wrote:
> > > This short series adds a generic stress test to KVM unit tests that runs a
> > > series of
> > 
> > Unintentional cliffhanger?
> 
> ... event injections, timer cycles, memory updates and TLB invalidations.
> 
> > > The configuration of the test is set individually for each VCPU on
> > > the command line, for example:
> > > 
> > >     ./x86/run x86/chaos.flat -smp 2 \
> > >        -append 'invtlb=1,mem=12,hz=100  hz=250,edu=1,edu_hz=53,hlt' -device edu
> > > 
> > > runs a continuous INVLPG+write test on 1<<12 pages on CPU 0, interrupted
> > > by a 100 Hz timer tick; and keeps CPU 1 mostly idle except for 250 timer
> > > ticks and 53 edu device interrupts per second.
> > 
> > Maybe take the target cpu as part of the command line instead of implicitly
> > defining it via group position?
> 
> Sure, the command line syntax can be adjusted.
> 
>   The "duplicate" hz=??? is confusing.  E.g.
> > 
> >      ./x86/run x86/chaos.flat -smp 2 \
> >        -append 'cpu=0,invtlb=1,mem=12,hz=100 cpu=1,hz=250,edu=1,edu_hz=53,hlt' -device edu
> > 
> > > For now, the test runs for an infinite time so it's not included in
> > > unittests.cfg.  Do you think this is worth including in kvm-unit-tests,
> > 
> > What's the motivation for this type of test?  What class of bugs can it find
> > that won't be found by existing kvm-unit-tests or simple boot tests?
> 
> Mostly live migration tests.  For example, Maxim found a corner case in
> KVM_GET_VCPU_EVENTS that affects both nVMX and nSVM live migration (patches
> coming), and it is quite hard to turn it into a selftest because it requires
> the ioctl to be invoked exactly when nested_run_pending==1.  Such a test
> would allow stress-testing live migration without having to set up L1 and L2
> virtual machine images.

Ah, so you run the stress test in L1 and then migrate L1?

What's the biggest hurdle for doing this completely within the unit test
framework?  Is teaching the framework to migrate a unit test the biggest pain?
Writing a "unit test" that puts an L2 guest into a busy loop doesn't seem _that_
bad.
Paolo Bonzini Jan. 13, 2021, 12:13 p.m. UTC | #4
On 12/01/21 23:28, Sean Christopherson wrote:
>>> What's the motivation for this type of test?  What class of bugs can it find
>>> that won't be found by existing kvm-unit-tests or simple boot tests?
>>
>> Mostly live migration tests.  For example, Maxim found a corner case in
>> KVM_GET_VCPU_EVENTS that affects both nVMX and nSVM live migration (patches
>> coming), and it is quite hard to turn it into a selftest because it requires
>> the ioctl to be invoked exactly when nested_run_pending==1.  Such a test
>> would allow stress-testing live migration without having to set up L1 and L2
>> virtual machine images.
> 
> Ah, so you run the stress test in L1 and then migrate L1?

Yes.  I can't exclude that it would find bugs without migration, but I 
hope we'd have stomped them by now.

> What's the biggest hurdle for doing this completely within the unit test
> framework?  Is teaching the framework to migrate a unit test the biggest pain?

Yes, pretty much.  The shell script framework would show its limits.

That said, I've always treated run_tests.sh as a utility more than an 
integral part of kvm-unit-tests.  There's nothing that prevents a more 
capable framework from parsing unittests.cfg.

Paolo
Sean Christopherson Jan. 14, 2021, 8:13 p.m. UTC | #5
On Wed, Jan 13, 2021, Paolo Bonzini wrote:
> On 12/01/21 23:28, Sean Christopherson wrote:
> > What's the biggest hurdle for doing this completely within the unit test
> > framework?  Is teaching the framework to migrate a unit test the biggest pain?
> 
> Yes, pretty much.  The shell script framework would show its limits.
> 
> That said, I've always treated run_tests.sh as a utility more than an
> integral part of kvm-unit-tests.  There's nothing that prevents a more
> capable framework from parsing unittests.cfg.

Heh, got anyone you can "volunteer" to create a new framework?  One-button
migration testing would be very nice to have.  I suspect I'm not the only
contributor that doesn't do migration testing as part of their standard workflow.
Paolo Bonzini Jan. 14, 2021, 9:12 p.m. UTC | #6
On 14/01/21 21:13, Sean Christopherson wrote:
> On Wed, Jan 13, 2021, Paolo Bonzini wrote:
>> On 12/01/21 23:28, Sean Christopherson wrote:
>>> What's the biggest hurdle for doing this completely within the unit test
>>> framework?  Is teaching the framework to migrate a unit test the biggest pain?
>>
>> Yes, pretty much.  The shell script framework would show its limits.
>>
>> That said, I've always treated run_tests.sh as a utility more than an
>> integral part of kvm-unit-tests.  There's nothing that prevents a more
>> capable framework from parsing unittests.cfg.
> 
> Heh, got anyone you can "volunteer" to create a new framework?  One-button
> migration testing would be very nice to have.  I suspect I'm not the only
> contributor that doesn't do migration testing as part of their standard workflow.

avocado-vt is the one I use for installation tests.  It can do a lot 
more, including migration, but it is a bit hard to set up.

avocado-qemu (python/qemu and tests/acceptance in the QEMU tree) is a 
lot simpler, but it does not have a lot of tests and in particular it is 
not integrated with kvm-unit-tests.

Maxim also wrote a script to automate his tests which has quite a few 
features, but I've never used it myself.

Paolo
Sean Christopherson Jan. 14, 2021, 10:13 p.m. UTC | #7
On Thu, Jan 14, 2021, Paolo Bonzini wrote:
> On 14/01/21 21:13, Sean Christopherson wrote:
> > On Wed, Jan 13, 2021, Paolo Bonzini wrote:
> > > On 12/01/21 23:28, Sean Christopherson wrote:
> > > > What's the biggest hurdle for doing this completely within the unit test
> > > > framework?  Is teaching the framework to migrate a unit test the biggest pain?
> > > 
> > > Yes, pretty much.  The shell script framework would show its limits.
> > > 
> > > That said, I've always treated run_tests.sh as a utility more than an
> > > integral part of kvm-unit-tests.  There's nothing that prevents a more
> > > capable framework from parsing unittests.cfg.
> > 
> > Heh, got anyone you can "volunteer" to create a new framework?  One-button
> > migration testing would be very nice to have.  I suspect I'm not the only
> > contributor that doesn't do migration testing as part of their standard workflow.
> 
> avocado-vt is the one I use for installation tests.  It can do a lot more,
> including migration, but it is a bit hard to set up.

Is avocado-vt the test stuff you were talking about at the KVM Forum BoF?
Paolo Bonzini Jan. 15, 2021, 1:15 p.m. UTC | #8
On 14/01/21 23:13, Sean Christopherson wrote:
> On Thu, Jan 14, 2021, Paolo Bonzini wrote:
>> On 14/01/21 21:13, Sean Christopherson wrote:
>>> On Wed, Jan 13, 2021, Paolo Bonzini wrote:
>>>> On 12/01/21 23:28, Sean Christopherson wrote:
>>>>> What's the biggest hurdle for doing this completely within the unit test
>>>>> framework?  Is teaching the framework to migrate a unit test the biggest pain?
>>>>
>>>> Yes, pretty much.  The shell script framework would show its limits.
>>>>
>>>> That said, I've always treated run_tests.sh as a utility more than an
>>>> integral part of kvm-unit-tests.  There's nothing that prevents a more
>>>> capable framework from parsing unittests.cfg.
>>>
>>> Heh, got anyone you can "volunteer" to create a new framework?  One-button
>>> migration testing would be very nice to have.  I suspect I'm not the only
>>> contributor that doesn't do migration testing as part of their standard workflow.
>>
>> avocado-vt is the one I use for installation tests.  It can do a lot more,
>> including migration, but it is a bit hard to set up.
> 
> Is avocado-vt the test stuff you were talking about at the KVM Forum BoF?

Yes, it is.

Paolo
Andrew Jones Jan. 18, 2021, 11:09 a.m. UTC | #9
On Thu, Jan 14, 2021 at 12:13:22PM -0800, Sean Christopherson wrote:
> On Wed, Jan 13, 2021, Paolo Bonzini wrote:
> > On 12/01/21 23:28, Sean Christopherson wrote:
> > > What's the biggest hurdle for doing this completely within the unit test
> > > framework?  Is teaching the framework to migrate a unit test the biggest pain?
> > 
> > Yes, pretty much.  The shell script framework would show its limits.
> > 
> > That said, I've always treated run_tests.sh as a utility more than an
> > integral part of kvm-unit-tests.  There's nothing that prevents a more
> > capable framework from parsing unittests.cfg.
> 
> Heh, got anyone you can "volunteer" to create a new framework?  One-button
> migration testing would be very nice to have.  I suspect I'm not the only
> contributor that doesn't do migration testing as part of their standard workflow.
>

We have one-button migration tests already with kvm-unit-tests. Just
compile the tests that use the migration framework as standalone
tests and then run them directly.

I agree, though, that Bash is a pain for some of the stuff we're trying
to do. However, we do have requests to keep the framework written in Bash,
because KVM testing is regularly done with simulators and even in embedded
environments. It's not desirable, or even possible, to have e.g. Python
everywhere we want kvm-unit-tests.

Thanks,
drew
Sean Christopherson Jan. 19, 2021, 5:37 p.m. UTC | #10
On Mon, Jan 18, 2021, Andrew Jones wrote:
> On Thu, Jan 14, 2021 at 12:13:22PM -0800, Sean Christopherson wrote:
> > On Wed, Jan 13, 2021, Paolo Bonzini wrote:
> > > On 12/01/21 23:28, Sean Christopherson wrote:
> > > > What's the biggest hurdle for doing this completely within the unit test
> > > > framework?  Is teaching the framework to migrate a unit test the biggest pain?
> > > 
> > > Yes, pretty much.  The shell script framework would show its limits.
> > > 
> > > That said, I've always treated run_tests.sh as a utility more than an
> > > integral part of kvm-unit-tests.  There's nothing that prevents a more
> > > capable framework from parsing unittests.cfg.
> > 
> > Heh, got anyone you can "volunteer" to create a new framework?  One-button
> > migration testing would be very nice to have.  I suspect I'm not the only
> > contributor that doesn't do migration testing as part of their standard workflow.
> >
> 
> We have one-button migration tests already with kvm-unit-tests. Just
> compile the tests that use the migration framework as standalone
> tests and then run them directly.

Do those exist/work for x86?  I see migration stuff for Arm and PPC, but nothing
for x86.

> I agree, though, that Bash is a pain for some of the stuff we're trying
> to do. However, we do have requests to keep the framework written in Bash,
> because KVM testing is regularly done with simulators and even in embedded
> environments. It's not desirable, or even possible, to have e.g. Python
> everywhere we want kvm-unit-tests.

True, I would probably be one of the people complaining if the tests started
requiring some newfangled language :-)
Andrew Jones Jan. 19, 2021, 6:40 p.m. UTC | #11
On Tue, Jan 19, 2021 at 09:37:19AM -0800, Sean Christopherson wrote:
> On Mon, Jan 18, 2021, Andrew Jones wrote:
> > On Thu, Jan 14, 2021 at 12:13:22PM -0800, Sean Christopherson wrote:
> > > On Wed, Jan 13, 2021, Paolo Bonzini wrote:
> > > > On 12/01/21 23:28, Sean Christopherson wrote:
> > > > > What's the biggest hurdle for doing this completely within the unit test
> > > > > framework?  Is teaching the framework to migrate a unit test the biggest pain?
> > > > 
> > > > Yes, pretty much.  The shell script framework would show its limits.
> > > > 
> > > > That said, I've always treated run_tests.sh as a utility more than an
> > > > integral part of kvm-unit-tests.  There's nothing that prevents a more
> > > > capable framework from parsing unittests.cfg.
> > > 
> > > Heh, got anyone you can "volunteer" to create a new framework?  One-button
> > > migration testing would be very nice to have.  I suspect I'm not the only
> > > contributor that doesn't do migration testing as part of their standard workflow.
> > >
> > 
> > We have one-button migration tests already with kvm-unit-tests. Just
> > compile the tests that use the migration framework as standalone
> > tests and then run them directly.
> 
> Do those exist/work for x86?  I see migration stuff for Arm and PPC, but nothing
> for x86.

Right, we don't have migration tests yet for x86. Of course that's just a
matter of programming... We'll also need to add an x86 __getchar() first.

Thanks,
drew