diff mbox

[v5,untested] kvm: better MWAIT emulation for guests

Message ID 20170316160157.GN14081@potion (mailing list archive)
State New, archived
Headers show

Commit Message

Radim Krčmář March 16, 2017, 4:01 p.m. UTC
2017-03-16 16:35+0100, Radim Krčmář:
> 2017-03-16 10:58-0400, Gabriel L. Somlo:
>> The intel manual said the same thing back in 2010 as well. However,
>> regardless of how any flags were set, interrupt-window exiting or not,
>> "normal" L1 MWAIT behavior was that it woke up immediately regardless.
>> Remember, never going to sleep is still correct ("normal" ?) behavior
>> per the ISA definition of MWAIT :)
> 
> I'll write a simple kvm-unit-test to better understand why it is broken
> for you ...

Please get git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git

and try this, thanks!

---8<---
x86/mwait: crappy test

`./configure && make` to build it, then follow the comment in code to
try few cases.

---
 x86/Makefile.common |  1 +
 x86/mwait.c         | 41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)
 create mode 100644 x86/mwait.c

Comments

Gabriel L. Somlo March 16, 2017, 4:47 p.m. UTC | #1
On Thu, Mar 16, 2017 at 05:01:58PM +0100, Radim Krčmář wrote:
> 2017-03-16 16:35+0100, Radim Krčmář:
> > 2017-03-16 10:58-0400, Gabriel L. Somlo:
> >> The intel manual said the same thing back in 2010 as well. However,
> >> regardless of how any flags were set, interrupt-window exiting or not,
> >> "normal" L1 MWAIT behavior was that it woke up immediately regardless.
> >> Remember, never going to sleep is still correct ("normal" ?) behavior
> >> per the ISA definition of MWAIT :)
> > 
> > I'll write a simple kvm-unit-test to better understand why it is broken
> > for you ...
> 
> Please get git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
> 
> and try this, thanks!
> 
> ---8<---
> x86/mwait: crappy test
> 
> `./configure && make` to build it, then follow the comment in code to
> try few cases.

kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 1'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 0 1 1
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real    0m10.564s
user    0m10.339s
sys     0m0.225s


and

kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 0'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 0 1 0
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real    0m0.746s
user    0m0.555s
sys     0m0.200s

Both of these with Michael's v5 patch applied, on the MacPro1,1.

Similar behavior (0 1 1 takes 10 seconds, 0 1 0 returns immediately)
on the macbook air.

If I revert to the original (nop-emulated MWAIT) kvm source, I get
both versions to return immediately.

HTH,
--Gabriel



> 
> ---
>  x86/Makefile.common |  1 +
>  x86/mwait.c         | 41 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 42 insertions(+)
>  create mode 100644 x86/mwait.c
> 
> diff --git a/x86/Makefile.common b/x86/Makefile.common
> index 1dad18ba26e1..1e708a6acd39 100644
> --- a/x86/Makefile.common
> +++ b/x86/Makefile.common
> @@ -46,6 +46,7 @@ tests-common = $(TEST_DIR)/vmexit.flat $(TEST_DIR)/tsc.flat \
>                 $(TEST_DIR)/tsc_adjust.flat $(TEST_DIR)/asyncpf.flat \
>                 $(TEST_DIR)/init.flat $(TEST_DIR)/smap.flat \
>                 $(TEST_DIR)/hyperv_synic.flat $(TEST_DIR)/hyperv_stimer.flat \
> +               $(TEST_DIR)/mwait.flat \
>  
>  ifdef API
>  tests-common += api/api-sample
> diff --git a/x86/mwait.c b/x86/mwait.c
> new file mode 100644
> index 000000000000..c21dab5cc97d
> --- /dev/null
> +++ b/x86/mwait.c
> @@ -0,0 +1,41 @@
> +#include "vm.h"
> +
> +#define TARGET_RESUMES 10000
> +volatile unsigned page[4096 / 4];
> +
> +/*
> + * Execute
> + *   time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 1'
> + * (first two arguments are eax and ecx for MWAIT, the third is FLAGS.IF bit)
> + * I assume you have 1000 Hz scheduler, so the test should take about 10
> + * seconds to run if mwait works (host timer interrupts will kick mwait).
> + *
> + * If you get far less, then mwait is just nop, as in the case of
> + *
> + *   time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 0'
> + *
> + * All other combinations of arguments should take 10 seconds.
> + * Getting killed by the TIMEOUT most likely means that you have different HZ,
> + * but could also be a bug ...
> + */
> +int main(int argc, char **argv)
> +{
> +	uint32_t eax = atol(argv[1]);
> +	uint32_t ecx = atol(argv[2]);
> +	bool sti = atol(argv[3]);
> +	unsigned resumes = 0;
> +
> +	if (sti)
> +		asm volatile ("sti");
> +	else
> +		asm volatile ("cli");
> +
> +	while (resumes < TARGET_RESUMES) {
> +		asm volatile("monitor" :: "a" (page), "c" (0), "d" (0));
> +		asm volatile("mwait" :: "a" (eax), "c" (ecx));
> +		resumes++;
> +	}
> +
> +	report("resumed from mwait %u times", resumes == TARGET_RESUMES, resumes);
> +	return report_summary();
> +}
> -- 
> 2.11.0
>
Michael S. Tsirkin March 16, 2017, 5:27 p.m. UTC | #2
On Thu, Mar 16, 2017 at 12:47:50PM -0400, Gabriel L. Somlo wrote:
> On Thu, Mar 16, 2017 at 05:01:58PM +0100, Radim Krčmář wrote:
> > 2017-03-16 16:35+0100, Radim Krčmář:
> > > 2017-03-16 10:58-0400, Gabriel L. Somlo:
> > >> The intel manual said the same thing back in 2010 as well. However,
> > >> regardless of how any flags were set, interrupt-window exiting or not,
> > >> "normal" L1 MWAIT behavior was that it woke up immediately regardless.
> > >> Remember, never going to sleep is still correct ("normal" ?) behavior
> > >> per the ISA definition of MWAIT :)
> > > 
> > > I'll write a simple kvm-unit-test to better understand why it is broken
> > > for you ...
> > 
> > Please get git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
> > 
> > and try this, thanks!
> > 
> > ---8<---
> > x86/mwait: crappy test
> > 
> > `./configure && make` to build it, then follow the comment in code to
> > try few cases.
> 
> kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 1'
> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 0 1 1
> enabling apic
> PASS: resumed from mwait 10000 times
> SUMMARY: 1 tests
> 
> real    0m10.564s
> user    0m10.339s
> sys     0m0.225s
> 
> 
> and
> 
> kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 0'
> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 0 1 0
> enabling apic
> PASS: resumed from mwait 10000 times
> SUMMARY: 1 tests
> 
> real    0m0.746s
> user    0m0.555s
> sys     0m0.200s
> 
> Both of these with Michael's v5 patch applied, on the MacPro1,1.

Would it make sense to try to set ECX to 0? 0 0 1 and 0 0 0.


> Similar behavior (0 1 1 takes 10 seconds, 0 1 0 returns immediately)
> on the macbook air.
> 
> If I revert to the original (nop-emulated MWAIT) kvm source, I get
> both versions to return immediately.
> 
> HTH,
> --Gabriel
> 
> 
> 
> > 
> > ---
> >  x86/Makefile.common |  1 +
> >  x86/mwait.c         | 41 +++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 42 insertions(+)
> >  create mode 100644 x86/mwait.c
> > 
> > diff --git a/x86/Makefile.common b/x86/Makefile.common
> > index 1dad18ba26e1..1e708a6acd39 100644
> > --- a/x86/Makefile.common
> > +++ b/x86/Makefile.common
> > @@ -46,6 +46,7 @@ tests-common = $(TEST_DIR)/vmexit.flat $(TEST_DIR)/tsc.flat \
> >                 $(TEST_DIR)/tsc_adjust.flat $(TEST_DIR)/asyncpf.flat \
> >                 $(TEST_DIR)/init.flat $(TEST_DIR)/smap.flat \
> >                 $(TEST_DIR)/hyperv_synic.flat $(TEST_DIR)/hyperv_stimer.flat \
> > +               $(TEST_DIR)/mwait.flat \
> >  
> >  ifdef API
> >  tests-common += api/api-sample
> > diff --git a/x86/mwait.c b/x86/mwait.c
> > new file mode 100644
> > index 000000000000..c21dab5cc97d
> > --- /dev/null
> > +++ b/x86/mwait.c
> > @@ -0,0 +1,41 @@
> > +#include "vm.h"
> > +
> > +#define TARGET_RESUMES 10000
> > +volatile unsigned page[4096 / 4];
> > +
> > +/*
> > + * Execute
> > + *   time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 1'
> > + * (first two arguments are eax and ecx for MWAIT, the third is FLAGS.IF bit)
> > + * I assume you have 1000 Hz scheduler, so the test should take about 10
> > + * seconds to run if mwait works (host timer interrupts will kick mwait).
> > + *
> > + * If you get far less, then mwait is just nop, as in the case of
> > + *
> > + *   time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 0'
> > + *
> > + * All other combinations of arguments should take 10 seconds.
> > + * Getting killed by the TIMEOUT most likely means that you have different HZ,
> > + * but could also be a bug ...
> > + */
> > +int main(int argc, char **argv)
> > +{
> > +	uint32_t eax = atol(argv[1]);
> > +	uint32_t ecx = atol(argv[2]);
> > +	bool sti = atol(argv[3]);
> > +	unsigned resumes = 0;
> > +
> > +	if (sti)
> > +		asm volatile ("sti");
> > +	else
> > +		asm volatile ("cli");
> > +
> > +	while (resumes < TARGET_RESUMES) {
> > +		asm volatile("monitor" :: "a" (page), "c" (0), "d" (0));
> > +		asm volatile("mwait" :: "a" (eax), "c" (ecx));
> > +		resumes++;
> > +	}
> > +
> > +	report("resumed from mwait %u times", resumes == TARGET_RESUMES, resumes);
> > +	return report_summary();
> > +}
> > -- 
> > 2.11.0
> >
Gabriel L. Somlo March 16, 2017, 5:41 p.m. UTC | #3
On Thu, Mar 16, 2017 at 07:27:34PM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 16, 2017 at 12:47:50PM -0400, Gabriel L. Somlo wrote:
> > On Thu, Mar 16, 2017 at 05:01:58PM +0100, Radim Krčmář wrote:
> > > 2017-03-16 16:35+0100, Radim Krčmář:
> > > > 2017-03-16 10:58-0400, Gabriel L. Somlo:
> > > >> The intel manual said the same thing back in 2010 as well. However,
> > > >> regardless of how any flags were set, interrupt-window exiting or not,
> > > >> "normal" L1 MWAIT behavior was that it woke up immediately regardless.
> > > >> Remember, never going to sleep is still correct ("normal" ?) behavior
> > > >> per the ISA definition of MWAIT :)
> > > > 
> > > > I'll write a simple kvm-unit-test to better understand why it is broken
> > > > for you ...
> > > 
> > > Please get git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
> > > 
> > > and try this, thanks!
> > > 
> > > ---8<---
> > > x86/mwait: crappy test
> > > 
> > > `./configure && make` to build it, then follow the comment in code to
> > > try few cases.
> > 
> > kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 1'
> > timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 0 1 1
> > enabling apic
> > PASS: resumed from mwait 10000 times
> > SUMMARY: 1 tests
> > 
> > real    0m10.564s
> > user    0m10.339s
> > sys     0m0.225s
> > 
> > 
> > and
> > 
> > kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 0'
> > timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 0 1 0
> > enabling apic
> > PASS: resumed from mwait 10000 times
> > SUMMARY: 1 tests
> > 
> > real    0m0.746s
> > user    0m0.555s
> > sys     0m0.200s
> > 
> > Both of these with Michael's v5 patch applied, on the MacPro1,1.
> 
> Would it make sense to try to set ECX to 0? 0 0 1 and 0 0 0.

$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 0 1'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 0 0 1
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real    0m10.567s
user    0m10.367s
sys     0m0.210s


$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 0 0'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 0 0 0
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real    0m10.549s
user    0m10.352s
sys     0m0.206s

Both took 10 seconds.
 
> 
> > Similar behavior (0 1 1 takes 10 seconds, 0 1 0 returns immediately)
> > on the macbook air.
> > 
> > If I revert to the original (nop-emulated MWAIT) kvm source, I get
> > both versions to return immediately.
> > 
> > HTH,
> > --Gabriel
> > 
> > 
> > 
> > > 
> > > ---
> > >  x86/Makefile.common |  1 +
> > >  x86/mwait.c         | 41 +++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 42 insertions(+)
> > >  create mode 100644 x86/mwait.c
> > > 
> > > diff --git a/x86/Makefile.common b/x86/Makefile.common
> > > index 1dad18ba26e1..1e708a6acd39 100644
> > > --- a/x86/Makefile.common
> > > +++ b/x86/Makefile.common
> > > @@ -46,6 +46,7 @@ tests-common = $(TEST_DIR)/vmexit.flat $(TEST_DIR)/tsc.flat \
> > >                 $(TEST_DIR)/tsc_adjust.flat $(TEST_DIR)/asyncpf.flat \
> > >                 $(TEST_DIR)/init.flat $(TEST_DIR)/smap.flat \
> > >                 $(TEST_DIR)/hyperv_synic.flat $(TEST_DIR)/hyperv_stimer.flat \
> > > +               $(TEST_DIR)/mwait.flat \
> > >  
> > >  ifdef API
> > >  tests-common += api/api-sample
> > > diff --git a/x86/mwait.c b/x86/mwait.c
> > > new file mode 100644
> > > index 000000000000..c21dab5cc97d
> > > --- /dev/null
> > > +++ b/x86/mwait.c
> > > @@ -0,0 +1,41 @@
> > > +#include "vm.h"
> > > +
> > > +#define TARGET_RESUMES 10000
> > > +volatile unsigned page[4096 / 4];
> > > +
> > > +/*
> > > + * Execute
> > > + *   time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 1'
> > > + * (first two arguments are eax and ecx for MWAIT, the third is FLAGS.IF bit)
> > > + * I assume you have 1000 Hz scheduler, so the test should take about 10
> > > + * seconds to run if mwait works (host timer interrupts will kick mwait).
> > > + *
> > > + * If you get far less, then mwait is just nop, as in the case of
> > > + *
> > > + *   time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 0'
> > > + *
> > > + * All other combinations of arguments should take 10 seconds.
> > > + * Getting killed by the TIMEOUT most likely means that you have different HZ,
> > > + * but could also be a bug ...
> > > + */
> > > +int main(int argc, char **argv)
> > > +{
> > > +	uint32_t eax = atol(argv[1]);
> > > +	uint32_t ecx = atol(argv[2]);
> > > +	bool sti = atol(argv[3]);
> > > +	unsigned resumes = 0;
> > > +
> > > +	if (sti)
> > > +		asm volatile ("sti");
> > > +	else
> > > +		asm volatile ("cli");
> > > +
> > > +	while (resumes < TARGET_RESUMES) {
> > > +		asm volatile("monitor" :: "a" (page), "c" (0), "d" (0));
> > > +		asm volatile("mwait" :: "a" (eax), "c" (ecx));
> > > +		resumes++;
> > > +	}
> > > +
> > > +	report("resumed from mwait %u times", resumes == TARGET_RESUMES, resumes);
> > > +	return report_summary();
> > > +}
> > > -- 
> > > 2.11.0
> > >
Michael S. Tsirkin March 16, 2017, 6:29 p.m. UTC | #4
Let's take a step back and try to figure out how is
mwait called. How about dumping code of VCPUs
around mwait?  gdb disa command will do this.
Gabriel L. Somlo March 16, 2017, 7:24 p.m. UTC | #5
On Thu, Mar 16, 2017 at 08:29:32PM +0200, Michael S. Tsirkin wrote:
> Let's take a step back and try to figure out how is
> mwait called. How about dumping code of VCPUs
> around mwait?  gdb disa command will do this.

Started guest with '-s', tried to attach from gdb with
"target remote localhost:1234", got
"remote 'g' packet reply is too long: <lengthy string of numbers>"

Tried typing 'cont' in the qemu monitor, got os x to crash:

panic (cpu 1 caller 0xffffff7f813ff488): pmLock: waited too long, held
by 0xffffff7f813eff65

Hmm, maybe that's where it keeps its monitor/mwait idle loop.
Restarted the guest, tried this from monitor:

	dump-guest-memory foobar 0xffffff7f813e0000 0x20000

Got "'dump-guest-memory' has failed: integer is for 32-bit values"

Hmmm... I have no idea what I'm doing anymore at this point... :)

--G
Michael S. Tsirkin March 16, 2017, 7:27 p.m. UTC | #6
On Thu, Mar 16, 2017 at 03:24:41PM -0400, Gabriel L. Somlo wrote:
> On Thu, Mar 16, 2017 at 08:29:32PM +0200, Michael S. Tsirkin wrote:
> > Let's take a step back and try to figure out how is
> > mwait called. How about dumping code of VCPUs
> > around mwait?  gdb disa command will do this.
> 
> Started guest with '-s', tried to attach from gdb with
> "target remote localhost:1234", got
> "remote 'g' packet reply is too long: <lengthy string of numbers>"

Try

set arch x86-64:x86-64


> Tried typing 'cont' in the qemu monitor, got os x to crash:
> 
> panic (cpu 1 caller 0xffffff7f813ff488): pmLock: waited too long, held
> by 0xffffff7f813eff65
> 
> Hmm, maybe that's where it keeps its monitor/mwait idle loop.
> Restarted the guest, tried this from monitor:
> 
> 	dump-guest-memory foobar 0xffffff7f813e0000 0x20000
> 
> Got "'dump-guest-memory' has failed: integer is for 32-bit values"
> 
> Hmmm... I have no idea what I'm doing anymore at this point... :)
> 
> --G

I think 0xffffff7f813ff488 is a PC.
Gabriel L. Somlo March 16, 2017, 8:17 p.m. UTC | #7
On Thu, Mar 16, 2017 at 09:27:56PM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 16, 2017 at 03:24:41PM -0400, Gabriel L. Somlo wrote:
> > On Thu, Mar 16, 2017 at 08:29:32PM +0200, Michael S. Tsirkin wrote:
> > > Let's take a step back and try to figure out how is
> > > mwait called. How about dumping code of VCPUs
> > > around mwait?  gdb disa command will do this.
> > 
> > Started guest with '-s', tried to attach from gdb with
> > "target remote localhost:1234", got
> > "remote 'g' packet reply is too long: <lengthy string of numbers>"
> 
> Try
> 
> set arch x86-64:x86-64

'set architecture i386:x86-64:intel' is what worked for me;

Been rooting around for a while, can't find mwait or monitor :(

Guess I'll have to recompile KVM to actually issue an invalid opcode,
so OS X will print a panic message with the exact address :)

Stay tuned...
 
> 
> > Tried typing 'cont' in the qemu monitor, got os x to crash:
> > 
> > panic (cpu 1 caller 0xffffff7f813ff488): pmLock: waited too long, held
> > by 0xffffff7f813eff65
> > 
> > Hmm, maybe that's where it keeps its monitor/mwait idle loop.
> > Restarted the guest, tried this from monitor:
> > 
> > 	dump-guest-memory foobar 0xffffff7f813e0000 0x20000
> > 
> > Got "'dump-guest-memory' has failed: integer is for 32-bit values"
> > 
> > Hmmm... I have no idea what I'm doing anymore at this point... :)
> > 
> > --G
> 
> I think 0xffffff7f813ff488 is a PC.
> 
> -- 
> MST
Gabriel L. Somlo March 16, 2017, 9:14 p.m. UTC | #8
On Thu, Mar 16, 2017 at 04:17:11PM -0400, Gabriel L. Somlo wrote:
> On Thu, Mar 16, 2017 at 09:27:56PM +0200, Michael S. Tsirkin wrote:
> > On Thu, Mar 16, 2017 at 03:24:41PM -0400, Gabriel L. Somlo wrote:
> > > On Thu, Mar 16, 2017 at 08:29:32PM +0200, Michael S. Tsirkin wrote:
> > > > Let's take a step back and try to figure out how is
> > > > mwait called. How about dumping code of VCPUs
> > > > around mwait?  gdb disa command will do this.
> > > 
> > > Started guest with '-s', tried to attach from gdb with
> > > "target remote localhost:1234", got
> > > "remote 'g' packet reply is too long: <lengthy string of numbers>"
> > 
> > Try
> > 
> > set arch x86-64:x86-64
> 
> 'set architecture i386:x86-64:intel' is what worked for me;
> 
> Been rooting around for a while, can't find mwait or monitor :(
> 
> Guess I'll have to recompile KVM to actually issue an invalid opcode,
> so OS X will print a panic message with the exact address :)
> 
> Stay tuned...

OK, so I found a few instances. The one closest to where a random
interrupt from gdb landed, was this one:

...
   0xffffff7f813ff379:  mov    0x90(%r15),%rax
   0xffffff7f813ff380:  mov    0x18(%rax),%rsi
   0xffffff7f813ff384:  xor    %ecx,%ecx
   0xffffff7f813ff386:  mov    %rsi,%rax
   0xffffff7f813ff389:  xor    %edx,%edx
   0xffffff7f813ff38b:  monitor %rax,%rcx,%rdx
   0xffffff7f813ff38e:  test   %r14,%r14
   0xffffff7f813ff391:  je     0xffffff7f813ff3ad
   0xffffff7f813ff393:  movq   $0x0,0x8(%r14)
   0xffffff7f813ff39b:  movl   $0x0,(%r14)
   0xffffff7f813ff3a2:  test   %ebx,%ebx
   0xffffff7f813ff3a4:  je     0xffffff7f813ff3b2
   0xffffff7f813ff3a6:  mfence 
   0xffffff7f813ff3a9:  wbinvd
   0xffffff7f813ff3ab:  jmp    0xffffff7f813ff3b2
   0xffffff7f813ff3ad:  cmpl   $0x0,(%rsi)
   0xffffff7f813ff3b0:  jne    0xffffff7f813ff3d6
   0xffffff7f813ff3b2:  mov    %r12d,%eax
   0xffffff7f813ff3b5:  imul   $0x148,%rax,%rax
   0xffffff7f813ff3bc:  lea    0x153bd(%rip),%rcx        # 0xffffff7f81414780
   0xffffff7f813ff3c3:  mov    (%rcx),%rcx 
   0xffffff7f813ff3c6:  mov    0x20(%rcx),%rcx
   0xffffff7f813ff3ca:  mov    0xc(%rcx,%rax,1),%eax
   0xffffff7f813ff3ce:  mov    $0x1,%ecx
   0xffffff7f813ff3d3:  mwait  %rax,%rcx
=> 0xffffff7f813ff3d6:  lfence
   0xffffff7f813ff3d9:  rdtsc  
   0xffffff7f813ff3db:  lfence 
   0xffffff7f813ff3de:  mov    %rax,%rbx
   0xffffff7f813ff3e1:  mov    %rdx,%r15
...

Also, there were a few more within the range occupied by
AppleIntelCPUPowerManagement.kext (which provides is the "smart"
idle loop used by OS X):


...
   0xffffff7f813f799a:  mov    0x90(%r15),%rax
   0xffffff7f813f79a1:  mov    0x18(%rax),%r15
   0xffffff7f813f79a5:  xor    %ecx,%ecx
   0xffffff7f813f79a7:  mov    %r15,%rax
   0xffffff7f813f79aa:  xor    %edx,%edx
   0xffffff7f813f79ac:  monitor %rax,%rcx,%rdx
   0xffffff7f813f79af:  mov    %r12d,%r12d
   0xffffff7f813f79b2:  imul   $0x148,%r12,%r13
   0xffffff7f813f79b9:  lea    0x1cdc0(%rip),%rax        # 0xffffff7f81414780
   0xffffff7f813f79c0:  mov    (%rax),%rax
   0xffffff7f813f79c3:  mov    0x20(%rax),%rcx
   0xffffff7f813f79c7:  testb  $0x10,0x2(%rcx,%r13,1)
   0xffffff7f813f79cd:  je     0xffffff7f813f79d5
   0xffffff7f813f79cf:  callq  *0x80(%rax)
   0xffffff7f813f79d5:  test   %r14,%r14
   0xffffff7f813f79d8:  je     0xffffff7f813f79f4
   0xffffff7f813f79da:  movq   $0x0,0x8(%r14)
   0xffffff7f813f79e2:  movl   $0x0,(%r14)
   0xffffff7f813f79e9:  test   %ebx,%ebx
   0xffffff7f813f79eb:  je     0xffffff7f813f79fa
   0xffffff7f813f79ed:  mfence  
   0xffffff7f813f79f0:  wbinvd 
   0xffffff7f813f79f2:  jmp    0xffffff7f813f79fa
   0xffffff7f813f79f4:  cmpl   $0x0,(%r15)
   0xffffff7f813f79f8:  jne    0xffffff7f813f7a15
   0xffffff7f813f79fa:  lea    0x1cd7f(%rip),%rax        # 0xffffff7f81414780
   0xffffff7f813f7a01:  mov    (%rax),%rax
   0xffffff7f813f7a04:  mov    0x20(%rax),%rax
   0xffffff7f813f7a08:  mov    0xc(%rax,%r13,1),%eax
   0xffffff7f813f7a0d:  mov    $0x1,%ecx
   0xffffff7f813f7a12:  mwait  %rax,%rcx
   0xffffff7f813f7a15:  lfence 
   0xffffff7f813f7a18:  rdtsc  
   0xffffff7f813f7a1a:  lfence 
   0xffffff7f813f7a1d:  mov    %rax,%rbx
   0xffffff7f813f7a20:  mov    %rdx,%r15
...

...
   0xffffff7f813f89c9:  xor    %ecx,%ecx
   0xffffff7f813f89cb:  mov    %r13,%rax
   0xffffff7f813f89ce:  xor    %edx,%edx
   0xffffff7f813f89d0:  monitor %rax,%rcx,%rdx
   0xffffff7f813f89d3:  mov    %r12d,%r15d
   0xffffff7f813f89d6:  imul   $0x148,%r15,%r12
   0xffffff7f813f89dd:  lea    0x1bd9c(%rip),%rax        # 0xffffff7f81414780
   0xffffff7f813f89e4:  mov    (%rax),%rax
   0xffffff7f813f89e7:  mov    0x20(%rax),%rcx
   0xffffff7f813f89eb:  testb  $0x10,0x2(%rcx,%r12,1)
   0xffffff7f813f89f1:  je     0xffffff7f813f89f9
   0xffffff7f813f89f3:  callq  *0x80(%rax)
   0xffffff7f813f89f9:  test   %r14,%r14
   0xffffff7f813f89fc:  je     0xffffff7f813f8a18
   0xffffff7f813f89fe:  movq   $0x0,0x8(%r14)
   0xffffff7f813f8a06:  movl   $0x0,(%r14)
   0xffffff7f813f8a0d:  test   %ebx,%ebx
   0xffffff7f813f8a0f:  je     0xffffff7f813f8a1f
   0xffffff7f813f8a11:  mfence 
   0xffffff7f813f8a14:  wbinvd 
   0xffffff7f813f8a16:  jmp    0xffffff7f813f8a1f
   0xffffff7f813f8a18:  cmpl   $0x0,0x0(%r13)
   0xffffff7f813f8a1d:  jne    0xffffff7f813f8a3a
   0xffffff7f813f8a1f:  lea    0x1bd5a(%rip),%rax        # 0xffffff7f81414780
   0xffffff7f813f8a26:  mov    (%rax),%rax
   0xffffff7f813f8a29:  mov    0x20(%rax),%rax
   0xffffff7f813f8a2d:  mov    0xc(%rax,%r12,1),%eax
   0xffffff7f813f8a32:  mov    $0x1,%ecx
   0xffffff7f813f8a37:  mwait  %rax,%rcx
   0xffffff7f813f8a3a:  lfence 
   0xffffff7f813f8a3d:  rdtsc  
   0xffffff7f813f8a3f:  lfence  
   0xffffff7f813f8a42:  mov    %rax,%rbx
   0xffffff7f813f8a45:  mov    %rdx,%r12
   0xffffff7f813f8a48:  shl    $0x20,%r12
...

...
   0xffffff7f81401c10:  mov    %r13,%rax
   0xffffff7f81401c13:  xor    %edx,%edx
   0xffffff7f81401c15:  monitor %rax,%rcx,%rdx
   0xffffff7f81401c18:  mov    %r12d,%r15d
   0xffffff7f81401c1b:  imul   $0x148,%r15,%r12
   0xffffff7f81401c22:  lea    0x12b57(%rip),%rax        # 0xffffff7f81414780
   0xffffff7f81401c29:  mov    (%rax),%rax
   0xffffff7f81401c2c:  mov    0x20(%rax),%rcx
   0xffffff7f81401c30:  testb  $0x10,0x2(%rcx,%r12,1)
   0xffffff7f81401c36:  je     0xffffff7f81401c3e
   0xffffff7f81401c38:  callq  *0x80(%rax)
   0xffffff7f81401c3e:  test   %r14,%r14
   0xffffff7f81401c41:  je     0xffffff7f81401c5d
   0xffffff7f81401c43:  movq   $0x0,0x8(%r14)
   0xffffff7f81401c4b:  movl   $0x0,(%r14)
   0xffffff7f81401c52:  test   %ebx,%ebx
   0xffffff7f81401c54:  je     0xffffff7f81401c64
   0xffffff7f81401c56:  mfence 
   0xffffff7f81401c59:  wbinvd  
   0xffffff7f81401c5b:  jmp    0xffffff7f81401c64
   0xffffff7f81401c5d:  cmpl   $0x0,0x0(%r13)
   0xffffff7f81401c62:  jne    0xffffff7f81401c7f
   0xffffff7f81401c64:  lea    0x12b15(%rip),%rax        # 0xffffff7f81414780
   0xffffff7f81401c6b:  mov    (%rax),%rax
   0xffffff7f81401c6e:  mov    0x20(%rax),%rax
   0xffffff7f81401c72:  mov    0xc(%rax,%r12,1),%eax
   0xffffff7f81401c77:  mov    $0x1,%ecx
   0xffffff7f81401c7c:  mwait  %rax,%rcx
   0xffffff7f81401c7f:  lfence 
   0xffffff7f81401c82:  rdtsc  
   0xffffff7f81401c84:  lfence 
   0xffffff7f81401c87:  mov    %rax,%rbx
   0xffffff7f81401c8a:  mov    %rdx,%r12
   0xffffff7f81401c8d:  shl    $0x20,%r12
   0xffffff7f81401c91:  lea    0xaf1c(%rip),%rax        # 0xffffff7f8140cbb4
   0xffffff7f81401c98:  testb  $0x1,(%rax)
...

If that's not enough context, I can email you the whole 'script'
output I collected...

HTH,
--Gabriel
Michael S. Tsirkin March 17, 2017, 2:03 a.m. UTC | #9
On Thu, Mar 16, 2017 at 05:14:15PM -0400, Gabriel L. Somlo wrote:
> On Thu, Mar 16, 2017 at 04:17:11PM -0400, Gabriel L. Somlo wrote:
> > On Thu, Mar 16, 2017 at 09:27:56PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, Mar 16, 2017 at 03:24:41PM -0400, Gabriel L. Somlo wrote:
> > > > On Thu, Mar 16, 2017 at 08:29:32PM +0200, Michael S. Tsirkin wrote:
> > > > > Let's take a step back and try to figure out how is
> > > > > mwait called. How about dumping code of VCPUs
> > > > > around mwait?  gdb disa command will do this.
> > > > 
> > > > Started guest with '-s', tried to attach from gdb with
> > > > "target remote localhost:1234", got
> > > > "remote 'g' packet reply is too long: <lengthy string of numbers>"
> > > 
> > > Try
> > > 
> > > set arch x86-64:x86-64
> > 
> > 'set architecture i386:x86-64:intel' is what worked for me;
> > 
> > Been rooting around for a while, can't find mwait or monitor :(
> > 
> > Guess I'll have to recompile KVM to actually issue an invalid opcode,
> > so OS X will print a panic message with the exact address :)
> > 
> > Stay tuned...
> 
> OK, so I found a few instances. The one closest to where a random
> interrupt from gdb landed, was this one:
> 
> ...
>    0xffffff7f813ff379:  mov    0x90(%r15),%rax
>    0xffffff7f813ff380:  mov    0x18(%rax),%rsi
>    0xffffff7f813ff384:  xor    %ecx,%ecx
>    0xffffff7f813ff386:  mov    %rsi,%rax
>    0xffffff7f813ff389:  xor    %edx,%edx
>    0xffffff7f813ff38b:  monitor %rax,%rcx,%rdx
>    0xffffff7f813ff38e:  test   %r14,%r14
>    0xffffff7f813ff391:  je     0xffffff7f813ff3ad
>    0xffffff7f813ff393:  movq   $0x0,0x8(%r14)
>    0xffffff7f813ff39b:  movl   $0x0,(%r14)
>    0xffffff7f813ff3a2:  test   %ebx,%ebx
>    0xffffff7f813ff3a4:  je     0xffffff7f813ff3b2
>    0xffffff7f813ff3a6:  mfence 
>    0xffffff7f813ff3a9:  wbinvd
>    0xffffff7f813ff3ab:  jmp    0xffffff7f813ff3b2
>    0xffffff7f813ff3ad:  cmpl   $0x0,(%rsi)

Seems to do cmpl - could indicate it uses different bytes
for signalling? Radim's test monitors and
modifies the same byte...

>    0xffffff7f813ff3b0:  jne    0xffffff7f813ff3d6
>    0xffffff7f813ff3b2:  mov    %r12d,%eax
>    0xffffff7f813ff3b5:  imul   $0x148,%rax,%rax
>    0xffffff7f813ff3bc:  lea    0x153bd(%rip),%rcx        # 0xffffff7f81414780
>    0xffffff7f813ff3c3:  mov    (%rcx),%rcx 
>    0xffffff7f813ff3c6:  mov    0x20(%rcx),%rcx
>    0xffffff7f813ff3ca:  mov    0xc(%rcx,%rax,1),%eax
>    0xffffff7f813ff3ce:  mov    $0x1,%ecx
>    0xffffff7f813ff3d3:  mwait  %rax,%rcx
> => 0xffffff7f813ff3d6:  lfence
>    0xffffff7f813ff3d9:  rdtsc  
>    0xffffff7f813ff3db:  lfence 
>    0xffffff7f813ff3de:  mov    %rax,%rbx
>    0xffffff7f813ff3e1:  mov    %rdx,%r15
> ...

OK nice, so it's actually using 1 for ECX. Now what's rax?
Can you check that with gdb pls, then try that value with
Radim's test?
Gabriel L. Somlo March 17, 2017, 1:23 p.m. UTC | #10
On Fri, Mar 17, 2017 at 04:03:59AM +0200, Michael S. Tsirkin wrote:
> On Thu, Mar 16, 2017 at 05:14:15PM -0400, Gabriel L. Somlo wrote:
> > On Thu, Mar 16, 2017 at 04:17:11PM -0400, Gabriel L. Somlo wrote:
> > > On Thu, Mar 16, 2017 at 09:27:56PM +0200, Michael S. Tsirkin wrote:
> > > > On Thu, Mar 16, 2017 at 03:24:41PM -0400, Gabriel L. Somlo wrote:
> > > > > On Thu, Mar 16, 2017 at 08:29:32PM +0200, Michael S. Tsirkin wrote:
> > > > > > Let's take a step back and try to figure out how is
> > > > > > mwait called. How about dumping code of VCPUs
> > > > > > around mwait?  gdb disa command will do this.
> > > > > 
> > > > > Started guest with '-s', tried to attach from gdb with
> > > > > "target remote localhost:1234", got
> > > > > "remote 'g' packet reply is too long: <lengthy string of numbers>"
> > > > 
> > > > Try
> > > > 
> > > > set arch x86-64:x86-64
> > > 
> > > 'set architecture i386:x86-64:intel' is what worked for me;
> > > 
> > > Been rooting around for a while, can't find mwait or monitor :(
> > > 
> > > Guess I'll have to recompile KVM to actually issue an invalid opcode,
> > > so OS X will print a panic message with the exact address :)
> > > 
> > > Stay tuned...
> > 
> > OK, so I found a few instances. The one closest to where a random
> > interrupt from gdb landed, was this one:
> > 
> > ...
> >    0xffffff7f813ff379:  mov    0x90(%r15),%rax
> >    0xffffff7f813ff380:  mov    0x18(%rax),%rsi
> >    0xffffff7f813ff384:  xor    %ecx,%ecx
> >    0xffffff7f813ff386:  mov    %rsi,%rax
> >    0xffffff7f813ff389:  xor    %edx,%edx
> >    0xffffff7f813ff38b:  monitor %rax,%rcx,%rdx
> >    0xffffff7f813ff38e:  test   %r14,%r14
> >    0xffffff7f813ff391:  je     0xffffff7f813ff3ad
> >    0xffffff7f813ff393:  movq   $0x0,0x8(%r14)
> >    0xffffff7f813ff39b:  movl   $0x0,(%r14)
> >    0xffffff7f813ff3a2:  test   %ebx,%ebx
> >    0xffffff7f813ff3a4:  je     0xffffff7f813ff3b2
> >    0xffffff7f813ff3a6:  mfence 
> >    0xffffff7f813ff3a9:  wbinvd
> >    0xffffff7f813ff3ab:  jmp    0xffffff7f813ff3b2
> >    0xffffff7f813ff3ad:  cmpl   $0x0,(%rsi)
> 
> Seems to do cmpl - could indicate it uses different bytes
> for signalling? Radim's test monitors and
> modifies the same byte...
> 
> >    0xffffff7f813ff3b0:  jne    0xffffff7f813ff3d6
> >    0xffffff7f813ff3b2:  mov    %r12d,%eax
> >    0xffffff7f813ff3b5:  imul   $0x148,%rax,%rax
> >    0xffffff7f813ff3bc:  lea    0x153bd(%rip),%rcx        # 0xffffff7f81414780
> >    0xffffff7f813ff3c3:  mov    (%rcx),%rcx 
> >    0xffffff7f813ff3c6:  mov    0x20(%rcx),%rcx
> >    0xffffff7f813ff3ca:  mov    0xc(%rcx,%rax,1),%eax
> >    0xffffff7f813ff3ce:  mov    $0x1,%ecx
> >    0xffffff7f813ff3d3:  mwait  %rax,%rcx
> > => 0xffffff7f813ff3d6:  lfence
> >    0xffffff7f813ff3d9:  rdtsc  
> >    0xffffff7f813ff3db:  lfence 
> >    0xffffff7f813ff3de:  mov    %rax,%rbx
> >    0xffffff7f813ff3e1:  mov    %rdx,%r15
> > ...
> 
> OK nice, so it's actually using 1 for ECX. Now what's rax?
> Can you check that with gdb pls, then try that value with
> Radim's test?

Thread 1 received signal SIGINT, Interrupt.
0xffffff80002c8991 in ?? ()
(gdb) break *0xffffff7f813ff3ce
Breakpoint 1 at 0xffffff7f813ff3ce
(gdb) continue
Continuing.

Thread 3 hit Breakpoint 1, 0xffffff7f813ff3ce in ?? ()
(gdb) p $rax
$1 = 240
(gdb) cont
Continuing.
[Switching to Thread 1]

Thread 1 hit Breakpoint 1, 0xffffff7f813ff3ce in ?? ()
(gdb) p $rax
$2 = 240
(gdb) cont
Continuing.
[Switching to Thread 4]

Thread 4 hit Breakpoint 1, 0xffffff7f813ff3ce in ?? ()
(gdb) p $rax
$3 = 240
(gdb) cont 
Continuing.

Thread 4 hit Breakpoint 1, 0xffffff7f813ff3ce in ?? ()
(gdb) p $rax
$4 = 240
(gdb) 

So, 240 or 0xf0

OK, now on to Radim's test, on the MacPro1,1:

[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real    0m0.746s
user    0m0.542s
sys     0m0.215s
[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0'
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0
enabling apic
PASS: resumed from mwait 10000 times
SUMMARY: 1 tests

real    0m0.743s
user    0m0.528s
sys     0m0.226s
[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1' -smp 2
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1 -smp 2
enabling apic
enabling apic
FAIL: resumed from mwait 10150 times
SUMMARY: 1 tests, 1 unexpected failures

real    0m0.745s
user    0m0.545s
sys     0m0.214s
[kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0' -smp 2
timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0 -smp 2
enabling apic
enabling apic
FAIL: resumed from mwait 10116 times
SUMMARY: 1 tests, 1 unexpected failures

real    0m0.744s
user    0m0.541s
sys     0m0.217s

HTH,
--Gabriel
Michael S. Tsirkin March 21, 2017, 3:22 a.m. UTC | #11
On Fri, Mar 17, 2017 at 09:23:56AM -0400, Gabriel L. Somlo wrote:
> OK, now on to Radim's test, on the MacPro1,1:
> 
> [kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1'
> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1
> enabling apic
> PASS: resumed from mwait 10000 times
> SUMMARY: 1 tests
> 
> real    0m0.746s
> user    0m0.542s
> sys     0m0.215s
> [kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0'
> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0
> enabling apic
> PASS: resumed from mwait 10000 times
> SUMMARY: 1 tests
> 
> real    0m0.743s
> user    0m0.528s
> sys     0m0.226s
> [kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1' -smp 2
> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1 -smp 2
> enabling apic
> enabling apic
> FAIL: resumed from mwait 10150 times
> SUMMARY: 1 tests, 1 unexpected failures
> 
> real    0m0.745s
> user    0m0.545s
> sys     0m0.214s
> [kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0' -smp 2
> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0 -smp 2
> enabling apic
> enabling apic
> FAIL: resumed from mwait 10116 times
> SUMMARY: 1 tests, 1 unexpected failures
> 
> real    0m0.744s
> user    0m0.541s
> sys     0m0.217s
> 
> HTH,
> --Gabriel

Weird. How can it go above 10000? Radim - any idea?
Radim Krčmář March 21, 2017, 4:58 p.m. UTC | #12
2017-03-21 05:22+0200, Michael S. Tsirkin:
> On Fri, Mar 17, 2017 at 09:23:56AM -0400, Gabriel L. Somlo wrote:
>> OK, now on to Radim's test, on the MacPro1,1:
>> 
>> [kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1'
>> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1
>> enabling apic
>> PASS: resumed from mwait 10000 times
>> SUMMARY: 1 tests
>> 
>> real    0m0.746s
>> user    0m0.542s
>> sys     0m0.215s
>> [kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0'
>> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0
>> enabling apic
>> PASS: resumed from mwait 10000 times
>> SUMMARY: 1 tests
>> 
>> real    0m0.743s
>> user    0m0.528s
>> sys     0m0.226s
>> [kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 1' -smp 2
>> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 1 -smp 2
>> enabling apic
>> enabling apic
>> FAIL: resumed from mwait 10150 times
>> SUMMARY: 1 tests, 1 unexpected failures
>> 
>> real    0m0.745s
>> user    0m0.545s
>> sys     0m0.214s
>> [kvm-unit-tests]$ time TIMEOUT=20 ./x86-run x86/mwait.flat -append '240 1 0' -smp 2
>> timeout -k 1s --foreground 20 qemu-kvm -nodefaults -enable-kvm -device pc-testdev -device isa-debug-exit,iobase=0xf4,iosize=0x4 -vnc none -serial stdio -device pci-testdev -kernel x86/mwait.flat -append 240 1 0 -smp 2
>> enabling apic
>> enabling apic
>> FAIL: resumed from mwait 10116 times
>> SUMMARY: 1 tests, 1 unexpected failures
>> 
>> real    0m0.744s
>> user    0m0.541s
>> sys     0m0.217s
>> 
>> HTH,
>> --Gabriel
> 
> Weird. How can it go above 10000? Radim - any idea?

In '-smp 2', the writing VCPU always does 10000 wakeups by writing into
monitored memory, but the mwaiting VCPU can be also woken up by host
interrupts, which might add a few exits depending on timing.

I didn't spend much time in making the PASS/FAIL mean much, or ensuring
that we only get 10000 wakeups ... it is nothing to be worried about.

Hint 240 behaves as nop even on my system, so I still don't find
anything insane on that machine (if OS X is exluded) ...
Nadav Amit March 21, 2017, 5:29 p.m. UTC | #13
> On Mar 21, 2017, at 9:58 AM, Radim Krčmář <rkrcmar@redhat.com> wrote:

> In '-smp 2', the writing VCPU always does 10000 wakeups by writing into
> monitored memory, but the mwaiting VCPU can be also woken up by host
> interrupts, which might add a few exits depending on timing.
> 
> I didn't spend much time in making the PASS/FAIL mean much, or ensuring
> that we only get 10000 wakeups ... it is nothing to be worried about.
> 
> Hint 240 behaves as nop even on my system, so I still don't find
> anything insane on that machine (if OS X is exluded) ...

From my days in Intel (10 years ago), I can say that MWAIT wakes for many
microarchitecural events beside interrupts.

Out of curiosity, aren’t you worried that on OS X the wbinvd causes an exit
after the monitor and before the mwait?
diff mbox

Patch

diff --git a/x86/Makefile.common b/x86/Makefile.common
index 1dad18ba26e1..1e708a6acd39 100644
--- a/x86/Makefile.common
+++ b/x86/Makefile.common
@@ -46,6 +46,7 @@  tests-common = $(TEST_DIR)/vmexit.flat $(TEST_DIR)/tsc.flat \
                $(TEST_DIR)/tsc_adjust.flat $(TEST_DIR)/asyncpf.flat \
                $(TEST_DIR)/init.flat $(TEST_DIR)/smap.flat \
                $(TEST_DIR)/hyperv_synic.flat $(TEST_DIR)/hyperv_stimer.flat \
+               $(TEST_DIR)/mwait.flat \
 
 ifdef API
 tests-common += api/api-sample
diff --git a/x86/mwait.c b/x86/mwait.c
new file mode 100644
index 000000000000..c21dab5cc97d
--- /dev/null
+++ b/x86/mwait.c
@@ -0,0 +1,41 @@ 
+#include "vm.h"
+
+#define TARGET_RESUMES 10000
+volatile unsigned page[4096 / 4];
+
+/*
+ * Execute
+ *   time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 1'
+ * (first two arguments are eax and ecx for MWAIT, the third is FLAGS.IF bit)
+ * I assume you have 1000 Hz scheduler, so the test should take about 10
+ * seconds to run if mwait works (host timer interrupts will kick mwait).
+ *
+ * If you get far less, then mwait is just nop, as in the case of
+ *
+ *   time TIMEOUT=20 ./x86-run x86/mwait.flat -append '0 1 0'
+ *
+ * All other combinations of arguments should take 10 seconds.
+ * Getting killed by the TIMEOUT most likely means that you have different HZ,
+ * but could also be a bug ...
+ */
+int main(int argc, char **argv)
+{
+	uint32_t eax = atol(argv[1]);
+	uint32_t ecx = atol(argv[2]);
+	bool sti = atol(argv[3]);
+	unsigned resumes = 0;
+
+	if (sti)
+		asm volatile ("sti");
+	else
+		asm volatile ("cli");
+
+	while (resumes < TARGET_RESUMES) {
+		asm volatile("monitor" :: "a" (page), "c" (0), "d" (0));
+		asm volatile("mwait" :: "a" (eax), "c" (ecx));
+		resumes++;
+	}
+
+	report("resumed from mwait %u times", resumes == TARGET_RESUMES, resumes);
+	return report_summary();
+}