rcutorture: Tweak kvm options
diff mbox series

Message ID 20190424073446.8577-1-bigeasy@linutronix.de
State New
Headers show
Series
  • rcutorture: Tweak kvm options
Related show

Commit Message

Sebastian Andrzej Siewior April 24, 2019, 7:34 a.m. UTC
In one of my rcutorture tests the TSC clocksource got marked unstable
due to a large difference in the TSC value. I'm not sure if the guest
run for a long time with disabled interrupts or if the host was very
busy and didn't schedule the guest for some time.
I took a look on the qemu/KVM options and decided to update the options:
- Use kvm{32|64} as CPU. We could probably use `host' (like ARM does)
  for maximum available features but since we don't run any userland I'm
  not sure if it makes any difference.

- Drop the "noapic" option, enable TSC deadline timer. There is no
  history why the APIC was disabled, I see no reason for it. The
  deadline timer is probably "nicer".

- Additional config options. It ensures that the kernel knowns that it
  runs as a kvm guest and can use virt devices like the kvm-clock as
  clocksource. The kvm-clock was the main motivation here.

- I didn't add a random HW device. It would make the random device ready
  earlier (not it doesn't complete the initialisation at all) but I
  doubt that there is any need for this.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 tools/testing/selftests/rcutorture/bin/functions.sh | 13 ++++++++++++-
 .../selftests/rcutorture/configs/rcu/CFcommon       |  4 ++++
 2 files changed, 16 insertions(+), 1 deletion(-)

Comments

Paul E. McKenney April 24, 2019, 10:38 a.m. UTC | #1
On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
> In one of my rcutorture tests the TSC clocksource got marked unstable
> due to a large difference in the TSC value. I'm not sure if the guest
> run for a long time with disabled interrupts or if the host was very
> busy and didn't schedule the guest for some time.
> I took a look on the qemu/KVM options and decided to update the options:
> - Use kvm{32|64} as CPU. We could probably use `host' (like ARM does)
>   for maximum available features but since we don't run any userland I'm
>   not sure if it makes any difference.
> 
> - Drop the "noapic" option, enable TSC deadline timer. There is no
>   history why the APIC was disabled, I see no reason for it. The
>   deadline timer is probably "nicer".
> 
> - Additional config options. It ensures that the kernel knowns that it
>   runs as a kvm guest and can use virt devices like the kvm-clock as
>   clocksource. The kvm-clock was the main motivation here.
> 
> - I didn't add a random HW device. It would make the random device ready
>   earlier (not it doesn't complete the initialisation at all) but I
>   doubt that there is any need for this.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Thank you, Sebastian!  Queued for review and testing.

							Thanx, Paul

> ---
>  tools/testing/selftests/rcutorture/bin/functions.sh | 13 ++++++++++++-
>  .../selftests/rcutorture/configs/rcu/CFcommon       |  4 ++++
>  2 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh
> index 6bcb8b5b2ff22..be3c5c73d7e79 100644
> --- a/tools/testing/selftests/rcutorture/bin/functions.sh
> +++ b/tools/testing/selftests/rcutorture/bin/functions.sh
> @@ -172,7 +172,7 @@ identify_qemu_append () {
>  	local console=ttyS0
>  	case "$1" in
>  	qemu-system-x86_64|qemu-system-i386)
> -		echo noapic selinux=0 initcall_debug debug
> +		echo selinux=0 initcall_debug debug
>  		;;
>  	qemu-system-aarch64)
>  		console=ttyAMA0
> @@ -191,8 +191,19 @@ identify_qemu_append () {
>  # Output arguments for qemu arguments based on the TORTURE_QEMU_MAC
>  # and TORTURE_QEMU_INTERACTIVE environment variables.
>  identify_qemu_args () {
> +	local KVM_CPU=""
> +	case "$1" in
> +	qemu-system-x86_64)
> +		KVM_CPU=kvm64
> +		;;
> +	qemu-system-i386)
> +		KVM_CPU=kvm32
> +		;;
> +	esac
>  	case "$1" in
>  	qemu-system-x86_64|qemu-system-i386)
> +		echo -machine q35,accel=kvm
> +		echo -cpu ${KVM_CPU},x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on
>  		;;
>  	qemu-system-aarch64)
>  		echo -machine virt,gic-version=host -cpu host
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
> index d2d2a86139db1..322d5d40443cd 100644
> --- a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
> @@ -1,2 +1,6 @@
>  CONFIG_RCU_TORTURE_TEST=y
>  CONFIG_PRINTK_TIME=y
> +CONFIG_HYPERVISOR_GUEST=y
> +CONFIG_PARAVIRT=y
> +CONFIG_PARAVIRT_SPINLOCKS=y
> +CONFIG_KVM_GUEST=y
> -- 
> 2.20.1
>
Paul E. McKenney April 24, 2019, 6:30 p.m. UTC | #2
On Wed, Apr 24, 2019 at 03:38:09AM -0700, Paul E. McKenney wrote:
> On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
> > In one of my rcutorture tests the TSC clocksource got marked unstable
> > due to a large difference in the TSC value. I'm not sure if the guest
> > run for a long time with disabled interrupts or if the host was very
> > busy and didn't schedule the guest for some time.
> > I took a look on the qemu/KVM options and decided to update the options:
> > - Use kvm{32|64} as CPU. We could probably use `host' (like ARM does)
> >   for maximum available features but since we don't run any userland I'm
> >   not sure if it makes any difference.
> > 
> > - Drop the "noapic" option, enable TSC deadline timer. There is no
> >   history why the APIC was disabled, I see no reason for it. The
> >   deadline timer is probably "nicer".
> > 
> > - Additional config options. It ensures that the kernel knowns that it
> >   runs as a kvm guest and can use virt devices like the kvm-clock as
> >   clocksource. The kvm-clock was the main motivation here.
> > 
> > - I didn't add a random HW device. It would make the random device ready
> >   earlier (not it doesn't complete the initialisation at all) but I
> >   doubt that there is any need for this.
> > 
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> 
> Thank you, Sebastian!  Queued for review and testing.

And it doesn't like my (admittedly ancient) QEMU, complaining about not
knowing about "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on".
If I remove these, it works.  I will be upgrading soon (famous last
words), so what I am going to do is queue the following separate
not-for-upstream patch that makes it work on my setup.

							Thanx, Paul

------------------------------------------------------------------------

commit d33dffe7d1f05adfc798836a9d30b9f6bf69635d
Author: Paul E. McKenney <paulmck@linux.ibm.com>
Date:   Wed Apr 24 05:20:07 2019 -0700

    EXP torture: Hack patch to support ancient QEMU
    
    To be removed before upstreaming or when Paul gets his QEMU act together,
    whichever comes first.  ;-)
    
    Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>

diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh
index be3c5c73d7e7..c3a49fb4d6f6 100644
--- a/tools/testing/selftests/rcutorture/bin/functions.sh
+++ b/tools/testing/selftests/rcutorture/bin/functions.sh
@@ -203,7 +203,7 @@ identify_qemu_args () {
 	case "$1" in
 	qemu-system-x86_64|qemu-system-i386)
 		echo -machine q35,accel=kvm
-		echo -cpu ${KVM_CPU},x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on
+		echo -cpu ${KVM_CPU}
 		;;
 	qemu-system-aarch64)
 		echo -machine virt,gic-version=host -cpu host
Joel Fernandes April 25, 2019, 4:45 p.m. UTC | #3
On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
> In one of my rcutorture tests the TSC clocksource got marked unstable
> due to a large difference in the TSC value. I'm not sure if the guest
> run for a long time with disabled interrupts or if the host was very
> busy and didn't schedule the guest for some time.
> I took a look on the qemu/KVM options and decided to update the options:
> - Use kvm{32|64} as CPU. We could probably use `host' (like ARM does)
>   for maximum available features but since we don't run any userland I'm
>   not sure if it makes any difference.
> 
> - Drop the "noapic" option, enable TSC deadline timer. There is no
>   history why the APIC was disabled, I see no reason for it. The
>   deadline timer is probably "nicer".

I was wondering why the tsc deadline timer can't just be the default in the
kernel if it is "nicer" / "better" , and why does it need to be an option.

> 
> - Additional config options. It ensures that the kernel knowns that it
>   runs as a kvm guest and can use virt devices like the kvm-clock as
>   clocksource. The kvm-clock was the main motivation here.
> 
> - I didn't add a random HW device. It would make the random device ready
>   earlier (not it doesn't complete the initialisation at all) but I
>   doubt that there is any need for this.

Didn't follow this point about "random HW device". It looks like there is no
part of the patch that matches this comment. There was no "random HW device"
needed for this change of the clocksource, so could you clarify what this means?

Otherwise lgtm, thanks!

Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>

- Joel

> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
>  tools/testing/selftests/rcutorture/bin/functions.sh | 13 ++++++++++++-
>  .../selftests/rcutorture/configs/rcu/CFcommon       |  4 ++++
>  2 files changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh
> index 6bcb8b5b2ff22..be3c5c73d7e79 100644
> --- a/tools/testing/selftests/rcutorture/bin/functions.sh
> +++ b/tools/testing/selftests/rcutorture/bin/functions.sh
> @@ -172,7 +172,7 @@ identify_qemu_append () {
>  	local console=ttyS0
>  	case "$1" in
>  	qemu-system-x86_64|qemu-system-i386)
> -		echo noapic selinux=0 initcall_debug debug
> +		echo selinux=0 initcall_debug debug
>  		;;
>  	qemu-system-aarch64)
>  		console=ttyAMA0
> @@ -191,8 +191,19 @@ identify_qemu_append () {
>  # Output arguments for qemu arguments based on the TORTURE_QEMU_MAC
>  # and TORTURE_QEMU_INTERACTIVE environment variables.
>  identify_qemu_args () {
> +	local KVM_CPU=""
> +	case "$1" in
> +	qemu-system-x86_64)
> +		KVM_CPU=kvm64
> +		;;
> +	qemu-system-i386)
> +		KVM_CPU=kvm32
> +		;;
> +	esac
>  	case "$1" in
>  	qemu-system-x86_64|qemu-system-i386)
> +		echo -machine q35,accel=kvm
> +		echo -cpu ${KVM_CPU},x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on
>  		;;
>  	qemu-system-aarch64)
>  		echo -machine virt,gic-version=host -cpu host
> diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
> index d2d2a86139db1..322d5d40443cd 100644
> --- a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
> +++ b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
> @@ -1,2 +1,6 @@
>  CONFIG_RCU_TORTURE_TEST=y
>  CONFIG_PRINTK_TIME=y
> +CONFIG_HYPERVISOR_GUEST=y
> +CONFIG_PARAVIRT=y
> +CONFIG_PARAVIRT_SPINLOCKS=y
> +CONFIG_KVM_GUEST=y
> -- 
> 2.20.1
>
Sebastian Andrzej Siewior April 25, 2019, 5:13 p.m. UTC | #4
On 2019-04-25 12:45:58 [-0400], Joel Fernandes wrote:
> On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
> > In one of my rcutorture tests the TSC clocksource got marked unstable
> > due to a large difference in the TSC value. I'm not sure if the guest
> > run for a long time with disabled interrupts or if the host was very
> > busy and didn't schedule the guest for some time.
> > I took a look on the qemu/KVM options and decided to update the options:
> > - Use kvm{32|64} as CPU. We could probably use `host' (like ARM does)
> >   for maximum available features but since we don't run any userland I'm
> >   not sure if it makes any difference.
> > 
> > - Drop the "noapic" option, enable TSC deadline timer. There is no
> >   history why the APIC was disabled, I see no reason for it. The
> >   deadline timer is probably "nicer".
> 
> I was wondering why the tsc deadline timer can't just be the default in the
> kernel if it is "nicer" / "better" , and why does it need to be an option.

The tsc-deadline=on part tells qemu to expose it. Otherwise the kernel
can't use HW that isn't there.
I added q35 as the machine which should pass enough sane default
options. If this tsc-deadline timer is a problem we could probably drop
it. The local-apic should work.

> > - I didn't add a random HW device. It would make the random device ready
> >   earlier (not it doesn't complete the initialisation at all) but I
> >   doubt that there is any need for this.
> 
> Didn't follow this point about "random HW device". It looks like there is no
> part of the patch that matches this comment. There was no "random HW device"
> needed for this change of the clocksource, so could you clarify what this means?

I wanted to "upgrade" the kvm options and as part of it also add:
         -object rng-random,filename=/dev/urandom,id=rng0
         -device virtio-rng-pci,rng=rng0

With that change you would see
|random: crng init done

during boot. Now you should end up with "only" 
|random: fast init done

I mentioned it because I didn't see a reason why to do so. If someone
has an idea why it would make sense for rcutorture to use it, I can add
it.

> Otherwise lgtm, thanks!
> 
> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> 
> - Joel

Sebastian
Paul E. McKenney April 25, 2019, 7:46 p.m. UTC | #5
On Wed, Apr 24, 2019 at 11:30:39AM -0700, Paul E. McKenney wrote:
> On Wed, Apr 24, 2019 at 03:38:09AM -0700, Paul E. McKenney wrote:
> > On Wed, Apr 24, 2019 at 09:34:46AM +0200, Sebastian Andrzej Siewior wrote:
> > > In one of my rcutorture tests the TSC clocksource got marked unstable
> > > due to a large difference in the TSC value. I'm not sure if the guest
> > > run for a long time with disabled interrupts or if the host was very
> > > busy and didn't schedule the guest for some time.
> > > I took a look on the qemu/KVM options and decided to update the options:
> > > - Use kvm{32|64} as CPU. We could probably use `host' (like ARM does)
> > >   for maximum available features but since we don't run any userland I'm
> > >   not sure if it makes any difference.
> > > 
> > > - Drop the "noapic" option, enable TSC deadline timer. There is no
> > >   history why the APIC was disabled, I see no reason for it. The
> > >   deadline timer is probably "nicer".
> > > 
> > > - Additional config options. It ensures that the kernel knowns that it
> > >   runs as a kvm guest and can use virt devices like the kvm-clock as
> > >   clocksource. The kvm-clock was the main motivation here.
> > > 
> > > - I didn't add a random HW device. It would make the random device ready
> > >   earlier (not it doesn't complete the initialisation at all) but I
> > >   doubt that there is any need for this.
> > > 
> > > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > 
> > Thank you, Sebastian!  Queued for review and testing.
> 
> And it doesn't like my (admittedly ancient) QEMU, complaining about not
> knowing about "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on".
> If I remove these, it works.  I will be upgrading soon (famous last
> words), so what I am going to do is queue the following separate
> not-for-upstream patch that makes it work on my setup.

Also, the !SMP scenarios get this:

:CONFIG_PARAVIRT_SPINLOCKS=y: improperly set

Would it make sense to only set this on CONFIG_SMP=y runs?  The easy
way to do this is to move it from CFcommon to the scenario files not
having CONFIG_SMP=n.  Or would something else work better?

Or am I doing something wrong?

							Thanx, Paul
Sebastian Andrzej Siewior April 26, 2019, 10:54 a.m. UTC | #6
On 2019-04-25 12:46:38 [-0700], Paul E. McKenney wrote:
> > And it doesn't like my (admittedly ancient) QEMU, complaining about not
> > knowing about "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on".
> > If I remove these, it works.  I will be upgrading soon (famous last
> > words), so what I am going to do is queue the following separate
> > not-for-upstream patch that makes it work on my setup.
> 
> Also, the !SMP scenarios get this:
> 
> :CONFIG_PARAVIRT_SPINLOCKS=y: improperly set

I ignored that (because there is a CHECK option to ensure that certain
options are set).

> Would it make sense to only set this on CONFIG_SMP=y runs?  The easy
> way to do this is to move it from CFcommon to the scenario files not
> having CONFIG_SMP=n.  Or would something else work better?
> 
> Or am I doing something wrong?

You should also get that warning on ARM for instance since those
PARAVIRT SPINLOCKS are only implemented on x86.
I have no numbers to compare the performance with and without that
option. So we could drop that option or you tell me where to look after
a run with/without that option so I can tell if it is worth the effort.

> 							Thanx, Paul

Sebastian
Paul E. McKenney April 26, 2019, 1:50 p.m. UTC | #7
On Fri, Apr 26, 2019 at 12:54:14PM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-04-25 12:46:38 [-0700], Paul E. McKenney wrote:
> > > And it doesn't like my (admittedly ancient) QEMU, complaining about not
> > > knowing about "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on".
> > > If I remove these, it works.  I will be upgrading soon (famous last
> > > words), so what I am going to do is queue the following separate
> > > not-for-upstream patch that makes it work on my setup.
> > 
> > Also, the !SMP scenarios get this:
> > 
> > :CONFIG_PARAVIRT_SPINLOCKS=y: improperly set
> 
> I ignored that (because there is a CHECK option to ensure that certain
> options are set).

True enough, but these things are a bit like compiler warnings in that
it is good to have a clean run.

> > Would it make sense to only set this on CONFIG_SMP=y runs?  The easy
> > way to do this is to move it from CFcommon to the scenario files not
> > having CONFIG_SMP=n.

(Which is how I am currently running, I should have added.)

> > having CONFIG_SMP=n.  Or would something else work better?
> > 
> > Or am I doing something wrong?
> 
> You should also get that warning on ARM for instance since those
> PARAVIRT SPINLOCKS are only implemented on x86.

OK, so we would want this to only be added for x86 runs.

> I have no numbers to compare the performance with and without that
> option. So we could drop that option or you tell me where to look after
> a run with/without that option so I can tell if it is worth the effort.

One place to look is in the summary output:

TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]

The "58.4667/s" is the number of grace periods per second.  I would be
surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in
grace-period rate (given the natural variation), but you never know.

							Thanx, Paul
Sebastian Andrzej Siewior April 29, 2019, 8:19 a.m. UTC | #8
On 2019-04-26 06:50:58 [-0700], Paul E. McKenney wrote:
> One place to look is in the summary output:
> 
> TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]
> 
> The "58.4667/s" is the number of grace periods per second.  I would be
> surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in
> grace-period rate (given the natural variation), but you never know.

I did four runs of the different parts of the patch:
- 5.1-rc6
- #1 + kvm64 CPU + some config options
- #2 + tsc-deadline=on and so on (the whole line)
- #3 + CONFIG_PARAVIRT_SPINLOCKS (now everything)

the test command was
	tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "16*TREE08" --memory 4G

and the results:
| HEAD is now at 085b7755808a... Linux 5.1-rc6
| (28.5942 +27.4658 +28.0203 +27.2061 +28.0731 +26.9078 +27.8494 + 27.3392 +26.4339 +28.025 +27.4797 +27.6775 +28.0653 +28.0742 +27.9581 +28.6508)/ 16
| 27.738775
| 
| HEAD is now at 36a12aa9761a... tune #1
| (28.5761 +26.6514 +26.6989 +27.4375 +27.3442 +28.3228 +26.6353 +27.5461+28.5531 +27.7006 +27.8078 +27.9753 +27.4269 +28.0464 +27.6314 +27.8356) / 16
| 27.6368375
| 
| HEAD is now at af5cd7196436... tune #2
| (28.4867 +26.3675 +27.6364 +28.3344 +27.4153 +27.9306 +27.1703 +26.8461+27.3194 +28.5486 +27.8975 +27.4356 +28.12 +28.4397 +29.0186 +26.9328 )/ 16
| 27.74371875
| 
| HEAD is now at 3701f64943f5... tune #3
| (28.2431 +27.7831 +28.39 +28.2586 +27.7408 +27.9258 +26.6236 +26.7817+29.1178 +26.9564 +29.0525 +27.4258 +27.4931 +27.8928 +26.9308 +28.4833)/ 16
| 27.8187

This 28.… is the number of GP/s. Based on the results in looks like
noise to me. Also I have no idea why you have more than twice as many
GP/s as I do.
Different box doing:
	tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "42*TREE01" --memory 4G
and got this:

| HEAD is now at 085b7755808aa... Linux 5.1-rc6
| (15.2878 + 15.8664 + 15.6369 + 15.6714 + 15.3667 + 16.4706 + 15.7844 +
| 16.2119 + 15.6108 + 15.84 + 16.0003 + 16.2886 + 15.8728 + 15.5347 +
| 15.6753 + 15.6628 + 15.8628 + 15.8158 + 15.8419 + 16.0053 + 15.7878 +
| 16.465 + 16.2267 + 16.6881 + 16.3186 + 16.1778 + 15.7069 + 16.0178 +
| 15.7156 + 16.0083 + 15.7181 + 15.8961 + 15.5253 + 16.1569 + 15.7692 +
| 16.2622 + 16.2931 + 15.9531 + 15.6697 + 15.4539 + 15.6478 + 15.8047) /
| 42
|	~15.89452142857142857143
|
| HEAD is now at 3701f64943f5a... tune #3
| ; (15.8461 + 15.8653 + 16.0392 + 15.8906 + 15.7606 + 15.6169 + 16.1425 +
| 15.9089 + 16.2169 + 16.1694 + 16.2122 + 15.6417 + 15.8022 + 16.1178 +
| 15.1689 + 16.1303 + 16.0181 + 16.3797 + 16.0614 + 16.2839 + 15.4583 +
| 15.9178 + 16.0589 + 16.3428 + 15.5486 + 16.0839 + 15.9931 + 15.8417 +
| 16.0981 + 15.8075 + 15.9925 + 15.7311 + 15.9172 + 16.1164 + 15.6303 +
| 15.9383 + 16.0714 + 16.2786 + 15.8394 + 15.9597 + 16.0175 + 15.3908) /
| 42
|	~15.93586904761904761905

Noise.

So from RCUtorture point of view there is no improvement right? In that
case I would suggest to drop the problematic parts…

> 							Thanx, Paul

Sebastian
Paul E. McKenney April 29, 2019, 2:49 p.m. UTC | #9
On Mon, Apr 29, 2019 at 10:19:44AM +0200, Sebastian Andrzej Siewior wrote:
> On 2019-04-26 06:50:58 [-0700], Paul E. McKenney wrote:
> > One place to look is in the summary output:
> > 
> > TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]
> > 
> > The "58.4667/s" is the number of grace periods per second.  I would be
> > surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in
> > grace-period rate (given the natural variation), but you never know.
> 
> I did four runs of the different parts of the patch:
> - 5.1-rc6
> - #1 + kvm64 CPU + some config options
> - #2 + tsc-deadline=on and so on (the whole line)
> - #3 + CONFIG_PARAVIRT_SPINLOCKS (now everything)
> 
> the test command was
> 	tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "16*TREE08" --memory 4G
> 
> and the results:
> | HEAD is now at 085b7755808a... Linux 5.1-rc6
> | (28.5942 +27.4658 +28.0203 +27.2061 +28.0731 +26.9078 +27.8494 + 27.3392 +26.4339 +28.025 +27.4797 +27.6775 +28.0653 +28.0742 +27.9581 +28.6508)/ 16
> | 27.738775
> | 
> | HEAD is now at 36a12aa9761a... tune #1
> | (28.5761 +26.6514 +26.6989 +27.4375 +27.3442 +28.3228 +26.6353 +27.5461+28.5531 +27.7006 +27.8078 +27.9753 +27.4269 +28.0464 +27.6314 +27.8356) / 16
> | 27.6368375
> | 
> | HEAD is now at af5cd7196436... tune #2
> | (28.4867 +26.3675 +27.6364 +28.3344 +27.4153 +27.9306 +27.1703 +26.8461+27.3194 +28.5486 +27.8975 +27.4356 +28.12 +28.4397 +29.0186 +26.9328 )/ 16
> | 27.74371875
> | 
> | HEAD is now at 3701f64943f5... tune #3
> | (28.2431 +27.7831 +28.39 +28.2586 +27.7408 +27.9258 +26.6236 +26.7817+29.1178 +26.9564 +29.0525 +27.4258 +27.4931 +27.8928 +26.9308 +28.4833)/ 16
> | 27.8187
> 
> This 28.… is the number of GP/s. Based on the results in looks like
> noise to me. Also I have no idea why you have more than twice as many
> GP/s as I do.

My guess is that because you have more CPUs, the for_each_online_cpu()
loop takes longer on your system.

> Different box doing:
> 	tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "42*TREE01" --memory 4G
> and got this:
> 
> | HEAD is now at 085b7755808aa... Linux 5.1-rc6
> | (15.2878 + 15.8664 + 15.6369 + 15.6714 + 15.3667 + 16.4706 + 15.7844 +
> | 16.2119 + 15.6108 + 15.84 + 16.0003 + 16.2886 + 15.8728 + 15.5347 +
> | 15.6753 + 15.6628 + 15.8628 + 15.8158 + 15.8419 + 16.0053 + 15.7878 +
> | 16.465 + 16.2267 + 16.6881 + 16.3186 + 16.1778 + 15.7069 + 16.0178 +
> | 15.7156 + 16.0083 + 15.7181 + 15.8961 + 15.5253 + 16.1569 + 15.7692 +
> | 16.2622 + 16.2931 + 15.9531 + 15.6697 + 15.4539 + 15.6478 + 15.8047) /
> | 42
> |	~15.89452142857142857143
> |
> | HEAD is now at 3701f64943f5a... tune #3
> | ; (15.8461 + 15.8653 + 16.0392 + 15.8906 + 15.7606 + 15.6169 + 16.1425 +
> | 15.9089 + 16.2169 + 16.1694 + 16.2122 + 15.6417 + 15.8022 + 16.1178 +
> | 15.1689 + 16.1303 + 16.0181 + 16.3797 + 16.0614 + 16.2839 + 15.4583 +
> | 15.9178 + 16.0589 + 16.3428 + 15.5486 + 16.0839 + 15.9931 + 15.8417 +
> | 16.0981 + 15.8075 + 15.9925 + 15.7311 + 15.9172 + 16.1164 + 15.6303 +
> | 15.9383 + 16.0714 + 16.2786 + 15.8394 + 15.9597 + 16.0175 + 15.3908) /
> | 42
> |	~15.93586904761904761905
> 
> Noise.
> 
> So from RCUtorture point of view there is no improvement right? In that
> case I would suggest to drop the problematic parts…

Thank you for testing this!  I will adjust accordingly.

							Thanx, Paul
Paul E. McKenney April 29, 2019, 3:06 p.m. UTC | #10
On Mon, Apr 29, 2019 at 07:49:24AM -0700, Paul E. McKenney wrote:
> On Mon, Apr 29, 2019 at 10:19:44AM +0200, Sebastian Andrzej Siewior wrote:
> > On 2019-04-26 06:50:58 [-0700], Paul E. McKenney wrote:
> > > One place to look is in the summary output:
> > > 
> > > TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]
> > > 
> > > The "58.4667/s" is the number of grace periods per second.  I would be
> > > surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in
> > > grace-period rate (given the natural variation), but you never know.
> > 
> > I did four runs of the different parts of the patch:
> > - 5.1-rc6
> > - #1 + kvm64 CPU + some config options
> > - #2 + tsc-deadline=on and so on (the whole line)
> > - #3 + CONFIG_PARAVIRT_SPINLOCKS (now everything)
> > 
> > the test command was
> > 	tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "16*TREE08" --memory 4G
> > 
> > and the results:
> > | HEAD is now at 085b7755808a... Linux 5.1-rc6
> > | (28.5942 +27.4658 +28.0203 +27.2061 +28.0731 +26.9078 +27.8494 + 27.3392 +26.4339 +28.025 +27.4797 +27.6775 +28.0653 +28.0742 +27.9581 +28.6508)/ 16
> > | 27.738775
> > | 
> > | HEAD is now at 36a12aa9761a... tune #1
> > | (28.5761 +26.6514 +26.6989 +27.4375 +27.3442 +28.3228 +26.6353 +27.5461+28.5531 +27.7006 +27.8078 +27.9753 +27.4269 +28.0464 +27.6314 +27.8356) / 16
> > | 27.6368375
> > | 
> > | HEAD is now at af5cd7196436... tune #2
> > | (28.4867 +26.3675 +27.6364 +28.3344 +27.4153 +27.9306 +27.1703 +26.8461+27.3194 +28.5486 +27.8975 +27.4356 +28.12 +28.4397 +29.0186 +26.9328 )/ 16
> > | 27.74371875
> > | 
> > | HEAD is now at 3701f64943f5... tune #3
> > | (28.2431 +27.7831 +28.39 +28.2586 +27.7408 +27.9258 +26.6236 +26.7817+29.1178 +26.9564 +29.0525 +27.4258 +27.4931 +27.8928 +26.9308 +28.4833)/ 16
> > | 27.8187
> > 
> > This 28.… is the number of GP/s. Based on the results in looks like
> > noise to me. Also I have no idea why you have more than twice as many
> > GP/s as I do.
> 
> My guess is that because you have more CPUs, the for_each_online_cpu()
> loop takes longer on your system.

OK, that is rather oversimplified, to say the least.  A better way to
put this is that the probability of some CPU holding things up is larger
the more CPUs you have.  RCU doe take explicit steps to slow down grace
periods, but that doesn't start kicking in until 256 CPUs.

							Thanx, Paul

> > Different box doing:
> > 	tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "42*TREE01" --memory 4G
> > and got this:
> > 
> > | HEAD is now at 085b7755808aa... Linux 5.1-rc6
> > | (15.2878 + 15.8664 + 15.6369 + 15.6714 + 15.3667 + 16.4706 + 15.7844 +
> > | 16.2119 + 15.6108 + 15.84 + 16.0003 + 16.2886 + 15.8728 + 15.5347 +
> > | 15.6753 + 15.6628 + 15.8628 + 15.8158 + 15.8419 + 16.0053 + 15.7878 +
> > | 16.465 + 16.2267 + 16.6881 + 16.3186 + 16.1778 + 15.7069 + 16.0178 +
> > | 15.7156 + 16.0083 + 15.7181 + 15.8961 + 15.5253 + 16.1569 + 15.7692 +
> > | 16.2622 + 16.2931 + 15.9531 + 15.6697 + 15.4539 + 15.6478 + 15.8047) /
> > | 42
> > |	~15.89452142857142857143
> > |
> > | HEAD is now at 3701f64943f5a... tune #3
> > | ; (15.8461 + 15.8653 + 16.0392 + 15.8906 + 15.7606 + 15.6169 + 16.1425 +
> > | 15.9089 + 16.2169 + 16.1694 + 16.2122 + 15.6417 + 15.8022 + 16.1178 +
> > | 15.1689 + 16.1303 + 16.0181 + 16.3797 + 16.0614 + 16.2839 + 15.4583 +
> > | 15.9178 + 16.0589 + 16.3428 + 15.5486 + 16.0839 + 15.9931 + 15.8417 +
> > | 16.0981 + 15.8075 + 15.9925 + 15.7311 + 15.9172 + 16.1164 + 15.6303 +
> > | 15.9383 + 16.0714 + 16.2786 + 15.8394 + 15.9597 + 16.0175 + 15.3908) /
> > | 42
> > |	~15.93586904761904761905
> > 
> > Noise.
> > 
> > So from RCUtorture point of view there is no improvement right? In that
> > case I would suggest to drop the problematic parts…
> 
> Thank you for testing this!  I will adjust accordingly.
> 
> 							Thanx, Paul
Paul E. McKenney May 2, 2019, 6:41 p.m. UTC | #11
On Mon, Apr 29, 2019 at 08:06:00AM -0700, Paul E. McKenney wrote:
> On Mon, Apr 29, 2019 at 07:49:24AM -0700, Paul E. McKenney wrote:
> > On Mon, Apr 29, 2019 at 10:19:44AM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2019-04-26 06:50:58 [-0700], Paul E. McKenney wrote:
> > > > One place to look is in the summary output:
> > > > 
> > > > TREE01 ------- 17540 GPs (58.4667/s) [rcu: g130629 f0x0 ]
> > > > 
> > > > The "58.4667/s" is the number of grace periods per second.  I would be
> > > > surprised if CONFIG_PARAVIRT_SPINLOCKS made a noticeable difference in
> > > > grace-period rate (given the natural variation), but you never know.
> > > 
> > > I did four runs of the different parts of the patch:
> > > - 5.1-rc6
> > > - #1 + kvm64 CPU + some config options
> > > - #2 + tsc-deadline=on and so on (the whole line)
> > > - #3 + CONFIG_PARAVIRT_SPINLOCKS (now everything)
> > > 
> > > the test command was
> > > 	tools/testing/selftests/rcutorture/bin/kvm.sh --cpus 112 --duration 60 --configs "16*TREE08" --memory 4G
> > > 
> > > and the results:
> > > | HEAD is now at 085b7755808a... Linux 5.1-rc6
> > > | (28.5942 +27.4658 +28.0203 +27.2061 +28.0731 +26.9078 +27.8494 + 27.3392 +26.4339 +28.025 +27.4797 +27.6775 +28.0653 +28.0742 +27.9581 +28.6508)/ 16
> > > | 27.738775
> > > | 
> > > | HEAD is now at 36a12aa9761a... tune #1
> > > | (28.5761 +26.6514 +26.6989 +27.4375 +27.3442 +28.3228 +26.6353 +27.5461+28.5531 +27.7006 +27.8078 +27.9753 +27.4269 +28.0464 +27.6314 +27.8356) / 16
> > > | 27.6368375
> > > | 
> > > | HEAD is now at af5cd7196436... tune #2
> > > | (28.4867 +26.3675 +27.6364 +28.3344 +27.4153 +27.9306 +27.1703 +26.8461+27.3194 +28.5486 +27.8975 +27.4356 +28.12 +28.4397 +29.0186 +26.9328 )/ 16
> > > | 27.74371875
> > > | 
> > > | HEAD is now at 3701f64943f5... tune #3
> > > | (28.2431 +27.7831 +28.39 +28.2586 +27.7408 +27.9258 +26.6236 +26.7817+29.1178 +26.9564 +29.0525 +27.4258 +27.4931 +27.8928 +26.9308 +28.4833)/ 16
> > > | 27.8187
> > > 
> > > This 28.… is the number of GP/s. Based on the results in looks like
> > > noise to me. Also I have no idea why you have more than twice as many
> > > GP/s as I do.
> > 
> > My guess is that because you have more CPUs, the for_each_online_cpu()
> > loop takes longer on your system.
> 
> OK, that is rather oversimplified, to say the least.  A better way to
> put this is that the probability of some CPU holding things up is larger
> the more CPUs you have.  RCU doe take explicit steps to slow down grace
> periods, but that doesn't start kicking in until 256 CPUs.

And I ended up with the following variant of your patch.  If I don't
hear otherwise from you, I will assume that you are OK with it.  So if
something bothers you about it, please don't suffer in silence!

							Thanx, Paul

------------------------------------------------------------------------

commit 4a04229cf73ac9bc1ae15357beb32a0b37be1480
Author: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Date:   Wed Apr 24 09:34:46 2019 +0200

    rcutorture: Tweak kvm options
    
    In one of my rcutorture tests the TSC clocksource got marked unstable
    due to a large difference in the TSC value. I'm not sure if the guest
    run for a long time with disabled interrupts or if the host was very
    busy and didn't schedule the guest for some time.
    
    I took a look on the qemu/KVM options and decided to update the options:
    
    - Use kvm{32|64} as CPU. We could probably use `host' (like ARM does)
      for maximum available features but since we don't run any userland I'm
      not sure if it makes any difference.
    
    - Drop the "noapic" option. There is no history why the APIC was disabled,
      I see no reason for it.  Once old qemu versions fade away, we can add
      "x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on".
    
    - Additional config options. It ensures that the kernel knowns that it
      runs as a kvm guest and can use virt devices like the kvm-clock as
      clocksource. The kvm-clock was the main motivation here.
    
    - I didn't add a random HW device. It would make the random device ready
      earlier (not it doesn't complete the initialisation at all) but I
      doubt that there is any need for this.
    
    Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
    [ paulmck: The world is not quite ready for CONFIG_PARAVIRT_SPINLOCKS=y
      and x2apic, so they are omitted for the time being. ]
    Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>

diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh
index 6bcb8b5b2ff2..c3a49fb4d6f6 100644
--- a/tools/testing/selftests/rcutorture/bin/functions.sh
+++ b/tools/testing/selftests/rcutorture/bin/functions.sh
@@ -172,7 +172,7 @@ identify_qemu_append () {
 	local console=ttyS0
 	case "$1" in
 	qemu-system-x86_64|qemu-system-i386)
-		echo noapic selinux=0 initcall_debug debug
+		echo selinux=0 initcall_debug debug
 		;;
 	qemu-system-aarch64)
 		console=ttyAMA0
@@ -191,8 +191,19 @@ identify_qemu_append () {
 # Output arguments for qemu arguments based on the TORTURE_QEMU_MAC
 # and TORTURE_QEMU_INTERACTIVE environment variables.
 identify_qemu_args () {
+	local KVM_CPU=""
+	case "$1" in
+	qemu-system-x86_64)
+		KVM_CPU=kvm64
+		;;
+	qemu-system-i386)
+		KVM_CPU=kvm32
+		;;
+	esac
 	case "$1" in
 	qemu-system-x86_64|qemu-system-i386)
+		echo -machine q35,accel=kvm
+		echo -cpu ${KVM_CPU}
 		;;
 	qemu-system-aarch64)
 		echo -machine virt,gic-version=host -cpu host
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
index d2d2a86139db..e19a444a0684 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
+++ b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
@@ -1,2 +1,5 @@
 CONFIG_RCU_TORTURE_TEST=y
 CONFIG_PRINTK_TIME=y
+CONFIG_HYPERVISOR_GUEST=y
+CONFIG_PARAVIRT=y
+CONFIG_KVM_GUEST=y
Sebastian Andrzej Siewior May 3, 2019, 7:26 a.m. UTC | #12
On 2019-05-02 11:41:16 [-0700], Paul E. McKenney wrote:
> And I ended up with the following variant of your patch.  If I don't
> hear otherwise from you, I will assume that you are OK with it.  So if
> something bothers you about it, please don't suffer in silence!

perfect, thank you!

> 							Thanx, Paul

Sebastian

Patch
diff mbox series

diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh
index 6bcb8b5b2ff22..be3c5c73d7e79 100644
--- a/tools/testing/selftests/rcutorture/bin/functions.sh
+++ b/tools/testing/selftests/rcutorture/bin/functions.sh
@@ -172,7 +172,7 @@  identify_qemu_append () {
 	local console=ttyS0
 	case "$1" in
 	qemu-system-x86_64|qemu-system-i386)
-		echo noapic selinux=0 initcall_debug debug
+		echo selinux=0 initcall_debug debug
 		;;
 	qemu-system-aarch64)
 		console=ttyAMA0
@@ -191,8 +191,19 @@  identify_qemu_append () {
 # Output arguments for qemu arguments based on the TORTURE_QEMU_MAC
 # and TORTURE_QEMU_INTERACTIVE environment variables.
 identify_qemu_args () {
+	local KVM_CPU=""
+	case "$1" in
+	qemu-system-x86_64)
+		KVM_CPU=kvm64
+		;;
+	qemu-system-i386)
+		KVM_CPU=kvm32
+		;;
+	esac
 	case "$1" in
 	qemu-system-x86_64|qemu-system-i386)
+		echo -machine q35,accel=kvm
+		echo -cpu ${KVM_CPU},x2apic=on,tsc-deadline=on,hypervisor=on,tsc_adjust=on
 		;;
 	qemu-system-aarch64)
 		echo -machine virt,gic-version=host -cpu host
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
index d2d2a86139db1..322d5d40443cd 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
+++ b/tools/testing/selftests/rcutorture/configs/rcu/CFcommon
@@ -1,2 +1,6 @@ 
 CONFIG_RCU_TORTURE_TEST=y
 CONFIG_PRINTK_TIME=y
+CONFIG_HYPERVISOR_GUEST=y
+CONFIG_PARAVIRT=y
+CONFIG_PARAVIRT_SPINLOCKS=y
+CONFIG_KVM_GUEST=y