diff mbox

[RFC,V2] kvm: x86: reduce rtc 0x70 access vm-exit time

Message ID 1502653881-116299-1-git-send-email-peng.hao2@zte.com.cn (mailing list archive)
State New, archived
Headers show

Commit Message

Peng Hao Aug. 13, 2017, 7:51 p.m. UTC
some versions of windows guest access rtc frequently because of
rtc as system tick.guest access rtc like this: write register index
to 0x70, then write or read data from 0x71. writing 0x70 port is
just as index and do nothing else. So we can use coalesced mmio to
handle this scene to reduce VM-EXIT time.
without my patch, get the vm-exit time of accessing rtc 0x70 using
perf tools: (guest OS : windows 7 64bit)
IO Port Access  Samples Samples%  Time%  Min Time  Max Time  Avg time
0x70:POUT        86     30.99%    74.59%   9us      29us    10.75us (+- 3.41%)

with my patch
IO Port Access  Samples Samples%  Time%   Min Time  Max Time   Avg time
 0x70:POUT       106    32.02%    29.47%    0us      10us     1.57us (+- 7.38%)

the patch is a part of optimizing rtc 0x70 port access. Another is in
qemu.

Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
---
 Documentation/virtual/kvm/00-INDEX          |  2 ++
 Documentation/virtual/kvm/coalesced-pio.txt | 36 +++++++++++++++++++++++++++++
 include/uapi/linux/kvm.h                    |  5 ++--
 virt/kvm/coalesced_mmio.c                   | 16 ++++++++++---
 virt/kvm/kvm_main.c                         |  2 ++
 5 files changed, 56 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/virtual/kvm/coalesced-pio.txt

Comments

Paolo Bonzini Aug. 14, 2017, 7:12 a.m. UTC | #1
On 13/08/2017 21:51, Peng Hao wrote:
> some versions of windows guest access rtc frequently because of
> rtc as system tick.guest access rtc like this: write register index
> to 0x70, then write or read data from 0x71. writing 0x70 port is
> just as index and do nothing else. So we can use coalesced mmio to
> handle this scene to reduce VM-EXIT time.
> without my patch, get the vm-exit time of accessing rtc 0x70 using
> perf tools: (guest OS : windows 7 64bit)
> IO Port Access  Samples Samples%  Time%  Min Time  Max Time  Avg time
> 0x70:POUT        86     30.99%    74.59%   9us      29us    10.75us (+- 3.41%)
> 
> with my patch
> IO Port Access  Samples Samples%  Time%   Min Time  Max Time   Avg time
>  0x70:POUT       106    32.02%    29.47%    0us      10us     1.57us (+- 7.38%)
> 
> the patch is a part of optimizing rtc 0x70 port access. Another is in
> qemu.
> 
> Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>

Very nice, thanks.

The new documentation file can be used later as a base for documentation
of coalesced MMIO ioctls.  Here is an edited version:

----
Coalesced MMIO and coalesced PIO can be used to optimize writes to
simple device registers.  Writes to a coalesced-I/O region are not
reported to userspace until the next non-coalesced I/O is issued, in a
similar fashion to write combining hardware.  In KVM, coalesced writes
are handled in the kernel without exits to userspace, and are thus
several times faster.

Examples of devices that can benefit from coalesced I/O include:

- devices whose memory is accessed with many consecutive writes, for
example the EGA/VGA video RAM.

- windowed I/O, such as the real-time clock.  The address register (port
0x70 in the RTC case) can use coalesced I/O, cutting the number of
userspace exits by half when reading or writing the RTC.
----

Paolo
diff mbox

Patch

diff --git a/Documentation/virtual/kvm/00-INDEX b/Documentation/virtual/kvm/00-INDEX
index 69fe1a8..3e4a49b 100644
--- a/Documentation/virtual/kvm/00-INDEX
+++ b/Documentation/virtual/kvm/00-INDEX
@@ -4,6 +4,8 @@  api.txt
 	- KVM userspace API.
 cpuid.txt
 	- KVM-specific cpuid leaves (x86).
+coalesced-pio.txt
+	- KVM_CAP_COALESCED_PIO
 devices/
 	- KVM_CAP_DEVICE_CTRL userspace API.
 halt-polling.txt
diff --git a/Documentation/virtual/kvm/coalesced-pio.txt b/Documentation/virtual/kvm/coalesced-pio.txt
new file mode 100644
index 0000000..34bf50a
--- /dev/null
+++ b/Documentation/virtual/kvm/coalesced-pio.txt
@@ -0,0 +1,36 @@ 
+Linux KVM Coalesced PIO:
+========================
+Coalesced pio is base on coalesced mmio. When the write access to a port
+of a device does a simple work or can delay, then follows a access to the
+same device, it can use coalesced pio optimizing method.
+
+using coalesced pio, the write access to port A of a device do not exit
+to userspace, and just record the value in shared coalesced ring in kernel.
+Then following access to the port B of the same device or read access to
+the port A will get the previous write value of port A firstly in usersapce
+according to the shared coalesced ring.
+
+Now just use coalesced pio for rtc 0x70 port and we can find other ports
+in the future.
+
+for example: rtc port 0x70
+without coalesced pio:
+  guest write port 0x70
+    vm_exit to userspace
+      userspace set register index: regindex
+    vm_entry
+  guest read or write 0x71
+    vm_exit to userspace
+      usersapce get/set rtc regisers[regindex]
+    vm_entry
+
+with coalesced pio:
+  guest write port 0x70
+    vm_exit to kernel
+    vm_entry
+  guest read or write 0x71
+    vm_exit to userspace
+      usersapce set register index: regindex
+      usersapce get/set rtc regisers[regindex]
+    vm_entry
+
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 6cd63c1..3e681f9 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -415,13 +415,13 @@  struct kvm_run {
 struct kvm_coalesced_mmio_zone {
 	__u64 addr;
 	__u32 size;
-	__u32 pad;
+	__u32 pio;
 };
 
 struct kvm_coalesced_mmio {
 	__u64 phys_addr;
 	__u32 len;
-	__u32 pad;
+	__u32 pio;
 	__u8  data[8];
 };
 
@@ -929,6 +929,7 @@  struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PPC_SMT_POSSIBLE 147
 #define KVM_CAP_HYPERV_SYNIC2 148
 #define KVM_CAP_HYPERV_VP_INDEX 149
+#define KVM_CAP_COALESCED_PIO 150
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
index 571c1ce..0811907 100644
--- a/virt/kvm/coalesced_mmio.c
+++ b/virt/kvm/coalesced_mmio.c
@@ -82,6 +82,7 @@  static int coalesced_mmio_write(struct kvm_vcpu *vcpu,
 	ring->coalesced_mmio[ring->last].phys_addr = addr;
 	ring->coalesced_mmio[ring->last].len = len;
 	memcpy(ring->coalesced_mmio[ring->last].data, val, len);
+	ring->coalesced_mmio[ring->last].pio = dev->zone.pio;
 	smp_wmb();
 	ring->last = (ring->last + 1) % KVM_COALESCED_MMIO_MAX;
 	spin_unlock(&dev->kvm->ring_lock);
@@ -148,8 +149,12 @@  int kvm_vm_ioctl_register_coalesced_mmio(struct kvm *kvm,
 	dev->zone = *zone;
 
 	mutex_lock(&kvm->slots_lock);
-	ret = kvm_io_bus_register_dev(kvm, KVM_MMIO_BUS, zone->addr,
-				      zone->size, &dev->dev);
+	if (zone->pio == 0)
+		ret = kvm_io_bus_register_dev(kvm, KVM_MMIO_BUS, zone->addr,
+						zone->size, &dev->dev);
+	else
+		ret = kvm_io_bus_register_dev(kvm, KVM_PIO_BUS, zone->addr,
+						zone->size, &dev->dev);
 	if (ret < 0)
 		goto out_free_dev;
 	list_add_tail(&dev->list, &kvm->coalesced_zones);
@@ -173,7 +178,12 @@  int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
 
 	list_for_each_entry_safe(dev, tmp, &kvm->coalesced_zones, list)
 		if (coalesced_mmio_in_range(dev, zone->addr, zone->size)) {
-			kvm_io_bus_unregister_dev(kvm, KVM_MMIO_BUS, &dev->dev);
+			if (zone->pio == 0)
+				kvm_io_bus_unregister_dev(kvm, KVM_MMIO_BUS,
+							&dev->dev);
+			else
+				kvm_io_bus_unregister_dev(kvm, KVM_PIO_BUS,
+							&dev->dev);
 			kvm_iodevice_destructor(&dev->dev);
 		}
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 15252d7..94307a9 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2950,6 +2950,8 @@  static long kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg)
 #ifdef CONFIG_KVM_MMIO
 	case KVM_CAP_COALESCED_MMIO:
 		return KVM_COALESCED_MMIO_PAGE_OFFSET;
+	case KVM_CAP_COALESCED_PIO:
+		return 1;
 #endif
 #ifdef CONFIG_HAVE_KVM_IRQ_ROUTING
 	case KVM_CAP_IRQ_ROUTING: