Message ID | 20190426233846.3675-1-sean.j.christopherson@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: Whitelist port 0x7e for pre-incrementing %rip | expand |
----- Original Message ----- > From: "Sean Christopherson" <sean.j.christopherson@intel.com> > To: "Paolo Bonzini" <pbonzini@redhat.com>, "Radim Krčmář" <rkrcmar@redhat.com> > Cc: kvm@vger.kernel.org, "Simon Becherer" <simon@becherer.de>, "Iakov Karpov" <srid@rkmail.ru>, "Gabriele Balducci" > <balducci@units.it>, "Antti Antinoja" <reader@fennosys.fi>, "Takashi Iwai" <tiwai@suse.com>, "Jiri Slaby" > <jslaby@suse.com>, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org> > Sent: Saturday, April 27, 2019 1:38:46 AM > Subject: [PATCH] KVM: x86: Whitelist port 0x7e for pre-incrementing %rip > > KVM's recent bug fix to update %rip after emulating I/O broke userspace > that relied on the previous behavior of incrementing %rip prior to > exiting to userspace. When running a Windows XP guest on AMD hardware, > Qemu may patch "OUT 0x7E" instructions in reaction to the OUT itself. > Because KVM's old behavior was to increment %rip before exiting to > userspace to handle the I/O, Qemu manually adjusted %rip to account for > the OUT instruction. > > Arguably this is a userspace bug as KVM requires userspace to re-enter > the kernel to complete instruction emulation before taking any other > actions. That being said, this is a bit of a grey area and breaking > userspace that has worked for many years is bad. > > Pre-increment %rip on OUT to port 0x7e before exiting to userspace to > hack around the issue. > > Fixes: 45def77ebf79e ("KVM: x86: update %rip after emulating IO") The patch should probably be tweaked to use the quirks mechanism. I'll post an adjusted version next Monday. Paolo
On Sat, Apr 27, 2019 at 12:19:23AM -0400, Paolo Bonzini wrote: > The patch should probably be tweaked to use the quirks mechanism. I'll post > an adjusted version next Monday. I took the liberty of posting v2 since it's a trivial change.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9482cb36b92a..5cc22a593e6d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6539,6 +6539,12 @@ int kvm_emulate_instruction_from_buffer(struct kvm_vcpu *vcpu, } EXPORT_SYMBOL_GPL(kvm_emulate_instruction_from_buffer); +static int complete_fast_pio_out_port_0x7e(struct kvm_vcpu *vcpu) +{ + vcpu->arch.pio.count = 0; + return 1; +} + static int complete_fast_pio_out(struct kvm_vcpu *vcpu) { vcpu->arch.pio.count = 0; @@ -6555,12 +6561,22 @@ static int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned long val = kvm_register_read(vcpu, VCPU_REGS_RAX); int ret = emulator_pio_out_emulated(&vcpu->arch.emulate_ctxt, size, port, &val, 1); + if (ret) + return ret; - if (!ret) { + /* + * Workaround userspace that relies on old KVM behavior of %rip being + * incremented prior to exiting to userspace to handle "OUT 0x7e". + */ + if (port == 0x7e) { + vcpu->arch.complete_userspace_io = + complete_fast_pio_out_port_0x7e; + kvm_skip_emulated_instruction(vcpu); + } else { vcpu->arch.pio.linear_rip = kvm_get_linear_rip(vcpu); vcpu->arch.complete_userspace_io = complete_fast_pio_out; } - return ret; + return 0; } static int complete_fast_pio_in(struct kvm_vcpu *vcpu)
KVM's recent bug fix to update %rip after emulating I/O broke userspace that relied on the previous behavior of incrementing %rip prior to exiting to userspace. When running a Windows XP guest on AMD hardware, Qemu may patch "OUT 0x7E" instructions in reaction to the OUT itself. Because KVM's old behavior was to increment %rip before exiting to userspace to handle the I/O, Qemu manually adjusted %rip to account for the OUT instruction. Arguably this is a userspace bug as KVM requires userspace to re-enter the kernel to complete instruction emulation before taking any other actions. That being said, this is a bit of a grey area and breaking userspace that has worked for many years is bad. Pre-increment %rip on OUT to port 0x7e before exiting to userspace to hack around the issue. Fixes: 45def77ebf79e ("KVM: x86: update %rip after emulating IO") Reported-by: Simon Becherer <simon@becherer.de> Reported-and-tested-by: Iakov Karpov <srid@rkmail.ru> Reported-by: Gabriele Balducci <balducci@units.it> Reported-by: Antti Antinoja <reader@fennosys.fi> Cc: stable@vger.kernel.org Cc: Takashi Iwai <tiwai@suse.com> Cc: Jiri Slaby <jslaby@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> --- arch/x86/kvm/x86.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)