diff mbox series

[v2] x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012

Message ID 20190423131537.9298-1-vkuznets@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v2] x86: kvm: hyper-v: deal with buggy TLB flush requests from WS2012 | expand

Commit Message

Vitaly Kuznetsov April 23, 2019, 1:15 p.m. UTC
It was reported that with some special Multi Processor Group configuration,
e.g:
 bcdedit.exe /set groupsize 1
 bcdedit.exe /set maxgroup on
 bcdedit.exe /set groupaware on
for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism
is in use.

Tracing kvm_hv_flush_tlb immediately reveals the issue:

 kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2

The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES,
however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified.
We don't flush anything and apparently it's not what Windows expects.

TLFS doesn't say anything about such requests and newer Windows versions
seem to be unaffected. This all feels like a WS2012 bug, which is, however,
easy to workaround in KVM: let's flush everything when we see an empty
flush request, over-flushing doesn't hurt.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
Changes from v1:
- Re-worded the comment [Paolo Bonzini, Sean Christopherson]
- "(flush.processor_mask == 0)" => "!flush.processor_mask" [Sean Christopherson]
---
 arch/x86/kvm/hyperv.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Sean Christopherson April 23, 2019, 1:59 p.m. UTC | #1
On Tue, Apr 23, 2019 at 03:15:37PM +0200, Vitaly Kuznetsov wrote:
> It was reported that with some special Multi Processor Group configuration,
> e.g:
>  bcdedit.exe /set groupsize 1
>  bcdedit.exe /set maxgroup on
>  bcdedit.exe /set groupaware on
> for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism
> is in use.
> 
> Tracing kvm_hv_flush_tlb immediately reveals the issue:
> 
>  kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2
> 
> The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES,
> however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified.
> We don't flush anything and apparently it's not what Windows expects.
> 
> TLFS doesn't say anything about such requests and newer Windows versions
> seem to be unaffected. This all feels like a WS2012 bug, which is, however,
> easy to workaround in KVM: let's flush everything when we see an empty
> flush request, over-flushing doesn't hurt.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---

Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
diff mbox series

Patch

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 421899f6ad7b..537be5ab030a 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -1371,7 +1371,16 @@  static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
 
 		valid_bank_mask = BIT_ULL(0);
 		sparse_banks[0] = flush.processor_mask;
-		all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS;
+
+		/*
+		 * Work around possible WS2012 bug: it sends hypercalls
+		 * with processor_mask = 0x0 and HV_FLUSH_ALL_PROCESSORS clear,
+		 * while also expecting us to flush something and crashing if
+		 * we don't. Let's treat processor_mask == 0 same as
+		 * HV_FLUSH_ALL_PROCESSORS.
+		 */
+		all_cpus = (flush.flags & HV_FLUSH_ALL_PROCESSORS) ||
+			!flush.processor_mask;
 	} else {
 		if (unlikely(kvm_read_guest(kvm, ingpa, &flush_ex,
 					    sizeof(flush_ex))))