Message ID | 20180305173327.1252-1-sean.j.christopherson@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
2018-03-05 09:33-0800, Sean Christopherson: > Segment registers must be synchronized prior to any code that may > trigger a call to emulation_required()/guest_state_valid(), e.g. > vmx_set_cr0(). Because preparing vmcs02 writes segmentation fields > directly, i.e. doesn't use vmx_set_segment(), emulation_required > will not be re-evaluated when synchronizing the segment registers, > which can result in L0 incorrectly starting emulation of L2. > > Fixes: 8665c3f97320 ("KVM: nVMX: initialize descriptor cache fields in prepare_vmcs02_full") > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > --- > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > @@ -10563,11 +10563,8 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne > return 0; > } > > -static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, > - bool from_vmentry) Nice catch! Paolo, was there a reason to that prevented prepare_vmcs02_full at the beginning of prepare_vmcs02? Thanks.
On 09/03/2018 19:37, Radim Krčmář wrote: > 2018-03-05 09:33-0800, Sean Christopherson: >> Segment registers must be synchronized prior to any code that may >> trigger a call to emulation_required()/guest_state_valid(), e.g. >> vmx_set_cr0(). Because preparing vmcs02 writes segmentation fields >> directly, i.e. doesn't use vmx_set_segment(), emulation_required >> will not be re-evaluated when synchronizing the segment registers, >> which can result in L0 incorrectly starting emulation of L2. >> >> Fixes: 8665c3f97320 ("KVM: nVMX: initialize descriptor cache fields in prepare_vmcs02_full") >> >> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> >> --- >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> @@ -10563,11 +10563,8 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne >> return 0; >> } >> >> -static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, >> - bool from_vmentry) > > Nice catch! > > Paolo, was there a reason to that prevented prepare_vmcs02_full at the > beginning of prepare_vmcs02? No, I placed it there because initially I wanted to move more work to prepare_vmcs02_full (such as entry/exit controls, CR0/CR4/EFER, etc.), but that didn't quite work out. Sean, do you have a test case for this bug? (And if it can be fixed by moving prepare_vmcs02_full at the beginning, as suggested by Radim, I'm all for it). Paolo
On Wed, 2018-03-14 at 14:14 +0100, Paolo Bonzini wrote: > On 09/03/2018 19:37, Radim Krčmář wrote: > > > > 2018-03-05 09:33-0800, Sean Christopherson: > > > > > > Segment registers must be synchronized prior to any code that may > > > trigger a call to emulation_required()/guest_state_valid(), e.g. > > > vmx_set_cr0(). Because preparing vmcs02 writes segmentation fields > > > directly, i.e. doesn't use vmx_set_segment(), emulation_required > > > will not be re-evaluated when synchronizing the segment registers, > > > which can result in L0 incorrectly starting emulation of L2. > > > > > > Fixes: 8665c3f97320 ("KVM: nVMX: initialize descriptor cache fields in prepare_vmcs02_full") > > > > > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > > > --- > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > > @@ -10563,11 +10563,8 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne > > > return 0; > > > } > > > > > > -static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, > > > - bool from_vmentry) > > Nice catch! > > > > Paolo, was there a reason to that prevented prepare_vmcs02_full at the > > beginning of prepare_vmcs02? > No, I placed it there because initially I wanted to move more work to > prepare_vmcs02_full (such as entry/exit controls, CR0/CR4/EFER, etc.), > but that didn't quite work out. > > Sean, do you have a test case for this bug? (And if it can be fixed by > moving prepare_vmcs02_full at the beginning, as suggested by Radim, I'm > all for it). I don't have a formal unit test, but I hit the bug 100% of the time simply by booting L2 with stock OVMF and unrestricted_guest=0 in L0, i.e. it's trivially easy for me to reproduce. I think I could create a unit test relatively easily. I hit the bug early in BIOS (OVMF) in L2, e.g. less than 10 instructions after reset. In theory it should be straightforward to extract/port OVMF's reset code to a unit test. Moving prepare_vmcs02_full to the beginning fixes the bug and doesn't introduce any new issues as far as I can tell, but moving that much code is a bit scary, especially this late in a release cycle. I think it would be ok since prepare_vmcs02_full doesn't appear to consume dynamic state or have side effects, but I'll definitely defer to your judgment. > Paolo
On Wed, 2018-03-14, Sean Christopherson wrote: > On Wed, 2018-03-14 at 14:14 +0100, Paolo Bonzini wrote: > > On 09/03/2018 19:37, Radim Krčmář wrote: > > > > > > 2018-03-05 09:33-0800, Sean Christopherson: > > > > > > > > Segment registers must be synchronized prior to any code that may > > > > trigger a call to emulation_required()/guest_state_valid(), e.g. > > > > vmx_set_cr0(). Because preparing vmcs02 writes segmentation fields > > > > directly, i.e. doesn't use vmx_set_segment(), emulation_required > > > > will not be re-evaluated when synchronizing the segment registers, > > > > which can result in L0 incorrectly starting emulation of L2. > > > > > > > > Fixes: 8665c3f97320 ("KVM: nVMX: initialize descriptor cache fields in prepare_vmcs02_full") > > > > > > > > Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> > > > > --- > > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > > > @@ -10563,11 +10563,8 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne > > > > return 0; > > > > } > > > > > > > > -static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, > > > > - bool from_vmentry) > > > Nice catch! > > > > > > Paolo, was there a reason to that prevented prepare_vmcs02_full at the > > > beginning of prepare_vmcs02? > > No, I placed it there because initially I wanted to move more work to > > prepare_vmcs02_full (such as entry/exit controls, CR0/CR4/EFER, etc.), > > but that didn't quite work out. > > > > Sean, do you have a test case for this bug? (And if it can be fixed by > > moving prepare_vmcs02_full at the beginning, as suggested by Radim, I'm > > all for it). > > I don't have a formal unit test, but I hit the bug 100% of the time > simply by booting L2 with stock OVMF and unrestricted_guest=0 in L0, > i.e. it's trivially easy for me to reproduce. > > I think I could create a unit test relatively easily. I hit the bug > early in BIOS (OVMF) in L2, e.g. less than 10 instructions after reset. > In theory it should be straightforward to extract/port OVMF's reset > code to a unit test. The existing kvm-unit-tests tests/realmode hits this when setting CR0.PM in exec_in_big_real_mode(), just need to load kvm_intel with unrestricted_guest=0 in L0. I thought I had tried tests/realmode before sending the previous email, but I think I forgot to load kvm_intel in L1 and so it ran with accel=tcg, doh. > Moving prepare_vmcs02_full to the beginning fixes the bug and doesn't > introduce any new issues as far as I can tell, but moving that much > code is a bit scary, especially this late in a release cycle. I think > it would be ok since prepare_vmcs02_full doesn't appear to consume > dynamic state or have side effects, but I'll definitely defer to your > judgment. > > > Paolo
On 14/03/2018 18:30, Christopherson, Sean J wrote: > On Wed, 2018-03-14, Sean Christopherson wrote: >> On Wed, 2018-03-14 at 14:14 +0100, Paolo Bonzini wrote: >>> On 09/03/2018 19:37, Radim Krčmář wrote: >>>> >>>> 2018-03-05 09:33-0800, Sean Christopherson: >>>>> >>>>> Segment registers must be synchronized prior to any code that may >>>>> trigger a call to emulation_required()/guest_state_valid(), e.g. >>>>> vmx_set_cr0(). Because preparing vmcs02 writes segmentation fields >>>>> directly, i.e. doesn't use vmx_set_segment(), emulation_required >>>>> will not be re-evaluated when synchronizing the segment registers, >>>>> which can result in L0 incorrectly starting emulation of L2. >>>>> >>>>> Fixes: 8665c3f97320 ("KVM: nVMX: initialize descriptor cache fields in prepare_vmcs02_full") >>>>> >>>>> Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> >>>>> --- >>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>>>> @@ -10563,11 +10563,8 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne >>>>> return 0; >>>>> } >>>>> >>>>> -static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, >>>>> - bool from_vmentry) >>>> Nice catch! >>>> >>>> Paolo, was there a reason to that prevented prepare_vmcs02_full at the >>>> beginning of prepare_vmcs02? >>> No, I placed it there because initially I wanted to move more work to >>> prepare_vmcs02_full (such as entry/exit controls, CR0/CR4/EFER, etc.), >>> but that didn't quite work out. >>> >>> Sean, do you have a test case for this bug? (And if it can be fixed by >>> moving prepare_vmcs02_full at the beginning, as suggested by Radim, I'm >>> all for it). >> >> I don't have a formal unit test, but I hit the bug 100% of the time >> simply by booting L2 with stock OVMF and unrestricted_guest=0 in L0, >> i.e. it's trivially easy for me to reproduce. >> >> I think I could create a unit test relatively easily. I hit the bug >> early in BIOS (OVMF) in L2, e.g. less than 10 instructions after reset. >> In theory it should be straightforward to extract/port OVMF's reset >> code to a unit test. > > The existing kvm-unit-tests tests/realmode hits this when setting > CR0.PM in exec_in_big_real_mode(), just need to load kvm_intel with > unrestricted_guest=0 in L0. I thought I had tried tests/realmode before > sending the previous email, but I think I forgot to load kvm_intel in L1 > and so it ran with accel=tcg, doh. Cool, I'll try to reproduce both ways and use it to test the move of prepare_vmcs02_full at the beginning. Paolo
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 591214843046..3d73d9ed15c8 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10563,11 +10563,8 @@ static int nested_vmx_load_cr3(struct kvm_vcpu *vcpu, unsigned long cr3, bool ne return 0; } -static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, - bool from_vmentry) +static void prepare_vmcs02_segmentation(struct vmcs12 *vmcs12) { - struct vcpu_vmx *vmx = to_vmx(vcpu); - vmcs_write16(GUEST_ES_SELECTOR, vmcs12->guest_es_selector); vmcs_write16(GUEST_SS_SELECTOR, vmcs12->guest_ss_selector); vmcs_write16(GUEST_DS_SELECTOR, vmcs12->guest_ds_selector); @@ -10599,6 +10596,12 @@ static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmcs_writel(GUEST_TR_BASE, vmcs12->guest_tr_base); vmcs_writel(GUEST_GDTR_BASE, vmcs12->guest_gdtr_base); vmcs_writel(GUEST_IDTR_BASE, vmcs12->guest_idtr_base); +} + +static void prepare_vmcs02_full(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, + bool from_vmentry) +{ + struct vcpu_vmx *vmx = to_vmx(vcpu); vmcs_write32(GUEST_SYSENTER_CS, vmcs12->guest_sysenter_cs); vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS, @@ -10714,6 +10717,14 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12, vmcs_writel(GUEST_ES_BASE, vmcs12->guest_es_base); vmcs_writel(GUEST_CS_BASE, vmcs12->guest_cs_base); + /* + * Segment registers must be synchronized prior to any code that may + * trigger a call to emulation_required()/guest_state_valid(), e.g. + * vmx_set_cr0(). + */ + if (vmx->nested.dirty_vmcs12) + prepare_vmcs02_segmentation(vmcs12); + /* * Not in vmcs02: GUEST_PML_INDEX, HOST_FS_SELECTOR, HOST_GS_SELECTOR, * HOST_FS_BASE, HOST_GS_BASE.
Segment registers must be synchronized prior to any code that may trigger a call to emulation_required()/guest_state_valid(), e.g. vmx_set_cr0(). Because preparing vmcs02 writes segmentation fields directly, i.e. doesn't use vmx_set_segment(), emulation_required will not be re-evaluated when synchronizing the segment registers, which can result in L0 incorrectly starting emulation of L2. Fixes: 8665c3f97320 ("KVM: nVMX: initialize descriptor cache fields in prepare_vmcs02_full") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> --- arch/x86/kvm/vmx.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)