Message ID | 20090415221042.GA20127@amt.cnet (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Marcelo Tosatti wrote: > Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity > checking for the new cr3 value: > > "Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to > the kernel. This will trigger a NULL pointer access in gfn_to_rmap() > when userspace next tries to call KVM_RUN on the affected VCPU and kvm > attempts to activate the new non-existent page table root. > > This happens since kvm only validates that cr3 points to a valid guest > physical memory page when code *inside* the guest sets cr3. However, kvm > currently trusts the userspace caller (e.g. QEMU) on the host machine to > always supply a valid page table root, rather than properly validating > it along with the rest of the reloaded guest state." > > http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599 > > Follow Avi's suggestion to use kvm_set_cr3, and do the same for > assigment of cr4. Note kvm_set_cr4 unconditionally resets the mmu > context, as long as cr4 is valid. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 148cde2..89fb3c7 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3985,25 +3985,19 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, > kvm_x86_ops->set_gdt(vcpu, &dt); > > vcpu->arch.cr2 = sregs->cr2; > - mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3; > - vcpu->arch.cr3 = sregs->cr3; > > + kvm_set_cr3(vcpu, sregs->cr3); > kvm_set_cr8(vcpu, sregs->cr8); > > mmu_reset_needed |= vcpu->arch.shadow_efer != sregs->efer; > kvm_x86_ops->set_efer(vcpu, sregs->efer); > kvm_set_apic_base(vcpu, sregs->apic_base); > > - kvm_x86_ops->decache_cr4_guest_bits(vcpu); > - > mmu_reset_needed |= vcpu->arch.cr0 != sregs->cr0; > kvm_x86_ops->set_cr0(vcpu, sregs->cr0); > vcpu->arch.cr0 = sregs->cr0; > > - mmu_reset_needed |= vcpu->arch.cr4 != sregs->cr4; > - kvm_x86_ops->set_cr4(vcpu, sregs->cr4); > - if (!is_long_mode(vcpu) && is_pae(vcpu)) > - load_pdptrs(vcpu, vcpu->arch.cr3); > + kvm_set_cr4(vcpu, sregs->cr4); > > if (mmu_reset_needed) > kvm_mmu_reset_context(vcpu); > Consider the following: current state: cr3 = 0 cr4.pae = 0 new state: cr3 = 0x800 cr4.pae = 1 When you call kvm_set_cr3(), it will inject a #GP into the guest because we are setting bit 11 when cr4.pae=0, which is illegal. However the new cr4.pae=1, so the new state was in fact legal! There are a few ways out, one is to first go back to real mode and set eveything up carefully in the right order (including EFER.LMA and EFER.LME, and CS.L). The other is to refactor kvm_set_* so that we have internal setters which won't trigger these faults (but do need to check at the end that the state is legal). This first method is probably better since that's what the guest does when booting anyway.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 148cde2..89fb3c7 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3985,25 +3985,19 @@ int kvm_arch_vcpu_ioctl_set_sregs(struct kvm_vcpu *vcpu, kvm_x86_ops->set_gdt(vcpu, &dt); vcpu->arch.cr2 = sregs->cr2; - mmu_reset_needed |= vcpu->arch.cr3 != sregs->cr3; - vcpu->arch.cr3 = sregs->cr3; + kvm_set_cr3(vcpu, sregs->cr3); kvm_set_cr8(vcpu, sregs->cr8); mmu_reset_needed |= vcpu->arch.shadow_efer != sregs->efer; kvm_x86_ops->set_efer(vcpu, sregs->efer); kvm_set_apic_base(vcpu, sregs->apic_base); - kvm_x86_ops->decache_cr4_guest_bits(vcpu); - mmu_reset_needed |= vcpu->arch.cr0 != sregs->cr0; kvm_x86_ops->set_cr0(vcpu, sregs->cr0); vcpu->arch.cr0 = sregs->cr0; - mmu_reset_needed |= vcpu->arch.cr4 != sregs->cr4; - kvm_x86_ops->set_cr4(vcpu, sregs->cr4); - if (!is_long_mode(vcpu) && is_pae(vcpu)) - load_pdptrs(vcpu, vcpu->arch.cr3); + kvm_set_cr4(vcpu, sregs->cr4); if (mmu_reset_needed) kvm_mmu_reset_context(vcpu);
Matt T. Yourst notes that kvm_arch_vcpu_ioctl_set_sregs lacks validity checking for the new cr3 value: "Userspace callers of KVM_SET_SREGS can pass a bogus value of cr3 to the kernel. This will trigger a NULL pointer access in gfn_to_rmap() when userspace next tries to call KVM_RUN on the affected VCPU and kvm attempts to activate the new non-existent page table root. This happens since kvm only validates that cr3 points to a valid guest physical memory page when code *inside* the guest sets cr3. However, kvm currently trusts the userspace caller (e.g. QEMU) on the host machine to always supply a valid page table root, rather than properly validating it along with the rest of the reloaded guest state." http://sourceforge.net/tracker/?func=detail&atid=893831&aid=2687641&group_id=180599 Follow Avi's suggestion to use kvm_set_cr3, and do the same for assigment of cr4. Note kvm_set_cr4 unconditionally resets the mmu context, as long as cr4 is valid. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html