Message ID | 20221013211234.1318131-4-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: gfn_to_pfn_cache fixes and cleanups | expand |
On 10/13/22 23:12, Sean Christopherson wrote: > Always use the size of Xen's non-compat vcpu_runstate_info struct when > checking that the GPA+size doesn't cross a page boundary. Conceptually, > using the current mode is more correct, but KVM isn't consistent with > itself as kvm_xen_vcpu_set_attr() unconditionally uses the "full" size > when activating the cache. More importantly, prior to the introduction > of the gfn_to_pfn_cache, KVM _always_ used the full size, i.e. allowing > the guest (userspace?) to use a poorly aligned GPA in 32-bit mode but not > 64-bit mode is more of a bug than a feature, and fixing the bug doesn't > break KVM's historical ABI. I'd rather not introduce additional restrictions in KVM, mostly because it's actually easy to avoid this patch by instead enforcing that attributes are set in a sensible order: - long mode cannot be changed after the shared info page is enabled (which makes sense because the shared info page also has a compat version) - the caches must be activated after the shared info page (which enforces that the vCPU attributes are set after the VM attributes) This is technically a userspace API break, but nobody is really using this API outside Amazon so... Patches coming after I finish testing. Paolo > Always using the non-compat size will allow for future gfn_to_pfn_cache > clenups as this is (was) the only case where KVM uses a different size at > check()+refresh() than at activate(). E.g. the length/size of the cache > can be made immutable and dropped from check()+refresh(), which yields a > cleaner set of APIs and avoids potential bugs that could occur if check() > where invoked with a different size than refresh(). > > Fixes: a795cd43c5b5 ("KVM: x86/xen: Use gfn_to_pfn_cache for runstate area") > Cc: David Woodhouse <dwmw2@infradead.org> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/xen.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > index b2be60c6efa4..9e79ef2cca99 100644 > --- a/arch/x86/kvm/xen.c > +++ b/arch/x86/kvm/xen.c > @@ -212,10 +212,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) > if (!vx->runstate_cache.active) > return; > > - if (IS_ENABLED(CONFIG_64BIT) && v->kvm->arch.xen.long_mode) > - user_len = sizeof(struct vcpu_runstate_info); > - else > - user_len = sizeof(struct compat_vcpu_runstate_info); > + user_len = sizeof(struct vcpu_runstate_info); > > read_lock_irqsave(&gpc->lock, flags); > while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
On Thu, Oct 27, 2022, Paolo Bonzini wrote: > On 10/13/22 23:12, Sean Christopherson wrote: > > Always use the size of Xen's non-compat vcpu_runstate_info struct when > > checking that the GPA+size doesn't cross a page boundary. Conceptually, > > using the current mode is more correct, but KVM isn't consistent with > > itself as kvm_xen_vcpu_set_attr() unconditionally uses the "full" size > > when activating the cache. More importantly, prior to the introduction > > of the gfn_to_pfn_cache, KVM _always_ used the full size, i.e. allowing > > the guest (userspace?) to use a poorly aligned GPA in 32-bit mode but not > > 64-bit mode is more of a bug than a feature, and fixing the bug doesn't > > break KVM's historical ABI. > > I'd rather not introduce additional restrictions in KVM, But KVM already has this restriction. "struct vcpu_info" is always checked for the non-compat size, and as above, "struct vcpu_runstate_info" is checked for the non-compat size during its initialization. > actually easy to avoid this patch by instead enforcing that attributes are > set in a sensible order: I don't care about fixing XEN support, I care about forcing "length" to be immutable in order to simplify the gfn_to_pfn_cache implementation. Avoiding this patch prevents doing that later in this series. > - long mode cannot be changed after the shared info page is enabled (which > makes sense because the shared info page also has a compat version) How is this not introducing an additional restriction? This seems way more onerous than what is effectively a revert. > - the caches must be activated after the shared info page (which enforces > that the vCPU attributes are set after the VM attributes) > > This is technically a userspace API break, but nobody is really using this > API outside Amazon so... Patches coming after I finish testing. It's not just userspace break, it affects the guest ABI as well. arch.xen.long_mode isn't set just by userspace, it's also snapshot when the guest changes the hypercall page. Maybe there's something in the XEN ABI that says the hypercall page can never be changed, but barring that I don't see how to prevent ending up with a misaligned cache due to the guest enabling long mode. int kvm_xen_write_hypercall_page(struct kvm_vcpu *vcpu, u64 data) { struct kvm *kvm = vcpu->kvm; u32 page_num = data & ~PAGE_MASK; u64 page_addr = data & PAGE_MASK; bool lm = is_long_mode(vcpu); /* Latch long_mode for shared_info pages etc. */ vcpu->kvm->arch.xen.long_mode = lm;
On 10/27/22 16:44, Sean Christopherson wrote: >> - long mode cannot be changed after the shared info page is enabled (which >> makes sense because the shared info page also has a compat version) > > How is this not introducing an additional restriction? This seems way more > onerous than what is effectively a revert. > >> - the caches must be activated after the shared info page (which enforces >> that the vCPU attributes are set after the VM attributes) >> >> This is technically a userspace API break, but nobody is really using this >> API outside Amazon so... Patches coming after I finish testing. > > It's not just userspace break, it affects the guest ABI as well. Yes, I was talking of the VMM here; additional restrictions are fine there. The guests however should be compatible with Xen, so you also need to re-activate the cache after the hypercall page is written, but that's two lines of code. Paolo
On Thu, Oct 27, 2022, Paolo Bonzini wrote: > On 10/27/22 16:44, Sean Christopherson wrote: > > > - long mode cannot be changed after the shared info page is enabled (which > > > makes sense because the shared info page also has a compat version) > > > > How is this not introducing an additional restriction? This seems way more > > onerous than what is effectively a revert. > > > > > - the caches must be activated after the shared info page (which enforces > > > that the vCPU attributes are set after the VM attributes) > > > > > > This is technically a userspace API break, but nobody is really using this > > > API outside Amazon so... Patches coming after I finish testing. > > > > It's not just userspace break, it affects the guest ABI as well. > > Yes, I was talking of the VMM here; additional restrictions are fine there. Additional restrictions are fine where? > The guests however should be compatible with Xen, so you also need to > re-activate the cache after the hypercall page is written, but that's two > lines of code. And do what if the guest transitions from 32-bit => 64-bit and the cache isn't aligned for 64-bit? E.g. kvm_xen_set_evtchn() will silently drop events no matter what KVM does. In other words, I don't see how KVM can provide a same ABI without forcing the cached pages to be aligned for the largets possible size.
On Thu, Oct 27, 2022, Sean Christopherson wrote: > On Thu, Oct 27, 2022, Paolo Bonzini wrote: > > On 10/13/22 23:12, Sean Christopherson wrote: > > > Always use the size of Xen's non-compat vcpu_runstate_info struct when > > > checking that the GPA+size doesn't cross a page boundary. Conceptually, > > > using the current mode is more correct, but KVM isn't consistent with > > > itself as kvm_xen_vcpu_set_attr() unconditionally uses the "full" size > > > when activating the cache. More importantly, prior to the introduction > > > of the gfn_to_pfn_cache, KVM _always_ used the full size, i.e. allowing > > > the guest (userspace?) to use a poorly aligned GPA in 32-bit mode but not > > > 64-bit mode is more of a bug than a feature, and fixing the bug doesn't > > > break KVM's historical ABI. > > > > I'd rather not introduce additional restrictions in KVM, > > But KVM already has this restriction. "struct vcpu_info" is always checked for > the non-compat size, and as above, "struct vcpu_runstate_info" is checked for the > non-compat size during its initialization. Ah, I forgot those are the same size: BUILD_BUG_ON(sizeof(struct vcpu_info) != sizeof(struct compat_vcpu_info));
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index b2be60c6efa4..9e79ef2cca99 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -212,10 +212,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) if (!vx->runstate_cache.active) return; - if (IS_ENABLED(CONFIG_64BIT) && v->kvm->arch.xen.long_mode) - user_len = sizeof(struct vcpu_runstate_info); - else - user_len = sizeof(struct compat_vcpu_runstate_info); + user_len = sizeof(struct vcpu_runstate_info); read_lock_irqsave(&gpc->lock, flags); while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
Always use the size of Xen's non-compat vcpu_runstate_info struct when checking that the GPA+size doesn't cross a page boundary. Conceptually, using the current mode is more correct, but KVM isn't consistent with itself as kvm_xen_vcpu_set_attr() unconditionally uses the "full" size when activating the cache. More importantly, prior to the introduction of the gfn_to_pfn_cache, KVM _always_ used the full size, i.e. allowing the guest (userspace?) to use a poorly aligned GPA in 32-bit mode but not 64-bit mode is more of a bug than a feature, and fixing the bug doesn't break KVM's historical ABI. Always using the non-compat size will allow for future gfn_to_pfn_cache clenups as this is (was) the only case where KVM uses a different size at check()+refresh() than at activate(). E.g. the length/size of the cache can be made immutable and dropped from check()+refresh(), which yields a cleaner set of APIs and avoids potential bugs that could occur if check() where invoked with a different size than refresh(). Fixes: a795cd43c5b5 ("KVM: x86/xen: Use gfn_to_pfn_cache for runstate area") Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Sean Christopherson <seanjc@google.com> --- arch/x86/kvm/xen.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)