Message ID | 1372858868-24755-1-git-send-email-yzt356@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Il 03/07/2013 15:41, Arthur Chunqi Li ha scritto: > Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. > Simply return 0x5 when read and generate #GP(0) when write. > Delete handling codes in vmx_set_vmx_msr() and generate #GP(0) in > handle_wrmsr(). > > Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> > --- > arch/x86/kvm/vmx.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index 260a919..e125f94 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -2277,7 +2277,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) > > switch (msr_index) { > case MSR_IA32_FEATURE_CONTROL: > - *pdata = 0; > + *pdata = 0x5; > break; This is not in the MSR_IA32_VMX_BASIC..MSR_IA32_VMX_TRUE_ENTRY_CTLS range, so you must check nested_vmx_allowed and return 0 if it is false. Otherwise looks good. Paolo > case MSR_IA32_VMX_BASIC: > /* > @@ -2356,9 +2356,6 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) > if (!nested_vmx_allowed(vcpu)) > return 0; > > - if (msr_index == MSR_IA32_FEATURE_CONTROL) > - /* TODO: the right thing. */ > - return 1; > /* > * No need to treat VMX capability MSRs specially: If we don't handle > * them, handle_wrmsr will #GP(0), which is correct (they are readonly) > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 04, 2013 at 09:00:09AM +0200, Paolo Bonzini wrote: > Il 03/07/2013 15:41, Arthur Chunqi Li ha scritto: > > Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. > > Simply return 0x5 when read and generate #GP(0) when write. > > Delete handling codes in vmx_set_vmx_msr() and generate #GP(0) in > > handle_wrmsr(). > > > > Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> > > --- > > arch/x86/kvm/vmx.c | 5 +---- > > 1 file changed, 1 insertion(+), 4 deletions(-) > > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > index 260a919..e125f94 100644 > > --- a/arch/x86/kvm/vmx.c > > +++ b/arch/x86/kvm/vmx.c > > @@ -2277,7 +2277,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) > > > > switch (msr_index) { > > case MSR_IA32_FEATURE_CONTROL: > > - *pdata = 0; > > + *pdata = 0x5; > > break; > > This is not in the MSR_IA32_VMX_BASIC..MSR_IA32_VMX_TRUE_ENTRY_CTLS > range, so you must check nested_vmx_allowed and return 0 if it is false. > Or 1? > Otherwise looks good. > > Paolo > > > case MSR_IA32_VMX_BASIC: > > /* > > @@ -2356,9 +2356,6 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) Also this function is no longer needed. You can drop it. And what about Nadav's patch Bandan pointed too? It is not entirely correct, but it is close to real HW. > > if (!nested_vmx_allowed(vcpu)) > > return 0; > > > > - if (msr_index == MSR_IA32_FEATURE_CONTROL) > > - /* TODO: the right thing. */ > > - return 1; > > /* > > * No need to treat VMX capability MSRs specially: If we don't handle > > * them, handle_wrmsr will #GP(0), which is correct (they are readonly) > > -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 4, 2013 at 3:10 PM, Gleb Natapov <gleb@redhat.com> wrote: > On Thu, Jul 04, 2013 at 09:00:09AM +0200, Paolo Bonzini wrote: >> Il 03/07/2013 15:41, Arthur Chunqi Li ha scritto: >> > Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. >> > Simply return 0x5 when read and generate #GP(0) when write. >> > Delete handling codes in vmx_set_vmx_msr() and generate #GP(0) in >> > handle_wrmsr(). >> > >> > Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> >> > --- >> > arch/x86/kvm/vmx.c | 5 +---- >> > 1 file changed, 1 insertion(+), 4 deletions(-) >> > >> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> > index 260a919..e125f94 100644 >> > --- a/arch/x86/kvm/vmx.c >> > +++ b/arch/x86/kvm/vmx.c >> > @@ -2277,7 +2277,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) >> > >> > switch (msr_index) { >> > case MSR_IA32_FEATURE_CONTROL: >> > - *pdata = 0; >> > + *pdata = 0x5; >> > break; >> >> This is not in the MSR_IA32_VMX_BASIC..MSR_IA32_VMX_TRUE_ENTRY_CTLS >> range, so you must check nested_vmx_allowed and return 0 if it is false. >> > Or 1? I think 1 is better here because this may return LOCK message when query and tell OS not to write (if OS does such logical check) > >> Otherwise looks good. >> >> Paolo >> >> > case MSR_IA32_VMX_BASIC: >> > /* >> > @@ -2356,9 +2356,6 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) > Also this function is no longer needed. You can drop it. > > And what about Nadav's patch Bandan pointed too? It is not entirely > correct, but it is close to real HW. I think Nadav's patch is much closer to the HW scenario. However, I think we don't need make things complex since KVM doen't support SMX now and this MSR is always set to 0x5. Arthur > >> > if (!nested_vmx_allowed(vcpu)) >> > return 0; >> > >> > - if (msr_index == MSR_IA32_FEATURE_CONTROL) >> > - /* TODO: the right thing. */ >> > - return 1; >> > /* >> > * No need to treat VMX capability MSRs specially: If we don't handle >> > * them, handle_wrmsr will #GP(0), which is correct (they are readonly) >> > > > -- > Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 04, 2013 at 03:21:15PM +0800, Arthur Chunqi Li wrote: > On Thu, Jul 4, 2013 at 3:10 PM, Gleb Natapov <gleb@redhat.com> wrote: > > On Thu, Jul 04, 2013 at 09:00:09AM +0200, Paolo Bonzini wrote: > >> Il 03/07/2013 15:41, Arthur Chunqi Li ha scritto: > >> > Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. > >> > Simply return 0x5 when read and generate #GP(0) when write. > >> > Delete handling codes in vmx_set_vmx_msr() and generate #GP(0) in > >> > handle_wrmsr(). > >> > > >> > Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> > >> > --- > >> > arch/x86/kvm/vmx.c | 5 +---- > >> > 1 file changed, 1 insertion(+), 4 deletions(-) > >> > > >> > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > >> > index 260a919..e125f94 100644 > >> > --- a/arch/x86/kvm/vmx.c > >> > +++ b/arch/x86/kvm/vmx.c > >> > @@ -2277,7 +2277,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) > >> > > >> > switch (msr_index) { > >> > case MSR_IA32_FEATURE_CONTROL: > >> > - *pdata = 0; > >> > + *pdata = 0x5; > >> > break; > >> > >> This is not in the MSR_IA32_VMX_BASIC..MSR_IA32_VMX_TRUE_ENTRY_CTLS > >> range, so you must check nested_vmx_allowed and return 0 if it is false. > >> > > Or 1? > I think 1 is better here because this may return LOCK message when > query and tell OS not to write (if OS does such logical check) > > > >> Otherwise looks good. > >> > >> Paolo > >> > >> > case MSR_IA32_VMX_BASIC: > >> > /* > >> > @@ -2356,9 +2356,6 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) > > Also this function is no longer needed. You can drop it. > > > > And what about Nadav's patch Bandan pointed too? It is not entirely > > correct, but it is close to real HW. > I think Nadav's patch is much closer to the HW scenario. However, I > think we don't need make things complex since KVM doen't support SMX > now and this MSR is always set to 0x5. > Set to 0x5 by BIOS on real HW. This way BIOS can control if VMX is exposed to an OS. > Arthur > > > >> > if (!nested_vmx_allowed(vcpu)) > >> > return 0; > >> > > >> > - if (msr_index == MSR_IA32_FEATURE_CONTROL) > >> > - /* TODO: the right thing. */ > >> > - return 1; > >> > /* > >> > * No need to treat VMX capability MSRs specially: If we don't handle > >> > * them, handle_wrmsr will #GP(0), which is correct (they are readonly) > >> > > > > > -- > > Gleb. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
? 2013-7-4?15:24?Gleb Natapov <gleb@redhat.com> ??? > On Thu, Jul 04, 2013 at 03:21:15PM +0800, Arthur Chunqi Li wrote: >> On Thu, Jul 4, 2013 at 3:10 PM, Gleb Natapov <gleb@redhat.com> wrote: >>> On Thu, Jul 04, 2013 at 09:00:09AM +0200, Paolo Bonzini wrote: >>>> Il 03/07/2013 15:41, Arthur Chunqi Li ha scritto: >>>>> Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. >>>>> Simply return 0x5 when read and generate #GP(0) when write. >>>>> Delete handling codes in vmx_set_vmx_msr() and generate #GP(0) in >>>>> handle_wrmsr(). >>>>> >>>>> Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> >>>>> --- >>>>> arch/x86/kvm/vmx.c | 5 +---- >>>>> 1 file changed, 1 insertion(+), 4 deletions(-) >>>>> >>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>>>> index 260a919..e125f94 100644 >>>>> --- a/arch/x86/kvm/vmx.c >>>>> +++ b/arch/x86/kvm/vmx.c >>>>> @@ -2277,7 +2277,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) >>>>> >>>>> switch (msr_index) { >>>>> case MSR_IA32_FEATURE_CONTROL: >>>>> - *pdata = 0; >>>>> + *pdata = 0x5; >>>>> break; >>>> >>>> This is not in the MSR_IA32_VMX_BASIC..MSR_IA32_VMX_TRUE_ENTRY_CTLS >>>> range, so you must check nested_vmx_allowed and return 0 if it is false. >>>> >>> Or 1? >> I think 1 is better here because this may return LOCK message when >> query and tell OS not to write (if OS does such logical check) >>> >>>> Otherwise looks good. >>>> >>>> Paolo >>>> >>>>> case MSR_IA32_VMX_BASIC: >>>>> /* >>>>> @@ -2356,9 +2356,6 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) >>> Also this function is no longer needed. You can drop it. >>> >>> And what about Nadav's patch Bandan pointed too? It is not entirely >>> correct, but it is close to real HW. >> I think Nadav's patch is much closer to the HW scenario. However, I >> think we don't need make things complex since KVM doen't support SMX >> now and this MSR is always set to 0x5. >> > Set to 0x5 by BIOS on real HW. This way BIOS can control if VMX is > exposed to an OS. I know. So if we don't use solutions like Nadav's patch, some third-party BIOSes emulator (if they are) may get error since we simply generate #GP(0) when write to this MSR. We can correct SIPI reset in Nadav's patch and add initial codes to seabios, then the entire logical can fit real HW. Arthur > >> Arthur >>> >>>>> if (!nested_vmx_allowed(vcpu)) >>>>> return 0; >>>>> >>>>> - if (msr_index == MSR_IA32_FEATURE_CONTROL) >>>>> - /* TODO: the right thing. */ >>>>> - return 1; >>>>> /* >>>>> * No need to treat VMX capability MSRs specially: If we don't handle >>>>> * them, handle_wrmsr will #GP(0), which is correct (they are readonly) >>>>> >>> >>> -- >>> Gleb. > > -- > Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 04, 2013 at 04:16:25PM +0800, Gmail wrote: > ? 2013-7-4?15:24?Gleb Natapov <gleb@redhat.com> ??? > > > On Thu, Jul 04, 2013 at 03:21:15PM +0800, Arthur Chunqi Li wrote: > >> On Thu, Jul 4, 2013 at 3:10 PM, Gleb Natapov <gleb@redhat.com> wrote: > >>> On Thu, Jul 04, 2013 at 09:00:09AM +0200, Paolo Bonzini wrote: > >>>> Il 03/07/2013 15:41, Arthur Chunqi Li ha scritto: > >>>>> Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. > >>>>> Simply return 0x5 when read and generate #GP(0) when write. > >>>>> Delete handling codes in vmx_set_vmx_msr() and generate #GP(0) in > >>>>> handle_wrmsr(). > >>>>> > >>>>> Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> > >>>>> --- > >>>>> arch/x86/kvm/vmx.c | 5 +---- > >>>>> 1 file changed, 1 insertion(+), 4 deletions(-) > >>>>> > >>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > >>>>> index 260a919..e125f94 100644 > >>>>> --- a/arch/x86/kvm/vmx.c > >>>>> +++ b/arch/x86/kvm/vmx.c > >>>>> @@ -2277,7 +2277,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) > >>>>> > >>>>> switch (msr_index) { > >>>>> case MSR_IA32_FEATURE_CONTROL: > >>>>> - *pdata = 0; > >>>>> + *pdata = 0x5; > >>>>> break; > >>>> > >>>> This is not in the MSR_IA32_VMX_BASIC..MSR_IA32_VMX_TRUE_ENTRY_CTLS > >>>> range, so you must check nested_vmx_allowed and return 0 if it is false. > >>>> > >>> Or 1? > >> I think 1 is better here because this may return LOCK message when > >> query and tell OS not to write (if OS does such logical check) > >>> > >>>> Otherwise looks good. > >>>> > >>>> Paolo > >>>> > >>>>> case MSR_IA32_VMX_BASIC: > >>>>> /* > >>>>> @@ -2356,9 +2356,6 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) > >>> Also this function is no longer needed. You can drop it. > >>> > >>> And what about Nadav's patch Bandan pointed too? It is not entirely > >>> correct, but it is close to real HW. > >> I think Nadav's patch is much closer to the HW scenario. However, I > >> think we don't need make things complex since KVM doen't support SMX > >> now and this MSR is always set to 0x5. > >> > > Set to 0x5 by BIOS on real HW. This way BIOS can control if VMX is > > exposed to an OS. > I know. So if we don't use solutions like Nadav's patch, some third-party BIOSes emulator (if they are) may get error since we simply generate #GP(0) when write to this MSR. We can correct SIPI reset in Nadav's patch and add initial codes to seabios, then the entire logical can fit real HW. > We do not support third-party BIOSes, we just try to be as close to real HW as possible. Fixing Nadav's code sounds best. > Arthur > > > >> Arthur > >>> > >>>>> if (!nested_vmx_allowed(vcpu)) > >>>>> return 0; > >>>>> > >>>>> - if (msr_index == MSR_IA32_FEATURE_CONTROL) > >>>>> - /* TODO: the right thing. */ > >>>>> - return 1; > >>>>> /* > >>>>> * No need to treat VMX capability MSRs specially: If we don't handle > >>>>> * them, handle_wrmsr will #GP(0), which is correct (they are readonly) > >>>>> > >>> > >>> -- > >>> Gleb. > > > > -- > > Gleb. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Il 04/07/2013 09:10, Gleb Natapov ha scritto: > On Thu, Jul 04, 2013 at 09:00:09AM +0200, Paolo Bonzini wrote: >> Il 03/07/2013 15:41, Arthur Chunqi Li ha scritto: >>> Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. >>> Simply return 0x5 when read and generate #GP(0) when write. >>> Delete handling codes in vmx_set_vmx_msr() and generate #GP(0) in >>> handle_wrmsr(). >>> >>> Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> >>> --- >>> arch/x86/kvm/vmx.c | 5 +---- >>> 1 file changed, 1 insertion(+), 4 deletions(-) >>> >>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>> index 260a919..e125f94 100644 >>> --- a/arch/x86/kvm/vmx.c >>> +++ b/arch/x86/kvm/vmx.c >>> @@ -2277,7 +2277,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) >>> >>> switch (msr_index) { >>> case MSR_IA32_FEATURE_CONTROL: >>> - *pdata = 0; >>> + *pdata = 0x5; >>> break; >> >> This is not in the MSR_IA32_VMX_BASIC..MSR_IA32_VMX_TRUE_ENTRY_CTLS >> range, so you must check nested_vmx_allowed and return 0 if it is false. > > Or 1? "Return 0 from the whole function" and hence #GP(0) on reads. The MSR doesn't exist if VMX=SMX=0. >> Otherwise looks good. >> >> Paolo >> >>> case MSR_IA32_VMX_BASIC: >>> /* >>> @@ -2356,9 +2356,6 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) > Also this function is no longer needed. You can drop it. > > And what about Nadav's patch Bandan pointed too? It is not entirely > correct, but it is close to real HW. I don't like that it requires a firmware change in order to use nested VMX (at least for hypervisors that read the MSR). "Worse emulation" and "better emulation + new firmware" are indistiguishable from the point of view of anyone except the firmware. IMO there is no reason for a better emulation that no one would care about _and_ could look like a regression when updating to a newer kernel. Paolo >>> if (!nested_vmx_allowed(vcpu)) >>> return 0; >>> >>> - if (msr_index == MSR_IA32_FEATURE_CONTROL) >>> - /* TODO: the right thing. */ >>> - return 1; >>> /* >>> * No need to treat VMX capability MSRs specially: If we don't handle >>> * them, handle_wrmsr will #GP(0), which is correct (they are readonly) >>> > > -- > Gleb. > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 04, 2013 at 01:01:15PM +0200, Paolo Bonzini wrote: > Il 04/07/2013 09:10, Gleb Natapov ha scritto: > > On Thu, Jul 04, 2013 at 09:00:09AM +0200, Paolo Bonzini wrote: > >> Il 03/07/2013 15:41, Arthur Chunqi Li ha scritto: > >>> Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. > >>> Simply return 0x5 when read and generate #GP(0) when write. > >>> Delete handling codes in vmx_set_vmx_msr() and generate #GP(0) in > >>> handle_wrmsr(). > >>> > >>> Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> > >>> --- > >>> arch/x86/kvm/vmx.c | 5 +---- > >>> 1 file changed, 1 insertion(+), 4 deletions(-) > >>> > >>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > >>> index 260a919..e125f94 100644 > >>> --- a/arch/x86/kvm/vmx.c > >>> +++ b/arch/x86/kvm/vmx.c > >>> @@ -2277,7 +2277,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) > >>> > >>> switch (msr_index) { > >>> case MSR_IA32_FEATURE_CONTROL: > >>> - *pdata = 0; > >>> + *pdata = 0x5; > >>> break; > >> > >> This is not in the MSR_IA32_VMX_BASIC..MSR_IA32_VMX_TRUE_ENTRY_CTLS > >> range, so you must check nested_vmx_allowed and return 0 if it is false. > > > > Or 1? > > "Return 0 from the whole function" and hence #GP(0) on reads. The MSR > doesn't exist if VMX=SMX=0. > Right. > >> Otherwise looks good. > >> > >> Paolo > >> > >>> case MSR_IA32_VMX_BASIC: > >>> /* > >>> @@ -2356,9 +2356,6 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) > > Also this function is no longer needed. You can drop it. > > > > And what about Nadav's patch Bandan pointed too? It is not entirely > > correct, but it is close to real HW. > > I don't like that it requires a firmware change in order to use nested > VMX (at least for hypervisors that read the MSR). "Worse emulation" and > "better emulation + new firmware" are indistiguishable from the point of > view of anyone except the firmware. > > IMO there is no reason for a better emulation that no one would care > about _and_ could look like a regression when updating to a newer kernel. > That is why now is the good time to do that since nested vmx is not widely used. When it will be widely used the change will be impossible to do for reason you age giving. So it is now or never. > Paolo > > >>> if (!nested_vmx_allowed(vcpu)) > >>> return 0; > >>> > >>> - if (msr_index == MSR_IA32_FEATURE_CONTROL) > >>> - /* TODO: the right thing. */ > >>> - return 1; > >>> /* > >>> * No need to treat VMX capability MSRs specially: If we don't handle > >>> * them, handle_wrmsr will #GP(0), which is correct (they are readonly) > >>> > > > > -- > > Gleb. > > -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Il 04/07/2013 13:12, Gleb Natapov ha scritto: > > I don't like that it requires a firmware change in order to use nested > > VMX (at least for hypervisors that read the MSR). "Worse emulation" and > > "better emulation + new firmware" are indistiguishable from the point of > > view of anyone except the firmware. > > > > IMO there is no reason for a better emulation that no one would care > > about _and_ could look like a regression when updating to a newer kernel. > > That is why now is the good time to do that since nested vmx is not > widely used. When it will be widely used the change will be impossible > to do for reason you age giving. So it is now or never. I think it is a can of worms. For example, should this be conditionalized on running under QEMU? Under UEFI, TianoCore should be doing it, not SeaBIOS. And for CoreBoot, should it be done by CoreBoot or SeaBIOS? (How do people use KVM together with CoreBoot?) So I still prefer never... :) Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 04, 2013 at 01:21:51PM +0200, Paolo Bonzini wrote: > Il 04/07/2013 13:12, Gleb Natapov ha scritto: > > > I don't like that it requires a firmware change in order to use nested > > > VMX (at least for hypervisors that read the MSR). "Worse emulation" and > > > "better emulation + new firmware" are indistiguishable from the point of > > > view of anyone except the firmware. > > > > > > IMO there is no reason for a better emulation that no one would care > > > about _and_ could look like a regression when updating to a newer kernel. > > > > That is why now is the good time to do that since nested vmx is not > > widely used. When it will be widely used the change will be impossible > > to do for reason you age giving. So it is now or never. > > I think it is a can of worms. For example, should this be > conditionalized on running under QEMU? Under UEFI, TianoCore should be > doing it, not SeaBIOS. And for CoreBoot, should it be done by CoreBoot > or SeaBIOS? (How do people use KVM together with CoreBoot?) > This is not the first thing that firmware need to initialize. I let firmware guys fight over who is doing it, we just model HW. FWIW for Seabios patch would be trivial. > So I still prefer never... :) > This is a "can of worms" IMO. What we decide to init in KVM next to relieve firmware from its duty? This is "other hypervisor" way, in KVM we just model HW. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Il 04/07/2013 13:31, Gleb Natapov ha scritto: > On Thu, Jul 04, 2013 at 01:21:51PM +0200, Paolo Bonzini wrote: >> Il 04/07/2013 13:12, Gleb Natapov ha scritto: >>>> I don't like that it requires a firmware change in order to use nested >>>> VMX (at least for hypervisors that read the MSR). "Worse emulation" and >>>> "better emulation + new firmware" are indistiguishable from the point of >>>> view of anyone except the firmware. >>>> >>>> IMO there is no reason for a better emulation that no one would care >>>> about _and_ could look like a regression when updating to a newer kernel. >>> >>> That is why now is the good time to do that since nested vmx is not >>> widely used. When it will be widely used the change will be impossible >>> to do for reason you age giving. So it is now or never. >> >> I think it is a can of worms. For example, should this be >> conditionalized on running under QEMU? Under UEFI, TianoCore should be >> doing it, not SeaBIOS. And for CoreBoot, should it be done by CoreBoot >> or SeaBIOS? (How do people use KVM together with CoreBoot?) >> > This is not the first thing that firmware need to initialize. I let > firmware guys fight over who is doing it, we just model HW. FWIW for > Seabios patch would be trivial. Trivial but still depending on the question "who is doing it". If CoreBoot should (also) be doing it, the SeaBIOS patch should be conditional on CONFIG_QEMU. Also, should it be unconditional or depend on some external configuration knob (as on a bare-metal firmware)? Actually KVM probes MSR_IA32_FEATURE_CONTROL itself and sets the bits, so we can sidestep the whole firmware thing, and go with a fixed version of Nadav's patch. >> So I still prefer never... :) > > This is a "can of worms" IMO. What we decide to init in KVM next to > relieve firmware from its duty? This is "other hypervisor" way, in KVM > we just model HW. FWIW, I now checked Xen nested VMX and it just returns 5, but this has nothing to do with paravirtualization). Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 04, 2013 at 02:34:11PM +0200, Paolo Bonzini wrote: > Il 04/07/2013 13:31, Gleb Natapov ha scritto: > > On Thu, Jul 04, 2013 at 01:21:51PM +0200, Paolo Bonzini wrote: > >> Il 04/07/2013 13:12, Gleb Natapov ha scritto: > >>>> I don't like that it requires a firmware change in order to use nested > >>>> VMX (at least for hypervisors that read the MSR). "Worse emulation" and > >>>> "better emulation + new firmware" are indistiguishable from the point of > >>>> view of anyone except the firmware. > >>>> > >>>> IMO there is no reason for a better emulation that no one would care > >>>> about _and_ could look like a regression when updating to a newer kernel. > >>> > >>> That is why now is the good time to do that since nested vmx is not > >>> widely used. When it will be widely used the change will be impossible > >>> to do for reason you age giving. So it is now or never. > >> > >> I think it is a can of worms. For example, should this be > >> conditionalized on running under QEMU? Under UEFI, TianoCore should be > >> doing it, not SeaBIOS. And for CoreBoot, should it be done by CoreBoot > >> or SeaBIOS? (How do people use KVM together with CoreBoot?) > >> > > This is not the first thing that firmware need to initialize. I let > > firmware guys fight over who is doing it, we just model HW. FWIW for > > Seabios patch would be trivial. > > Trivial but still depending on the question "who is doing it". If > CoreBoot should (also) be doing it, the SeaBIOS patch should be > conditional on CONFIG_QEMU. Also, should it be unconditional or depend > on some external configuration knob (as on a bare-metal firmware)? > Let firmware developers solve firmware problems. They have all the same problems when running on real HW and they will have to figure out a solution regardless. Making things different on virt will only cause people to treat virt differently (remember irqbalance?). > Actually KVM probes MSR_IA32_FEATURE_CONTROL itself and sets the bits, > so we can sidestep the whole firmware thing, and go with a fixed version > of Nadav's patch. > Indeed, so no regression will be seen even temporary. > >> So I still prefer never... :) > > > > This is a "can of worms" IMO. What we decide to init in KVM next to > > relieve firmware from its duty? This is "other hypervisor" way, in KVM > > we just model HW. > > FWIW, I now checked Xen nested VMX and it just returns 5, but this has > nothing to do with paravirtualization). > > Paolo -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jul 4, 2013 at 8:43 PM, Gleb Natapov <gleb@redhat.com> wrote: > On Thu, Jul 04, 2013 at 02:34:11PM +0200, Paolo Bonzini wrote: >> Il 04/07/2013 13:31, Gleb Natapov ha scritto: >> > On Thu, Jul 04, 2013 at 01:21:51PM +0200, Paolo Bonzini wrote: >> >> Il 04/07/2013 13:12, Gleb Natapov ha scritto: >> >>>> I don't like that it requires a firmware change in order to use nested >> >>>> VMX (at least for hypervisors that read the MSR). "Worse emulation" and >> >>>> "better emulation + new firmware" are indistiguishable from the point of >> >>>> view of anyone except the firmware. >> >>>> >> >>>> IMO there is no reason for a better emulation that no one would care >> >>>> about _and_ could look like a regression when updating to a newer kernel. >> >>> >> >>> That is why now is the good time to do that since nested vmx is not >> >>> widely used. When it will be widely used the change will be impossible >> >>> to do for reason you age giving. So it is now or never. >> >> >> >> I think it is a can of worms. For example, should this be >> >> conditionalized on running under QEMU? Under UEFI, TianoCore should be >> >> doing it, not SeaBIOS. And for CoreBoot, should it be done by CoreBoot >> >> or SeaBIOS? (How do people use KVM together with CoreBoot?) >> >> >> > This is not the first thing that firmware need to initialize. I let >> > firmware guys fight over who is doing it, we just model HW. FWIW for >> > Seabios patch would be trivial. >> >> Trivial but still depending on the question "who is doing it". If >> CoreBoot should (also) be doing it, the SeaBIOS patch should be >> conditional on CONFIG_QEMU. Also, should it be unconditional or depend >> on some external configuration knob (as on a bare-metal firmware)? >> > Let firmware developers solve firmware problems. They have all the same > problems when running on real HW and they will have to figure out a > solution regardless. Making things different on virt will only cause > people to treat virt differently (remember irqbalance?). I prefer to Gleb's idea. As nested virt is trying to provide a framework the same as the HW, we just need to model HW. If initialization works are done in KVM, the responsibility will be confused and some features are hard to expand in the future. e.g. if we want to add SMX support and let BIOS to configure MSR_IA32_FEATURE_CONTROL, it may hard to handle this at that time. Arthur > >> Actually KVM probes MSR_IA32_FEATURE_CONTROL itself and sets the bits, >> so we can sidestep the whole firmware thing, and go with a fixed version >> of Nadav's patch. >> > Indeed, so no regression will be seen even temporary. > >> >> So I still prefer never... :) >> > >> > This is a "can of worms" IMO. What we decide to init in KVM next to >> > relieve firmware from its duty? This is "other hypervisor" way, in KVM >> > we just model HW. >> >> FWIW, I now checked Xen nested VMX and it just returns 5, but this has >> nothing to do with paravirtualization). >> >> Paolo > > -- > Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 260a919..e125f94 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -2277,7 +2277,7 @@ static int vmx_get_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) switch (msr_index) { case MSR_IA32_FEATURE_CONTROL: - *pdata = 0; + *pdata = 0x5; break; case MSR_IA32_VMX_BASIC: /* @@ -2356,9 +2356,6 @@ static int vmx_set_vmx_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 data) if (!nested_vmx_allowed(vcpu)) return 0; - if (msr_index == MSR_IA32_FEATURE_CONTROL) - /* TODO: the right thing. */ - return 1; /* * No need to treat VMX capability MSRs specially: If we don't handle * them, handle_wrmsr will #GP(0), which is correct (they are readonly)
Fix read/write to IA32_FEATURE_CONTROL MSR in nested environment. Simply return 0x5 when read and generate #GP(0) when write. Delete handling codes in vmx_set_vmx_msr() and generate #GP(0) in handle_wrmsr(). Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com> --- arch/x86/kvm/vmx.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)