Message ID | 20190102192522.13158-1-liran.alon@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: x86: Do not raise #GP on write to MSR_IA32_MCG_CTL which is not 0 or all 1s | expand |
On Wed, Jan 2, 2019 at 11:25 AM Liran Alon <liran.alon@oracle.com> wrote: > > Only 0 or all 1s can be written to IA32_MCG_CTL. > SDM specifies other values as undefined and/or implementation specific. > > However, some guest kernels write different values. > One such example is WinNT 4 SP6 which uses a value of 0xffffffff. > > Prefer to silently accept these writes to avoid an uncatched #GP in the guest. > We will define our implementation specific behaviour as any value other than 0 > to be treated as all 1s. > > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Signed-off-by: Liran Alon <liran.alon@oracle.com> > --- > arch/x86/kvm/x86.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 02c8e095a239..a06e4e892a9d 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2287,9 +2287,21 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > if (!(mcg_cap & MCG_CTL_P) && > (data || !msr_info->host_initiated)) > return 1; > - if (data != 0 && data != ~(u64)0) > - return 1; > - vcpu->arch.mcg_ctl = data; > + /* > + * Only 0 or all 1s can be written to IA32_MCG_CTL. > + * SDM specifies other values as undefined and/or > + * implementation specific. > + * > + * However, some guest kernels write different values. > + * One such example is WinNT 4 SP6 which uses a value > + * of 0xffffffff. > + * > + * Prefer to silently accept these writes to avoid an > + * uncatched #GP in the guest. We will define our > + * implementation specific behaviour as any value > + * other than 0 to be treated as all 1s. > + */ Isn't an "implementation" essentially a "family, model, and stepping"? I don't think you can define your own implementation-specific behavior unless you have F/M/S different from any actual hardware produced by Intel. > + vcpu->arch.mcg_ctl = data ? ~(u64)0 : 0; > break; > default: > if (msr >= MSR_IA32_MC0_CTL && > -- > 2.16.1 >
On Wed, Jan 02, 2019 at 09:25:22PM +0200, Liran Alon wrote: > Only 0 or all 1s can be written to IA32_MCG_CTL. > SDM specifies other values as undefined and/or implementation specific. > > However, some guest kernels write different values. > One such example is WinNT 4 SP6 which uses a value of 0xffffffff. > > Prefer to silently accept these writes to avoid an uncatched #GP in the guest. > We will define our implementation specific behaviour as any value other than 0 > to be treated as all 1s. I'm assuming this is a 32-bit guest, so what about going with a more precise hackaround and explicitly allowing 0xffffffff for 32-bit guests, e.g. sign-extending bit 31 when the value isn't already 0 or -1? It's worth keeping the #GP behavior for modern kernels, e.g. for testing and debug. MSRs 0x0 and 0x1 are aliased to MSRs 0x400 and 0x401 for historical reasons, i.e. WRMSR without setting ECX can easily write MSR_IA32_MC0_CTL. > Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Signed-off-by: Liran Alon <liran.alon@oracle.com> > --- > arch/x86/kvm/x86.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 02c8e095a239..a06e4e892a9d 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2287,9 +2287,21 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > if (!(mcg_cap & MCG_CTL_P) && > (data || !msr_info->host_initiated)) > return 1; > - if (data != 0 && data != ~(u64)0) > - return 1; > - vcpu->arch.mcg_ctl = data; > + /* > + * Only 0 or all 1s can be written to IA32_MCG_CTL. > + * SDM specifies other values as undefined and/or > + * implementation specific. > + * > + * However, some guest kernels write different values. > + * One such example is WinNT 4 SP6 which uses a value > + * of 0xffffffff. > + * > + * Prefer to silently accept these writes to avoid an > + * uncatched #GP in the guest. We will define our > + * implementation specific behaviour as any value > + * other than 0 to be treated as all 1s. > + */ > + vcpu->arch.mcg_ctl = data ? ~(u64)0 : 0; > break; > default: > if (msr >= MSR_IA32_MC0_CTL && > -- > 2.16.1 >
On Wed, Jan 02, 2019 at 03:26:30PM -0800, Sean Christopherson wrote: > On Wed, Jan 02, 2019 at 09:25:22PM +0200, Liran Alon wrote: > > Only 0 or all 1s can be written to IA32_MCG_CTL. > > SDM specifies other values as undefined and/or implementation specific. > > > > However, some guest kernels write different values. > > One such example is WinNT 4 SP6 which uses a value of 0xffffffff. > > > > Prefer to silently accept these writes to avoid an uncatched #GP in the guest. > > We will define our implementation specific behaviour as any value other than 0 > > to be treated as all 1s. > > I'm assuming this is a 32-bit guest, so what about going with a more > precise hackaround and explicitly allowing 0xffffffff for 32-bit guests, > e.g. sign-extending bit 31 when the value isn't already 0 or -1? > > It's worth keeping the #GP behavior for modern kernels, e.g. for testing > and debug. MSRs 0x0 and 0x1 are aliased to MSRs 0x400 and 0x401 for > historical reasons, i.e. WRMSR without setting ECX can easily write > MSR_IA32_MC0_CTL. Of course KVM probably doesn't emulate the aliasing so it'd likely #GP anyways, but I still think we should go with a more surgical hack :)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 02c8e095a239..a06e4e892a9d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2287,9 +2287,21 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if (!(mcg_cap & MCG_CTL_P) && (data || !msr_info->host_initiated)) return 1; - if (data != 0 && data != ~(u64)0) - return 1; - vcpu->arch.mcg_ctl = data; + /* + * Only 0 or all 1s can be written to IA32_MCG_CTL. + * SDM specifies other values as undefined and/or + * implementation specific. + * + * However, some guest kernels write different values. + * One such example is WinNT 4 SP6 which uses a value + * of 0xffffffff. + * + * Prefer to silently accept these writes to avoid an + * uncatched #GP in the guest. We will define our + * implementation specific behaviour as any value + * other than 0 to be treated as all 1s. + */ + vcpu->arch.mcg_ctl = data ? ~(u64)0 : 0; break; default: if (msr >= MSR_IA32_MC0_CTL &&