diff mbox

[04/19] qemu-kvm: x86: Drop MSR reset

Message ID c9487b8c3de659b4a34ab00c03fbc180985f4119.1304538230.git.jan.kiszka@web.de (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Kiszka May 4, 2011, 7:43 p.m. UTC
From: Jan Kiszka <jan.kiszka@siemens.com>

Paravirtual MSRs are properly cleared on reset now, and blindly clearing
the rest is questionable anyway (better address those one by one,
re-initializing their backing CPU state fields).

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 qemu-kvm-x86.c |   69 --------------------------------------------------------
 1 files changed, 0 insertions(+), 69 deletions(-)

Comments

Avi Kivity May 5, 2011, 8:08 a.m. UTC | #1
On 05/04/2011 10:43 PM, Jan Kiszka wrote:
> From: Jan Kiszka<jan.kiszka@siemens.com>
>
> Paravirtual MSRs are properly cleared on reset now, and blindly clearing
> the rest is questionable anyway (better address those one by one,
> re-initializing their backing CPU state fields).
>

This can introduce a regression when new paravirtual MSRs are added.  So 
we either need to port this, or query the reset state from the kernel 
immediately after creating the vcpu and saving it.
Jan Kiszka May 5, 2011, 8:11 a.m. UTC | #2
On 2011-05-05 10:08, Avi Kivity wrote:
> On 05/04/2011 10:43 PM, Jan Kiszka wrote:
>> From: Jan Kiszka<jan.kiszka@siemens.com>
>>
>> Paravirtual MSRs are properly cleared on reset now, and blindly clearing
>> the rest is questionable anyway (better address those one by one,
>> re-initializing their backing CPU state fields).
>>
> 
> This can introduce a regression when new paravirtual MSRs are added.

You mean MSRs already included or future ones?

>  So
> we either need to port this, or query the reset state from the kernel
> immediately after creating the vcpu and saving it.

Can't completely follow what you mean.

My general point remains: Every MSR requires individual care, not blind
overwriting like qemu-kvm does. So the person contributing a new MSR,
real or pv, has to tackle this aspect, and we need to review the code in
this regard.

Jan
Avi Kivity May 5, 2011, 8:16 a.m. UTC | #3
On 05/05/2011 11:11 AM, Jan Kiszka wrote:
> On 2011-05-05 10:08, Avi Kivity wrote:
> >  On 05/04/2011 10:43 PM, Jan Kiszka wrote:
> >>  From: Jan Kiszka<jan.kiszka@siemens.com>
> >>
> >>  Paravirtual MSRs are properly cleared on reset now, and blindly clearing
> >>  the rest is questionable anyway (better address those one by one,
> >>  re-initializing their backing CPU state fields).
> >>
> >
> >  This can introduce a regression when new paravirtual MSRs are added.
>
> You mean MSRs already included or future ones?

Future ones.

> >   So
> >  we either need to port this, or query the reset state from the kernel
> >  immediately after creating the vcpu and saving it.
>
> Can't completely follow what you mean.
>
> My general point remains: Every MSR requires individual care, not blind
> overwriting like qemu-kvm does. So the person contributing a new MSR,
> real or pv, has to tackle this aspect, and we need to review the code in
> this regard.

It's a trick to avoid needing individual care.

1. Call KVM_CREATE_VCPU.  This causes all MSRs to be initialized to 
their power-on reset values.
2. Issue KVM_GET_MSR_LIST, and then KVM_GET_MSRS to read all MSRs.  
Stash them all in safe places - the ones known to qemu but also the 
unknown ones.  Qemu may use its own values for the MSRs it knows about 
(for example if different cpu models have different power-on values)
3. On reset, issue KVM_SET_MSRS with the MSR values obtained in step 2.

The result is forward and backwards compatibility without lockstepping 
qemu and kvm.
Jan Kiszka May 5, 2011, 8:27 a.m. UTC | #4
On 2011-05-05 10:16, Avi Kivity wrote:
> On 05/05/2011 11:11 AM, Jan Kiszka wrote:
>> On 2011-05-05 10:08, Avi Kivity wrote:
>>>  On 05/04/2011 10:43 PM, Jan Kiszka wrote:
>>>>  From: Jan Kiszka<jan.kiszka@siemens.com>
>>>>
>>>>  Paravirtual MSRs are properly cleared on reset now, and blindly clearing
>>>>  the rest is questionable anyway (better address those one by one,
>>>>  re-initializing their backing CPU state fields).
>>>>
>>>
>>>  This can introduce a regression when new paravirtual MSRs are added.
>>
>> You mean MSRs already included or future ones?
> 
> Future ones.
> 
>>>   So
>>>  we either need to port this, or query the reset state from the kernel
>>>  immediately after creating the vcpu and saving it.
>>
>> Can't completely follow what you mean.
>>
>> My general point remains: Every MSR requires individual care, not blind
>> overwriting like qemu-kvm does. So the person contributing a new MSR,
>> real or pv, has to tackle this aspect, and we need to review the code in
>> this regard.
> 
> It's a trick to avoid needing individual care.
> 
> 1. Call KVM_CREATE_VCPU.  This causes all MSRs to be initialized to 
> their power-on reset values.

Almost all: Which ones are CPU specific like the APIC_BASE?

> 2. Issue KVM_GET_MSR_LIST, and then KVM_GET_MSRS to read all MSRs.  
> Stash them all in safe places - the ones known to qemu but also the 
> unknown ones.  Qemu may use its own values for the MSRs it knows about 
> (for example if different cpu models have different power-on values)
> 3. On reset, issue KVM_SET_MSRS with the MSR values obtained in step 2.
> 
> The result is forward and backwards compatibility without lockstepping 
> qemu and kvm.

OK, sounds good. I will work out a patch for uq/master.

Nevertheless, the qemu-kvm code is already unneeded today and can safely
be removed IMHO.

Jan
Avi Kivity May 5, 2011, 8:33 a.m. UTC | #5
On 05/05/2011 11:27 AM, Jan Kiszka wrote:
> >
> >  1. Call KVM_CREATE_VCPU.  This causes all MSRs to be initialized to
> >  their power-on reset values.
>
> Almost all: Which ones are CPU specific like the APIC_BASE?

Do you mean cpu specific as in smp or cpu specific as in cpu model?

If the former, we simply do the reset operation per-cpu.  It's the 
natural thing anyway.

> >  2. Issue KVM_GET_MSR_LIST, and then KVM_GET_MSRS to read all MSRs.
> >  Stash them all in safe places - the ones known to qemu but also the
> >  unknown ones.  Qemu may use its own values for the MSRs it knows about
> >  (for example if different cpu models have different power-on values)
> >  3. On reset, issue KVM_SET_MSRS with the MSR values obtained in step 2.
> >
> >  The result is forward and backwards compatibility without lockstepping
> >  qemu and kvm.
>
> OK, sounds good. I will work out a patch for uq/master.

Great, thanks.

> Nevertheless, the qemu-kvm code is already unneeded today and can safely
> be removed IMHO.

I don't follow?  Won't it cause a regression?

Of course, if we get a patch soon no one will ever see the regression so 
we can apply the series.
Jan Kiszka May 5, 2011, 8:44 a.m. UTC | #6
On 2011-05-05 10:33, Avi Kivity wrote:
> On 05/05/2011 11:27 AM, Jan Kiszka wrote:
>>>
>>>  1. Call KVM_CREATE_VCPU.  This causes all MSRs to be initialized to
>>>  their power-on reset values.
>>
>> Almost all: Which ones are CPU specific like the APIC_BASE?
> 
> Do you mean cpu specific as in smp or cpu specific as in cpu model?

Yep.

> 
> If the former, we simply do the reset operation per-cpu.  It's the 
> natural thing anyway.

Quite wasteful /wrt to memory given that the majority will be identical.

> 
>>>  2. Issue KVM_GET_MSR_LIST, and then KVM_GET_MSRS to read all MSRs.
>>>  Stash them all in safe places - the ones known to qemu but also the
>>>  unknown ones.  Qemu may use its own values for the MSRs it knows about
>>>  (for example if different cpu models have different power-on values)
>>>  3. On reset, issue KVM_SET_MSRS with the MSR values obtained in step 2.
>>>
>>>  The result is forward and backwards compatibility without lockstepping
>>>  qemu and kvm.
>>
>> OK, sounds good. I will work out a patch for uq/master.
> 
> Great, thanks.
> 
>> Nevertheless, the qemu-kvm code is already unneeded today and can safely
>> be removed IMHO.
> 
> I don't follow?  Won't it cause a regression?

Not at all. We use the "individual care" pattern upstream now,
specifically for those MSRs (kvmclock) for which the qemu-kvm code was
introduced.

> 
> Of course, if we get a patch soon no one will ever see the regression so 
> we can apply the series.

I will still require the usual testing and merging round via upstream
and back. Not sure when I'll be able to work on it, probably not the
next days.

Jan
Avi Kivity May 5, 2011, 8:53 a.m. UTC | #7
On 05/05/2011 11:44 AM, Jan Kiszka wrote:
> On 2011-05-05 10:33, Avi Kivity wrote:
> >  On 05/05/2011 11:27 AM, Jan Kiszka wrote:
> >>>
> >>>   1. Call KVM_CREATE_VCPU.  This causes all MSRs to be initialized to
> >>>   their power-on reset values.
> >>
> >>  Almost all: Which ones are CPU specific like the APIC_BASE?
> >
> >  Do you mean cpu specific as in smp or cpu specific as in cpu model?
>
> Yep.

Doh.

> >
> >  If the former, we simply do the reset operation per-cpu.  It's the
> >  natural thing anyway.
>
> Quite wasteful /wrt to memory given that the majority will be identical.

We're talking a few hundred bytes per cpu.  If you want to save memory, 
look at the PhysPageDesc array, it takes up 0.4% of guest memory, so 4MB 
for a 1GB guest.

> >>  Nevertheless, the qemu-kvm code is already unneeded today and can safely
> >>  be removed IMHO.
> >
> >  I don't follow?  Won't it cause a regression?
>
> Not at all. We use the "individual care" pattern upstream now,
> specifically for those MSRs (kvmclock) for which the qemu-kvm code was
> introduced.

I mean a future regression with current+patch qemu and a new kernel.

> >
> >  Of course, if we get a patch soon no one will ever see the regression so
> >  we can apply the series.
>
> I will still require the usual testing and merging round via upstream
> and back. Not sure when I'll be able to work on it, probably not the
> next days.

If you can do it within a couple of weeks or so that should be fine.
Jan Kiszka May 5, 2011, 9:32 a.m. UTC | #8
On 2011-05-05 10:53, Avi Kivity wrote:
> On 05/05/2011 11:44 AM, Jan Kiszka wrote:
>> On 2011-05-05 10:33, Avi Kivity wrote:
>> >  On 05/05/2011 11:27 AM, Jan Kiszka wrote:
>> >>>
>> >>>   1. Call KVM_CREATE_VCPU.  This causes all MSRs to be initialized to
>> >>>   their power-on reset values.
>> >>
>> >>  Almost all: Which ones are CPU specific like the APIC_BASE?
>> >
>> >  Do you mean cpu specific as in smp or cpu specific as in cpu model?
>>
>> Yep.
> 
> Doh.
> 
>> >
>> >  If the former, we simply do the reset operation per-cpu.  It's the
>> >  natural thing anyway.
>>
>> Quite wasteful /wrt to memory given that the majority will be identical.
> 
> We're talking a few hundred bytes per cpu.  If you want to save memory,
> look at the PhysPageDesc array, it takes up 0.4% of guest memory, so 4MB
> for a 1GB guest.

I know (that's fixable, BTW). But that should not excuse needless memory
wasting elsewhere.

> 
>> >>  Nevertheless, the qemu-kvm code is already unneeded today and can
>> safely
>> >>  be removed IMHO.
>> >
>> >  I don't follow?  Won't it cause a regression?
>>
>> Not at all. We use the "individual care" pattern upstream now,
>> specifically for those MSRs (kvmclock) for which the qemu-kvm code was
>> introduced.
> 
> I mean a future regression with current+patch qemu and a new kernel.

For sane scenarios, such a combination should never expose new (ie.
unknown from qemu's POV) MSRs to the guest. Thus not clearing them
cannot cause any harm.

BTW, you also do not know if 0 will be the right reset value for these
to-be-invented MSRs. That could cause regression as well.

> 
>> >
>> >  Of course, if we get a patch soon no one will ever see the
>> regression so
>> >  we can apply the series.
>>
>> I will still require the usual testing and merging round via upstream
>> and back. Not sure when I'll be able to work on it, probably not the
>> next days.
> 
> If you can do it within a couple of weeks or so that should be fine.
> 

We'll see, but I still do not share your concern regarding future
regressions when removing the fragile reset code.

Jan
Avi Kivity May 5, 2011, 10:22 a.m. UTC | #9
On 05/05/2011 12:32 PM, Jan Kiszka wrote:
> >>  >
> >>  >   If the former, we simply do the reset operation per-cpu.  It's the
> >>  >   natural thing anyway.
> >>
> >>  Quite wasteful /wrt to memory given that the majority will be identical.
> >
> >  We're talking a few hundred bytes per cpu.  If you want to save memory,
> >  look at the PhysPageDesc array, it takes up 0.4% of guest memory, so 4MB
> >  for a 1GB guest.
>
> I know (that's fixable, BTW). But that should not excuse needless memory
> wasting elsewhere.

IMO a few hundred bytes is worth the correctness here.

> >
> >>  >>   Nevertheless, the qemu-kvm code is already unneeded today and can
> >>  safely
> >>  >>   be removed IMHO.
> >>  >
> >>  >   I don't follow?  Won't it cause a regression?
> >>
> >>  Not at all. We use the "individual care" pattern upstream now,
> >>  specifically for those MSRs (kvmclock) for which the qemu-kvm code was
> >>  introduced.
> >
> >  I mean a future regression with current+patch qemu and a new kernel.
>
> For sane scenarios, such a combination should never expose new (ie.
> unknown from qemu's POV) MSRs to the guest. Thus not clearing them
> cannot cause any harm.

The problem is with hardware MSRs (PV MSRs are protected by cpuid, and 
always disable themselves when zeroed).

> BTW, you also do not know if 0 will be the right reset value for these
> to-be-invented MSRs. That could cause regression as well.

What I suggested wasn't zeroing them, but writing the value we read just 
after vcpu creation.

We had a regression when we started supporting PAT.  Zeroing it causes 
the cache to be disabled, making everything ridiculously slow.  We now 
special case it; my proposed solution would have taken care of it.
Jan Kiszka May 5, 2011, 10:36 a.m. UTC | #10
On 2011-05-05 12:22, Avi Kivity wrote:
> On 05/05/2011 12:32 PM, Jan Kiszka wrote:
>> >>  >
>> >>  >   If the former, we simply do the reset operation per-cpu.  It's
>> the
>> >>  >   natural thing anyway.
>> >>
>> >>  Quite wasteful /wrt to memory given that the majority will be
>> identical.
>> >
>> >  We're talking a few hundred bytes per cpu.  If you want to save
>> memory,
>> >  look at the PhysPageDesc array, it takes up 0.4% of guest memory,
>> so 4MB
>> >  for a 1GB guest.
>>
>> I know (that's fixable, BTW). But that should not excuse needless memory
>> wasting elsewhere.
> 
> IMO a few hundred bytes is worth the correctness here.
> 
>> >
>> >>  >>   Nevertheless, the qemu-kvm code is already unneeded today and
>> can
>> >>  safely
>> >>  >>   be removed IMHO.
>> >>  >
>> >>  >   I don't follow?  Won't it cause a regression?
>> >>
>> >>  Not at all. We use the "individual care" pattern upstream now,
>> >>  specifically for those MSRs (kvmclock) for which the qemu-kvm code
>> was
>> >>  introduced.
>> >
>> >  I mean a future regression with current+patch qemu and a new kernel.
>>
>> For sane scenarios, such a combination should never expose new (ie.
>> unknown from qemu's POV) MSRs to the guest. Thus not clearing them
>> cannot cause any harm.
> 
> The problem is with hardware MSRs (PV MSRs are protected by cpuid, and
> always disable themselves when zeroed).

Well, this doesn't change the problem of the existing code.

> 
>> BTW, you also do not know if 0 will be the right reset value for these
>> to-be-invented MSRs. That could cause regression as well.
> 
> What I suggested wasn't zeroing them, but writing the value we read just
> after vcpu creation.
> 
> We had a regression when we started supporting PAT.  Zeroing it causes
> the cache to be disabled, making everything ridiculously slow.  We now
> special case it; my proposed solution would have taken care of it.

I'm talking about the current code, not the proper way to do it in the
future. PAT demonstrates why regressions can happen as long as we zero
and it's better to stop this now even without the new code in place.

Jan
Gleb Natapov May 5, 2011, 11:22 a.m. UTC | #11
On Thu, May 05, 2011 at 11:32:26AM +0200, Jan Kiszka wrote:
> >> >
> >> >  Of course, if we get a patch soon no one will ever see the
> >> regression so
> >> >  we can apply the series.
> >>
> >> I will still require the usual testing and merging round via upstream
> >> and back. Not sure when I'll be able to work on it, probably not the
> >> next days.
> > 
> > If you can do it within a couple of weeks or so that should be fine.
> > 
> 
> We'll see, but I still do not share your concern regarding future
> regressions when removing the fragile reset code.
> 
Why do we rely on userspace to properly reset kernel component anyway?
We should introduce cpu/lapic/ioapic/pit/pic resets ASAP.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Avi Kivity May 5, 2011, 11:57 a.m. UTC | #12
On 05/05/2011 01:36 PM, Jan Kiszka wrote:
> >
> >  The problem is with hardware MSRs (PV MSRs are protected by cpuid, and
> >  always disable themselves when zeroed).
>
> Well, this doesn't change the problem of the existing code.

Right.

> >
> >  What I suggested wasn't zeroing them, but writing the value we read just
> >  after vcpu creation.
> >
> >  We had a regression when we started supporting PAT.  Zeroing it causes
> >  the cache to be disabled, making everything ridiculously slow.  We now
> >  special case it; my proposed solution would have taken care of it.
>
> I'm talking about the current code, not the proper way to do it in the
> future. PAT demonstrates why regressions can happen as long as we zero
> and it's better to stop this now even without the new code in place.

I'm not sure, but if we do adopt the new reset mechanism, it doesn't matter.
Avi Kivity May 5, 2011, 11:58 a.m. UTC | #13
On 05/05/2011 02:22 PM, Gleb Natapov wrote:
> >
> >  We'll see, but I still do not share your concern regarding future
> >  regressions when removing the fragile reset code.
> >
> Why do we rely on userspace to properly reset kernel component anyway?
> We should introduce cpu/lapic/ioapic/pit/pic resets ASAP.

We should, but we'll always have to deal with kernels that don't have 
reset ioctls.
Gleb Natapov May 5, 2011, 12:23 p.m. UTC | #14
On Thu, May 05, 2011 at 02:58:27PM +0300, Avi Kivity wrote:
> On 05/05/2011 02:22 PM, Gleb Natapov wrote:
> >>
> >>  We'll see, but I still do not share your concern regarding future
> >>  regressions when removing the fragile reset code.
> >>
> >Why do we rely on userspace to properly reset kernel component anyway?
> >We should introduce cpu/lapic/ioapic/pit/pic resets ASAP.
> 
> We should, but we'll always have to deal with kernels that don't
> have reset ioctls.
> 
s/always/for quite a while/. Unfortunately yes. Unless we put qemu in the kernel
tree and will release them in lock steps of course :)

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka May 5, 2011, 12:42 p.m. UTC | #15
On 2011-05-05 14:23, Gleb Natapov wrote:
> On Thu, May 05, 2011 at 02:58:27PM +0300, Avi Kivity wrote:
>> On 05/05/2011 02:22 PM, Gleb Natapov wrote:
>>>>
>>>>  We'll see, but I still do not share your concern regarding future
>>>>  regressions when removing the fragile reset code.
>>>>
>>> Why do we rely on userspace to properly reset kernel component anyway?
>>> We should introduce cpu/lapic/ioapic/pit/pic resets ASAP.
>>
>> We should, but we'll always have to deal with kernels that don't
>> have reset ioctls.
>>
> s/always/for quite a while/. Unfortunately yes. Unless we put qemu in the kernel
> tree and will release them in lock steps of course :)

Seriously, I do not see much added-value of a reset service. The pattern
we will use for MSRs could just be applied to other in-kernel resources
as well - unless they are already architecturally defined in a way that
leaves no questions regarding the proper future reset state. Except for
the CPU, all other in-kernel devices are not extensible in their current
form.

Jan
Marcelo Tosatti May 5, 2011, 1:33 p.m. UTC | #16
On Thu, May 05, 2011 at 02:22:57PM +0300, Gleb Natapov wrote:
> On Thu, May 05, 2011 at 11:32:26AM +0200, Jan Kiszka wrote:
> > >> >
> > >> >  Of course, if we get a patch soon no one will ever see the
> > >> regression so
> > >> >  we can apply the series.
> > >>
> > >> I will still require the usual testing and merging round via upstream
> > >> and back. Not sure when I'll be able to work on it, probably not the
> > >> next days.
> > > 
> > > If you can do it within a couple of weeks or so that should be fine.
> > > 
> > 
> > We'll see, but I still do not share your concern regarding future
> > regressions when removing the fragile reset code.
> > 
> Why do we rely on userspace to properly reset kernel component anyway?

In general its required that userspace properly sets (some of) the
parameters of emulated hardware, including on reset.

> We should introduce cpu/lapic/ioapic/pit/pic resets ASAP.

Thats done through userspace via ioctls, don't get your point?

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov May 5, 2011, 6:08 p.m. UTC | #17
On Thu, May 05, 2011 at 10:33:01AM -0300, Marcelo Tosatti wrote:
> On Thu, May 05, 2011 at 02:22:57PM +0300, Gleb Natapov wrote:
> > On Thu, May 05, 2011 at 11:32:26AM +0200, Jan Kiszka wrote:
> > > >> >
> > > >> >  Of course, if we get a patch soon no one will ever see the
> > > >> regression so
> > > >> >  we can apply the series.
> > > >>
> > > >> I will still require the usual testing and merging round via upstream
> > > >> and back. Not sure when I'll be able to work on it, probably not the
> > > >> next days.
> > > > 
> > > > If you can do it within a couple of weeks or so that should be fine.
> > > > 
> > > 
> > > We'll see, but I still do not share your concern regarding future
> > > regressions when removing the fragile reset code.
> > > 
> > Why do we rely on userspace to properly reset kernel component anyway?
> 
> In general its required that userspace properly sets (some of) the
> parameters of emulated hardware, including on reset.
That's just how things work now. It doesn't have to be this way.

> 
> > We should introduce cpu/lapic/ioapic/pit/pic resets ASAP.
> 
> Thats done through userspace via ioctls, don't get your point?

Why userspace shouldn't know such low level details about in kernel
device? We had same devices emulated in qemu so it was easy to reuse
that for reset, but think about writing device model from the start.
With current interfaces you either need to have very low level knowledge
about in kernel device in userspace just to reset it properly, or you
need to save device state just after creation and reload the saved state
on reset (kind of what Avi proposed for MSR). Both solutions are not
optional IMO.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index 844d345..eb8faf2 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -26,7 +26,6 @@ 
 
 #define MSR_IA32_TSC            0x10
 
-static struct kvm_msr_list *kvm_msr_list;
 extern unsigned int kvm_shadow_memory;
 
 int kvm_set_tss_addr(kvm_context_t kvm, unsigned long addr)
@@ -338,35 +337,6 @@  void kvm_show_code(CPUState *env)
     fprintf(stderr, "code:%s\n", code_str);
 }
 
-
-/*
- * Returns available msr list.  User must free.
- */
-static struct kvm_msr_list *kvm_get_msr_list(void)
-{
-    struct kvm_msr_list sizer, *msrs;
-    int r;
-
-    sizer.nmsrs = 0;
-    r = kvm_ioctl(kvm_state, KVM_GET_MSR_INDEX_LIST, &sizer);
-    if (r < 0 && r != -E2BIG) {
-        return NULL;
-    }
-    /* Old kernel modules had a bug and could write beyond the provided
-       memory. Allocate at least a safe amount of 1K. */
-    msrs = qemu_malloc(MAX(1024, sizeof(*msrs) +
-                           sizer.nmsrs * sizeof(*msrs->indices)));
-
-    msrs->nmsrs = sizer.nmsrs;
-    r = kvm_ioctl(kvm_state, KVM_GET_MSR_INDEX_LIST, msrs);
-    if (r < 0) {
-        free(msrs);
-        errno = r;
-        return NULL;
-    }
-    return msrs;
-}
-
 static void print_seg(FILE *file, const char *name, struct kvm_segment *seg)
 {
     fprintf(stderr,
@@ -496,11 +466,6 @@  int kvm_arch_qemu_create_context(void)
         return r;
     }
 
-    kvm_msr_list = kvm_get_msr_list();
-    if (!kvm_msr_list) {
-        return -1;
-    }
-
     r = kvm_set_boot_cpu_id(0);
     if (r < 0 && r != -ENOSYS) {
         return r;
@@ -653,42 +618,8 @@  void kvm_arch_push_nmi(void *opaque)
 }
 #endif /* KVM_CAP_USER_NMI */
 
-static int kvm_reset_msrs(CPUState *env)
-{
-    struct {
-        struct kvm_msrs info;
-        struct kvm_msr_entry entries[100];
-    } msr_data;
-    int n;
-    struct kvm_msr_entry *msrs = msr_data.entries;
-    uint32_t index;
-    uint64_t data;
-
-    if (!kvm_msr_list) {
-        return -1;
-    }
-
-    for (n = 0; n < kvm_msr_list->nmsrs; n++) {
-        index = kvm_msr_list->indices[n];
-        switch (index) {
-        case MSR_PAT:
-            data = 0x0007040600070406ULL;
-            break;
-        default:
-            data = 0;
-        }
-        kvm_msr_entry_set(&msrs[n], kvm_msr_list->indices[n], data);
-    }
-
-    msr_data.info.nmsrs = n;
-
-    return kvm_vcpu_ioctl(env, KVM_SET_MSRS, &msr_data);
-}
-
-
 void kvm_arch_cpu_reset(CPUState *env)
 {
-    kvm_reset_msrs(env);
     kvm_arch_reset_vcpu(env);
 }