Message ID | 1509891090-8985-1-git-send-email-liran.alon@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 05/11/2017 15:11, Liran Alon wrote: > When guest passes KVM it's pvclock-page GPA via WRMSR to > MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize > pvclock-page to some start-values. It just requests a clock-update which > will happen before entering to guest. > > The clock-update logic will call kvm_setup_pvclock_page() to update the > pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly* > assumes that the version-field is initialized to an even number. This is > wrong because at first-time write, field could be any-value. > > Fix simply makes sure that if first-time version-field is odd, increment > it once more to make it even and only then start standard logic. > This follows same logic as done in other pvclock shared-pages (See > kvm_write_wall_clock() and record_steal_time()). > > Signed-off-by: Liran Alon <liran.alon@oracle.com> > Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/kvm/x86.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 03869eb7fcd6..181106080e41 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -1830,6 +1830,9 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu *v) > */ > BUILD_BUG_ON(offsetof(struct pvclock_vcpu_time_info, version) != 0); > > + if (guest_hv_clock.version & 1) > + ++guest_hv_clock.version; /* first time write, random junk */ > + > vcpu->hv_clock.version = guest_hv_clock.version + 1; > kvm_write_guest_cached(v->kvm, &vcpu->pv_time, > &vcpu->hv_clock, > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org
2017-11-05 16:11+0200, Liran Alon: > When guest passes KVM it's pvclock-page GPA via WRMSR to > MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize > pvclock-page to some start-values. It just requests a clock-update which > will happen before entering to guest. > > The clock-update logic will call kvm_setup_pvclock_page() to update the > pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly* > assumes that the version-field is initialized to an even number. This is > wrong because at first-time write, field could be any-value. > > Fix simply makes sure that if first-time version-field is odd, increment > it once more to make it even and only then start standard logic. > This follows same logic as done in other pvclock shared-pages (See > kvm_write_wall_clock() and record_steal_time()). > > Signed-off-by: Liran Alon <liran.alon@oracle.com> > Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- Applied, thanks.
2017-11-05 22:11 GMT+08:00 Liran Alon <liran.alon@oracle.com>: > When guest passes KVM it's pvclock-page GPA via WRMSR to > MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize > pvclock-page to some start-values. It just requests a clock-update which KVM does, please refer to memset(hv_clock, 0, size) in kvmclock_init(). Regards, Wanpeng Li > will happen before entering to guest. > > The clock-update logic will call kvm_setup_pvclock_page() to update the > pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly* > assumes that the version-field is initialized to an even number. This is > wrong because at first-time write, field could be any-value. > > Fix simply makes sure that if first-time version-field is odd, increment > it once more to make it even and only then start standard logic. > This follows same logic as done in other pvclock shared-pages (See > kvm_write_wall_clock() and record_steal_time()). > > Signed-off-by: Liran Alon <liran.alon@oracle.com> > Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> > Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/kvm/x86.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 03869eb7fcd6..181106080e41 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -1830,6 +1830,9 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu *v) > */ > BUILD_BUG_ON(offsetof(struct pvclock_vcpu_time_info, version) != 0); > > + if (guest_hv_clock.version & 1) > + ++guest_hv_clock.version; /* first time write, random junk */ > + > vcpu->hv_clock.version = guest_hv_clock.version + 1; > kvm_write_guest_cached(v->kvm, &vcpu->pv_time, > &vcpu->hv_clock, > -- > 1.9.1 >
On 13/11/17 02:40, Wanpeng Li wrote: > 2017-11-05 22:11 GMT+08:00 Liran Alon <liran.alon@oracle.com>: >> When guest passes KVM it's pvclock-page GPA via WRMSR to >> MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize >> pvclock-page to some start-values. It just requests a clock-update which > > KVM does, please refer to memset(hv_clock, 0, size) in kvmclock_init(). kvmclock_init() is the code that runs in KVM-guest. I was talking here about the code that handles the WRMSR in KVM hypervisor code. The issue happens when the guest doesn't init pvclock-page as done in kvmclock_init(). Not all guests behave nicely :) -Liran > > Regards, > Wanpeng Li > >> will happen before entering to guest. >> >> The clock-update logic will call kvm_setup_pvclock_page() to update the >> pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly* >> assumes that the version-field is initialized to an even number. This is >> wrong because at first-time write, field could be any-value. >> >> Fix simply makes sure that if first-time version-field is odd, increment >> it once more to make it even and only then start standard logic. >> This follows same logic as done in other pvclock shared-pages (See >> kvm_write_wall_clock() and record_steal_time()). >> >> Signed-off-by: Liran Alon <liran.alon@oracle.com> >> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> >> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> --- >> arch/x86/kvm/x86.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 03869eb7fcd6..181106080e41 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -1830,6 +1830,9 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu *v) >> */ >> BUILD_BUG_ON(offsetof(struct pvclock_vcpu_time_info, version) != 0); >> >> + if (guest_hv_clock.version & 1) >> + ++guest_hv_clock.version; /* first time write, random junk */ >> + >> vcpu->hv_clock.version = guest_hv_clock.version + 1; >> kvm_write_guest_cached(v->kvm, &vcpu->pv_time, >> &vcpu->hv_clock, >> -- >> 1.9.1 >>
2017-11-13 8:44 GMT+08:00 Liran Alon <LIRAN.ALON@oracle.com>: > > > On 13/11/17 02:40, Wanpeng Li wrote: >> >> 2017-11-05 22:11 GMT+08:00 Liran Alon <liran.alon@oracle.com>: >>> >>> When guest passes KVM it's pvclock-page GPA via WRMSR to >>> MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize >>> pvclock-page to some start-values. It just requests a clock-update which >> >> >> KVM does, please refer to memset(hv_clock, 0, size) in kvmclock_init(). > > > kvmclock_init() is the code that runs in KVM-guest. I was talking here about > the code that handles the WRMSR in KVM hypervisor code. > > The issue happens when the guest doesn't init pvclock-page as done in > kvmclock_init(). Not all guests behave nicely :) But the codes which you modify just works for kvm guest I think. Regards, Wanpeng Li > > -Liran > >> >> Regards, >> Wanpeng Li >> >>> will happen before entering to guest. >>> >>> The clock-update logic will call kvm_setup_pvclock_page() to update the >>> pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly* >>> assumes that the version-field is initialized to an even number. This is >>> wrong because at first-time write, field could be any-value. >>> >>> Fix simply makes sure that if first-time version-field is odd, increment >>> it once more to make it even and only then start standard logic. >>> This follows same logic as done in other pvclock shared-pages (See >>> kvm_write_wall_clock() and record_steal_time()). >>> >>> Signed-off-by: Liran Alon <liran.alon@oracle.com> >>> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> >>> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>> --- >>> arch/x86/kvm/x86.c | 3 +++ >>> 1 file changed, 3 insertions(+) >>> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index 03869eb7fcd6..181106080e41 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -1830,6 +1830,9 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu >>> *v) >>> */ >>> BUILD_BUG_ON(offsetof(struct pvclock_vcpu_time_info, version) != >>> 0); >>> >>> + if (guest_hv_clock.version & 1) >>> + ++guest_hv_clock.version; /* first time write, random >>> junk */ >>> + >>> vcpu->hv_clock.version = guest_hv_clock.version + 1; >>> kvm_write_guest_cached(v->kvm, &vcpu->pv_time, >>> &vcpu->hv_clock, >>> -- >>> 1.9.1 >>> >
On 13/11/17 02:52, Wanpeng Li wrote: > 2017-11-13 8:44 GMT+08:00 Liran Alon <LIRAN.ALON@oracle.com>: >> >> >> On 13/11/17 02:40, Wanpeng Li wrote: >>> >>> 2017-11-05 22:11 GMT+08:00 Liran Alon <liran.alon@oracle.com>: >>>> >>>> When guest passes KVM it's pvclock-page GPA via WRMSR to >>>> MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize >>>> pvclock-page to some start-values. It just requests a clock-update which >>> >>> >>> KVM does, please refer to memset(hv_clock, 0, size) in kvmclock_init(). >> >> >> kvmclock_init() is the code that runs in KVM-guest. I was talking here about >> the code that handles the WRMSR in KVM hypervisor code. >> >> The issue happens when the guest doesn't init pvclock-page as done in >> kvmclock_init(). Not all guests behave nicely :) > > But the codes which you modify just works for kvm guest I think. No, it's the code that runs in KVM hypervisor for any guest that knows how to work with KVM pv-clock PV interface. Linux guest is just one of the possible guests which can use this interface. kvmclock_init() you mentioned is part of the linux-guest. But there are other guests which use KVM pv-clock PV interface. -Liran > > Regards, > Wanpeng Li > >> >> -Liran >> >>> >>> Regards, >>> Wanpeng Li >>> >>>> will happen before entering to guest. >>>> >>>> The clock-update logic will call kvm_setup_pvclock_page() to update the >>>> pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly* >>>> assumes that the version-field is initialized to an even number. This is >>>> wrong because at first-time write, field could be any-value. >>>> >>>> Fix simply makes sure that if first-time version-field is odd, increment >>>> it once more to make it even and only then start standard logic. >>>> This follows same logic as done in other pvclock shared-pages (See >>>> kvm_write_wall_clock() and record_steal_time()). >>>> >>>> Signed-off-by: Liran Alon <liran.alon@oracle.com> >>>> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> >>>> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>>> --- >>>> arch/x86/kvm/x86.c | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>>> index 03869eb7fcd6..181106080e41 100644 >>>> --- a/arch/x86/kvm/x86.c >>>> +++ b/arch/x86/kvm/x86.c >>>> @@ -1830,6 +1830,9 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu >>>> *v) >>>> */ >>>> BUILD_BUG_ON(offsetof(struct pvclock_vcpu_time_info, version) != >>>> 0); >>>> >>>> + if (guest_hv_clock.version & 1) >>>> + ++guest_hv_clock.version; /* first time write, random >>>> junk */ >>>> + >>>> vcpu->hv_clock.version = guest_hv_clock.version + 1; >>>> kvm_write_guest_cached(v->kvm, &vcpu->pv_time, >>>> &vcpu->hv_clock, >>>> -- >>>> 1.9.1 >>>> >>
2017-11-13 8:55 GMT+08:00 Liran Alon <LIRAN.ALON@oracle.com>: > > > On 13/11/17 02:52, Wanpeng Li wrote: >> >> 2017-11-13 8:44 GMT+08:00 Liran Alon <LIRAN.ALON@oracle.com>: >>> >>> >>> >>> On 13/11/17 02:40, Wanpeng Li wrote: >>>> >>>> >>>> 2017-11-05 22:11 GMT+08:00 Liran Alon <liran.alon@oracle.com>: >>>>> >>>>> >>>>> When guest passes KVM it's pvclock-page GPA via WRMSR to >>>>> MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize >>>>> pvclock-page to some start-values. It just requests a clock-update >>>>> which >>>> >>>> >>>> >>>> KVM does, please refer to memset(hv_clock, 0, size) in kvmclock_init(). >>> >>> >>> >>> kvmclock_init() is the code that runs in KVM-guest. I was talking here >>> about >>> the code that handles the WRMSR in KVM hypervisor code. >>> >>> The issue happens when the guest doesn't init pvclock-page as done in >>> kvmclock_init(). Not all guests behave nicely :) >> >> >> But the codes which you modify just works for kvm guest I think. > > No, it's the code that runs in KVM hypervisor for any guest that knows how > to work with KVM pv-clock PV interface. > Linux guest is just one of the possible guests which can use this interface. > kvmclock_init() you mentioned is part of the linux-guest. But there are > other guests which use KVM pv-clock PV interface. Good to know it, could you point one? > > -Liran > >> >> Regards, >> Wanpeng Li >> >>> >>> -Liran >>> >>>> >>>> Regards, >>>> Wanpeng Li >>>> >>>>> will happen before entering to guest. >>>>> >>>>> The clock-update logic will call kvm_setup_pvclock_page() to update the >>>>> pvclock-page with info. However, kvm_setup_pvclock_page() *wrongly* >>>>> assumes that the version-field is initialized to an even number. This >>>>> is >>>>> wrong because at first-time write, field could be any-value. >>>>> >>>>> Fix simply makes sure that if first-time version-field is odd, >>>>> increment >>>>> it once more to make it even and only then start standard logic. >>>>> This follows same logic as done in other pvclock shared-pages (See >>>>> kvm_write_wall_clock() and record_steal_time()). >>>>> >>>>> Signed-off-by: Liran Alon <liran.alon@oracle.com> >>>>> Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com> >>>>> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>>>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >>>>> --- >>>>> arch/x86/kvm/x86.c | 3 +++ >>>>> 1 file changed, 3 insertions(+) >>>>> >>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>>>> index 03869eb7fcd6..181106080e41 100644 >>>>> --- a/arch/x86/kvm/x86.c >>>>> +++ b/arch/x86/kvm/x86.c >>>>> @@ -1830,6 +1830,9 @@ static void kvm_setup_pvclock_page(struct >>>>> kvm_vcpu >>>>> *v) >>>>> */ >>>>> BUILD_BUG_ON(offsetof(struct pvclock_vcpu_time_info, version) >>>>> != >>>>> 0); >>>>> >>>>> + if (guest_hv_clock.version & 1) >>>>> + ++guest_hv_clock.version; /* first time write, random >>>>> junk */ >>>>> + >>>>> vcpu->hv_clock.version = guest_hv_clock.version + 1; >>>>> kvm_write_guest_cached(v->kvm, &vcpu->pv_time, >>>>> &vcpu->hv_clock, >>>>> -- >>>>> 1.9.1 >>>>> >>> >
2017-11-13 9:03 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>: > 2017-11-13 8:55 GMT+08:00 Liran Alon <LIRAN.ALON@oracle.com>: >> >> >> On 13/11/17 02:52, Wanpeng Li wrote: >>> >>> 2017-11-13 8:44 GMT+08:00 Liran Alon <LIRAN.ALON@oracle.com>: >>>> >>>> >>>> >>>> On 13/11/17 02:40, Wanpeng Li wrote: >>>>> >>>>> >>>>> 2017-11-05 22:11 GMT+08:00 Liran Alon <liran.alon@oracle.com>: >>>>>> >>>>>> >>>>>> When guest passes KVM it's pvclock-page GPA via WRMSR to >>>>>> MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize >>>>>> pvclock-page to some start-values. It just requests a clock-update >>>>>> which >>>>> >>>>> >>>>> >>>>> KVM does, please refer to memset(hv_clock, 0, size) in kvmclock_init(). >>>> >>>> >>>> >>>> kvmclock_init() is the code that runs in KVM-guest. I was talking here >>>> about >>>> the code that handles the WRMSR in KVM hypervisor code. >>>> >>>> The issue happens when the guest doesn't init pvclock-page as done in >>>> kvmclock_init(). Not all guests behave nicely :) >>> >>> >>> But the codes which you modify just works for kvm guest I think. >> >> No, it's the code that runs in KVM hypervisor for any guest that knows how >> to work with KVM pv-clock PV interface. >> Linux guest is just one of the possible guests which can use this interface. >> kvmclock_init() you mentioned is part of the linux-guest. But there are >> other guests which use KVM pv-clock PV interface. > > Good to know it, could you point one? In addition, there is a BUG_ON(if version != 0) just before the codes which you added. I will move these avoid random junk logics into msr setup in order that it can avoid '& operation' each time update the pv stuff. I will send a patch later. Regards, Wanpeng Li
2017-11-13 9:37 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>: > 2017-11-13 9:03 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>: >> 2017-11-13 8:55 GMT+08:00 Liran Alon <LIRAN.ALON@oracle.com>: >>> >>> >>> On 13/11/17 02:52, Wanpeng Li wrote: >>>> >>>> 2017-11-13 8:44 GMT+08:00 Liran Alon <LIRAN.ALON@oracle.com>: >>>>> >>>>> >>>>> >>>>> On 13/11/17 02:40, Wanpeng Li wrote: >>>>>> >>>>>> >>>>>> 2017-11-05 22:11 GMT+08:00 Liran Alon <liran.alon@oracle.com>: >>>>>>> >>>>>>> >>>>>>> When guest passes KVM it's pvclock-page GPA via WRMSR to >>>>>>> MSR_KVM_SYSTEM_TIME / MSR_KVM_SYSTEM_TIME_NEW, KVM don't initialize >>>>>>> pvclock-page to some start-values. It just requests a clock-update >>>>>>> which >>>>>> >>>>>> >>>>>> >>>>>> KVM does, please refer to memset(hv_clock, 0, size) in kvmclock_init(). >>>>> >>>>> >>>>> >>>>> kvmclock_init() is the code that runs in KVM-guest. I was talking here >>>>> about >>>>> the code that handles the WRMSR in KVM hypervisor code. >>>>> >>>>> The issue happens when the guest doesn't init pvclock-page as done in >>>>> kvmclock_init(). Not all guests behave nicely :) >>>> >>>> >>>> But the codes which you modify just works for kvm guest I think. >>> >>> No, it's the code that runs in KVM hypervisor for any guest that knows how >>> to work with KVM pv-clock PV interface. >>> Linux guest is just one of the possible guests which can use this interface. >>> kvmclock_init() you mentioned is part of the linux-guest. But there are >>> other guests which use KVM pv-clock PV interface. >> >> Good to know it, could you point one? > > In addition, there is a BUG_ON(if version != 0) just before the codes Oh, not mean version != 0, but I think we can move '& operation' to the setup. > which you added. I will move these avoid random junk logics into msr > setup in order that it can avoid '& operation' each time update the pv > stuff. I will send a patch later. > > Regards, > Wanpeng Li
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 03869eb7fcd6..181106080e41 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1830,6 +1830,9 @@ static void kvm_setup_pvclock_page(struct kvm_vcpu *v) */ BUILD_BUG_ON(offsetof(struct pvclock_vcpu_time_info, version) != 0); + if (guest_hv_clock.version & 1) + ++guest_hv_clock.version; /* first time write, random junk */ + vcpu->hv_clock.version = guest_hv_clock.version + 1; kvm_write_guest_cached(v->kvm, &vcpu->pv_time, &vcpu->hv_clock,