mbox series

[v2,0/3] KVM: Properly account for guest CPU time

Message ID 1618298169-3831-1-git-send-email-wanpengli@tencent.com (mailing list archive)
Headers show
Series KVM: Properly account for guest CPU time | expand

Message

Wanpeng Li April 13, 2021, 7:16 a.m. UTC
The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
reported that the guest time remains 0 when running a while true
loop in the guest.

The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
belongs") moves guest_exit_irqoff() close to vmexit breaks the
tick-based time accouting when the ticks that happen after IRQs are
disabled are incorrectly accounted to the host/system time. This is
because we exit the guest state too early.

This patchset splits both context tracking logic and the time accounting 
logic from guest_enter/exit_irqoff(), keep context tracking around the 
actual vmentry/exit code, have the virt time specific helpers which 
can be placed at the proper spots in kvm. In addition, it will not 
break the world outside of x86.

v1 -> v2:
 * split context_tracking from guest_enter/exit_irqoff
 * provide separate vtime accounting functions for consistent
 * place the virt time specific helpers at the proper splot 

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>

Wanpeng Li (3):
  context_tracking: Split guest_enter/exit_irqoff
  context_tracking: Provide separate vtime accounting functions
  x86/kvm: Fix vtime accounting

 arch/x86/kvm/svm/svm.c           |  6 ++-
 arch/x86/kvm/vmx/vmx.c           |  6 ++-
 arch/x86/kvm/x86.c               |  1 +
 include/linux/context_tracking.h | 84 +++++++++++++++++++++++++++++++---------
 4 files changed, 74 insertions(+), 23 deletions(-)

Comments

Christian Borntraeger April 13, 2021, 8:32 a.m. UTC | #1
On 13.04.21 09:16, Wanpeng Li wrote:
> The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
> reported that the guest time remains 0 when running a while true
> loop in the guest.
> 
> The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
> belongs") moves guest_exit_irqoff() close to vmexit breaks the
> tick-based time accouting when the ticks that happen after IRQs are
> disabled are incorrectly accounted to the host/system time. This is
> because we exit the guest state too early.
> 
> This patchset splits both context tracking logic and the time accounting
> logic from guest_enter/exit_irqoff(), keep context tracking around the
> actual vmentry/exit code, have the virt time specific helpers which
> can be placed at the proper spots in kvm. In addition, it will not
> break the world outside of x86.
> 
> v1 -> v2:
>   * split context_tracking from guest_enter/exit_irqoff
>   * provide separate vtime accounting functions for consistent
>   * place the virt time specific helpers at the proper splot
> 
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Michael Tokarev <mjt@tls.msk.ru>
> 
> Wanpeng Li (3):
>    context_tracking: Split guest_enter/exit_irqoff
>    context_tracking: Provide separate vtime accounting functions
>    x86/kvm: Fix vtime accounting
> 
>   arch/x86/kvm/svm/svm.c           |  6 ++-
>   arch/x86/kvm/vmx/vmx.c           |  6 ++-
>   arch/x86/kvm/x86.c               |  1 +
>   include/linux/context_tracking.h | 84 +++++++++++++++++++++++++++++++---------
>   4 files changed, 74 insertions(+), 23 deletions(-)
> 

The non CONFIG_VIRT_CPU_ACCOUNTING_GEN look good.
Sean Christopherson April 13, 2021, 5:25 p.m. UTC | #2
On Tue, Apr 13, 2021, Wanpeng Li wrote:
> The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
> reported that the guest time remains 0 when running a while true
> loop in the guest.
> 
> The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
> belongs") moves guest_exit_irqoff() close to vmexit breaks the
> tick-based time accouting when the ticks that happen after IRQs are
> disabled are incorrectly accounted to the host/system time. This is
> because we exit the guest state too early.
> 
> This patchset splits both context tracking logic and the time accounting 
> logic from guest_enter/exit_irqoff(), keep context tracking around the 
> actual vmentry/exit code, have the virt time specific helpers which 
> can be placed at the proper spots in kvm. In addition, it will not 
> break the world outside of x86.

IMO, this is going in the wrong direction.  Rather than separate context tracking,
vtime accounting, and KVM logic, this further intertwines the three.  E.g. the
context tracking code has even more vtime accounting NATIVE vs. GEN vs. TICK
logic baked into it.

Rather than smush everything into context_tracking.h, I think we can cleanly
split the context tracking and vtime accounting code into separate pieces, which
will in turn allow moving the wrapping logic to linux/kvm_host.h.  Once that is
done, splitting the context tracking and time accounting logic for KVM x86
becomes a KVM detail as opposed to requiring dedicated logic in the context
tracking code.

I have untested code that compiles on x86, I'll send an RFC shortly.

> v1 -> v2:
>  * split context_tracking from guest_enter/exit_irqoff
>  * provide separate vtime accounting functions for consistent
>  * place the virt time specific helpers at the proper splot 
> 
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Michael Tokarev <mjt@tls.msk.ru>
> 
> Wanpeng Li (3):
>   context_tracking: Split guest_enter/exit_irqoff
>   context_tracking: Provide separate vtime accounting functions
>   x86/kvm: Fix vtime accounting
> 
>  arch/x86/kvm/svm/svm.c           |  6 ++-
>  arch/x86/kvm/vmx/vmx.c           |  6 ++-
>  arch/x86/kvm/x86.c               |  1 +
>  include/linux/context_tracking.h | 84 +++++++++++++++++++++++++++++++---------
>  4 files changed, 74 insertions(+), 23 deletions(-)
> 
> -- 
> 2.7.4
>
Wanpeng Li April 14, 2021, 9:36 a.m. UTC | #3
On Wed, 14 Apr 2021 at 01:25, Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Apr 13, 2021, Wanpeng Li wrote:
> > The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
> > reported that the guest time remains 0 when running a while true
> > loop in the guest.
> >
> > The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
> > belongs") moves guest_exit_irqoff() close to vmexit breaks the
> > tick-based time accouting when the ticks that happen after IRQs are
> > disabled are incorrectly accounted to the host/system time. This is
> > because we exit the guest state too early.
> >
> > This patchset splits both context tracking logic and the time accounting
> > logic from guest_enter/exit_irqoff(), keep context tracking around the
> > actual vmentry/exit code, have the virt time specific helpers which
> > can be placed at the proper spots in kvm. In addition, it will not
> > break the world outside of x86.
>
> IMO, this is going in the wrong direction.  Rather than separate context tracking,
> vtime accounting, and KVM logic, this further intertwines the three.  E.g. the
> context tracking code has even more vtime accounting NATIVE vs. GEN vs. TICK
> logic baked into it.
>
> Rather than smush everything into context_tracking.h, I think we can cleanly
> split the context tracking and vtime accounting code into separate pieces, which
> will in turn allow moving the wrapping logic to linux/kvm_host.h.  Once that is
> done, splitting the context tracking and time accounting logic for KVM x86
> becomes a KVM detail as opposed to requiring dedicated logic in the context
> tracking code.
>
> I have untested code that compiles on x86, I'll send an RFC shortly.

We need an easy to backport fix and then we might have some further
cleanups on top.

    Wanpeng
Sean Christopherson April 15, 2021, 12:49 a.m. UTC | #4
On Wed, Apr 14, 2021, Wanpeng Li wrote:
> On Wed, 14 Apr 2021 at 01:25, Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, Apr 13, 2021, Wanpeng Li wrote:
> > > The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
> > > reported that the guest time remains 0 when running a while true
> > > loop in the guest.
> > >
> > > The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
> > > belongs") moves guest_exit_irqoff() close to vmexit breaks the
> > > tick-based time accouting when the ticks that happen after IRQs are
> > > disabled are incorrectly accounted to the host/system time. This is
> > > because we exit the guest state too early.
> > >
> > > This patchset splits both context tracking logic and the time accounting
> > > logic from guest_enter/exit_irqoff(), keep context tracking around the
> > > actual vmentry/exit code, have the virt time specific helpers which
> > > can be placed at the proper spots in kvm. In addition, it will not
> > > break the world outside of x86.
> >
> > IMO, this is going in the wrong direction.  Rather than separate context tracking,
> > vtime accounting, and KVM logic, this further intertwines the three.  E.g. the
> > context tracking code has even more vtime accounting NATIVE vs. GEN vs. TICK
> > logic baked into it.
> >
> > Rather than smush everything into context_tracking.h, I think we can cleanly
> > split the context tracking and vtime accounting code into separate pieces, which
> > will in turn allow moving the wrapping logic to linux/kvm_host.h.  Once that is
> > done, splitting the context tracking and time accounting logic for KVM x86
> > becomes a KVM detail as opposed to requiring dedicated logic in the context
> > tracking code.
> >
> > I have untested code that compiles on x86, I'll send an RFC shortly.
> 
> We need an easy to backport fix and then we might have some further
> cleanups on top.

I fiddled with this a bit today, I think I have something workable that will be
a relatively clean and short backport.  With luck, I'll get it posted tomorrow.
Wanpeng Li April 15, 2021, 1:23 a.m. UTC | #5
On Thu, 15 Apr 2021 at 08:49, Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Apr 14, 2021, Wanpeng Li wrote:
> > On Wed, 14 Apr 2021 at 01:25, Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Tue, Apr 13, 2021, Wanpeng Li wrote:
> > > > The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
> > > > reported that the guest time remains 0 when running a while true
> > > > loop in the guest.
> > > >
> > > > The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
> > > > belongs") moves guest_exit_irqoff() close to vmexit breaks the
> > > > tick-based time accouting when the ticks that happen after IRQs are
> > > > disabled are incorrectly accounted to the host/system time. This is
> > > > because we exit the guest state too early.
> > > >
> > > > This patchset splits both context tracking logic and the time accounting
> > > > logic from guest_enter/exit_irqoff(), keep context tracking around the
> > > > actual vmentry/exit code, have the virt time specific helpers which
> > > > can be placed at the proper spots in kvm. In addition, it will not
> > > > break the world outside of x86.
> > >
> > > IMO, this is going in the wrong direction.  Rather than separate context tracking,
> > > vtime accounting, and KVM logic, this further intertwines the three.  E.g. the
> > > context tracking code has even more vtime accounting NATIVE vs. GEN vs. TICK
> > > logic baked into it.
> > >
> > > Rather than smush everything into context_tracking.h, I think we can cleanly
> > > split the context tracking and vtime accounting code into separate pieces, which
> > > will in turn allow moving the wrapping logic to linux/kvm_host.h.  Once that is
> > > done, splitting the context tracking and time accounting logic for KVM x86
> > > becomes a KVM detail as opposed to requiring dedicated logic in the context
> > > tracking code.
> > >
> > > I have untested code that compiles on x86, I'll send an RFC shortly.
> >
> > We need an easy to backport fix and then we might have some further
> > cleanups on top.
>
> I fiddled with this a bit today, I think I have something workable that will be
> a relatively clean and short backport.  With luck, I'll get it posted tomorrow.

I think we should improve my posted version instead of posting a lot
of alternative versions to save everybody's time.

    Wanpeng
Sean Christopherson April 15, 2021, 7:02 p.m. UTC | #6
On Thu, Apr 15, 2021, Wanpeng Li wrote:
> On Thu, 15 Apr 2021 at 08:49, Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Apr 14, 2021, Wanpeng Li wrote:
> > > On Wed, 14 Apr 2021 at 01:25, Sean Christopherson <seanjc@google.com> wrote:
> > > >
> > > > On Tue, Apr 13, 2021, Wanpeng Li wrote:
> > > > > The bugzilla https://bugzilla.kernel.org/show_bug.cgi?id=209831
> > > > > reported that the guest time remains 0 when running a while true
> > > > > loop in the guest.
> > > > >
> > > > > The commit 87fa7f3e98a131 ("x86/kvm: Move context tracking where it
> > > > > belongs") moves guest_exit_irqoff() close to vmexit breaks the
> > > > > tick-based time accouting when the ticks that happen after IRQs are
> > > > > disabled are incorrectly accounted to the host/system time. This is
> > > > > because we exit the guest state too early.
> > > > >
> > > > > This patchset splits both context tracking logic and the time accounting
> > > > > logic from guest_enter/exit_irqoff(), keep context tracking around the
> > > > > actual vmentry/exit code, have the virt time specific helpers which
> > > > > can be placed at the proper spots in kvm. In addition, it will not
> > > > > break the world outside of x86.
> > > >
> > > > IMO, this is going in the wrong direction.  Rather than separate context tracking,
> > > > vtime accounting, and KVM logic, this further intertwines the three.  E.g. the
> > > > context tracking code has even more vtime accounting NATIVE vs. GEN vs. TICK
> > > > logic baked into it.
> > > >
> > > > Rather than smush everything into context_tracking.h, I think we can cleanly
> > > > split the context tracking and vtime accounting code into separate pieces, which
> > > > will in turn allow moving the wrapping logic to linux/kvm_host.h.  Once that is
> > > > done, splitting the context tracking and time accounting logic for KVM x86
> > > > becomes a KVM detail as opposed to requiring dedicated logic in the context
> > > > tracking code.
> > > >
> > > > I have untested code that compiles on x86, I'll send an RFC shortly.
> > >
> > > We need an easy to backport fix and then we might have some further
> > > cleanups on top.
> >
> > I fiddled with this a bit today, I think I have something workable that will be
> > a relatively clean and short backport.  With luck, I'll get it posted tomorrow.
> 
> I think we should improve my posted version instead of posting a lot
> of alternative versions to save everybody's time.

Ya, definitely not looking to throw out more variants.  I'm trying to stack my
cleanups on your code, while also stripping down your patches to the bare minimum
to minimize both the backports and the churn across the cleanups.  It looks like
it's going to work?  Fingers crossed :-)