[0/5] KVM: nVMX: Skip vmentry checks that are necessary only if VMCS12 is dirty
mbox series

Message ID 20190707071147.11651-1-krish.sadhukhan@oracle.com
Headers show
Series
  • KVM: nVMX: Skip vmentry checks that are necessary only if VMCS12 is dirty
Related show

Message

Krish Sadhukhan July 7, 2019, 7:11 a.m. UTC
The following functions,

	nested_vmx_check_controls
	nested_vmx_check_host_state
	nested_vmx_check_guest_state

do a number of vmentry checks for VMCS12. However, not all of these checks need
to be executed on every vmentry. This patchset makes some of these vmentry
checks optional based on the state of VMCS12 in that if VMCS12 is dirty, only
then the checks will be executed. This will reduce performance impact on
vmentry of nested guests.


[PATCH 1/5] KVM: nVMX: Skip VM-Execution Control vmentry checks that are
[PATCH 2/5] KVM: nVMX: Skip VM-Exit Control vmentry checks that are
[PATCH 3/5] KVM: nVMX: Skip VM-Entry Control checks that are necessary
[PATCH 4/5] KVM: nVMX: Skip Host State Area vmentry checks that are
[PATCH 5/5] KVM: nVMX: Skip Guest State Area vmentry checks that are

 arch/x86/kvm/vmx/nested.c | 149 ++++++++++++++++++++++++++++++++++------------
 1 file changed, 111 insertions(+), 38 deletions(-)

Krish Sadhukhan (5):
      nVMX: Skip VM-Execution Control vmentry checks that are necessary only if VMCS12 is dirty
      nVMX: Skip VM-Exit Control vmentry checks that are necessary only if VMCS12 is dirty
      nVMX: Skip VM-Entry Control checks that are necessary only if VMCS12 is dirty
      nVMX: Skip Host State Area vmentry checks that are necessary only if VMCS12 is dirty
      nVMX: Skip Guest State Area vmentry checks that are necessary only if VMCS12 is dirty

Comments

Sean Christopherson July 8, 2019, 6:17 p.m. UTC | #1
On Sun, Jul 07, 2019 at 03:11:42AM -0400, Krish Sadhukhan wrote:
> The following functions,
> 
> 	nested_vmx_check_controls
> 	nested_vmx_check_host_state
> 	nested_vmx_check_guest_state
> 
> do a number of vmentry checks for VMCS12. However, not all of these checks need
> to be executed on every vmentry. This patchset makes some of these vmentry
> checks optional based on the state of VMCS12 in that if VMCS12 is dirty, only
> then the checks will be executed. This will reduce performance impact on
> vmentry of nested guests.

All of these patches break vmx_set_nested_state(), which sets dirty_vmcs12
only after the aforementioned consistency checks pass.

The new nomenclature for the dirty paths is "rare", not "full".

In general, I dislike directly associating the consistency checks with
dirty_vmcs12.

  - It's difficult to assess the correctness of the resulting code, e.g.
    changing CPU_BASED_VM_EXEC_CONTROL doesn't set dirty_vmcs12, which
    calls into question any and all SECONDARY_VM_EXEC_CONTROL checks since
    an L1 could toggle CPU_BASED_ACTIVATE_SECONDARY_CONTROLS.

  - We lose the existing organization of the consistency checks, e.g.
    similar checks get arbitrarily split into separate flows based on
    the rarity of the field changing.

  - The performance gains are likely minimal since the majority of checks
    can't be skipped due to the coarseness of dirty_vmcs12.

Rather than a quick and dirty (pun intended) change to use dirty_vmcs12,
I think we should have some amount of dedicated infrastructure for
optimizing consistency checks from the get go, e.g. perhaps something
similar to how eVMCS categorizes fields.  The initial usage could be very
coarse grained, e.g. based purely on dirty_vmcs12, but having the
infrastructure would make it easier to reason about the correctness of
the code.  Future patches could then refine the triggerring of checks to
achieve better optimization, e.g. skipping the vast majority of checks
when L1 is simply toggling CPU_BASED_VIRTUAL_INTR_PENDING.
Krish Sadhukhan July 9, 2019, 10:50 p.m. UTC | #2
On 07/08/2019 11:17 AM, Sean Christopherson wrote:
> On Sun, Jul 07, 2019 at 03:11:42AM -0400, Krish Sadhukhan wrote:
>> The following functions,
>>
>> 	nested_vmx_check_controls
>> 	nested_vmx_check_host_state
>> 	nested_vmx_check_guest_state
>>
>> do a number of vmentry checks for VMCS12. However, not all of these checks need
>> to be executed on every vmentry. This patchset makes some of these vmentry
>> checks optional based on the state of VMCS12 in that if VMCS12 is dirty, only
>> then the checks will be executed. This will reduce performance impact on
>> vmentry of nested guests.
> All of these patches break vmx_set_nested_state(), which sets dirty_vmcs12
> only after the aforementioned consistency checks pass.

Perhaps vmx_set_nested_state() can set dirty_vmcs12 right before the 
consistency checks are done ? I see no difference in correctness. Also, 
it calls set_current_vmptr() which anyway sets dirty_vmcs12 for valid VMCSs.

>
> The new nomenclature for the dirty paths is "rare", not "full".
OK.
>
> In general, I dislike directly associating the consistency checks with
> dirty_vmcs12.
>
>    - It's difficult to assess the correctness of the resulting code, e.g.
>      changing CPU_BASED_VM_EXEC_CONTROL doesn't set dirty_vmcs12, which
>      calls into question any and all SECONDARY_VM_EXEC_CONTROL checks since
>      an L1 could toggle CPU_BASED_ACTIVATE_SECONDARY_CONTROLS.
>
>    - We lose the existing organization of the consistency checks, e.g.
>      similar checks get arbitrarily split into separate flows based on
>      the rarity of the field changing.

Initially, I was thinking of inserting the check for dirty_vmcs12 right 
in place of each of the checks without having to move them to separate 
functions. That approach saves the separation of the checks but results 
in poor readability. Hence I adopted the current approach.

>
>    - The performance gains are likely minimal since the majority of checks
>      can't be skipped due to the coarseness of dirty_vmcs12.
>
> Rather than a quick and dirty (pun intended) change to use dirty_vmcs12,
> I think we should have some amount of dedicated infrastructure for
> optimizing consistency checks from the get go, e.g. perhaps something
> similar to how eVMCS categorizes fields.
Are you referring to the categorization done in 
copy_vmcs12_to_enlightened() ? If so, what is the basis for 
categorization in there ?
We can re-order the checks in 
nested_vmx_check_{controls,host_state,guest_state} based on dirty_vmcs  
to create an initial framework for controlling the consistency checks. 
The only disadvantage will be that such an ordering will be completely 
off from how the SDM describes the checks.

>   The initial usage could be very
> coarse grained, e.g. based purely on dirty_vmcs12, but having the
> infrastructure would make it easier to reason about the correctness of
> the code.  Future patches could then refine the triggerring of checks to
> achieve better optimization, e.g. skipping the vast majority of checks
> when L1 is simply toggling CPU_BASED_VIRTUAL_INTR_PENDING.
It seems you are suggesting a finer granularity up to each VMCS field 
instead of groups of VMCS fields ? Then we need a per-field flag to 
track its modification and that seems an overkill.
Paolo Bonzini July 10, 2019, 2:35 p.m. UTC | #3
On 08/07/19 20:17, Sean Christopherson wrote:
> On Sun, Jul 07, 2019 at 03:11:42AM -0400, Krish Sadhukhan wrote:
>> The following functions,
>>
>> 	nested_vmx_check_controls
>> 	nested_vmx_check_host_state
>> 	nested_vmx_check_guest_state
>>
>> do a number of vmentry checks for VMCS12. However, not all of these checks need
>> to be executed on every vmentry. This patchset makes some of these vmentry
>> checks optional based on the state of VMCS12 in that if VMCS12 is dirty, only
>> then the checks will be executed. This will reduce performance impact on
>> vmentry of nested guests.
> 
> All of these patches break vmx_set_nested_state(), which sets dirty_vmcs12
> only after the aforementioned consistency checks pass.
> 
> The new nomenclature for the dirty paths is "rare", not "full".
> 
> In general, I dislike directly associating the consistency checks with
> dirty_vmcs12.
> 
>   - It's difficult to assess the correctness of the resulting code, e.g.
>     changing CPU_BASED_VM_EXEC_CONTROL doesn't set dirty_vmcs12, which
>     calls into question any and all SECONDARY_VM_EXEC_CONTROL checks since
>     an L1 could toggle CPU_BASED_ACTIVATE_SECONDARY_CONTROLS.

Yes, CPU-based controls are tricky and should not be changed.  But I
don't see a big issue apart from the CPU-based controls, and the other
checks can also be quite expensive---and the point of dirty_vmcs12 and
shadow VMCS is that we _can_ exclude them most of the time.

This is all 5.4 material anyway, I'll do some testing of Krish's patches
2-5.

Thanks,

Paolo

>   - We lose the existing organization of the consistency checks, e.g.
>     similar checks get arbitrarily split into separate flows based on
>     the rarity of the field changing.
> 
>   - The performance gains are likely minimal since the majority of checks
>     can't be skipped due to the coarseness of dirty_vmcs12.
>
> Rather than a quick and dirty (pun intended) change to use dirty_vmcs12,
> I think we should have some amount of dedicated infrastructure for
> optimizing consistency checks from the get go, e.g. perhaps something
> similar to how eVMCS categorizes fields.  The initial usage could be very
> coarse grained, e.g. based purely on dirty_vmcs12, but having the
> infrastructure would make it easier to reason about the correctness of
> the code.  Future patches could then refine the triggerring of checks to
> achieve better optimization, e.g. skipping the vast majority of checks
> when L1 is simply toggling CPU_BASED_VIRTUAL_INTR_PENDING.
Sean Christopherson July 10, 2019, 4:15 p.m. UTC | #4
On Wed, Jul 10, 2019 at 04:35:46PM +0200, Paolo Bonzini wrote:
> On 08/07/19 20:17, Sean Christopherson wrote:
> > On Sun, Jul 07, 2019 at 03:11:42AM -0400, Krish Sadhukhan wrote:
> >> The following functions,
> >>
> >> 	nested_vmx_check_controls
> >> 	nested_vmx_check_host_state
> >> 	nested_vmx_check_guest_state
> >>
> >> do a number of vmentry checks for VMCS12. However, not all of these checks need
> >> to be executed on every vmentry. This patchset makes some of these vmentry
> >> checks optional based on the state of VMCS12 in that if VMCS12 is dirty, only
> >> then the checks will be executed. This will reduce performance impact on
> >> vmentry of nested guests.
> > 
> > All of these patches break vmx_set_nested_state(), which sets dirty_vmcs12
> > only after the aforementioned consistency checks pass.
> > 
> > The new nomenclature for the dirty paths is "rare", not "full".
> > 
> > In general, I dislike directly associating the consistency checks with
> > dirty_vmcs12.
> > 
> >   - It's difficult to assess the correctness of the resulting code, e.g.
> >     changing CPU_BASED_VM_EXEC_CONTROL doesn't set dirty_vmcs12, which
> >     calls into question any and all SECONDARY_VM_EXEC_CONTROL checks since
> >     an L1 could toggle CPU_BASED_ACTIVATE_SECONDARY_CONTROLS.
> 
> Yes, CPU-based controls are tricky and should not be changed.  But I
> don't see a big issue apart from the CPU-based controls, and the other
> checks can also be quite expensive---and the point of dirty_vmcs12 and
> shadow VMCS is that we _can_ exclude them most of the time.

No argument there.  My thought was do something like the following so that
all of the "which checks should we perform" logic is consolidated in a
single location and not spread piecemeal throughout the checks themselves.

static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
{
	unsigned long dirty_checks;

	...

	if (vmx->nested.dirty_vmcs12)
		dirty_checks = ENTRY_CONTROLS | EXIT_CONTROLS | HOST_STATE |
			       GUEST_STATE;
	else
		dirty_checks = 0;
}
Paolo Bonzini July 10, 2019, 4:33 p.m. UTC | #5
On 10/07/19 18:15, Sean Christopherson wrote:
> On Wed, Jul 10, 2019 at 04:35:46PM +0200, Paolo Bonzini wrote:
>> On 08/07/19 20:17, Sean Christopherson wrote:
>>> On Sun, Jul 07, 2019 at 03:11:42AM -0400, Krish Sadhukhan wrote:
>>>> The following functions,
>>>>
>>>> 	nested_vmx_check_controls
>>>> 	nested_vmx_check_host_state
>>>> 	nested_vmx_check_guest_state
>>>>
>>>> do a number of vmentry checks for VMCS12. However, not all of these checks need
>>>> to be executed on every vmentry. This patchset makes some of these vmentry
>>>> checks optional based on the state of VMCS12 in that if VMCS12 is dirty, only
>>>> then the checks will be executed. This will reduce performance impact on
>>>> vmentry of nested guests.
>>>
>>> All of these patches break vmx_set_nested_state(), which sets dirty_vmcs12
>>> only after the aforementioned consistency checks pass.
>>>
>>> The new nomenclature for the dirty paths is "rare", not "full".
>>>
>>> In general, I dislike directly associating the consistency checks with
>>> dirty_vmcs12.
>>>
>>>   - It's difficult to assess the correctness of the resulting code, e.g.
>>>     changing CPU_BASED_VM_EXEC_CONTROL doesn't set dirty_vmcs12, which
>>>     calls into question any and all SECONDARY_VM_EXEC_CONTROL checks since
>>>     an L1 could toggle CPU_BASED_ACTIVATE_SECONDARY_CONTROLS.
>>
>> Yes, CPU-based controls are tricky and should not be changed.  But I
>> don't see a big issue apart from the CPU-based controls, and the other
>> checks can also be quite expensive---and the point of dirty_vmcs12 and
>> shadow VMCS is that we _can_ exclude them most of the time.
> 
> No argument there.  My thought was do something like the following so that
> all of the "which checks should we perform" logic is consolidated in a
> single location and not spread piecemeal throughout the checks themselves.
> 
> static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
> {
> 	unsigned long dirty_checks;
> 
> 	...
> 
> 	if (vmx->nested.dirty_vmcs12)
> 		dirty_checks = ENTRY_CONTROLS | EXIT_CONTROLS | HOST_STATE |
> 			       GUEST_STATE;
> 	else
> 		dirty_checks = 0;
> }

That makes sense, though it would be somewhat awkward:

	dirty_checks = EXEC_CONTROLS | HOST_STATE_FSGS |
		ENTRY_CONTROLS_INTRINFO | GUEST_STATE_EFER;
	if (vmx->nested.dirty_vmcs12)
		dirty_checks |= ENTRY_CONTROLS_FULL | EXIT_CONTROLS |
			HOST_STATE_FULL | GUEST_STATE_FULL;

Paolo