diff mbox

[RFC,2/5] KVM: add KVM_EXIT_MSR exit reason and capability.

Message ID CA+0KQ4P1M6mb8-BtKg9Ss5PRk5ZAqPRLZ1WwkQzuHhHbFturKg@mail.gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Hornyack Dec. 21, 2015, 6:58 p.m. UTC
On Fri, Dec 18, 2015 at 1:25 PM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 18/08/2015 20:46, Peter Hornyack wrote:
>> Define KVM_EXIT_MSR, a new exit reason for accesses to MSRs that kvm
>> does not handle. Define KVM_CAP_UNHANDLED_MSR_EXITS, a vm-wide
>> capability that guards the new exit reason and which can be enabled via
>> the KVM_ENABLE_CAP ioctl.
>>
>> Signed-off-by: Peter Hornyack <peterhornyack@google.com>
>> ---
>>  Documentation/virtual/kvm/api.txt | 48 +++++++++++++++++++++++++++++++++++++++
>>  include/uapi/linux/kvm.h          | 14 ++++++++++++
>>  2 files changed, 62 insertions(+)
>
> Let's instead add:
>
> - KVM_CAP_MSR_EXITS
>
> - KVM_CAP_ENABLE_MSR_EXIT
>
> - KVM_CAP_DISABLE_MSR_EXIT
>
> and 3 kinds of exits: KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE,
> KVM_EXIT_MSR_AFTER_WRITE.  The first two use four fields (type, msr,
> data, handled) and #GP if handled=0 is zero on the next entry.  The last
> one uses the first three fields and can be used for HyperV.
>
> Paolo

Paolo, I've included an updated version of this patch below. Does this
match the API that you had in mind? If so, I'll continue adjusting the
rest of the code and will send the entire new patchset.

This updated version of the API seems more cumbersome to me (three new
capabilities, three exit reasons when just one or two might suffice)
than the original interface I used, but maybe you have a good reason
for that. Also, will it be ok to have just one capability to enable
all three kinds of exits, or will KVM_EXIT_MSR_AFTER_WRITE want a
separate capability?


commit a684d520ed62cf0db4495e5197d5bf722e4f8109
Author: Peter Hornyack <peterhornyack@google.com>
Date:   Fri Dec 18 14:44:04 2015 -0800

    KVM: add capabilities and exit reasons for MSRs.

    Define KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, and
    KVM_EXIT_MSR_AFTER_WRITE, new exit reasons for accesses to MSRs that kvm
    does not handle or that user space needs to be notified about. Define the
    KVM_CAP_MSR_EXITS, KVM_CAP_ENABLE_MSR_EXITS, and KVM_CAP_DISABLE_MSR_EXITS
    capabilities to control these new exits for a VM.

the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Pavel Fedin Dec. 22, 2015, 7:24 a.m. UTC | #1
Hello!

> commit a684d520ed62cf0db4495e5197d5bf722e4f8109
> Author: Peter Hornyack <peterhornyack@google.com>
> Date:   Fri Dec 18 14:44:04 2015 -0800
> 
>     KVM: add capabilities and exit reasons for MSRs.
> 
>     Define KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, and
>     KVM_EXIT_MSR_AFTER_WRITE, new exit reasons for accesses to MSRs that kvm
>     does not handle or that user space needs to be notified about. Define the
>     KVM_CAP_MSR_EXITS, KVM_CAP_ENABLE_MSR_EXITS, and KVM_CAP_DISABLE_MSR_EXITS
>     capabilities to control these new exits for a VM.
> 
> diff --git a/Documentation/virtual/kvm/api.txt
> b/Documentation/virtual/kvm/api.txt
> index 053f613fc9a9..3bba3248df3d 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -3359,6 +3359,43 @@ Hyper-V SynIC state change. Notification is
> used to remap SynIC
>  event/message pages and to enable/disable SynIC messages/events processing
>  in userspace.
> 
> + /*
> + * KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE,
> + * KVM_EXIT_MSR_AFTER_WRITE
> + */
> + struct {
> + __u32 index;    /* i.e. ecx; out */
> + __u64 data;     /* out (wrmsr) / in (rdmsr) */
> +#define KVM_EXIT_MSR_COMPLETION_FAILED 1
> + __u64 type;     /* out */
> +#define KVM_EXIT_MSR_UNHANDLED 0
> +#define KVM_EXIT_MSR_HANDLED   1
> + __u8 handled;   /* in */
> + } msr;
> +
> +If exit_reason is KVM_EXIT_MSR_READ or KVM_EXIT_MSR_WRITE, then the vcpu has
> +executed a rdmsr or wrmsr instruction which could not be satisfied by kvm. The
> +msr struct is used for both output to and input from user space. index is the
> +target MSR number held in ecx; user space must not modify this field. data

 In 'index', you meant?
 I would enlarge it to __u64 and use generalized encoding, the same as for KVM_SET_ONE_REG ioctl. I already wrote about it.
 And i would use simply "REG" instead of "MSR" denotion. Because on different architectures they can have different names (e. g. on ARM32 they are called "coprocessor registers" and on ARM64 these are "system registers"), however the common thing between them is that it is some special CPU register, access to which can be trapped and emulated. Therefore KVM_EXIT_REG_xxx.

> +holds the payload from a wrmsr or must be filled in with a payload on a rdmsr.
> +For a normal exit, type will be 0.
> +
> +On the return path into kvm, user space should set handled to
> +KVM_EXIT_MSR_HANDLED if it successfully handled the MSR access. Otherwise,
> +handled should be set to KVM_EXIT_MSR_UNHANDLED, which will cause a general
> +protection fault to be injected into the vcpu. If an error occurs during the
> +return into kvm, the vcpu will not be run and another exit will be generated
> +with type set to KVM_EXIT_MSR_COMPLETION_FAILED.
> +
> +If exit_reason is KVM_EXIT_MSR_AFTER_WRITE, then the vcpu has executed a wrmsr
> +instruction which is handled by kvm but which user space may need to be
> +notified about. index and data are set as described above; the value of type
> +depends on the MSR that was written. handled is ignored on reentry into kvm.

1. Is there any real need to distinguish between KVM_EXIT_MSR_WRITE and KVM_EXIT_MSR_AFTER_WRITE ? IMHO from userland's point of view these are the same.
2. Why do WRITE and READ have to be different exit codes? We could use something like "u8 is_write" in our structure, this would be more in line with PIO and MMIO handling.

> +
> +KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, and KVM_EXIT_MSR_AFTER_WRITE can only
> +occur when KVM_CAP_MSR_EXITS is present and KVM_CAP_ENABLE_MSR_EXITS has been
> +set. A detailed description of these capabilities is below.
> +
>   /* Fix the size of the union. */
>   char padding[256];
>   };
> @@ -3697,6 +3734,26 @@ a KVM_EXIT_IOAPIC_EOI vmexit will be reported
> to userspace.
>  Fails if VCPU has already been created, or if the irqchip is already in the
>  kernel (i.e. KVM_CREATE_IRQCHIP has already been called).
> 
> +7.6 KVM_CAP_ENABLE_MSR_EXITS, KVM_CAP_DISABLE_MSR_EXITS
> +
> +Architectures: x86 (vmx-only)
> +Parameters: none
> +Returns: 0 on success, -1 on error
> +
> +These capabilities enable and disable exits to user space for certain guest MSR
> +accesses. These capabilities are only available if KVM_CHECK_EXTENSION
> +indicates that KVM_CAP_MSR_EXITS is present.
> +
> +When enabled, kvm will exit to user space when the guest reads
> +an MSR that kvm does not handle (KVM_EXIT_MSR_READ), writes an MSR that kvm
> +does not handle (KVM_EXIT_MSR_WRITE), or writes an MSR that kvm handles but for
> +which user space should be notified (KVM_EXIT_MSR_AFTER_WRITE).
> +
> +These exits are currently only implemented for vmx. Also, note that if the kvm
> +module's ignore_msrs flag is set then KVM_EXIT_MSR_READ and KVM_EXIT_MSR_WRITE
> +will not be generated, and unhandled MSR accesses will simply be ignored and
> +the guest re-entered immediately.
> +
> 
>  8. Other capabilities.
>  ----------------------
> @@ -3726,3 +3783,11 @@ In order to use SynIC, it has to be activated
> by setting this
>  capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
>  will disable the use of APIC hardware virtualization even if supported
>  by the CPU, as it's incompatible with SynIC auto-EOI behavior.
> +
> +8.3 KVM_CAP_MSR_EXITS
> +
> +Architectures: x86 (vmx-only)
> +
> +This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
> +that the kernel implements the KVM_CAP_ENABLE_MSR_EXITS and
> +KVM_CAP_DISABLE_MSR_EXITS capabilities for VMs.
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 6e32f7599081..431fd1ec0d06 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -199,6 +199,9 @@ struct kvm_hyperv_exit {
>  #define KVM_EXIT_S390_STSI        25
>  #define KVM_EXIT_IOAPIC_EOI       26
>  #define KVM_EXIT_HYPERV           27
> +#define KVM_EXIT_MSR_READ         28
> +#define KVM_EXIT_MSR_WRITE        29
> +#define KVM_EXIT_MSR_AFTER_WRITE  30
> 
>  /* For KVM_EXIT_INTERNAL_ERROR */
>  /* Emulate instruction failed. */
> @@ -355,6 +358,18 @@ struct kvm_run {
>   } eoi;
>   /* KVM_EXIT_HYPERV */
>   struct kvm_hyperv_exit hyperv;
> + /*
> + * KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE,
> + * KVM_EXIT_MSR_AFTER_WRITE
> + */
> + struct {
> + __u32 index;    /* i.e. ecx; out */
> + __u64 data;     /* out (wrmsr) / in (rdmsr) */
> + __u64 type;     /* out */
> +#define KVM_EXIT_MSR_UNHANDLED 0
> +#define KVM_EXIT_MSR_HANDLED   1
> + __u8 handled;   /* in */
> + } msr;
>   /* Fix the size of the union. */
>   char padding[256];
>   };
> @@ -849,6 +864,9 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_SPLIT_IRQCHIP 121
>  #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122
>  #define KVM_CAP_HYPERV_SYNIC 123
> +#define KVM_CAP_MSR_EXITS 124
> +#define KVM_CAP_DISABLE_MSR_EXITS 125
> +#define KVM_CAP_ENABLE_MSR_EXITS 126
> 
>  #ifdef KVM_CAP_IRQ_ROUTING

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roman Kagan Dec. 22, 2015, 12:01 p.m. UTC | #2
On Tue, Dec 22, 2015 at 10:24:13AM +0300, Pavel Fedin wrote:
> > +On the return path into kvm, user space should set handled to
> > +KVM_EXIT_MSR_HANDLED if it successfully handled the MSR access. Otherwise,
> > +handled should be set to KVM_EXIT_MSR_UNHANDLED, which will cause a general
> > +protection fault to be injected into the vcpu. If an error occurs during the
> > +return into kvm, the vcpu will not be run and another exit will be generated
> > +with type set to KVM_EXIT_MSR_COMPLETION_FAILED.
> > +
> > +If exit_reason is KVM_EXIT_MSR_AFTER_WRITE, then the vcpu has executed a wrmsr
> > +instruction which is handled by kvm but which user space may need to be
> > +notified about. index and data are set as described above; the value of type
> > +depends on the MSR that was written. handled is ignored on reentry into kvm.
> 
> 1. Is there any real need to distinguish between KVM_EXIT_MSR_WRITE and KVM_EXIT_MSR_AFTER_WRITE ? IMHO from userland's point of view these are the same.

Indeed.  Perhaps the kernel can set .handled to true to let userspace
know it already took care of it, instead of introducing yet another
exit_reason.  The field would need to be marked in/out, then.

Roman.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Fedin Dec. 22, 2015, 12:51 p.m. UTC | #3
Hello!

> > 1. Is there any real need to distinguish between KVM_EXIT_MSR_WRITE and
> KVM_EXIT_MSR_AFTER_WRITE ? IMHO from userland's point of view these are the same.
> 
> Indeed.  Perhaps the kernel can set .handled to true to let userspace
> know it already took care of it, instead of introducing yet another
> exit_reason.  The field would need to be marked in/out, then.

 I'm not sure that you need even this. Anyway, particular MSRs are function-specific, and if you're emulating an MSR in userspace,
then, i believe, you know the function behind it. And it's IMHO safe to just know that SynIC MSRs have some extra handling in
kernel. And i believe this has no direct impact on userland's behavior.
 But, you better know the details.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Roman Kagan Dec. 22, 2015, 2:09 p.m. UTC | #4
On Tue, Dec 22, 2015 at 03:51:52PM +0300, Pavel Fedin wrote:
>  Hello!
> 
> > > 1. Is there any real need to distinguish between KVM_EXIT_MSR_WRITE and
> > KVM_EXIT_MSR_AFTER_WRITE ? IMHO from userland's point of view these are the same.
> > 
> > Indeed.  Perhaps the kernel can set .handled to true to let userspace
> > know it already took care of it, instead of introducing yet another
> > exit_reason.  The field would need to be marked in/out, then.
> 
>  I'm not sure that you need even this. Anyway, particular MSRs are function-specific, and if you're emulating an MSR in userspace,
> then, i believe, you know the function behind it. And it's IMHO safe to just know that SynIC MSRs have some extra handling in
> kernel. And i believe this has no direct impact on userland's behavior.

It has: unlike the scenario that was the original motivation for Peter's
patches, where the the userspace wanted to handle register accesses
which the kernel *didn't*, in case of SynIC the userspace wants do
something about MSR accesses *only* if the kernel *also* handles them.

I guess that was the reason why Paolo suggested an extra exit_reason,
and I think .handled field can be used to pass that information instead.

You're probably right that, at least in SynIC case, it should be safe to
assume that, if all the SynIC setup succeeded, the corresponding MSR
accesses would only trigger exits when the kernel processed them
appropriately.

But the proposed use of .handled costs basically nothing, and it may
prove useful in general (as a conisistency proof, if anything).

Roman.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pavel Fedin Dec. 23, 2015, 7:47 a.m. UTC | #5
Hello!

> It has: unlike the scenario that was the original motivation for Peter's
> patches, where the the userspace wanted to handle register accesses
> which the kernel *didn't*, in case of SynIC the userspace wants do
> something about MSR accesses *only* if the kernel *also* handles them.

 Well... I believe, that qemu knows if we are instantiating SynIC. And, if we are, it knows that the kernel will do something about
it. Otherwise these registers don't exist, and, by the way, the guest is not expected to touch them, is it?

> I guess that was the reason why Paolo suggested an extra exit_reason,
> and I think .handled field can be used to pass that information instead.

[skip]

> But the proposed use of .handled costs basically nothing, and it may
> prove useful in general (as a conisistency proof, if anything).

 Well... May be... So, i'm OK with it.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia


--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Hornyack Jan. 12, 2016, 3:21 a.m. UTC | #6
On Mon, Dec 21, 2015 at 11:24 PM, Pavel Fedin <p.fedin@samsung.com> wrote:
>  Hello!
>
>> commit a684d520ed62cf0db4495e5197d5bf722e4f8109
>> Author: Peter Hornyack <peterhornyack@google.com>
>> Date:   Fri Dec 18 14:44:04 2015 -0800
>>
>>     KVM: add capabilities and exit reasons for MSRs.
>>
>>     Define KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, and
>>     KVM_EXIT_MSR_AFTER_WRITE, new exit reasons for accesses to MSRs that kvm
>>     does not handle or that user space needs to be notified about. Define the
>>     KVM_CAP_MSR_EXITS, KVM_CAP_ENABLE_MSR_EXITS, and KVM_CAP_DISABLE_MSR_EXITS
>>     capabilities to control these new exits for a VM.
>>
>> diff --git a/Documentation/virtual/kvm/api.txt
>> b/Documentation/virtual/kvm/api.txt
>> index 053f613fc9a9..3bba3248df3d 100644
>> --- a/Documentation/virtual/kvm/api.txt
>> +++ b/Documentation/virtual/kvm/api.txt
>> @@ -3359,6 +3359,43 @@ Hyper-V SynIC state change. Notification is
>> used to remap SynIC
>>  event/message pages and to enable/disable SynIC messages/events processing
>>  in userspace.
>>
>> + /*
>> + * KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE,
>> + * KVM_EXIT_MSR_AFTER_WRITE
>> + */
>> + struct {
>> + __u32 index;    /* i.e. ecx; out */
>> + __u64 data;     /* out (wrmsr) / in (rdmsr) */
>> +#define KVM_EXIT_MSR_COMPLETION_FAILED 1
>> + __u64 type;     /* out */
>> +#define KVM_EXIT_MSR_UNHANDLED 0
>> +#define KVM_EXIT_MSR_HANDLED   1
>> + __u8 handled;   /* in */
>> + } msr;
>> +
>> +If exit_reason is KVM_EXIT_MSR_READ or KVM_EXIT_MSR_WRITE, then the vcpu has
>> +executed a rdmsr or wrmsr instruction which could not be satisfied by kvm. The
>> +msr struct is used for both output to and input from user space. index is the
>> +target MSR number held in ecx; user space must not modify this field. data
>
>  In 'index', you meant?

I'll just remove "held in ecx."

>  I would enlarge it to __u64 and use generalized encoding, the same as for KVM_SET_ONE_REG ioctl. I already wrote about it.

Ok, I would be open to using this sort of encoding.

>  And i would use simply "REG" instead of "MSR" denotion. Because on different architectures they can have different names (e. g. on ARM32 they are called "coprocessor registers" and on ARM64 these are "system registers"), however the common thing between them is that it is some special CPU register, access to which can be trapped and emulated. Therefore KVM_EXIT_REG_xxx.

Sure.

>> +holds the payload from a wrmsr or must be filled in with a payload on a rdmsr.
>> +For a normal exit, type will be 0.
>> +
>> +On the return path into kvm, user space should set handled to
>> +KVM_EXIT_MSR_HANDLED if it successfully handled the MSR access. Otherwise,
>> +handled should be set to KVM_EXIT_MSR_UNHANDLED, which will cause a general
>> +protection fault to be injected into the vcpu. If an error occurs during the
>> +return into kvm, the vcpu will not be run and another exit will be generated
>> +with type set to KVM_EXIT_MSR_COMPLETION_FAILED.
>> +
>> +If exit_reason is KVM_EXIT_MSR_AFTER_WRITE, then the vcpu has executed a wrmsr
>> +instruction which is handled by kvm but which user space may need to be
>> +notified about. index and data are set as described above; the value of type
>> +depends on the MSR that was written. handled is ignored on reentry into kvm.
>
> 1. Is there any real need to distinguish between KVM_EXIT_MSR_WRITE and KVM_EXIT_MSR_AFTER_WRITE ? IMHO from userland's point of view these are the same.

I think I agree with you Pavel, but we could leave the .handled field
as Roman suggests in case user space really does need to distinguish
between these two.

> 2. Why do WRITE and READ have to be different exit codes? We could use something like "u8 is_write" in our structure, this would be more in line with PIO and MMIO handling.

I agree - this is what I had in my original patchset. I updated the
proposed API based on Paolo's suggestions but I'd prefer to go back to
just a single exit reason.

I will send a v2 patchset for KVM_EXIT_REG_IO with the generalized
register encoding.

>> +
>> +KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, and KVM_EXIT_MSR_AFTER_WRITE can only
>> +occur when KVM_CAP_MSR_EXITS is present and KVM_CAP_ENABLE_MSR_EXITS has been
>> +set. A detailed description of these capabilities is below.
>> +
>>   /* Fix the size of the union. */
>>   char padding[256];
>>   };
>> @@ -3697,6 +3734,26 @@ a KVM_EXIT_IOAPIC_EOI vmexit will be reported
>> to userspace.
>>  Fails if VCPU has already been created, or if the irqchip is already in the
>>  kernel (i.e. KVM_CREATE_IRQCHIP has already been called).
>>
>> +7.6 KVM_CAP_ENABLE_MSR_EXITS, KVM_CAP_DISABLE_MSR_EXITS
>> +
>> +Architectures: x86 (vmx-only)
>> +Parameters: none
>> +Returns: 0 on success, -1 on error
>> +
>> +These capabilities enable and disable exits to user space for certain guest MSR
>> +accesses. These capabilities are only available if KVM_CHECK_EXTENSION
>> +indicates that KVM_CAP_MSR_EXITS is present.
>> +
>> +When enabled, kvm will exit to user space when the guest reads
>> +an MSR that kvm does not handle (KVM_EXIT_MSR_READ), writes an MSR that kvm
>> +does not handle (KVM_EXIT_MSR_WRITE), or writes an MSR that kvm handles but for
>> +which user space should be notified (KVM_EXIT_MSR_AFTER_WRITE).
>> +
>> +These exits are currently only implemented for vmx. Also, note that if the kvm
>> +module's ignore_msrs flag is set then KVM_EXIT_MSR_READ and KVM_EXIT_MSR_WRITE
>> +will not be generated, and unhandled MSR accesses will simply be ignored and
>> +the guest re-entered immediately.
>> +
>>
>>  8. Other capabilities.
>>  ----------------------
>> @@ -3726,3 +3783,11 @@ In order to use SynIC, it has to be activated
>> by setting this
>>  capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
>>  will disable the use of APIC hardware virtualization even if supported
>>  by the CPU, as it's incompatible with SynIC auto-EOI behavior.
>> +
>> +8.3 KVM_CAP_MSR_EXITS
>> +
>> +Architectures: x86 (vmx-only)
>> +
>> +This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
>> +that the kernel implements the KVM_CAP_ENABLE_MSR_EXITS and
>> +KVM_CAP_DISABLE_MSR_EXITS capabilities for VMs.
>> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
>> index 6e32f7599081..431fd1ec0d06 100644
>> --- a/include/uapi/linux/kvm.h
>> +++ b/include/uapi/linux/kvm.h
>> @@ -199,6 +199,9 @@ struct kvm_hyperv_exit {
>>  #define KVM_EXIT_S390_STSI        25
>>  #define KVM_EXIT_IOAPIC_EOI       26
>>  #define KVM_EXIT_HYPERV           27
>> +#define KVM_EXIT_MSR_READ         28
>> +#define KVM_EXIT_MSR_WRITE        29
>> +#define KVM_EXIT_MSR_AFTER_WRITE  30
>>
>>  /* For KVM_EXIT_INTERNAL_ERROR */
>>  /* Emulate instruction failed. */
>> @@ -355,6 +358,18 @@ struct kvm_run {
>>   } eoi;
>>   /* KVM_EXIT_HYPERV */
>>   struct kvm_hyperv_exit hyperv;
>> + /*
>> + * KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE,
>> + * KVM_EXIT_MSR_AFTER_WRITE
>> + */
>> + struct {
>> + __u32 index;    /* i.e. ecx; out */
>> + __u64 data;     /* out (wrmsr) / in (rdmsr) */
>> + __u64 type;     /* out */
>> +#define KVM_EXIT_MSR_UNHANDLED 0
>> +#define KVM_EXIT_MSR_HANDLED   1
>> + __u8 handled;   /* in */
>> + } msr;
>>   /* Fix the size of the union. */
>>   char padding[256];
>>   };
>> @@ -849,6 +864,9 @@ struct kvm_ppc_smmu_info {
>>  #define KVM_CAP_SPLIT_IRQCHIP 121
>>  #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122
>>  #define KVM_CAP_HYPERV_SYNIC 123
>> +#define KVM_CAP_MSR_EXITS 124
>> +#define KVM_CAP_DISABLE_MSR_EXITS 125
>> +#define KVM_CAP_ENABLE_MSR_EXITS 126
>>
>>  #ifdef KVM_CAP_IRQ_ROUTING
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>

Peter
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/virtual/kvm/api.txt
b/Documentation/virtual/kvm/api.txt
index 053f613fc9a9..3bba3248df3d 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3359,6 +3359,43 @@  Hyper-V SynIC state change. Notification is
used to remap SynIC
 event/message pages and to enable/disable SynIC messages/events processing
 in userspace.

+ /*
+ * KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE,
+ * KVM_EXIT_MSR_AFTER_WRITE
+ */
+ struct {
+ __u32 index;    /* i.e. ecx; out */
+ __u64 data;     /* out (wrmsr) / in (rdmsr) */
+#define KVM_EXIT_MSR_COMPLETION_FAILED 1
+ __u64 type;     /* out */
+#define KVM_EXIT_MSR_UNHANDLED 0
+#define KVM_EXIT_MSR_HANDLED   1
+ __u8 handled;   /* in */
+ } msr;
+
+If exit_reason is KVM_EXIT_MSR_READ or KVM_EXIT_MSR_WRITE, then the vcpu has
+executed a rdmsr or wrmsr instruction which could not be satisfied by kvm. The
+msr struct is used for both output to and input from user space. index is the
+target MSR number held in ecx; user space must not modify this field. data
+holds the payload from a wrmsr or must be filled in with a payload on a rdmsr.
+For a normal exit, type will be 0.
+
+On the return path into kvm, user space should set handled to
+KVM_EXIT_MSR_HANDLED if it successfully handled the MSR access. Otherwise,
+handled should be set to KVM_EXIT_MSR_UNHANDLED, which will cause a general
+protection fault to be injected into the vcpu. If an error occurs during the
+return into kvm, the vcpu will not be run and another exit will be generated
+with type set to KVM_EXIT_MSR_COMPLETION_FAILED.
+
+If exit_reason is KVM_EXIT_MSR_AFTER_WRITE, then the vcpu has executed a wrmsr
+instruction which is handled by kvm but which user space may need to be
+notified about. index and data are set as described above; the value of type
+depends on the MSR that was written. handled is ignored on reentry into kvm.
+
+KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE, and KVM_EXIT_MSR_AFTER_WRITE can only
+occur when KVM_CAP_MSR_EXITS is present and KVM_CAP_ENABLE_MSR_EXITS has been
+set. A detailed description of these capabilities is below.
+
  /* Fix the size of the union. */
  char padding[256];
  };
@@ -3697,6 +3734,26 @@  a KVM_EXIT_IOAPIC_EOI vmexit will be reported
to userspace.
 Fails if VCPU has already been created, or if the irqchip is already in the
 kernel (i.e. KVM_CREATE_IRQCHIP has already been called).

+7.6 KVM_CAP_ENABLE_MSR_EXITS, KVM_CAP_DISABLE_MSR_EXITS
+
+Architectures: x86 (vmx-only)
+Parameters: none
+Returns: 0 on success, -1 on error
+
+These capabilities enable and disable exits to user space for certain guest MSR
+accesses. These capabilities are only available if KVM_CHECK_EXTENSION
+indicates that KVM_CAP_MSR_EXITS is present.
+
+When enabled, kvm will exit to user space when the guest reads
+an MSR that kvm does not handle (KVM_EXIT_MSR_READ), writes an MSR that kvm
+does not handle (KVM_EXIT_MSR_WRITE), or writes an MSR that kvm handles but for
+which user space should be notified (KVM_EXIT_MSR_AFTER_WRITE).
+
+These exits are currently only implemented for vmx. Also, note that if the kvm
+module's ignore_msrs flag is set then KVM_EXIT_MSR_READ and KVM_EXIT_MSR_WRITE
+will not be generated, and unhandled MSR accesses will simply be ignored and
+the guest re-entered immediately.
+

 8. Other capabilities.
 ----------------------
@@ -3726,3 +3783,11 @@  In order to use SynIC, it has to be activated
by setting this
 capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
 will disable the use of APIC hardware virtualization even if supported
 by the CPU, as it's incompatible with SynIC auto-EOI behavior.
+
+8.3 KVM_CAP_MSR_EXITS
+
+Architectures: x86 (vmx-only)
+
+This capability, if KVM_CHECK_EXTENSION indicates that it is available, means
+that the kernel implements the KVM_CAP_ENABLE_MSR_EXITS and
+KVM_CAP_DISABLE_MSR_EXITS capabilities for VMs.
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 6e32f7599081..431fd1ec0d06 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -199,6 +199,9 @@  struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI        25
 #define KVM_EXIT_IOAPIC_EOI       26
 #define KVM_EXIT_HYPERV           27
+#define KVM_EXIT_MSR_READ         28
+#define KVM_EXIT_MSR_WRITE        29
+#define KVM_EXIT_MSR_AFTER_WRITE  30

 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -355,6 +358,18 @@  struct kvm_run {
  } eoi;
  /* KVM_EXIT_HYPERV */
  struct kvm_hyperv_exit hyperv;
+ /*
+ * KVM_EXIT_MSR_READ, KVM_EXIT_MSR_WRITE,
+ * KVM_EXIT_MSR_AFTER_WRITE
+ */
+ struct {
+ __u32 index;    /* i.e. ecx; out */
+ __u64 data;     /* out (wrmsr) / in (rdmsr) */
+ __u64 type;     /* out */
+#define KVM_EXIT_MSR_UNHANDLED 0
+#define KVM_EXIT_MSR_HANDLED   1
+ __u8 handled;   /* in */
+ } msr;
  /* Fix the size of the union. */
  char padding[256];
  };
@@ -849,6 +864,9 @@  struct kvm_ppc_smmu_info {
 #define KVM_CAP_SPLIT_IRQCHIP 121
 #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122
 #define KVM_CAP_HYPERV_SYNIC 123
+#define KVM_CAP_MSR_EXITS 124
+#define KVM_CAP_DISABLE_MSR_EXITS 125
+#define KVM_CAP_ENABLE_MSR_EXITS 126

 #ifdef KVM_CAP_IRQ_ROUTING
--
To unsubscribe from this list: send the line "unsubscribe kvm" in