Message ID | 20220409184549.1681189-8-oupton@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: arm64: PSCI SYSTEM_SUSPEND support | expand |
Hi Oliver, On Sat, Apr 9, 2022 at 11:46 AM Oliver Upton <oupton@google.com> wrote: > > Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU > is in a suspended state. In the suspended state the vCPU will block > until a wakeup event (pending interrupt) is recognized. > > Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to > userspace that KVM has recognized one such wakeup event. It is the > responsibility of userspace to then make the vCPU runnable, or leave it > suspended until the next wakeup event. > > Signed-off-by: Oliver Upton <oupton@google.com> > --- > Documentation/virt/kvm/api.rst | 37 +++++++++++++++++++++-- > arch/arm64/include/asm/kvm_host.h | 1 + > arch/arm64/kvm/arm.c | 49 +++++++++++++++++++++++++++++++ > include/uapi/linux/kvm.h | 2 ++ > 4 files changed, 87 insertions(+), 2 deletions(-) > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > index d13fa6600467..d104e34ad703 100644 > --- a/Documentation/virt/kvm/api.rst > +++ b/Documentation/virt/kvm/api.rst > @@ -1476,14 +1476,43 @@ Possible values are: > [s390] > KVM_MP_STATE_LOAD the vcpu is in a special load/startup state > [s390] > + KVM_MP_STATE_SUSPENDED the vcpu is in a suspend state and is waiting > + for a wakeup event [arm64] > ========================== =============================================== > > On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an > in-kernel irqchip, the multiprocessing state must be maintained by userspace on > these architectures. > > -For arm64/riscv: > -^^^^^^^^^^^^^^^^ > +For arm64: > +^^^^^^^^^^ > + > +If a vCPU is in the KVM_MP_STATE_SUSPENDED state, KVM will emulate the > +architectural execution of a WFI instruction. > + > +If a wakeup event is recognized, KVM will exit to userspace with a > +KVM_SYSTEM_EVENT exit, where the event type is KVM_SYSTEM_EVENT_WAKEUP. If > +userspace wants to honor the wakeup, it must set the vCPU's MP state to > +KVM_MP_STATE_RUNNABLE. If it does not, KVM will continue to await a wakeup > +event in subsequent calls to KVM_RUN. > + > +.. warning:: > + > + If userspace intends to keep the vCPU in a SUSPENDED state, it is > + strongly recommended that userspace take action to suppress the > + wakeup event (such as masking an interrupt). Otherwise, subsequent > + calls to KVM_RUN will immediately exit with a KVM_SYSTEM_EVENT_WAKEUP > + event and inadvertently waste CPU cycles. > + > + Additionally, if userspace takes action to suppress a wakeup event, > + it is strongly recommended that it also restores the vCPU to its > + original state when the vCPU is made RUNNABLE again. For example, > + if userspace masked a pending interrupt to suppress the wakeup, > + the interrupt should be unmasked before returning control to the > + guest. > + > +For riscv: > +^^^^^^^^^^ > > The only states that are valid are KVM_MP_STATE_STOPPED and > KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. > @@ -5985,6 +6014,7 @@ should put the acknowledged interrupt vector into the 'epr' field. > #define KVM_SYSTEM_EVENT_SHUTDOWN 1 > #define KVM_SYSTEM_EVENT_RESET 2 > #define KVM_SYSTEM_EVENT_CRASH 3 > + #define KVM_SYSTEM_EVENT_WAKEUP 4 > __u32 type; > __u64 flags; > } system_event; > @@ -6009,6 +6039,9 @@ Valid values for 'type' are: > has requested a crash condition maintenance. Userspace can choose > to ignore the request, or to gather VM memory core dump and/or > reset/shutdown of the VM. > + - KVM_SYSTEM_EVENT_WAKEUP -- the exiting vCPU is in a suspended state and > + KVM has recognized a wakeup event. Userspace may honor this event by > + marking the exiting vCPU as runnable, or deny it and call KVM_RUN again. > > Valid flags are: > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index f3f93d48e21a..46027b9b80ca 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -46,6 +46,7 @@ > #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) > #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4) > #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) > +#define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) > > #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \ > KVM_DIRTY_LOG_INITIALLY_SET) > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > index efe54aba5cce..e9641b86d375 100644 > --- a/arch/arm64/kvm/arm.c > +++ b/arch/arm64/kvm/arm.c > @@ -444,6 +444,18 @@ bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu) > return vcpu->arch.mp_state.mp_state == KVM_MP_STATE_STOPPED; > } > > +static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu) > +{ > + vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED; > + kvm_make_request(KVM_REQ_SUSPEND, vcpu); > + kvm_vcpu_kick(vcpu); > +static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu) > +{ > + vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED; > + kvm_make_request(KVM_REQ_SUSPEND, vcpu); > + kvm_vcpu_kick(vcpu); Considering the patch 8 will remove the call to kvm_vcpu_kick() (BTW, I wonder why you wanted to make that change in the patch-8 instead of the patch-7), it looks like we could use the mp_state KVM_MP_STATE_SUSPENDED instead of using KVM_REQ_SUSPEND. What is the reason why you prefer to introduce KVM_REQ_SUSPEND rather than simply using KVM_MP_STATE_SUSPENDED ? Thanks, Reiji > +} > + > +static bool kvm_arm_vcpu_suspended(struct kvm_vcpu *vcpu) > +{ > + return vcpu->arch.mp_state.mp_state == KVM_MP_STATE_SUSPENDED; > +} > + > int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, > struct kvm_mp_state *mp_state) > { > @@ -464,6 +476,9 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, > case KVM_MP_STATE_STOPPED: > kvm_arm_vcpu_power_off(vcpu); > break; > + case KVM_MP_STATE_SUSPENDED: > + kvm_arm_vcpu_suspend(vcpu); > + break; > default: > ret = -EINVAL; > } > @@ -648,6 +663,37 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu) > preempt_enable(); > } > > +static int kvm_vcpu_suspend(struct kvm_vcpu *vcpu) > +{ > + if (!kvm_arm_vcpu_suspended(vcpu)) > + return 1; > + > + kvm_vcpu_wfi(vcpu); > + > + /* > + * The suspend state is sticky; we do not leave it until userspace > + * explicitly marks the vCPU as runnable. Request that we suspend again > + * later. > + */ > + kvm_make_request(KVM_REQ_SUSPEND, vcpu); > + > + /* > + * Check to make sure the vCPU is actually runnable. If so, exit to > + * userspace informing it of the wakeup condition. > + */ > + if (kvm_arch_vcpu_runnable(vcpu)) { > + kvm_vcpu_set_system_event_exit(vcpu, KVM_SYSTEM_EVENT_WAKEUP, 0); > + return 0; > + } > + > + /* > + * Otherwise, we were unblocked to process a different event, such as a > + * pending signal. Return 1 and allow kvm_arch_vcpu_ioctl_run() to > + * process the event. > + */ > + return 1; > +} > + > /** > * check_vcpu_requests - check and handle pending vCPU requests > * @vcpu: the VCPU pointer > @@ -686,6 +732,9 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu) > if (kvm_check_request(KVM_REQ_RELOAD_PMU, vcpu)) > kvm_pmu_handle_pmcr(vcpu, > __vcpu_sys_reg(vcpu, PMCR_EL0)); > + > + if (kvm_check_request(KVM_REQ_SUSPEND, vcpu)) > + return kvm_vcpu_suspend(vcpu); > } > > return 1; > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 91a6fe4e02c0..64e5f9d83a7a 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -444,6 +444,7 @@ struct kvm_run { > #define KVM_SYSTEM_EVENT_SHUTDOWN 1 > #define KVM_SYSTEM_EVENT_RESET 2 > #define KVM_SYSTEM_EVENT_CRASH 3 > +#define KVM_SYSTEM_EVENT_WAKEUP 4 > __u32 type; > __u64 flags; > } system_event; > @@ -640,6 +641,7 @@ struct kvm_vapic_addr { > #define KVM_MP_STATE_OPERATING 7 > #define KVM_MP_STATE_LOAD 8 > #define KVM_MP_STATE_AP_RESET_HOLD 9 > +#define KVM_MP_STATE_SUSPENDED 10 > > struct kvm_mp_state { > __u32 mp_state; > -- > 2.35.1.1178.g4f1659d476-goog >
Hi Reiji, On Wed, Apr 20, 2022 at 8:13 PM Reiji Watanabe <reijiw@google.com> wrote: > > Hi Oliver, > > On Sat, Apr 9, 2022 at 11:46 AM Oliver Upton <oupton@google.com> wrote: > > > > Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU > > is in a suspended state. In the suspended state the vCPU will block > > until a wakeup event (pending interrupt) is recognized. > > > > Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to > > userspace that KVM has recognized one such wakeup event. It is the > > responsibility of userspace to then make the vCPU runnable, or leave it > > suspended until the next wakeup event. > > > > Signed-off-by: Oliver Upton <oupton@google.com> > > --- > > Documentation/virt/kvm/api.rst | 37 +++++++++++++++++++++-- > > arch/arm64/include/asm/kvm_host.h | 1 + > > arch/arm64/kvm/arm.c | 49 +++++++++++++++++++++++++++++++ > > include/uapi/linux/kvm.h | 2 ++ > > 4 files changed, 87 insertions(+), 2 deletions(-) > > > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > > index d13fa6600467..d104e34ad703 100644 > > --- a/Documentation/virt/kvm/api.rst > > +++ b/Documentation/virt/kvm/api.rst > > @@ -1476,14 +1476,43 @@ Possible values are: > > [s390] > > KVM_MP_STATE_LOAD the vcpu is in a special load/startup state > > [s390] > > + KVM_MP_STATE_SUSPENDED the vcpu is in a suspend state and is waiting > > + for a wakeup event [arm64] > > ========================== =============================================== > > > > On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an > > in-kernel irqchip, the multiprocessing state must be maintained by userspace on > > these architectures. > > > > -For arm64/riscv: > > -^^^^^^^^^^^^^^^^ > > +For arm64: > > +^^^^^^^^^^ > > + > > +If a vCPU is in the KVM_MP_STATE_SUSPENDED state, KVM will emulate the > > +architectural execution of a WFI instruction. > > + > > +If a wakeup event is recognized, KVM will exit to userspace with a > > +KVM_SYSTEM_EVENT exit, where the event type is KVM_SYSTEM_EVENT_WAKEUP. If > > +userspace wants to honor the wakeup, it must set the vCPU's MP state to > > +KVM_MP_STATE_RUNNABLE. If it does not, KVM will continue to await a wakeup > > +event in subsequent calls to KVM_RUN. > > + > > +.. warning:: > > + > > + If userspace intends to keep the vCPU in a SUSPENDED state, it is > > + strongly recommended that userspace take action to suppress the > > + wakeup event (such as masking an interrupt). Otherwise, subsequent > > + calls to KVM_RUN will immediately exit with a KVM_SYSTEM_EVENT_WAKEUP > > + event and inadvertently waste CPU cycles. > > + > > + Additionally, if userspace takes action to suppress a wakeup event, > > + it is strongly recommended that it also restores the vCPU to its > > + original state when the vCPU is made RUNNABLE again. For example, > > + if userspace masked a pending interrupt to suppress the wakeup, > > + the interrupt should be unmasked before returning control to the > > + guest. > > + > > +For riscv: > > +^^^^^^^^^^ > > > > The only states that are valid are KVM_MP_STATE_STOPPED and > > KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. > > @@ -5985,6 +6014,7 @@ should put the acknowledged interrupt vector into the 'epr' field. > > #define KVM_SYSTEM_EVENT_SHUTDOWN 1 > > #define KVM_SYSTEM_EVENT_RESET 2 > > #define KVM_SYSTEM_EVENT_CRASH 3 > > + #define KVM_SYSTEM_EVENT_WAKEUP 4 > > __u32 type; > > __u64 flags; > > } system_event; > > @@ -6009,6 +6039,9 @@ Valid values for 'type' are: > > has requested a crash condition maintenance. Userspace can choose > > to ignore the request, or to gather VM memory core dump and/or > > reset/shutdown of the VM. > > + - KVM_SYSTEM_EVENT_WAKEUP -- the exiting vCPU is in a suspended state and > > + KVM has recognized a wakeup event. Userspace may honor this event by > > + marking the exiting vCPU as runnable, or deny it and call KVM_RUN again. > > > > Valid flags are: > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > index f3f93d48e21a..46027b9b80ca 100644 > > --- a/arch/arm64/include/asm/kvm_host.h > > +++ b/arch/arm64/include/asm/kvm_host.h > > @@ -46,6 +46,7 @@ > > #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) > > #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4) > > #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) > > +#define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) > > > > #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \ > > KVM_DIRTY_LOG_INITIALLY_SET) > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > index efe54aba5cce..e9641b86d375 100644 > > --- a/arch/arm64/kvm/arm.c > > +++ b/arch/arm64/kvm/arm.c > > @@ -444,6 +444,18 @@ bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu) > > return vcpu->arch.mp_state.mp_state == KVM_MP_STATE_STOPPED; > > } > > > > +static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu) > > +{ > > + vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED; > > + kvm_make_request(KVM_REQ_SUSPEND, vcpu); > > + kvm_vcpu_kick(vcpu); > > > +static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu) > > +{ > > + vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED; > > + kvm_make_request(KVM_REQ_SUSPEND, vcpu); > > + kvm_vcpu_kick(vcpu); > > Considering the patch 8 will remove the call to kvm_vcpu_kick() > (BTW, I wonder why you wanted to make that change in the patch-8 > instead of the patch-7), Squashed the diff into the wrong patch! Marc pointed out this is of course cargo-culted as I was following the pattern laid down by KVM_REQ_SLEEP :) > it looks like we could use the mp_state > KVM_MP_STATE_SUSPENDED instead of using KVM_REQ_SUSPEND. > What is the reason why you prefer to introduce KVM_REQ_SUSPEND > rather than simply using KVM_MP_STATE_SUSPENDED ? I was trying to avoid any heavy refactoring in adding new functionality here, as we handle KVM_MP_STATE_STOPPED similarly (make a request). ARM is definitely a bit different than x86 in the way that we handle the MP states, as x86 doesn't bounce through vCPU requests to do it and instead directly checks the mp_state value. Do you think it's fair to defer on repainting to a later series? We probably will need to touch up the main run loop quite a lot along the way. -- Thanks, Oliver
Hi Oliver, On Wed, Apr 20, 2022 at 8:24 PM Oliver Upton <oupton@google.com> wrote: > > Hi Reiji, > > On Wed, Apr 20, 2022 at 8:13 PM Reiji Watanabe <reijiw@google.com> wrote: > > > > Hi Oliver, > > > > On Sat, Apr 9, 2022 at 11:46 AM Oliver Upton <oupton@google.com> wrote: > > > > > > Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU > > > is in a suspended state. In the suspended state the vCPU will block > > > until a wakeup event (pending interrupt) is recognized. > > > > > > Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to > > > userspace that KVM has recognized one such wakeup event. It is the > > > responsibility of userspace to then make the vCPU runnable, or leave it > > > suspended until the next wakeup event. > > > > > > Signed-off-by: Oliver Upton <oupton@google.com> > > > --- > > > Documentation/virt/kvm/api.rst | 37 +++++++++++++++++++++-- > > > arch/arm64/include/asm/kvm_host.h | 1 + > > > arch/arm64/kvm/arm.c | 49 +++++++++++++++++++++++++++++++ > > > include/uapi/linux/kvm.h | 2 ++ > > > 4 files changed, 87 insertions(+), 2 deletions(-) > > > > > > diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst > > > index d13fa6600467..d104e34ad703 100644 > > > --- a/Documentation/virt/kvm/api.rst > > > +++ b/Documentation/virt/kvm/api.rst > > > @@ -1476,14 +1476,43 @@ Possible values are: > > > [s390] > > > KVM_MP_STATE_LOAD the vcpu is in a special load/startup state > > > [s390] > > > + KVM_MP_STATE_SUSPENDED the vcpu is in a suspend state and is waiting > > > + for a wakeup event [arm64] > > > ========================== =============================================== > > > > > > On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an > > > in-kernel irqchip, the multiprocessing state must be maintained by userspace on > > > these architectures. > > > > > > -For arm64/riscv: > > > -^^^^^^^^^^^^^^^^ > > > +For arm64: > > > +^^^^^^^^^^ > > > + > > > +If a vCPU is in the KVM_MP_STATE_SUSPENDED state, KVM will emulate the > > > +architectural execution of a WFI instruction. > > > + > > > +If a wakeup event is recognized, KVM will exit to userspace with a > > > +KVM_SYSTEM_EVENT exit, where the event type is KVM_SYSTEM_EVENT_WAKEUP. If > > > +userspace wants to honor the wakeup, it must set the vCPU's MP state to > > > +KVM_MP_STATE_RUNNABLE. If it does not, KVM will continue to await a wakeup > > > +event in subsequent calls to KVM_RUN. > > > + > > > +.. warning:: > > > + > > > + If userspace intends to keep the vCPU in a SUSPENDED state, it is > > > + strongly recommended that userspace take action to suppress the > > > + wakeup event (such as masking an interrupt). Otherwise, subsequent > > > + calls to KVM_RUN will immediately exit with a KVM_SYSTEM_EVENT_WAKEUP > > > + event and inadvertently waste CPU cycles. > > > + > > > + Additionally, if userspace takes action to suppress a wakeup event, > > > + it is strongly recommended that it also restores the vCPU to its > > > + original state when the vCPU is made RUNNABLE again. For example, > > > + if userspace masked a pending interrupt to suppress the wakeup, > > > + the interrupt should be unmasked before returning control to the > > > + guest. > > > + > > > +For riscv: > > > +^^^^^^^^^^ > > > > > > The only states that are valid are KVM_MP_STATE_STOPPED and > > > KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. > > > @@ -5985,6 +6014,7 @@ should put the acknowledged interrupt vector into the 'epr' field. > > > #define KVM_SYSTEM_EVENT_SHUTDOWN 1 > > > #define KVM_SYSTEM_EVENT_RESET 2 > > > #define KVM_SYSTEM_EVENT_CRASH 3 > > > + #define KVM_SYSTEM_EVENT_WAKEUP 4 > > > __u32 type; > > > __u64 flags; > > > } system_event; > > > @@ -6009,6 +6039,9 @@ Valid values for 'type' are: > > > has requested a crash condition maintenance. Userspace can choose > > > to ignore the request, or to gather VM memory core dump and/or > > > reset/shutdown of the VM. > > > + - KVM_SYSTEM_EVENT_WAKEUP -- the exiting vCPU is in a suspended state and > > > + KVM has recognized a wakeup event. Userspace may honor this event by > > > + marking the exiting vCPU as runnable, or deny it and call KVM_RUN again. > > > > > > Valid flags are: > > > > > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > > > index f3f93d48e21a..46027b9b80ca 100644 > > > --- a/arch/arm64/include/asm/kvm_host.h > > > +++ b/arch/arm64/include/asm/kvm_host.h > > > @@ -46,6 +46,7 @@ > > > #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) > > > #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4) > > > #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) > > > +#define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) > > > > > > #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \ > > > KVM_DIRTY_LOG_INITIALLY_SET) > > > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c > > > index efe54aba5cce..e9641b86d375 100644 > > > --- a/arch/arm64/kvm/arm.c > > > +++ b/arch/arm64/kvm/arm.c > > > @@ -444,6 +444,18 @@ bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu) > > > return vcpu->arch.mp_state.mp_state == KVM_MP_STATE_STOPPED; > > > } > > > > > > +static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu) > > > +{ > > > + vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED; > > > + kvm_make_request(KVM_REQ_SUSPEND, vcpu); > > > + kvm_vcpu_kick(vcpu); > > > > > +static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu) > > > +{ > > > + vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED; > > > + kvm_make_request(KVM_REQ_SUSPEND, vcpu); > > > + kvm_vcpu_kick(vcpu); > > > > Considering the patch 8 will remove the call to kvm_vcpu_kick() > > (BTW, I wonder why you wanted to make that change in the patch-8 > > instead of the patch-7), > > Squashed the diff into the wrong patch! Marc pointed out this is of > course cargo-culted as I was following the pattern laid down by > KVM_REQ_SLEEP :) I see. Thanks for the clarification ! > > it looks like we could use the mp_state > > KVM_MP_STATE_SUSPENDED instead of using KVM_REQ_SUSPEND. > > What is the reason why you prefer to introduce KVM_REQ_SUSPEND > > rather than simply using KVM_MP_STATE_SUSPENDED ? > > I was trying to avoid any heavy refactoring in adding new > functionality here, as we handle KVM_MP_STATE_STOPPED similarly (make > a request). ARM is definitely a bit different than x86 in the way that > we handle the MP states, as x86 doesn't bounce through vCPU requests > to do it and instead directly checks the mp_state value. The difference from KVM_MP_STATE_STOPPED is that kvm_arm_vcpu_power_off() calls kvm_vcpu_kick(), which made me think having KVM_REQ_SLEEP was reasonable (it appears kvm_vcpu_kick() won't be needed there due to the same reason as kvm_arm_vcpu_suspend). > Do you think it's fair to defer on repainting to a later series? We > probably will need to touch up the main run loop quite a lot along the > way. Yes, I'm fine with that :-) Thanks, Reiji
On Thu, Apr 21, 2022 at 11:28:42PM -0700, Reiji Watanabe wrote: [...] > > > > +static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu) > > > > +{ > > > > + vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED; > > > > + kvm_make_request(KVM_REQ_SUSPEND, vcpu); > > > > + kvm_vcpu_kick(vcpu); > > > > > > Considering the patch 8 will remove the call to kvm_vcpu_kick() > > > (BTW, I wonder why you wanted to make that change in the patch-8 > > > instead of the patch-7), > > > > Squashed the diff into the wrong patch! Marc pointed out this is of > > course cargo-culted as I was following the pattern laid down by > > KVM_REQ_SLEEP :) > > I see. Thanks for the clarification ! > > > > it looks like we could use the mp_state > > > KVM_MP_STATE_SUSPENDED instead of using KVM_REQ_SUSPEND. > > > What is the reason why you prefer to introduce KVM_REQ_SUSPEND > > > rather than simply using KVM_MP_STATE_SUSPENDED ? > > > > I was trying to avoid any heavy refactoring in adding new > > functionality here, as we handle KVM_MP_STATE_STOPPED similarly (make > > a request). ARM is definitely a bit different than x86 in the way that > > we handle the MP states, as x86 doesn't bounce through vCPU requests > > to do it and instead directly checks the mp_state value. > > The difference from KVM_MP_STATE_STOPPED is that kvm_arm_vcpu_power_off() > calls kvm_vcpu_kick(), which made me think having KVM_REQ_SLEEP was > reasonable (it appears kvm_vcpu_kick() won't be needed there due to > the same reason as kvm_arm_vcpu_suspend). Just to finish the thought on this before mailing out what I hope is the last take on all of this. I'm going to leave the pointless call to kvm_vcpu_kick() in place, if only to follow the pattern of other MP states. That will all get cleaned up later on, as discussed :) -- Thanks, Oliver
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index d13fa6600467..d104e34ad703 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1476,14 +1476,43 @@ Possible values are: [s390] KVM_MP_STATE_LOAD the vcpu is in a special load/startup state [s390] + KVM_MP_STATE_SUSPENDED the vcpu is in a suspend state and is waiting + for a wakeup event [arm64] ========================== =============================================== On x86, this ioctl is only useful after KVM_CREATE_IRQCHIP. Without an in-kernel irqchip, the multiprocessing state must be maintained by userspace on these architectures. -For arm64/riscv: -^^^^^^^^^^^^^^^^ +For arm64: +^^^^^^^^^^ + +If a vCPU is in the KVM_MP_STATE_SUSPENDED state, KVM will emulate the +architectural execution of a WFI instruction. + +If a wakeup event is recognized, KVM will exit to userspace with a +KVM_SYSTEM_EVENT exit, where the event type is KVM_SYSTEM_EVENT_WAKEUP. If +userspace wants to honor the wakeup, it must set the vCPU's MP state to +KVM_MP_STATE_RUNNABLE. If it does not, KVM will continue to await a wakeup +event in subsequent calls to KVM_RUN. + +.. warning:: + + If userspace intends to keep the vCPU in a SUSPENDED state, it is + strongly recommended that userspace take action to suppress the + wakeup event (such as masking an interrupt). Otherwise, subsequent + calls to KVM_RUN will immediately exit with a KVM_SYSTEM_EVENT_WAKEUP + event and inadvertently waste CPU cycles. + + Additionally, if userspace takes action to suppress a wakeup event, + it is strongly recommended that it also restores the vCPU to its + original state when the vCPU is made RUNNABLE again. For example, + if userspace masked a pending interrupt to suppress the wakeup, + the interrupt should be unmasked before returning control to the + guest. + +For riscv: +^^^^^^^^^^ The only states that are valid are KVM_MP_STATE_STOPPED and KVM_MP_STATE_RUNNABLE which reflect if the vcpu is paused or not. @@ -5985,6 +6014,7 @@ should put the acknowledged interrupt vector into the 'epr' field. #define KVM_SYSTEM_EVENT_SHUTDOWN 1 #define KVM_SYSTEM_EVENT_RESET 2 #define KVM_SYSTEM_EVENT_CRASH 3 + #define KVM_SYSTEM_EVENT_WAKEUP 4 __u32 type; __u64 flags; } system_event; @@ -6009,6 +6039,9 @@ Valid values for 'type' are: has requested a crash condition maintenance. Userspace can choose to ignore the request, or to gather VM memory core dump and/or reset/shutdown of the VM. + - KVM_SYSTEM_EVENT_WAKEUP -- the exiting vCPU is in a suspended state and + KVM has recognized a wakeup event. Userspace may honor this event by + marking the exiting vCPU as runnable, or deny it and call KVM_RUN again. Valid flags are: diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index f3f93d48e21a..46027b9b80ca 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -46,6 +46,7 @@ #define KVM_REQ_RECORD_STEAL KVM_ARCH_REQ(3) #define KVM_REQ_RELOAD_GICv4 KVM_ARCH_REQ(4) #define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5) +#define KVM_REQ_SUSPEND KVM_ARCH_REQ(6) #define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \ KVM_DIRTY_LOG_INITIALLY_SET) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index efe54aba5cce..e9641b86d375 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -444,6 +444,18 @@ bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu) return vcpu->arch.mp_state.mp_state == KVM_MP_STATE_STOPPED; } +static void kvm_arm_vcpu_suspend(struct kvm_vcpu *vcpu) +{ + vcpu->arch.mp_state.mp_state = KVM_MP_STATE_SUSPENDED; + kvm_make_request(KVM_REQ_SUSPEND, vcpu); + kvm_vcpu_kick(vcpu); +} + +static bool kvm_arm_vcpu_suspended(struct kvm_vcpu *vcpu) +{ + return vcpu->arch.mp_state.mp_state == KVM_MP_STATE_SUSPENDED; +} + int kvm_arch_vcpu_ioctl_get_mpstate(struct kvm_vcpu *vcpu, struct kvm_mp_state *mp_state) { @@ -464,6 +476,9 @@ int kvm_arch_vcpu_ioctl_set_mpstate(struct kvm_vcpu *vcpu, case KVM_MP_STATE_STOPPED: kvm_arm_vcpu_power_off(vcpu); break; + case KVM_MP_STATE_SUSPENDED: + kvm_arm_vcpu_suspend(vcpu); + break; default: ret = -EINVAL; } @@ -648,6 +663,37 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu) preempt_enable(); } +static int kvm_vcpu_suspend(struct kvm_vcpu *vcpu) +{ + if (!kvm_arm_vcpu_suspended(vcpu)) + return 1; + + kvm_vcpu_wfi(vcpu); + + /* + * The suspend state is sticky; we do not leave it until userspace + * explicitly marks the vCPU as runnable. Request that we suspend again + * later. + */ + kvm_make_request(KVM_REQ_SUSPEND, vcpu); + + /* + * Check to make sure the vCPU is actually runnable. If so, exit to + * userspace informing it of the wakeup condition. + */ + if (kvm_arch_vcpu_runnable(vcpu)) { + kvm_vcpu_set_system_event_exit(vcpu, KVM_SYSTEM_EVENT_WAKEUP, 0); + return 0; + } + + /* + * Otherwise, we were unblocked to process a different event, such as a + * pending signal. Return 1 and allow kvm_arch_vcpu_ioctl_run() to + * process the event. + */ + return 1; +} + /** * check_vcpu_requests - check and handle pending vCPU requests * @vcpu: the VCPU pointer @@ -686,6 +732,9 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu) if (kvm_check_request(KVM_REQ_RELOAD_PMU, vcpu)) kvm_pmu_handle_pmcr(vcpu, __vcpu_sys_reg(vcpu, PMCR_EL0)); + + if (kvm_check_request(KVM_REQ_SUSPEND, vcpu)) + return kvm_vcpu_suspend(vcpu); } return 1; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 91a6fe4e02c0..64e5f9d83a7a 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -444,6 +444,7 @@ struct kvm_run { #define KVM_SYSTEM_EVENT_SHUTDOWN 1 #define KVM_SYSTEM_EVENT_RESET 2 #define KVM_SYSTEM_EVENT_CRASH 3 +#define KVM_SYSTEM_EVENT_WAKEUP 4 __u32 type; __u64 flags; } system_event; @@ -640,6 +641,7 @@ struct kvm_vapic_addr { #define KVM_MP_STATE_OPERATING 7 #define KVM_MP_STATE_LOAD 8 #define KVM_MP_STATE_AP_RESET_HOLD 9 +#define KVM_MP_STATE_SUSPENDED 10 struct kvm_mp_state { __u32 mp_state;
Introduce a new MP state, KVM_MP_STATE_SUSPENDED, which indicates a vCPU is in a suspended state. In the suspended state the vCPU will block until a wakeup event (pending interrupt) is recognized. Add a new system event type, KVM_SYSTEM_EVENT_WAKEUP, to indicate to userspace that KVM has recognized one such wakeup event. It is the responsibility of userspace to then make the vCPU runnable, or leave it suspended until the next wakeup event. Signed-off-by: Oliver Upton <oupton@google.com> --- Documentation/virt/kvm/api.rst | 37 +++++++++++++++++++++-- arch/arm64/include/asm/kvm_host.h | 1 + arch/arm64/kvm/arm.c | 49 +++++++++++++++++++++++++++++++ include/uapi/linux/kvm.h | 2 ++ 4 files changed, 87 insertions(+), 2 deletions(-)