Message ID | 20210608154805.216869-1-jean-philippe@linaro.org (mailing list archive) |
---|---|
Headers | show |
Series | KVM: arm64: Pass PSCI to userspace | expand |
Hi Jean-Philippe, I'm not really familiar with this part of KVM, and I'm still trying to get my head around how this works, so please bare with me if I ask silly questions. This is how I understand this will work: 1. VMM opts in to forward HVC calls not handled by KVM. 2. VMM opts in to forward PSCI calls, other than PSCI_1_0_FN_PSCI_FEATURES(ARM_SMCCC_VERSION_FUNC_ID). 3. Guest emulates PSCI calls (and all the other HVC calls). 3.a For CPU_SUSPEND coming from VCPU A, userspace does a KVM_SET_MP_STATE(KVM_MP_STATE_HALTED) ioctl on the VCPU fd which sets the request KVM_REQ_SUSPEND. 3.b The next time the VCPU is run, KVM blocks the VCPU as a result of the request. kvm_vcpu_block() does a schedule() in a loop until it decides that the CPU must unblock. 3.c The VCPU will run as normal after kvm_vcpu_block() returns. Please correct me if I got something wrong. I have a few general questions. It doesn't mean there's something wrong with your approach, I'm just trying to understand it better. 1. Why forwarding PSCI calls to userspace depend on enabling forwarding for other HVC calls? As I understand from the patches, those handle distinct function IDs. 2. HVC call forwarding to userspace also forwards PSCI functions which are defined in ARM DEN 0022D, but not (yet) implemented by KVM. What happens if KVM's PSCI implementation gets support for one of those functions? How does userspace know that now it also needs to enable PSCI call forwarding to be able to handle that function? It looks to me like the boundary between the functions that are forwarded when HVC call forwarding is enabled and the functions that are forwarded when PSCI call forwarding is enabled is based on what Linux v5.13 handles. Have you considered choosing this boundary based on something less arbitrary, like the function types specified in ARM DEN 0028C, table 2-1? In my opinion, setting the MP state to HALTED looks like a sensible approach to implementing PSCI_SUSPEND. I'll take a closer look at the patches after I get a better understanding about what is going on. On 6/8/21 4:48 PM, Jean-Philippe Brucker wrote: > Allow userspace to request handling PSCI calls from guests. Our goal is > to enable a vCPU hot-add solution for Arm where the VMM presents > possible resources to the guest at boot, and controls which vCPUs can be > brought up by allowing or denying PSCI CPU_ON calls. Passing HVC and > PSCI to userspace has been discussed on the list in the context of vCPU > hot-add [1,2] but it can also be useful for implementing other SMCCC and > vendor hypercalls [3,4,5]. > > Patches 1-3 allow userspace to request WFI to be executed in KVM. That I don't understand this. KVM, in kvm_vcpu_block(), does not execute an WFI. PSCI_SUSPEND is documented as being indistinguishable from an WFI from the guest's point of view, but it's implementation is not architecturally defined. Thanks, Alex > way the VMM can easily implement the PSCI CPU_SUSPEND function, which is > mandatory from PSCI v0.2 onwards (even if it doesn't have a more useful > implementation than WFI, natively available to the guest). > > Patch 4 lets userspace request any HVC that isn't handled by KVM, and > patch 5 lets userspace request PSCI calls, disabling in-kernel PSCI > handling. > > I'm focusing on the PSCI bits, but a complete prototype of vCPU hot-add > for arm64 on Linux and QEMU, most of it from Salil and James, is > available at [6]. > > [1] https://lore.kernel.org/kvmarm/82879258-46a7-a6e9-ee54-fc3692c1cdc3@arm.com/ > [2] https://lore.kernel.org/linux-arm-kernel/20200625133757.22332-1-salil.mehta@huawei.com/ > (Followed by KVM forum and Linaro Open discussions) > [3] https://lore.kernel.org/linux-arm-kernel/f56cf420-affc-35f0-2355-801a924b8a35@arm.com/ > [4] https://lore.kernel.org/kvm/bf7e83f1-c58e-8d65-edd0-d08f27b8b766@arm.com/ > [5] https://lore.kernel.org/kvm/1569338454-26202-2-git-send-email-guoheyi@huawei.com/ > [6] https://jpbrucker.net/git/linux/log/?h=cpuhp/devel > https://jpbrucker.net/git/qemu/log/?h=cpuhp/devel > > Jean-Philippe Brucker (5): > KVM: arm64: Replace power_off with mp_state in struct kvm_vcpu_arch > KVM: arm64: Move WFI execution to check_vcpu_requests() > KVM: arm64: Allow userspace to request WFI > KVM: arm64: Pass hypercalls to userspace > KVM: arm64: Pass PSCI calls to userspace > > Documentation/virt/kvm/api.rst | 46 +++++++++++++++---- > Documentation/virt/kvm/arm/psci.rst | 1 + > arch/arm64/include/asm/kvm_host.h | 10 +++- > include/kvm/arm_hypercalls.h | 1 + > include/kvm/arm_psci.h | 4 ++ > include/uapi/linux/kvm.h | 3 ++ > arch/arm64/kvm/arm.c | 71 +++++++++++++++++++++-------- > arch/arm64/kvm/handle_exit.c | 3 +- > arch/arm64/kvm/hypercalls.c | 28 +++++++++++- > arch/arm64/kvm/psci.c | 69 ++++++++++++++-------------- > 10 files changed, 170 insertions(+), 66 deletions(-) >
Hi Alex, I'm not planning to resend this work at the moment, because it looks like vcpu hot-add will go a different way so I don't have a user. But I'll probably address the feedback so far and park it on some branch, in case anyone else needs it. On Mon, Jul 19, 2021 at 04:29:18PM +0100, Alexandru Elisei wrote: > 1. Why forwarding PSCI calls to userspace depend on enabling forwarding for other > HVC calls? As I understand from the patches, those handle distinct function IDs. The HVC cap from patch 4 enables returning from the VCPU_RUN ioctl with KVM_EXIT_HYPERCALL, for any HVC not handled by KVM. This one should definitely be improved, either by letting userspace choose the ranges of HVC it wants, or at least by reporting ranges reserved by KVM to userspace. The PSCI cap from patch 5 disables the in-kernel PSCI implementation. As a result those HVCs are forwarded to userspace. It was suggested that other users will want to handle HVC calls (SDEI for example [1]), hence splitting into two capabilities rather than just the PSCI cap. In v5.14 x86 added KVM_CAP_EXIT_HYPERCALL [2], which lets userspace receive specific hypercalls. We could reuse that and have PSCI be one bit of that capability's parameter. [1] https://lore.kernel.org/linux-arm-kernel/20170808164616.25949-12-james.morse@arm.com/ [2] https://lore.kernel.org/kvm/90778988e1ee01926ff9cac447aacb745f954c8c.1623174621.git.ashish.kalra@amd.com/ > 2. HVC call forwarding to userspace also forwards PSCI functions which are defined > in ARM DEN 0022D, but not (yet) implemented by KVM. What happens if KVM's PSCI > implementation gets support for one of those functions? How does userspace know > that now it also needs to enable PSCI call forwarding to be able to handle that > function? We forward the whole PSCI function range, so it's either KVM or userspace. If KVM manages PSCI and the guest calls an unimplemented function, that returns directly to the guest without going to userspace. The concern is valid for any other range, though. If userspace enables the HVC cap it receives function calls that at some point KVM might need to handle itself. So we need some negotiation between user and KVM about the specific HVC ranges that userspace can and will handle. > It looks to me like the boundary between the functions that are forwarded when HVC > call forwarding is enabled and the functions that are forwarded when PSCI call > forwarding is enabled is based on what Linux v5.13 handles. Have you considered > choosing this boundary based on something less arbitrary, like the function types > specified in ARM DEN 0028C, table 2-1? For PSCI I've used the range 0-0x1f as the boundary, which is reserved for PSCI by SMCCC (table 6-4 in that document). > > In my opinion, setting the MP state to HALTED looks like a sensible approach to > implementing PSCI_SUSPEND. I'll take a closer look at the patches after I get a > better understanding about what is going on. > > On 6/8/21 4:48 PM, Jean-Philippe Brucker wrote: > > Allow userspace to request handling PSCI calls from guests. Our goal is > > to enable a vCPU hot-add solution for Arm where the VMM presents > > possible resources to the guest at boot, and controls which vCPUs can be > > brought up by allowing or denying PSCI CPU_ON calls. Passing HVC and > > PSCI to userspace has been discussed on the list in the context of vCPU > > hot-add [1,2] but it can also be useful for implementing other SMCCC and > > vendor hypercalls [3,4,5]. > > > > Patches 1-3 allow userspace to request WFI to be executed in KVM. That > > I don't understand this. KVM, in kvm_vcpu_block(), does not execute an WFI. > PSCI_SUSPEND is documented as being indistinguishable from an WFI from the guest's > point of view, but it's implementation is not architecturally defined. Yes that was an oversimplification on my part Thanks, Jean
On Mon, Jul 19, 2021 at 11:02 AM Jean-Philippe Brucker <jean-philippe@linaro.org> wrote: > We forward the whole PSCI function range, so it's either KVM or userspace. > If KVM manages PSCI and the guest calls an unimplemented function, that > returns directly to the guest without going to userspace. > > The concern is valid for any other range, though. If userspace enables the > HVC cap it receives function calls that at some point KVM might need to > handle itself. So we need some negotiation between user and KVM about the > specific HVC ranges that userspace can and will handle. Are we going to use KVM_CAPs for every interesting HVC range that userspace may want to trap? I wonder if a more generic interface for hypercall filtering would have merit to handle the aforementioned cases, and whatever else a VMM will want to intercept down the line. For example, x86 has the concept of 'MSR filtering', wherein userspace can specify a set of registers that it wants to intercept. Doing something similar for HVCs would avoid the need for a kernel change each time a VMM wishes to intercept a new hypercall. -- Thanks, Oliver
On Mon, Jul 19, 2021 at 12:37:52PM -0700, Oliver Upton wrote: > On Mon, Jul 19, 2021 at 11:02 AM Jean-Philippe Brucker > <jean-philippe@linaro.org> wrote: > > We forward the whole PSCI function range, so it's either KVM or userspace. > > If KVM manages PSCI and the guest calls an unimplemented function, that > > returns directly to the guest without going to userspace. > > > > The concern is valid for any other range, though. If userspace enables the > > HVC cap it receives function calls that at some point KVM might need to > > handle itself. So we need some negotiation between user and KVM about the > > specific HVC ranges that userspace can and will handle. > > Are we going to use KVM_CAPs for every interesting HVC range that > userspace may want to trap? I wonder if a more generic interface for > hypercall filtering would have merit to handle the aforementioned > cases, and whatever else a VMM will want to intercept down the line. > > For example, x86 has the concept of 'MSR filtering', wherein userspace > can specify a set of registers that it wants to intercept. Doing > something similar for HVCs would avoid the need for a kernel change > each time a VMM wishes to intercept a new hypercall. Yes we could introduce a VM device group for this: * User reads attribute KVM_ARM_VM_HVC_NR_SLOTS, which defines the number of available HVC ranges. * User writes attribute KVM_ARM_VM_HVC_SET_RANGE with one range struct kvm_arm_hvc_range { __u32 slot; #define KVM_ARM_HVC_USER (1 << 0) /* Enable range. 0 disables it */ __u16 flags; __u16 imm; __u32 fn_start; __u32 fn_end; }; * KVM forwards any HVC within this range to userspace. * If one of the ranges is PSCI functions, disable KVM PSCI. Since it's more work for KVM to keep track of ranges, I didn't include it in the RFC, and I'm going to leave it to the next person dealing with this stuff :) Thanks, Jean
On Tue, Jun 08, 2021 at 05:48:01PM +0200, Jean-Philippe Brucker wrote: > Allow userspace to request handling PSCI calls from guests. Our goal is > to enable a vCPU hot-add solution for Arm where the VMM presents > possible resources to the guest at boot, and controls which vCPUs can be > brought up by allowing or denying PSCI CPU_ON calls. Since it looks like vCPU hot-add will be implemented differently, I don't intend to resend this series at the moment. But some of it could be useful for other projects and to avoid the helpful review effort going to waste, I fixed it up and will leave it on branch https://jpbrucker.net/git/linux/log/?h=kvm/psci-to-userspace It now only uses KVM_CAP_EXIT_HYPERCALL introduced in v5.14. Thanks, Jean