Message ID | 20190418103942.2883-1-clg@kaod.org (mailing list archive) |
---|---|
Headers | show |
Series | KVM: PPC: Book3S HV: add XIVE native exploitation mode | expand |
On Thu, Apr 18, 2019 at 12:39:25PM +0200, Cédric Le Goater wrote: > On the POWER9 processor, the XIVE interrupt controller can control > interrupt sources using MMIOs to trigger events, to EOI or to turn off > the sources. Priority management and interrupt acknowledgment is also > controlled by MMIO in the CPU presenter sub-engine. > > PowerNV/baremetal Linux runs natively under XIVE but sPAPR guests need > special support from the hypervisor to do the same. This is called the > XIVE native exploitation mode and today, it can be activated under the > PowerPC Hypervisor, pHyp. However, Linux/KVM lacks XIVE native support > and still offers the old interrupt mode interface using a KVM device > implementing the XICS hcalls over XIVE. > > The following series is proposal to add the same support under KVM. > > A new KVM device is introduced for the XIVE native exploitation > mode. It reuses most of the XICS-over-XIVE glue implementation > structures which are internal to KVM but has a completely different > interface. A set of KVM device ioctls provide support for the > hypervisor calls, all handled in QEMU, to configure the sources and > the event queues. From there, all interrupt control is transferred to > the guest which can use MMIOs. > > These MMIO regions (ESB and TIMA) are exposed to guests in QEMU, > similarly to VFIO, and the associated VMAs are populated dynamically > with the appropriate pages using a fault handler. These are now > implemented using mmap()s of the KVM device fd. > > Migration has its own specific needs regarding memory. The patchset > provides a specific control to quiesce XIVE before capturing the > memory. The save and restore of the internal state is based on the > same ioctls used for the hcalls. > > On a POWER9 sPAPR machine, the Client Architecture Support (CAS) > negotiation process determines whether the guest operates with a > interrupt controller using the XICS legacy model, as found on POWER8, > or in XIVE exploitation mode. Which means that the KVM interrupt > device should be created at run-time, after the machine has started. > This requires extra support from KVM to destroy KVM devices. It is > introduced at the end of the patchset and requires some attention. > > This is based on Linux 5.1-rc5 and is a candidate for 5.2. The OPAL > patches have been merged now. Thanks, patch series applied to my kvm-ppc-next tree. I added two patches of mine on top to make sure we exclude other execution paths in the device release method, and to clear the escalation interrupt hardware pointers on release. I also modified your last patch to free the xive structures in book3s.c rather than powerpc.c in order to fix compilation for Book E configs. Paul.
On 4/30/19 12:11 PM, Paul Mackerras wrote: > On Thu, Apr 18, 2019 at 12:39:25PM +0200, Cédric Le Goater wrote: >> On the POWER9 processor, the XIVE interrupt controller can control >> interrupt sources using MMIOs to trigger events, to EOI or to turn off >> the sources. Priority management and interrupt acknowledgment is also >> controlled by MMIO in the CPU presenter sub-engine. >> >> PowerNV/baremetal Linux runs natively under XIVE but sPAPR guests need >> special support from the hypervisor to do the same. This is called the >> XIVE native exploitation mode and today, it can be activated under the >> PowerPC Hypervisor, pHyp. However, Linux/KVM lacks XIVE native support >> and still offers the old interrupt mode interface using a KVM device >> implementing the XICS hcalls over XIVE. >> >> The following series is proposal to add the same support under KVM. >> >> A new KVM device is introduced for the XIVE native exploitation >> mode. It reuses most of the XICS-over-XIVE glue implementation >> structures which are internal to KVM but has a completely different >> interface. A set of KVM device ioctls provide support for the >> hypervisor calls, all handled in QEMU, to configure the sources and >> the event queues. From there, all interrupt control is transferred to >> the guest which can use MMIOs. >> >> These MMIO regions (ESB and TIMA) are exposed to guests in QEMU, >> similarly to VFIO, and the associated VMAs are populated dynamically >> with the appropriate pages using a fault handler. These are now >> implemented using mmap()s of the KVM device fd. >> >> Migration has its own specific needs regarding memory. The patchset >> provides a specific control to quiesce XIVE before capturing the >> memory. The save and restore of the internal state is based on the >> same ioctls used for the hcalls. >> >> On a POWER9 sPAPR machine, the Client Architecture Support (CAS) >> negotiation process determines whether the guest operates with a >> interrupt controller using the XICS legacy model, as found on POWER8, >> or in XIVE exploitation mode. Which means that the KVM interrupt >> device should be created at run-time, after the machine has started. >> This requires extra support from KVM to destroy KVM devices. It is >> introduced at the end of the patchset and requires some attention. >> >> This is based on Linux 5.1-rc5 and is a candidate for 5.2. The OPAL >> patches have been merged now. > > Thanks, patch series applied to my kvm-ppc-next tree. I added two > patches of mine on top to make sure we exclude other execution paths > in the device release method, and to clear the escalation interrupt > hardware pointers on release. I also modified your last patch to free > the xive structures in book3s.c rather than powerpc.c in order to fix > compilation for Book E configs. OK. I have one minor cleanup removing bogus checks in the release method of the KVM device. Thanks, C.