Message ID | 20241011075053.80540-1-shameerali.kolothum.thodi@huawei.com (mailing list archive) |
---|---|
Headers | show |
Series | KVM: arm64: Errata management for VM Live migration | expand |
Hi Shameer, Thanks for getting the ball rolling on this one, much appreciated. On Fri, 11 Oct 2024 08:50:47 +0100, Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote: > > Hi, > > On ARM64 platforms most of the errata workarounds are based on CPU > MIDR/REVIDR values and a number of these workarounds need to be > implemented by the Guest kernel as well. This creates a problem when > Guest needs to be migrated to a platform that differs in these > MIDR/REVIDR values even if the VMM can come up with a common minimum > feature list for the Guest using the recently introduced "Writable > ID registers" support. > > (This is roughly based on a discussion I had with Marc and Oliver > at KVM forum. Marc outlined his idea for a solution and this is an > attempt to implement it. Thanks to both and I take all the blame > if this is nowhere near what is intended/required) > > This RFC proposes a solution to handle the above issue by introducing > the following, > > 1. A new VM IOCTL, > KVM_ARM_SET_MIGRN_TARGET_CPUS _IOW(KVMIO, 0xb7, struct kvm_arm_migrn_cpus) > This can be used by the userspace(VMM) to set the target CPUs the > Guest will run in its lifetime. See patch #2 > 2. Add hypercall support for Guest kernel to retrieve any migration > errata bitmap(ARM_SMCCC_VENDOR_HYP_KVM_MIGRN_ERRATA) > The above will return the bitmaps in R0-R3 registers. See patch #4 > 3. The "capability" field in struct arm64_cpu_capabilities is a generated > one at present and may get renumbered or reordered. Hence, we can't use > this directly for migration errata bitmaps. Instead, introduced > "migartion_safe_cap", which has to be set statically for any > erratum that needs to be enabled and is safe for migration > purposes. See patches 3 & 6. > 4. Rest of the patches includes the plumbing required to populate the > errata bitmap based on the target CPUs set by the VMM and update the > system_cap based on it. > > ToDos:- > -We still need a way to handle the error in setting the invariant > registers(MIDR/REVIDR/AIDR) during Guest migration. Perhaps we can > handle it in userspace? > - Possibly we could do better to avoid the additional "migartion_safe_cap" use. > Suggestions welcome. > -There are errata that require more than MIDR/REVIDR, eg: CTR_EL0. > How to handle those? > -Check for locking requirements if any. > > This is lightly tested on a HiSilicon ARM64 platform. > > Please take a look and let me know your thoughts. Having eyeballed this very superficially, I think we can do something simpler, and maybe more future-proof: - I don't think KVM should be concerned about the description of the target CPUs. The hypercall you defined is the right thing to do, but the VMM should completely handle it. That's an implementation detail, but it would make things much simpler. - I don't think the "errata bitmap" works. That's a construct that is specific to Linux, and that cannot be supported for other OSs. It also limits the described issues to those the host knows, instead of the guest. The host doesn't have a clue what the guest really wants. Really, the guest should have enough information to decide what to do based on its own view of the ID registers and the list of CPUs it runs on. - To answer your question about CTR_EL0: KVM should (and does) sanitise that register by trapping it. This should be the default behaviour for things that need to be mitigated outside of MIDR/REVIDR. Thanks, M.
> -----Original Message----- > From: Marc Zyngier <maz@kernel.org> > Sent: Friday, October 11, 2024 11:37 AM > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> > Cc: kvmarm@lists.linux.dev; oliver.upton@linux.dev; > catalin.marinas@arm.com; will@kernel.org; mark.rutland@arm.com; > cohuck@redhat.com; eric.auger@redhat.com; yuzenghui > <yuzenghui@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>; > jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron > <jonathan.cameron@huawei.com>; Anthony Jebson > <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; > Linuxarm <linuxarm@huawei.com> > Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live > migration > > Hi Shameer, > > Thanks for getting the ball rolling on this one, much appreciated. > > On Fri, 11 Oct 2024 08:50:47 +0100, > Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote: > > > > Hi, > > > > On ARM64 platforms most of the errata workarounds are based on CPU > > MIDR/REVIDR values and a number of these workarounds need to be > > implemented by the Guest kernel as well. This creates a problem when > > Guest needs to be migrated to a platform that differs in these > > MIDR/REVIDR values even if the VMM can come up with a common > minimum > > feature list for the Guest using the recently introduced "Writable > > ID registers" support. > > > > (This is roughly based on a discussion I had with Marc and Oliver > > at KVM forum. Marc outlined his idea for a solution and this is an > > attempt to implement it. Thanks to both and I take all the blame > > if this is nowhere near what is intended/required) > > > > This RFC proposes a solution to handle the above issue by introducing > > the following, > > > > 1. A new VM IOCTL, > > KVM_ARM_SET_MIGRN_TARGET_CPUS _IOW(KVMIO, 0xb7, struct > kvm_arm_migrn_cpus) > > This can be used by the userspace(VMM) to set the target CPUs the > > Guest will run in its lifetime. See patch #2 > > 2. Add hypercall support for Guest kernel to retrieve any migration > > errata bitmap(ARM_SMCCC_VENDOR_HYP_KVM_MIGRN_ERRATA) > > The above will return the bitmaps in R0-R3 registers. See patch #4 > > 3. The "capability" field in struct arm64_cpu_capabilities is a generated > > one at present and may get renumbered or reordered. Hence, we can't > use > > this directly for migration errata bitmaps. Instead, introduced > > "migartion_safe_cap", which has to be set statically for any > > erratum that needs to be enabled and is safe for migration > > purposes. See patches 3 & 6. > > 4. Rest of the patches includes the plumbing required to populate the > > errata bitmap based on the target CPUs set by the VMM and update the > > system_cap based on it. > > > > ToDos:- > > -We still need a way to handle the error in setting the invariant > > registers(MIDR/REVIDR/AIDR) during Guest migration. Perhaps we can > > handle it in userspace? > > - Possibly we could do better to avoid the additional > "migartion_safe_cap" use. > > Suggestions welcome. > > -There are errata that require more than MIDR/REVIDR, eg: CTR_EL0. > > How to handle those? > > -Check for locking requirements if any. > > > > This is lightly tested on a HiSilicon ARM64 platform. > > > > Please take a look and let me know your thoughts. > > Having eyeballed this very superficially, I think we can do something > simpler, and maybe more future-proof: Thanks Marc for taking a look and the quick feedback. > > - I don't think KVM should be concerned about the description of the > target CPUs. The hypercall you defined is the right thing to do, > but the VMM should completely handle it. That's an implementation > detail, but it would make things much simpler. Ok. So does that mean the hypercall will use some sort of shared memory to retrieve the list of target CPUs from VMM? > - I don't think the "errata bitmap" works. That's a construct that is > specific to Linux, and that cannot be supported for other OSs. It > also limits the described issues to those the host knows, instead of > the guest. The host doesn't have a clue what the guest really wants. > Really, the guest should have enough information to decide what to > do based on its own view of the ID registers and the list of CPUs it > runs on. Yes. "errata bitmap" is specific to Linux. So if we go with the above hypercall-->VMM path and get the target CPU list, Guest can directly use that. > > - To answer your question about CTR_EL0: KVM should (and does) > sanitise that register by trapping it. This should be the default > behaviour for things that need to be mitigated outside of > MIDR/REVIDR. Ok. Make sense and simplifies things. Please let me know whether my understanding on hypercall<->VMM is correct or not. I can take a look at that route. Thanks, Shameer
On Fri, 11 Oct 2024 11:57:10 +0100, Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote: > > > > > > Please take a look and let me know your thoughts. > > > > Having eyeballed this very superficially, I think we can do something > > simpler, and maybe more future-proof: > > Thanks Marc for taking a look and the quick feedback. No worries, that's the least I could do given that you put the effort implementing my silly ideas! ;-) > > - I don't think KVM should be concerned about the description of the > > target CPUs. The hypercall you defined is the right thing to do, > > but the VMM should completely handle it. That's an implementation > > detail, but it would make things much simpler. > > Ok. So does that mean the hypercall will use some sort of shared memory > to retrieve the list of target CPUs from VMM? Two possibilities: - either shared memory, in which case the hypercall would require the guest to give an IPA and size for the VMM to write its stuff into the guest memory, - or more simply return the data as an MIDR/REVIDR pair in registers, the guest requesting an index, and getting an error when out of range, leaving it with the freedom to organise the storage. The second option is a bit slower, but way simpler, and it only happens once per guest boot, so it would probably be my preferred option unless this is proved to be impractical. > > > - I don't think the "errata bitmap" works. That's a construct that is > > specific to Linux, and that cannot be supported for other OSs. It > > also limits the described issues to those the host knows, instead of > > the guest. The host doesn't have a clue what the guest really wants. > > Really, the guest should have enough information to decide what to > > do based on its own view of the ID registers and the list of CPUs it > > runs on. > > Yes. "errata bitmap" is specific to Linux. So if we go with the above > hypercall-->VMM path and get the target CPU list, Guest can directly > use that. Indeed, and this is something that is much easier to standardise upon, as I really want this to be a universal mechanism. > > - To answer your question about CTR_EL0: KVM should (and does) > > sanitise that register by trapping it. This should be the default > > behaviour for things that need to be mitigated outside of > > MIDR/REVIDR. > > Ok. Make sense and simplifies things. > > Please let me know whether my understanding on hypercall<->VMM is > correct or not. I can take a look at that route. Yes, I think we are aligned on it. Thanks again, M.
On Fri, Oct 11 2024, Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> wrote: > Hi, > > On ARM64 platforms most of the errata workarounds are based on CPU > MIDR/REVIDR values and a number of these workarounds need to be > implemented by the Guest kernel as well. This creates a problem when > Guest needs to be migrated to a platform that differs in these > MIDR/REVIDR values even if the VMM can come up with a common minimum > feature list for the Guest using the recently introduced "Writable > ID registers" support. > > (This is roughly based on a discussion I had with Marc and Oliver > at KVM forum. Marc outlined his idea for a solution and this is an > attempt to implement it. Thanks to both and I take all the blame > if this is nowhere near what is intended/required) > > This RFC proposes a solution to handle the above issue by introducing > the following, > > 1. A new VM IOCTL, > KVM_ARM_SET_MIGRN_TARGET_CPUS _IOW(KVMIO, 0xb7, struct kvm_arm_migrn_cpus) > This can be used by the userspace(VMM) to set the target CPUs the > Guest will run in its lifetime. See patch #2 > 2. Add hypercall support for Guest kernel to retrieve any migration > errata bitmap(ARM_SMCCC_VENDOR_HYP_KVM_MIGRN_ERRATA) > The above will return the bitmaps in R0-R3 registers. See patch #4 > 3. The "capability" field in struct arm64_cpu_capabilities is a generated > one at present and may get renumbered or reordered. Hence, we can't use > this directly for migration errata bitmaps. Instead, introduced > "migartion_safe_cap", which has to be set statically for any > erratum that needs to be enabled and is safe for migration > purposes. See patches 3 & 6. > 4. Rest of the patches includes the plumbing required to populate the > errata bitmap based on the target CPUs set by the VMM and update the > system_cap based on it. > > ToDos:- > -We still need a way to handle the error in setting the invariant > registers(MIDR/REVIDR/AIDR) during Guest migration. Perhaps we can > handle it in userspace? > - Possibly we could do better to avoid the additional "migartion_safe_cap" use. > Suggestions welcome. > -There are errata that require more than MIDR/REVIDR, eg: CTR_EL0. > How to handle those? > -Check for locking requirements if any. > > This is lightly tested on a HiSilicon ARM64 platform. > > Please take a look and let me know your thoughts. So, I've only taken a very quick look at it, but IIUC, the idea is for the VMM to do the following: - figure out where we want to possibly run - figure out the least common denominator for writable features - tell KVM about the possible target cpu resp. the errata it wants - build a frankencpu via the writable id reg infrastructure, fiddling with invariant handling as needed? (Sorry, it's late in the week :)
> -----Original Message----- > From: Cornelia Huck <cohuck@redhat.com> > Sent: Friday, October 11, 2024 2:18 PM > To: Shameerali Kolothum Thodi > <shameerali.kolothum.thodi@huawei.com>; kvmarm@lists.linux.dev; > maz@kernel.org; oliver.upton@linux.dev > Cc: catalin.marinas@arm.com; will@kernel.org; mark.rutland@arm.com; > eric.auger@redhat.com; yuzenghui <yuzenghui@huawei.com>; Wangzhou > (B) <wangzhou1@hisilicon.com>; jiangkunkun > <jiangkunkun@huawei.com>; Jonathan Cameron > <jonathan.cameron@huawei.com>; Anthony Jebson > <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; > Linuxarm <linuxarm@huawei.com> > Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live > migration > > On Fri, Oct 11 2024, Shameer Kolothum > <shameerali.kolothum.thodi@huawei.com> wrote: > > > Hi, > > > > On ARM64 platforms most of the errata workarounds are based on CPU > > MIDR/REVIDR values and a number of these workarounds need to be > > implemented by the Guest kernel as well. This creates a problem when > > Guest needs to be migrated to a platform that differs in these > > MIDR/REVIDR values even if the VMM can come up with a common > minimum > > feature list for the Guest using the recently introduced "Writable > > ID registers" support. > > > > (This is roughly based on a discussion I had with Marc and Oliver > > at KVM forum. Marc outlined his idea for a solution and this is an > > attempt to implement it. Thanks to both and I take all the blame > > if this is nowhere near what is intended/required) > > > > This RFC proposes a solution to handle the above issue by introducing > > the following, > > > > 1. A new VM IOCTL, > > KVM_ARM_SET_MIGRN_TARGET_CPUS _IOW(KVMIO, 0xb7, struct > kvm_arm_migrn_cpus) > > This can be used by the userspace(VMM) to set the target CPUs the > > Guest will run in its lifetime. See patch #2 > > 2. Add hypercall support for Guest kernel to retrieve any migration > > errata bitmap(ARM_SMCCC_VENDOR_HYP_KVM_MIGRN_ERRATA) > > The above will return the bitmaps in R0-R3 registers. See patch #4 > > 3. The "capability" field in struct arm64_cpu_capabilities is a generated > > one at present and may get renumbered or reordered. Hence, we can't > use > > this directly for migration errata bitmaps. Instead, introduced > > "migartion_safe_cap", which has to be set statically for any > > erratum that needs to be enabled and is safe for migration > > purposes. See patches 3 & 6. > > 4. Rest of the patches includes the plumbing required to populate the > > errata bitmap based on the target CPUs set by the VMM and update the > > system_cap based on it. > > > > ToDos:- > > -We still need a way to handle the error in setting the invariant > > registers(MIDR/REVIDR/AIDR) during Guest migration. Perhaps we can > > handle it in userspace? > > - Possibly we could do better to avoid the additional > "migartion_safe_cap" use. > > Suggestions welcome. > > -There are errata that require more than MIDR/REVIDR, eg: CTR_EL0. > > How to handle those? > > -Check for locking requirements if any. > > > > This is lightly tested on a HiSilicon ARM64 platform. > > > > Please take a look and let me know your thoughts. > > So, I've only taken a very quick look at it, but IIUC, the idea is for > the VMM to do the following: > - figure out where we want to possibly run Yes. > - figure out the least common denominator for writable features Yes > - tell KVM about the possible target cpu resp. the errata it wants This might change as per Marc's comments. VMM has to handle hypercall directly and provide the target CPU list to Guest. > - build a frankencpu via the writable id reg infrastructure, fiddling > with invariant handling as needed? That is my idea of handling the invariant registers. If the VMM has provided a list of target CPUs for the Guest, make sure the incoming CPU is part of the list during migration and ignore invariant(MIDR/REVIDR) reg SET errors. Thanks, Shameer
On Fri, Oct 11 2024, Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote: >> -----Original Message----- >> From: Cornelia Huck <cohuck@redhat.com> >> Sent: Friday, October 11, 2024 2:18 PM >> To: Shameerali Kolothum Thodi >> <shameerali.kolothum.thodi@huawei.com>; kvmarm@lists.linux.dev; >> maz@kernel.org; oliver.upton@linux.dev >> Cc: catalin.marinas@arm.com; will@kernel.org; mark.rutland@arm.com; >> eric.auger@redhat.com; yuzenghui <yuzenghui@huawei.com>; Wangzhou >> (B) <wangzhou1@hisilicon.com>; jiangkunkun >> <jiangkunkun@huawei.com>; Jonathan Cameron >> <jonathan.cameron@huawei.com>; Anthony Jebson >> <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; >> Linuxarm <linuxarm@huawei.com> >> Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live >> migration >> >> On Fri, Oct 11 2024, Shameer Kolothum >> <shameerali.kolothum.thodi@huawei.com> wrote: >> >> > Hi, >> > >> > On ARM64 platforms most of the errata workarounds are based on CPU >> > MIDR/REVIDR values and a number of these workarounds need to be >> > implemented by the Guest kernel as well. This creates a problem when >> > Guest needs to be migrated to a platform that differs in these >> > MIDR/REVIDR values even if the VMM can come up with a common >> minimum >> > feature list for the Guest using the recently introduced "Writable >> > ID registers" support. >> > >> > (This is roughly based on a discussion I had with Marc and Oliver >> > at KVM forum. Marc outlined his idea for a solution and this is an >> > attempt to implement it. Thanks to both and I take all the blame >> > if this is nowhere near what is intended/required) >> > >> > This RFC proposes a solution to handle the above issue by introducing >> > the following, >> > >> > 1. A new VM IOCTL, >> > KVM_ARM_SET_MIGRN_TARGET_CPUS _IOW(KVMIO, 0xb7, struct >> kvm_arm_migrn_cpus) >> > This can be used by the userspace(VMM) to set the target CPUs the >> > Guest will run in its lifetime. See patch #2 >> > 2. Add hypercall support for Guest kernel to retrieve any migration >> > errata bitmap(ARM_SMCCC_VENDOR_HYP_KVM_MIGRN_ERRATA) >> > The above will return the bitmaps in R0-R3 registers. See patch #4 >> > 3. The "capability" field in struct arm64_cpu_capabilities is a generated >> > one at present and may get renumbered or reordered. Hence, we can't >> use >> > this directly for migration errata bitmaps. Instead, introduced >> > "migartion_safe_cap", which has to be set statically for any >> > erratum that needs to be enabled and is safe for migration >> > purposes. See patches 3 & 6. >> > 4. Rest of the patches includes the plumbing required to populate the >> > errata bitmap based on the target CPUs set by the VMM and update the >> > system_cap based on it. >> > >> > ToDos:- >> > -We still need a way to handle the error in setting the invariant >> > registers(MIDR/REVIDR/AIDR) during Guest migration. Perhaps we can >> > handle it in userspace? >> > - Possibly we could do better to avoid the additional >> "migartion_safe_cap" use. >> > Suggestions welcome. >> > -There are errata that require more than MIDR/REVIDR, eg: CTR_EL0. >> > How to handle those? >> > -Check for locking requirements if any. >> > >> > This is lightly tested on a HiSilicon ARM64 platform. >> > >> > Please take a look and let me know your thoughts. >> >> So, I've only taken a very quick look at it, but IIUC, the idea is for >> the VMM to do the following: >> - figure out where we want to possibly run > > > Yes. > >> - figure out the least common denominator for writable features > > Yes > >> - tell KVM about the possible target cpu resp. the errata it wants > > This might change as per Marc's comments. VMM has to handle hypercall > directly and provide the target CPU list to Guest. > >> - build a frankencpu via the writable id reg infrastructure, fiddling >> with invariant handling as needed? > > That is my idea of handling the invariant registers. If the VMM has provided a list of > target CPUs for the Guest, make sure the incoming CPU is part of the list > during migration and ignore invariant(MIDR/REVIDR) reg SET errors. Thanks! I hope I'll be able to spend some time on this next week.
On Fri, Oct 11, 2024 at 12:43:28PM +0100, Marc Zyngier wrote: > On Fri, 11 Oct 2024 11:57:10 +0100, Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote: > > > > > > > > Please take a look and let me know your thoughts. > > > > > > Having eyeballed this very superficially, I think we can do something > > > simpler, and maybe more future-proof: > > > > Thanks Marc for taking a look and the quick feedback. > > No worries, that's the least I could do given that you put the effort > implementing my silly ideas! ;-) > > > > - I don't think KVM should be concerned about the description of the > > > target CPUs. The hypercall you defined is the right thing to do, > > > but the VMM should completely handle it. That's an implementation > > > detail, but it would make things much simpler. > > > > Ok. So does that mean the hypercall will use some sort of shared memory > > to retrieve the list of target CPUs from VMM? > > Two possibilities: > > - either shared memory, in which case the hypercall would require the > guest to give an IPA and size for the VMM to write its stuff into > the guest memory, > > - or more simply return the data as an MIDR/REVIDR pair in registers, > the guest requesting an index, and getting an error when out of > range, leaving it with the freedom to organise the storage. > > The second option is a bit slower, but way simpler, and it only > happens once per guest boot, so it would probably be my preferred > option unless this is proved to be impractical. Also worth noting there's existing UAPI [*] for allowing userspace to register range(s) of hypercalls that it services directly. It's a bit weird that we'd allow userspace to do stuff in KVM's own hypercall range, but I don't think it really matters at this point since this is all prototyping. [*]: https://docs.kernel.org/virt/kvm/devices/vm.html#attribute-kvm-arm-vm-smccc-filter-w-o
> -----Original Message----- > From: Oliver Upton <oliver.upton@linux.dev> > Sent: Friday, October 11, 2024 4:11 PM > To: Marc Zyngier <maz@kernel.org> > Cc: Shameerali Kolothum Thodi > <shameerali.kolothum.thodi@huawei.com>; kvmarm@lists.linux.dev; > catalin.marinas@arm.com; will@kernel.org; mark.rutland@arm.com; > cohuck@redhat.com; eric.auger@redhat.com; yuzenghui > <yuzenghui@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>; > jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron > <jonathan.cameron@huawei.com>; Anthony Jebson > <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; > Linuxarm <linuxarm@huawei.com> > Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live > migration > > On Fri, Oct 11, 2024 at 12:43:28PM +0100, Marc Zyngier wrote: > > On Fri, 11 Oct 2024 11:57:10 +0100, Shameerali Kolothum Thodi > <shameerali.kolothum.thodi@huawei.com> wrote: > > > > > > > > > > Please take a look and let me know your thoughts. > > > > > > > > Having eyeballed this very superficially, I think we can do something > > > > simpler, and maybe more future-proof: > > > > > > Thanks Marc for taking a look and the quick feedback. > > > > No worries, that's the least I could do given that you put the effort > > implementing my silly ideas! ;-) > > > > > > - I don't think KVM should be concerned about the description of the > > > > target CPUs. The hypercall you defined is the right thing to do, > > > > but the VMM should completely handle it. That's an implementation > > > > detail, but it would make things much simpler. > > > > > > Ok. So does that mean the hypercall will use some sort of shared > memory > > > to retrieve the list of target CPUs from VMM? > > > > Two possibilities: > > > > - either shared memory, in which case the hypercall would require the > > guest to give an IPA and size for the VMM to write its stuff into > > the guest memory, > > > > - or more simply return the data as an MIDR/REVIDR pair in registers, > > the guest requesting an index, and getting an error when out of > > range, leaving it with the freedom to organise the storage. > > > > The second option is a bit slower, but way simpler, and it only > > happens once per guest boot, so it would probably be my preferred > > option unless this is proved to be impractical. > > Also worth noting there's existing UAPI [*] for allowing userspace to > register range(s) of hypercalls that it services directly. It's a bit > weird that we'd allow userspace to do stuff in KVM's own hypercall > range, but I don't think it really matters at this point since this is > all prototyping. > > [*]: https://docs.kernel.org/virt/kvm/devices/vm.html#attribute-kvm-arm- > vm-smccc-filter-w-o Thanks. Yes and there are attempts to add that handling in Qemu[*] in the context of vCPU hotplug support(PSCI related ones though). Will take a look. Thanks, Shameer [1] https://lore.kernel.org/qemu-devel/20241009033704.250287-1-salil.mehta@huawei.com/
Hey Shameer, Thank you for posting this series. This'll help might cross-host migrations on arm64 a little less sketch :) On Fri, Oct 11, 2024 at 08:50:47AM +0100, Shameer Kolothum wrote: > Hi, > > On ARM64 platforms most of the errata workarounds are based on CPU > MIDR/REVIDR values and a number of these workarounds need to be > implemented by the Guest kernel as well. This creates a problem when > Guest needs to be migrated to a platform that differs in these > MIDR/REVIDR values even if the VMM can come up with a common minimum > feature list for the Guest using the recently introduced "Writable > ID registers" support. > > (This is roughly based on a discussion I had with Marc and Oliver > at KVM forum. Marc outlined his idea for a solution and this is an > attempt to implement it. Thanks to both and I take all the blame > if this is nowhere near what is intended/required) > > This RFC proposes a solution to handle the above issue by introducing > the following, > > 1. A new VM IOCTL, > KVM_ARM_SET_MIGRN_TARGET_CPUS _IOW(KVMIO, 0xb7, struct kvm_arm_migrn_cpus) > This can be used by the userspace(VMM) to set the target CPUs the > Guest will run in its lifetime. See patch #2 > 2. Add hypercall support for Guest kernel to retrieve any migration > errata bitmap(ARM_SMCCC_VENDOR_HYP_KVM_MIGRN_ERRATA) > The above will return the bitmaps in R0-R3 registers. See patch #4 > 3. The "capability" field in struct arm64_cpu_capabilities is a generated > one at present and may get renumbered or reordered. Hence, we can't use > this directly for migration errata bitmaps. Instead, introduced > "migartion_safe_cap", which has to be set statically for any > erratum that needs to be enabled and is safe for migration > purposes. See patches 3 & 6. > 4. Rest of the patches includes the plumbing required to populate the > errata bitmap based on the target CPUs set by the VMM and update the > system_cap based on it. > > ToDos:- > -We still need a way to handle the error in setting the invariant > registers(MIDR/REVIDR/AIDR) during Guest migration. Perhaps we can > handle it in userspace? This seems entirely reasonable. There's nothing KVM can do to unilaterally prevent userspace from migrating VMs between different implementations already, and the invariance of MIDR/REVIDR is just a speed bump and not an outright blocker. Ultimately this is now a contract between the VMM and guest, so we may as well let userspace advertise whatever implementation it wants in the ID registers. > Please take a look and let me know your thoughts. Looking at the guest side of things, it'd be nice if we hide the entire MIDR abstraction behind is_midr_in_range() et al and stop letting callers provide the MIDR. Internally that can find the MIDR of the current CPU or walk the array of MIDRs supplied by the hypervisor. Beyond that, we probably need a way to clue in userspace on the game in case it has some behaviors keyed off MIDR too.
On Fri, Oct 11 2024, Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote: >> -----Original Message----- >> From: Oliver Upton <oliver.upton@linux.dev> >> Sent: Friday, October 11, 2024 4:11 PM >> To: Marc Zyngier <maz@kernel.org> >> Cc: Shameerali Kolothum Thodi >> <shameerali.kolothum.thodi@huawei.com>; kvmarm@lists.linux.dev; >> catalin.marinas@arm.com; will@kernel.org; mark.rutland@arm.com; >> cohuck@redhat.com; eric.auger@redhat.com; yuzenghui >> <yuzenghui@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>; >> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron >> <jonathan.cameron@huawei.com>; Anthony Jebson >> <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; >> Linuxarm <linuxarm@huawei.com> >> Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live >> migration >> >> On Fri, Oct 11, 2024 at 12:43:28PM +0100, Marc Zyngier wrote: >> > On Fri, 11 Oct 2024 11:57:10 +0100, Shameerali Kolothum Thodi >> <shameerali.kolothum.thodi@huawei.com> wrote: >> > > > > >> > > > > Please take a look and let me know your thoughts. >> > > > >> > > > Having eyeballed this very superficially, I think we can do something >> > > > simpler, and maybe more future-proof: >> > > >> > > Thanks Marc for taking a look and the quick feedback. >> > >> > No worries, that's the least I could do given that you put the effort >> > implementing my silly ideas! ;-) >> > >> > > > - I don't think KVM should be concerned about the description of the >> > > > target CPUs. The hypercall you defined is the right thing to do, >> > > > but the VMM should completely handle it. That's an implementation >> > > > detail, but it would make things much simpler. >> > > >> > > Ok. So does that mean the hypercall will use some sort of shared >> memory >> > > to retrieve the list of target CPUs from VMM? >> > >> > Two possibilities: >> > >> > - either shared memory, in which case the hypercall would require the >> > guest to give an IPA and size for the VMM to write its stuff into >> > the guest memory, >> > >> > - or more simply return the data as an MIDR/REVIDR pair in registers, >> > the guest requesting an index, and getting an error when out of >> > range, leaving it with the freedom to organise the storage. >> > >> > The second option is a bit slower, but way simpler, and it only >> > happens once per guest boot, so it would probably be my preferred >> > option unless this is proved to be impractical. >> >> Also worth noting there's existing UAPI [*] for allowing userspace to >> register range(s) of hypercalls that it services directly. It's a bit >> weird that we'd allow userspace to do stuff in KVM's own hypercall >> range, but I don't think it really matters at this point since this is >> all prototyping. >> >> [*]: https://docs.kernel.org/virt/kvm/devices/vm.html#attribute-kvm-arm- >> vm-smccc-filter-w-o > > Thanks. Yes and there are attempts to add that handling in Qemu[*] in the context > of vCPU hotplug support(PSCI related ones though). Will take a look. > > Thanks, > Shameer > [1] https://lore.kernel.org/qemu-devel/20241009033704.250287-1-salil.mehta@huawei.com/ Speaking of QEMU: Do you maybe already have some prototype code that tries to do something with the setup here? (I don't think QEMU currently mucks around with MIDR and friends when running with KVM; I wonder what it should provide to the guest and if it should care to set something as a base level that gives guests not using the hypercall a chance to work properly.)
Hi Shameer, On 10/17/24 17:49, Cornelia Huck wrote: > On Fri, Oct 11 2024, Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote: > >>> -----Original Message----- >>> From: Oliver Upton <oliver.upton@linux.dev> >>> Sent: Friday, October 11, 2024 4:11 PM >>> To: Marc Zyngier <maz@kernel.org> >>> Cc: Shameerali Kolothum Thodi >>> <shameerali.kolothum.thodi@huawei.com>; kvmarm@lists.linux.dev; >>> catalin.marinas@arm.com; will@kernel.org; mark.rutland@arm.com; >>> cohuck@redhat.com; eric.auger@redhat.com; yuzenghui >>> <yuzenghui@huawei.com>; Wangzhou (B) <wangzhou1@hisilicon.com>; >>> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron >>> <jonathan.cameron@huawei.com>; Anthony Jebson >>> <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; >>> Linuxarm <linuxarm@huawei.com> >>> Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live >>> migration >>> >>> On Fri, Oct 11, 2024 at 12:43:28PM +0100, Marc Zyngier wrote: >>>> On Fri, 11 Oct 2024 11:57:10 +0100, Shameerali Kolothum Thodi >>> <shameerali.kolothum.thodi@huawei.com> wrote: >>>>>>> Please take a look and let me know your thoughts. >>>>>> Having eyeballed this very superficially, I think we can do something >>>>>> simpler, and maybe more future-proof: >>>>> Thanks Marc for taking a look and the quick feedback. >>>> No worries, that's the least I could do given that you put the effort >>>> implementing my silly ideas! ;-) >>>> >>>>>> - I don't think KVM should be concerned about the description of the >>>>>> target CPUs. The hypercall you defined is the right thing to do, >>>>>> but the VMM should completely handle it. That's an implementation >>>>>> detail, but it would make things much simpler. >>>>> Ok. So does that mean the hypercall will use some sort of shared >>> memory >>>>> to retrieve the list of target CPUs from VMM? >>>> Two possibilities: >>>> >>>> - either shared memory, in which case the hypercall would require the >>>> guest to give an IPA and size for the VMM to write its stuff into >>>> the guest memory, >>>> >>>> - or more simply return the data as an MIDR/REVIDR pair in registers, >>>> the guest requesting an index, and getting an error when out of >>>> range, leaving it with the freedom to organise the storage. >>>> >>>> The second option is a bit slower, but way simpler, and it only >>>> happens once per guest boot, so it would probably be my preferred >>>> option unless this is proved to be impractical. >>> Also worth noting there's existing UAPI [*] for allowing userspace to >>> register range(s) of hypercalls that it services directly. It's a bit >>> weird that we'd allow userspace to do stuff in KVM's own hypercall >>> range, but I don't think it really matters at this point since this is >>> all prototyping. >>> >>> [*]: https://docs.kernel.org/virt/kvm/devices/vm.html#attribute-kvm-arm- >>> vm-smccc-filter-w-o >> Thanks. Yes and there are attempts to add that handling in Qemu[*] in the context >> of vCPU hotplug support(PSCI related ones though). Will take a look. >> >> Thanks, >> Shameer >> [1] https://lore.kernel.org/qemu-devel/20241009033704.250287-1-salil.mehta@huawei.com/ > Speaking of QEMU: Do you maybe already have some prototype code that > tries to do something with the setup here? (I don't think QEMU currently > mucks around with MIDR and friends when running with KVM; I wonder what > it should provide to the guest and if it should care to set something as > a base level that gives guests not using the hypercall a chance to work > properly.) > As discussed during the KVM forum we are working on a qemu integration for writable ID regs. The first goal is to be able to specialize the host passthrough model (custom host model). Maybe this will trigger more discussions on named models too. This is complementary to the MIDR/REVIDR problematic and I hope we will be able to consolidate our works at some point. Thanks Eric
> -----Original Message----- > From: Eric Auger <eric.auger@redhat.com> > Sent: Thursday, October 17, 2024 6:16 PM > To: Cornelia Huck <cohuck@redhat.com>; Shameerali Kolothum Thodi > <shameerali.kolothum.thodi@huawei.com>; Oliver Upton > <oliver.upton@linux.dev>; Marc Zyngier <maz@kernel.org> > Cc: kvmarm@lists.linux.dev; catalin.marinas@arm.com; will@kernel.org; > mark.rutland@arm.com; yuzenghui <yuzenghui@huawei.com>; Wangzhou > (B) <wangzhou1@hisilicon.com>; jiangkunkun > <jiangkunkun@huawei.com>; Jonathan Cameron > <jonathan.cameron@huawei.com>; Anthony Jebson > <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; > Linuxarm <linuxarm@huawei.com> > Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live > migration > > Hi Shameer, > > On 10/17/24 17:49, Cornelia Huck wrote: > > On Fri, Oct 11 2024, Shameerali Kolothum Thodi > <shameerali.kolothum.thodi@huawei.com> wrote: > > > >>> -----Original Message----- > >>> From: Oliver Upton <oliver.upton@linux.dev> > >>> Sent: Friday, October 11, 2024 4:11 PM > >>> To: Marc Zyngier <maz@kernel.org> > >>> Cc: Shameerali Kolothum Thodi > >>> <shameerali.kolothum.thodi@huawei.com>; kvmarm@lists.linux.dev; > >>> catalin.marinas@arm.com; will@kernel.org; mark.rutland@arm.com; > >>> cohuck@redhat.com; eric.auger@redhat.com; yuzenghui > >>> <yuzenghui@huawei.com>; Wangzhou (B) > <wangzhou1@hisilicon.com>; > >>> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron > >>> <jonathan.cameron@huawei.com>; Anthony Jebson > >>> <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; > >>> Linuxarm <linuxarm@huawei.com> > >>> Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM > Live > >>> migration > >>> > >>> On Fri, Oct 11, 2024 at 12:43:28PM +0100, Marc Zyngier wrote: > >>>> On Fri, 11 Oct 2024 11:57:10 +0100, Shameerali Kolothum Thodi > >>> <shameerali.kolothum.thodi@huawei.com> wrote: > >>>>>>> Please take a look and let me know your thoughts. > >>>>>> Having eyeballed this very superficially, I think we can do > something > >>>>>> simpler, and maybe more future-proof: > >>>>> Thanks Marc for taking a look and the quick feedback. > >>>> No worries, that's the least I could do given that you put the effort > >>>> implementing my silly ideas! ;-) > >>>> > >>>>>> - I don't think KVM should be concerned about the description of > the > >>>>>> target CPUs. The hypercall you defined is the right thing to do, > >>>>>> but the VMM should completely handle it. That's an > implementation > >>>>>> detail, but it would make things much simpler. > >>>>> Ok. So does that mean the hypercall will use some sort of shared > >>> memory > >>>>> to retrieve the list of target CPUs from VMM? > >>>> Two possibilities: > >>>> > >>>> - either shared memory, in which case the hypercall would require the > >>>> guest to give an IPA and size for the VMM to write its stuff into > >>>> the guest memory, > >>>> > >>>> - or more simply return the data as an MIDR/REVIDR pair in registers, > >>>> the guest requesting an index, and getting an error when out of > >>>> range, leaving it with the freedom to organise the storage. > >>>> > >>>> The second option is a bit slower, but way simpler, and it only > >>>> happens once per guest boot, so it would probably be my preferred > >>>> option unless this is proved to be impractical. > >>> Also worth noting there's existing UAPI [*] for allowing userspace to > >>> register range(s) of hypercalls that it services directly. It's a bit > >>> weird that we'd allow userspace to do stuff in KVM's own hypercall > >>> range, but I don't think it really matters at this point since this is > >>> all prototyping. > >>> > >>> [*]: https://docs.kernel.org/virt/kvm/devices/vm.html#attribute-kvm- > arm- > >>> vm-smccc-filter-w-o > >> Thanks. Yes and there are attempts to add that handling in Qemu[*] in > the context > >> of vCPU hotplug support(PSCI related ones though). Will take a look. > >> > >> Thanks, > >> Shameer > >> [1] https://lore.kernel.org/qemu-devel/20241009033704.250287-1- > salil.mehta@huawei.com/ > > Speaking of QEMU: Do you maybe already have some prototype code that > > tries to do something with the setup here? (I don't think QEMU currently > > mucks around with MIDR and friends when running with KVM; I wonder > what > > it should provide to the guest and if it should care to set something as > > a base level that gives guests not using the hypercall a chance to work > > properly.) I had a hacked Qemu to test this RFC. Nothing fancy but just to verify the RFC. It basically adds a named CPU to virt as below, 1. Checks the host MIDR and REVIDR and if it is one of the supported platforms, use the writable register interface to update the ID registers with a common minimum feature list. 2. Use the IOCTL in this RFC to update the target CPU list info. 3. Ignore the invariant register SET errors. > > > As discussed during the KVM forum we are working on a qemu integration > for writable ID regs. The first goal is to be able to specialize the > host passthrough model (custom host model). Maybe this will trigger more > discussions on named models too. This is complementary to the > MIDR/REVIDR problematic and I hope we will be able to consolidate our > works at some point. Cool!. Happy to know that it is making progress. Once I rework this RFC based on the suggestions, I will rework the Qemu to have a hacked prototype to test it as well and probably can share that. Please keep me in loop if you have plans to share your Qemu work soon. That will be very helpful. Thanks, Shameer
On Fri, Oct 18 2024, Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> wrote: >> -----Original Message----- >> From: Eric Auger <eric.auger@redhat.com> >> Sent: Thursday, October 17, 2024 6:16 PM >> To: Cornelia Huck <cohuck@redhat.com>; Shameerali Kolothum Thodi >> <shameerali.kolothum.thodi@huawei.com>; Oliver Upton >> <oliver.upton@linux.dev>; Marc Zyngier <maz@kernel.org> >> Cc: kvmarm@lists.linux.dev; catalin.marinas@arm.com; will@kernel.org; >> mark.rutland@arm.com; yuzenghui <yuzenghui@huawei.com>; Wangzhou >> (B) <wangzhou1@hisilicon.com>; jiangkunkun >> <jiangkunkun@huawei.com>; Jonathan Cameron >> <jonathan.cameron@huawei.com>; Anthony Jebson >> <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; >> Linuxarm <linuxarm@huawei.com> >> Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM Live >> migration >> >> Hi Shameer, >> >> On 10/17/24 17:49, Cornelia Huck wrote: >> > On Fri, Oct 11 2024, Shameerali Kolothum Thodi >> <shameerali.kolothum.thodi@huawei.com> wrote: >> > >> >>> -----Original Message----- >> >>> From: Oliver Upton <oliver.upton@linux.dev> >> >>> Sent: Friday, October 11, 2024 4:11 PM >> >>> To: Marc Zyngier <maz@kernel.org> >> >>> Cc: Shameerali Kolothum Thodi >> >>> <shameerali.kolothum.thodi@huawei.com>; kvmarm@lists.linux.dev; >> >>> catalin.marinas@arm.com; will@kernel.org; mark.rutland@arm.com; >> >>> cohuck@redhat.com; eric.auger@redhat.com; yuzenghui >> >>> <yuzenghui@huawei.com>; Wangzhou (B) >> <wangzhou1@hisilicon.com>; >> >>> jiangkunkun <jiangkunkun@huawei.com>; Jonathan Cameron >> >>> <jonathan.cameron@huawei.com>; Anthony Jebson >> >>> <anthony.jebson@huawei.com>; linux-arm-kernel@lists.infradead.org; >> >>> Linuxarm <linuxarm@huawei.com> >> >>> Subject: Re: [RFC PATCH 0/6] KVM: arm64: Errata management for VM >> Live >> >>> migration >> >>> >> >>> On Fri, Oct 11, 2024 at 12:43:28PM +0100, Marc Zyngier wrote: >> >>>> On Fri, 11 Oct 2024 11:57:10 +0100, Shameerali Kolothum Thodi >> >>> <shameerali.kolothum.thodi@huawei.com> wrote: >> >>>>>>> Please take a look and let me know your thoughts. >> >>>>>> Having eyeballed this very superficially, I think we can do >> something >> >>>>>> simpler, and maybe more future-proof: >> >>>>> Thanks Marc for taking a look and the quick feedback. >> >>>> No worries, that's the least I could do given that you put the effort >> >>>> implementing my silly ideas! ;-) >> >>>> >> >>>>>> - I don't think KVM should be concerned about the description of >> the >> >>>>>> target CPUs. The hypercall you defined is the right thing to do, >> >>>>>> but the VMM should completely handle it. That's an >> implementation >> >>>>>> detail, but it would make things much simpler. >> >>>>> Ok. So does that mean the hypercall will use some sort of shared >> >>> memory >> >>>>> to retrieve the list of target CPUs from VMM? >> >>>> Two possibilities: >> >>>> >> >>>> - either shared memory, in which case the hypercall would require the >> >>>> guest to give an IPA and size for the VMM to write its stuff into >> >>>> the guest memory, >> >>>> >> >>>> - or more simply return the data as an MIDR/REVIDR pair in registers, >> >>>> the guest requesting an index, and getting an error when out of >> >>>> range, leaving it with the freedom to organise the storage. >> >>>> >> >>>> The second option is a bit slower, but way simpler, and it only >> >>>> happens once per guest boot, so it would probably be my preferred >> >>>> option unless this is proved to be impractical. >> >>> Also worth noting there's existing UAPI [*] for allowing userspace to >> >>> register range(s) of hypercalls that it services directly. It's a bit >> >>> weird that we'd allow userspace to do stuff in KVM's own hypercall >> >>> range, but I don't think it really matters at this point since this is >> >>> all prototyping. >> >>> >> >>> [*]: https://docs.kernel.org/virt/kvm/devices/vm.html#attribute-kvm- >> arm- >> >>> vm-smccc-filter-w-o >> >> Thanks. Yes and there are attempts to add that handling in Qemu[*] in >> the context >> >> of vCPU hotplug support(PSCI related ones though). Will take a look. >> >> >> >> Thanks, >> >> Shameer >> >> [1] https://lore.kernel.org/qemu-devel/20241009033704.250287-1- >> salil.mehta@huawei.com/ >> > Speaking of QEMU: Do you maybe already have some prototype code that >> > tries to do something with the setup here? (I don't think QEMU currently >> > mucks around with MIDR and friends when running with KVM; I wonder >> what >> > it should provide to the guest and if it should care to set something as >> > a base level that gives guests not using the hypercall a chance to work >> > properly.) > > I had a hacked Qemu to test this RFC. Nothing fancy but just to verify the RFC. > It basically adds a named CPU to virt as below, > > 1. Checks the host MIDR and REVIDR and if it is one of the supported platforms, > use the writable register interface to update the ID registers with a common > minimum feature list. > 2. Use the IOCTL in this RFC to update the target CPU list info. > 3. Ignore the invariant register SET errors. Thanks for sharing this information (no need to share the code, I usually do not want to share my hacked-together-for-testing code, either :) > >> > >> As discussed during the KVM forum we are working on a qemu integration >> for writable ID regs. The first goal is to be able to specialize the >> host passthrough model (custom host model). Maybe this will trigger more >> discussions on named models too. This is complementary to the >> MIDR/REVIDR problematic and I hope we will be able to consolidate our >> works at some point. > > Cool!. Happy to know that it is making progress. Once I rework this RFC based on the > suggestions, I will rework the Qemu to have a hacked prototype to test it as well and > probably can share that. The reason I asked was mostly to get a feel about how you wanted to end up using the interface. Did I understand correctly that you derive your custom set of features from whatever comes as common features with the MIDR/REVIDR, and do not treat "frankencpu with features common across all possible targets" and "list of MIDR/REVIDR for all possible targets" separately and tack them together in the end? It would also be interesting to discuss what actual named models could look like (models we could use for baselining etc.) But that's probably something that needs to be discussed on qemu-devel anyway, and I certainly don't want to derail the discussions about the KVM changes. (Agreeing on QEMU interfaces is going to take a lot of work anyway :) > > Please keep me in loop if you have plans to share your Qemu work soon. That will > be very helpful. > > Thanks, > Shameer
On Thu, 17 Oct 2024 18:16:04 +0100, Eric Auger <eric.auger@redhat.com> wrote: > > Hi Shameer, > > On 10/17/24 17:49, Cornelia Huck wrote: > > Speaking of QEMU: Do you maybe already have some prototype code that > > tries to do something with the setup here? (I don't think QEMU currently > > mucks around with MIDR and friends when running with KVM; I wonder what > > it should provide to the guest and if it should care to set something as > > a base level that gives guests not using the hypercall a chance to work > > properly.) > > > As discussed during the KVM forum we are working on a qemu integration > for writable ID regs. The first goal is to be able to specialize the > host passthrough model (custom host model). Maybe this will trigger more > discussions on named models too. This is complementary to the > MIDR/REVIDR problematic and I hope we will be able to consolidate our > works at some point. Complementary to the MIDR/REVIDR work, I would also like to make MIDR/REVIDR writable when this scheme is available. Ideally reporting a synthetic CPU description (with MIDR_EL1.Implementer returning 0, and the rest being VMM-specific, but with a clear definition for the IMPDEF fields so that we can version the ABI). Thoughts? M.
On Fri, Oct 18, 2024 at 02:17:17PM +0100, Marc Zyngier wrote: > On Thu, 17 Oct 2024 18:16:04 +0100, > Eric Auger <eric.auger@redhat.com> wrote: > > > > Hi Shameer, > > > > On 10/17/24 17:49, Cornelia Huck wrote: > > > > Speaking of QEMU: Do you maybe already have some prototype code that > > > tries to do something with the setup here? (I don't think QEMU currently > > > mucks around with MIDR and friends when running with KVM; I wonder what > > > it should provide to the guest and if it should care to set something as > > > a base level that gives guests not using the hypercall a chance to work > > > properly.) > > > > > As discussed during the KVM forum we are working on a qemu integration > > for writable ID regs. The first goal is to be able to specialize the > > host passthrough model (custom host model). Maybe this will trigger more > > discussions on named models too. This is complementary to the > > MIDR/REVIDR problematic and I hope we will be able to consolidate our > > works at some point. > > Complementary to the MIDR/REVIDR work, I would also like to make > MIDR/REVIDR writable when this scheme is available. Ideally reporting > a synthetic CPU description (with MIDR_EL1.Implementer returning 0, > and the rest being VMM-specific, but with a clear definition for the > IMPDEF fields so that we can version the ABI). When the VMM is using the hypercall mechanism to describe implementations, 100% agree. > Thoughts? I think we should go a step further and allow userspace full control of known fields in these registers, even for a nonzero Implementer code. We're already affording userspace full control of what implementation(s) the guest sees anyway via hypercalls, so there isn't much left for KVM to enforce. Ignoring errata, it'd let folks spoof an old implementation on newer hardware for testing, especially when dealing w/ older software.
On Fri, Oct 18 2024, Oliver Upton <oliver.upton@linux.dev> wrote: > On Fri, Oct 18, 2024 at 02:17:17PM +0100, Marc Zyngier wrote: >> On Thu, 17 Oct 2024 18:16:04 +0100, >> Eric Auger <eric.auger@redhat.com> wrote: >> > >> > Hi Shameer, >> > >> > On 10/17/24 17:49, Cornelia Huck wrote: >> >> > > Speaking of QEMU: Do you maybe already have some prototype code that >> > > tries to do something with the setup here? (I don't think QEMU currently >> > > mucks around with MIDR and friends when running with KVM; I wonder what >> > > it should provide to the guest and if it should care to set something as >> > > a base level that gives guests not using the hypercall a chance to work >> > > properly.) >> > > >> > As discussed during the KVM forum we are working on a qemu integration >> > for writable ID regs. The first goal is to be able to specialize the >> > host passthrough model (custom host model). Maybe this will trigger more >> > discussions on named models too. This is complementary to the >> > MIDR/REVIDR problematic and I hope we will be able to consolidate our >> > works at some point. >> >> Complementary to the MIDR/REVIDR work, I would also like to make >> MIDR/REVIDR writable when this scheme is available. Ideally reporting >> a synthetic CPU description (with MIDR_EL1.Implementer returning 0, >> and the rest being VMM-specific, but with a clear definition for the >> IMPDEF fields so that we can version the ABI). > > When the VMM is using the hypercall mechanism to describe > implementations, 100% agree. > >> Thoughts? > > I think we should go a step further and allow userspace full control of > known fields in these registers, even for a nonzero Implementer code. > > We're already affording userspace full control of what implementation(s) > the guest sees anyway via hypercalls, so there isn't much left for KVM > to enforce. > > Ignoring errata, it'd let folks spoof an old implementation on newer > hardware for testing, especially when dealing w/ older software. So basically, the VMM would have to make a choice which guests it wants to support? Modern guests, which are aware through the hypercall interface what is going on and are able to parse any information in our synthetic MIDR correctly, or older guests, which should be presented with a MIDR that makes them hopefully act in the way we want it to act? I guess that would mean that for full migration support between different hosts, we'd want enlightned guests, and for (some) other cases, we could generate a configuration that will hopefully work? (Spoofing for testing things sounds useful regardless.)
On Tue, 22 Oct 2024 11:58:02 +0100, Cornelia Huck <cohuck@redhat.com> wrote: > > On Fri, Oct 18 2024, Oliver Upton <oliver.upton@linux.dev> wrote: > > > On Fri, Oct 18, 2024 at 02:17:17PM +0100, Marc Zyngier wrote: > >> On Thu, 17 Oct 2024 18:16:04 +0100, > >> Eric Auger <eric.auger@redhat.com> wrote: > >> > > >> > Hi Shameer, > >> > > >> > On 10/17/24 17:49, Cornelia Huck wrote: > >> > >> > > Speaking of QEMU: Do you maybe already have some prototype code that > >> > > tries to do something with the setup here? (I don't think QEMU currently > >> > > mucks around with MIDR and friends when running with KVM; I wonder what > >> > > it should provide to the guest and if it should care to set something as > >> > > a base level that gives guests not using the hypercall a chance to work > >> > > properly.) > >> > > > >> > As discussed during the KVM forum we are working on a qemu integration > >> > for writable ID regs. The first goal is to be able to specialize the > >> > host passthrough model (custom host model). Maybe this will trigger more > >> > discussions on named models too. This is complementary to the > >> > MIDR/REVIDR problematic and I hope we will be able to consolidate our > >> > works at some point. > >> > >> Complementary to the MIDR/REVIDR work, I would also like to make > >> MIDR/REVIDR writable when this scheme is available. Ideally reporting > >> a synthetic CPU description (with MIDR_EL1.Implementer returning 0, > >> and the rest being VMM-specific, but with a clear definition for the > >> IMPDEF fields so that we can version the ABI). > > > > When the VMM is using the hypercall mechanism to describe > > implementations, 100% agree. > > > >> Thoughts? > > > > I think we should go a step further and allow userspace full control of > > known fields in these registers, even for a nonzero Implementer code. > > > > We're already affording userspace full control of what implementation(s) > > the guest sees anyway via hypercalls, so there isn't much left for KVM > > to enforce. > > > > Ignoring errata, it'd let folks spoof an old implementation on newer > > hardware for testing, especially when dealing w/ older software. > > So basically, the VMM would have to make a choice which guests it wants > to support? Modern guests, which are aware through the hypercall > interface what is going on and are able to parse any information in our > synthetic MIDR correctly, or older guests, which should be presented > with a MIDR that makes them hopefully act in the way we want it to act? > > I guess that would mean that for full migration support between > different hosts, we'd want enlightned guests, and for (some) other > cases, we could generate a configuration that will hopefully work? That's sums it up, I think. The way I think of it is that, at VM creation, you decide what you want to support: - homogeneous migration: that's basically what we have today, with some limited flexibility. - heterogeneous migration: synthetic MIDR, discovery hypercall, the works. > (Spoofing for testing things sounds useful regardless.) My problem with that is that there is no difference in ABI between "I want this for testing" and "I expect this to fully work". The latter is obviously not achievable... Thanks, M.