Message ID | 20230428095533.21747-1-cohuck@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | arm: enable MTE for QEMU + kvm | expand |
On Fri, 28 Apr 2023 at 10:55, Cornelia Huck <cohuck@redhat.com> wrote: > > v7 takes a different approach to wiring up MTE, so I still include a cover > letter where I can explain things better, even though it is now only a > single patch :) Applied to target-arm.next, thanks. -- PMM
On Fri, Apr 28 2023, Cornelia Huck <cohuck@redhat.com> wrote: > Another open problem is mte vs mte3: tcg emulates mte3, kvm gives the guest > whatever the host supports. Without migration support, this is not too much > of a problem yet, but for compatibility handling, we'll need a way to keep > QEMU from handing out mte3 for guests that might be migrated to a mte3-less > host. We could tack this unto the mte property (specifying the version or > max supported), or we could handle this via cpu properties if we go with > handling compatibility via cpu models (sorting this out for kvm is probably > going to be interesting in general.) In any case, I think we'll need a way > to inform kvm of it. Before I start to figure out the initialization breakage, I think it might be worth pointing to this open issue again. As Andrea mentioned in https://listman.redhat.com/archives/libvir-list/2023-May/239926.html, libvirt wants to provide a stable guest ABI, not only in the context of migration compatibility (which we can handwave away via the migration blocker.) The part I'm mostly missing right now is how to tell KVM to not present mte3 to a guest while running on a mte3 capable host (i.e. the KVM interface for that; it's more a case of "we don't have it right now", though.) I'd expect it to be on the cpu level, rather than on the vm level, but it's not there yet; we also probably want something that's not fighting whatever tcg (or other accels) end up doing. I see several options here: - Continue to ignore mte3 and its implications for now. The big risk is that someone might end up implementing support for MTE in libvirt again, with the same stable guest ABI issues as for this version. - Add a "version" qualifier to the mte machine prop (probably with semantics similar to the gic stuff), with the default working with tcg as it does right now (i.e. defaulting to mte3). KVM would only support "no mte" or "same as host" (with no stable guest ABI guarantees) for now. I'm not sure how hairy this might get if we end up with a per-cpu configuration of mte (and other features) with kvm. - Add cpu properties for mte and mte3. I think we've been there before :) It would likely match any KVM infrastructure well, but we gain interactions with the machine property. Also, there's a lot in the whole CPU model area that need proper figuring out first... if we go that route, we won't be able to add MTE support with KVM anytime soon, I fear. The second option might be the most promising, except for potential future headaches; but a lot depends on where we want to be going with cpu models for KVM in general.
On Mon, May 22, 2023 at 02:04:28PM +0200, Cornelia Huck wrote: > On Fri, Apr 28 2023, Cornelia Huck <cohuck@redhat.com> wrote: > > Another open problem is mte vs mte3: tcg emulates mte3, kvm gives the guest > > whatever the host supports. Without migration support, this is not too much > > of a problem yet, but for compatibility handling, we'll need a way to keep > > QEMU from handing out mte3 for guests that might be migrated to a mte3-less > > host. We could tack this unto the mte property (specifying the version or > > max supported), or we could handle this via cpu properties if we go with > > handling compatibility via cpu models (sorting this out for kvm is probably > > going to be interesting in general.) In any case, I think we'll need a way > > to inform kvm of it. > > Before I start to figure out the initialization breakage, I think it > might be worth pointing to this open issue again. As Andrea mentioned in > https://listman.redhat.com/archives/libvir-list/2023-May/239926.html, > libvirt wants to provide a stable guest ABI, not only in the context of > migration compatibility (which we can handwave away via the migration > blocker.) Yeah, in order to guarantee a stable guest ABI it's critical that libvirt can ask for a *specific* version of MTE (MTE or MTE3) and either get exactly that version, or an error on QEMU's side. > The part I'm mostly missing right now is how to tell KVM to not present > mte3 to a guest while running on a mte3 capable host (i.e. the KVM > interface for that; it's more a case of "we don't have it right now", > though.) I'd expect it to be on the cpu level, rather than on the vm > level, but it's not there yet; we also probably want something that's > not fighting whatever tcg (or other accels) end up doing. > > I see several options here: > - Continue to ignore mte3 and its implications for now. The big risk is > that someone might end up implementing support for MTE in libvirt again, > with the same stable guest ABI issues as for this version. > - Add a "version" qualifier to the mte machine prop (probably with > semantics similar to the gic stuff), with the default working with tcg > as it does right now (i.e. defaulting to mte3). KVM would only support > "no mte" or "same as host" (with no stable guest ABI guarantees) for > now. I'm not sure how hairy this might get if we end up with a per-cpu > configuration of mte (and other features) with kvm. > - Add cpu properties for mte and mte3. I think we've been there before > :) It would likely match any KVM infrastructure well, but we gain > interactions with the machine property. Also, there's a lot in the > whole CPU model area that need proper figuring out first... if we go > that route, we won't be able to add MTE support with KVM anytime soon, > I fear. > > The second option might be the most promising, except for potential > future headaches; but a lot depends on where we want to be going with > cpu models for KVM in general. What are the arguments for/against making MTE a machine type option or CPU feature flag? IIUC on real hardware you get "mte" or "mte3" listed in /proc/cpuinfo, so a CPU feature would seem a pretty natural fit to me, but I seem to recall that Peter was pushing for keeping it a machine property instead. Working off the assumption that Peter knows what he's doing :) can we do something like this? * introduce a new machine type property mte-version, which accepts either a specific version (2 for MTE and 3 for MTE3), an abstract setting (max and host) or a way to disable MTE entirely (none); * turn the existing mte machine type option into an alias with the mapping mte=off -> mte-version=none mte=on -> mte-version=max for TCG mte=on -> mte-version=host for KVM and deprecate it; * optionally introduce a new QMP command query-mte-capabilities that can be used by libvirt to figure out ahead of time which MTE versions are available for use on the current hardware. Yes, this is basically a shameless rip-off of how GIC is handled :) I'm pretty satisfied with how that works and see no reason to reinvent the wheel. Note that it's perfectly fine if the lack of KVM-level APIs results in mte-version=2 being rejected on MTE3-capable hardware for now! What's important is that you don't get a different MTE version than what you asked for. I assume that the existing KVM API for enabling MTE have good enough granularity to make this work? If not, that's going to be a problem :)
On Mon, 29 May 2023 at 11:15, Andrea Bolognani <abologna@redhat.com> wrote: > > On Mon, May 22, 2023 at 02:04:28PM +0200, Cornelia Huck wrote: > > On Fri, Apr 28 2023, Cornelia Huck <cohuck@redhat.com> wrote: > > > Another open problem is mte vs mte3: tcg emulates mte3, kvm gives the guest > > > whatever the host supports. Without migration support, this is not too much > > > of a problem yet, but for compatibility handling, we'll need a way to keep > > > QEMU from handing out mte3 for guests that might be migrated to a mte3-less > > > host. We could tack this unto the mte property (specifying the version or > > > max supported), or we could handle this via cpu properties if we go with > > > handling compatibility via cpu models (sorting this out for kvm is probably > > > going to be interesting in general.) In any case, I think we'll need a way > > > to inform kvm of it. > > > > Before I start to figure out the initialization breakage, I think it > > might be worth pointing to this open issue again. As Andrea mentioned in > > https://listman.redhat.com/archives/libvir-list/2023-May/239926.html, > > libvirt wants to provide a stable guest ABI, not only in the context of > > migration compatibility (which we can handwave away via the migration > > blocker.) > > Yeah, in order to guarantee a stable guest ABI it's critical that > libvirt can ask for a *specific* version of MTE (MTE or MTE3) and > either get exactly that version, or an error on QEMU's side. > > > The part I'm mostly missing right now is how to tell KVM to not present > > mte3 to a guest while running on a mte3 capable host (i.e. the KVM > > interface for that; it's more a case of "we don't have it right now", > > though.) I'd expect it to be on the cpu level, rather than on the vm > > level, but it's not there yet; we also probably want something that's > > not fighting whatever tcg (or other accels) end up doing. > > > > I see several options here: > > - Continue to ignore mte3 and its implications for now. The big risk is > > that someone might end up implementing support for MTE in libvirt again, > > with the same stable guest ABI issues as for this version. > > - Add a "version" qualifier to the mte machine prop (probably with > > semantics similar to the gic stuff), with the default working with tcg > > as it does right now (i.e. defaulting to mte3). KVM would only support > > "no mte" or "same as host" (with no stable guest ABI guarantees) for > > now. I'm not sure how hairy this might get if we end up with a per-cpu > > configuration of mte (and other features) with kvm. > > - Add cpu properties for mte and mte3. I think we've been there before > > :) It would likely match any KVM infrastructure well, but we gain > > interactions with the machine property. Also, there's a lot in the > > whole CPU model area that need proper figuring out first... if we go > > that route, we won't be able to add MTE support with KVM anytime soon, > > I fear. > > > > The second option might be the most promising, except for potential > > future headaches; but a lot depends on where we want to be going with > > cpu models for KVM in general. > > What are the arguments for/against making MTE a machine type option > or CPU feature flag? IIUC on real hardware you get "mte" or "mte3" > listed in /proc/cpuinfo, so a CPU feature would seem a pretty natural > fit to me, but I seem to recall that Peter was pushing for keeping it > a machine property instead. The arguments for a machine property are: * MTE needs not just CPU support but also system level support (on real hardware there needs to be actual tag ram somewhere out there in the system; for TCG we need to create the tag RAM in the virt board code) * it's what we're already doing for TCG, so it keeps the UI consistent between TCG and KVM * it's what we already do for things like the virtualization extensions and TrustZone emulation (which also generally need to be supported not just by the CPU but also by the board) thanks -- PMM