Message ID | 20191220143025.33853-1-andrew.murray@arm.com (mailing list archive) |
---|---|
Headers | show |
Series | arm64: KVM: add SPE profiling support | expand |
Hi Andrew, On Fri, Dec 20, 2019 at 02:30:07PM +0000, Andrew Murray wrote: > This series implements support for allowing KVM guests to use the Arm > Statistical Profiling Extension (SPE). > > It has been tested on a model to ensure that both host and guest can > simultaneously use SPE with valid data. E.g. > > $ perf record -e arm_spe/ts_enable=1,pa_enable=1,pct_enable=1/ \ > dd if=/dev/zero of=/dev/null count=1000 > $ perf report --dump-raw-trace > spe_buf.txt What happens if I run perf record on the VMM, or on the CPU(s) that the VMM is running on? i.e. $ perf record -e arm_spe/ts_enable=1,pa_enable=1,pct_enable=1/ \ lkvm ${OPTIONS_FOR_GUEST_USING_SPE} ... or: $ perf record -a -c 0 -e arm_spe/ts_enable=1,pa_enable=1,pct_enable=1/ \ sleep 1000 & $ taskset -c 0 lkvm ${OPTIONS_FOR_GUEST_USING_SPE} & > As we save and restore the SPE context, the guest can access the SPE > registers directly, thus in this version of the series we remove the > trapping and emulation. > > In the previous series of this support, when KVM SPE isn't supported > (e.g. via CONFIG_KVM_ARM_SPE) we were able to return a value of 0 to > all reads of the SPE registers - as we can no longer do this there isn't > a mechanism to prevent the guest from using SPE - thus I'm keen for > feedback on the best way of resolving this. When not providing SPE to the guest, surely we should be trapping the registers and injecting an UNDEF? What happens today, without these patches? > It appears necessary to pin the entire guest memory in order to provide > guest SPE access - otherwise it is possible for the guest to receive > Stage-2 faults. AFAICT these patches do not implement this. I assume that's what you're trying to point out here, but I just want to make sure that's explicit. Maybe this is a reason to trap+emulate if there's something more sensible that hyp can do if it sees a Stage-2 fault. Thanks, Mark.
[fixing email addresses] Hi Andrew, On 2019-12-20 14:30, Andrew Murray wrote: > This series implements support for allowing KVM guests to use the Arm > Statistical Profiling Extension (SPE). Thanks for this. In future, please Cc me and Will on email addresses we can actually read. > It has been tested on a model to ensure that both host and guest can > simultaneously use SPE with valid data. E.g. > > $ perf record -e arm_spe/ts_enable=1,pa_enable=1,pct_enable=1/ \ > dd if=/dev/zero of=/dev/null count=1000 > $ perf report --dump-raw-trace > spe_buf.txt > > As we save and restore the SPE context, the guest can access the SPE > registers directly, thus in this version of the series we remove the > trapping and emulation. > > In the previous series of this support, when KVM SPE isn't supported > (e.g. via CONFIG_KVM_ARM_SPE) we were able to return a value of 0 to > all reads of the SPE registers - as we can no longer do this there > isn't > a mechanism to prevent the guest from using SPE - thus I'm keen for > feedback on the best way of resolving this. Surely there is a way to conditionally trap SPE registers, right? You should still be able to do this if SPE is not configured for a given guest (as we do for other feature such as PtrAuth). > It appears necessary to pin the entire guest memory in order to > provide > guest SPE access - otherwise it is possible for the guest to receive > Stage-2 faults. Really? How can the guest receive a stage-2 fault? This doesn't fit what I understand of the ARMv8 exception model. Or do you mean a SPE interrupt describing a S2 fault? And this is not just pinning the memory either. You have to ensure that all S2 page tables are created ahead of SPE being able to DMA to guest memory. This may have some impacts on the THP code... I'll have a look at the actual series ASAP (but that's not very soon). Thanks, M.
On Sat, 21 Dec 2019 10:48:16 +0000, Marc Zyngier <maz@kernel.org> wrote: > > [fixing email addresses] > > Hi Andrew, > > On 2019-12-20 14:30, Andrew Murray wrote: > > This series implements support for allowing KVM guests to use the Arm > > Statistical Profiling Extension (SPE). > > Thanks for this. In future, please Cc me and Will on email addresses > we can actually read. > > > It has been tested on a model to ensure that both host and guest can > > simultaneously use SPE with valid data. E.g. > > > > $ perf record -e arm_spe/ts_enable=1,pa_enable=1,pct_enable=1/ \ > > dd if=/dev/zero of=/dev/null count=1000 > > $ perf report --dump-raw-trace > spe_buf.txt > > > > As we save and restore the SPE context, the guest can access the SPE > > registers directly, thus in this version of the series we remove the > > trapping and emulation. > > > > In the previous series of this support, when KVM SPE isn't > > supported (e.g. via CONFIG_KVM_ARM_SPE) we were able to return a > > value of 0 to all reads of the SPE registers - as we can no longer > > do this there isn't a mechanism to prevent the guest from using > > SPE - thus I'm keen for feedback on the best way of resolving > > this. > > Surely there is a way to conditionally trap SPE registers, right? You > should still be able to do this if SPE is not configured for a given > guest (as we do for other feature such as PtrAuth). > > > It appears necessary to pin the entire guest memory in order to > > provide guest SPE access - otherwise it is possible for the guest > > to receive Stage-2 faults. > > Really? How can the guest receive a stage-2 fault? This doesn't fit > what I understand of the ARMv8 exception model. Or do you mean a SPE > interrupt describing a S2 fault? > > And this is not just pinning the memory either. You have to ensure that > all S2 page tables are created ahead of SPE being able to DMA to guest > memory. This may have some impacts on the THP code... > > I'll have a look at the actual series ASAP (but that's not very soon). I found some time to go through the series, and there is clearly a lot of work left to do: - There so nothing here to handle memory pinning whatsoever. If it works, it is only thanks to some side effect. - The missing trapping is deeply worrying. Given that this is an optional feature, you cannot just let the guest do whatever it wants in an uncontrolled manner. - The interrupt handling is busted. You mix concepts picked from both the PMU and the timer code, while the SPE device doesn't behave like any of these two (it is neither a fully emulated device, nor a device that is exclusively owned by a guest at any given time). I expect some level of discussion on the list including at least Will and myself before you respin this. M.
On Fri, Dec 20, 2019 at 05:55:25PM +0000, Mark Rutland wrote: > Hi Andrew, > > On Fri, Dec 20, 2019 at 02:30:07PM +0000, Andrew Murray wrote: > > This series implements support for allowing KVM guests to use the Arm > > Statistical Profiling Extension (SPE). > > > > It has been tested on a model to ensure that both host and guest can > > simultaneously use SPE with valid data. E.g. > > > > $ perf record -e arm_spe/ts_enable=1,pa_enable=1,pct_enable=1/ \ > > dd if=/dev/zero of=/dev/null count=1000 > > $ perf report --dump-raw-trace > spe_buf.txt > > What happens if I run perf record on the VMM, or on the CPU(s) that the > VMM is running on? i.e. > > $ perf record -e arm_spe/ts_enable=1,pa_enable=1,pct_enable=1/ \ > lkvm ${OPTIONS_FOR_GUEST_USING_SPE} > By default perf excludes the guest, so this works as expected, just recording activity of the process when it is outside the guest. (perf report appears to give valid output). Patch 15 currently prevents using perf to record inside the guest. > ... or: > > $ perf record -a -c 0 -e arm_spe/ts_enable=1,pa_enable=1,pct_enable=1/ \ > sleep 1000 & > $ taskset -c 0 lkvm ${OPTIONS_FOR_GUEST_USING_SPE} & > > > As we save and restore the SPE context, the guest can access the SPE > > registers directly, thus in this version of the series we remove the > > trapping and emulation. > > > > In the previous series of this support, when KVM SPE isn't supported > > (e.g. via CONFIG_KVM_ARM_SPE) we were able to return a value of 0 to > > all reads of the SPE registers - as we can no longer do this there isn't > > a mechanism to prevent the guest from using SPE - thus I'm keen for > > feedback on the best way of resolving this. > > When not providing SPE to the guest, surely we should be trapping the > registers and injecting an UNDEF? Yes we should, I'll update the series. > > What happens today, without these patches? > Prior to this series MDCR_EL2_TPMS is set and E2PB is unset resulting in all SPE registers being trapped (with NULL handlers). > > It appears necessary to pin the entire guest memory in order to provide > > guest SPE access - otherwise it is possible for the guest to receive > > Stage-2 faults. > > AFAICT these patches do not implement this. I assume that's what you're > trying to point out here, but I just want to make sure that's explicit. That's right. > > Maybe this is a reason to trap+emulate if there's something more > sensible that hyp can do if it sees a Stage-2 fault. Yes it's not really clear to me at the moment what to do about this. Thanks, Andrew Murray > > Thanks, > Mark.
On Sun, Dec 22, 2019 at 12:22:10PM +0000, Marc Zyngier wrote: > On Sat, 21 Dec 2019 10:48:16 +0000, > Marc Zyngier <maz@kernel.org> wrote: > > > > [fixing email addresses] > > > > Hi Andrew, > > > > On 2019-12-20 14:30, Andrew Murray wrote: > > > This series implements support for allowing KVM guests to use the Arm > > > Statistical Profiling Extension (SPE). > > > > Thanks for this. In future, please Cc me and Will on email addresses > > we can actually read. > > > > > It has been tested on a model to ensure that both host and guest can > > > simultaneously use SPE with valid data. E.g. > > > > > > $ perf record -e arm_spe/ts_enable=1,pa_enable=1,pct_enable=1/ \ > > > dd if=/dev/zero of=/dev/null count=1000 > > > $ perf report --dump-raw-trace > spe_buf.txt > > > > > > As we save and restore the SPE context, the guest can access the SPE > > > registers directly, thus in this version of the series we remove the > > > trapping and emulation. > > > > > > In the previous series of this support, when KVM SPE isn't > > > supported (e.g. via CONFIG_KVM_ARM_SPE) we were able to return a > > > value of 0 to all reads of the SPE registers - as we can no longer > > > do this there isn't a mechanism to prevent the guest from using > > > SPE - thus I'm keen for feedback on the best way of resolving > > > this. > > > > Surely there is a way to conditionally trap SPE registers, right? You > > should still be able to do this if SPE is not configured for a given > > guest (as we do for other feature such as PtrAuth). > > > > > It appears necessary to pin the entire guest memory in order to > > > provide guest SPE access - otherwise it is possible for the guest > > > to receive Stage-2 faults. > > > > Really? How can the guest receive a stage-2 fault? This doesn't fit > > what I understand of the ARMv8 exception model. Or do you mean a SPE > > interrupt describing a S2 fault? Yes the latter. > > > > And this is not just pinning the memory either. You have to ensure that > > all S2 page tables are created ahead of SPE being able to DMA to guest > > memory. This may have some impacts on the THP code... > > > > I'll have a look at the actual series ASAP (but that's not very soon). > > I found some time to go through the series, and there is clearly a lot > of work left to do: > > - There so nothing here to handle memory pinning whatsoever. If it > works, it is only thanks to some side effect. > > - The missing trapping is deeply worrying. Given that this is an > optional feature, you cannot just let the guest do whatever it wants > in an uncontrolled manner. Yes I'll add this. > > - The interrupt handling is busted. You mix concepts picked from both > the PMU and the timer code, while the SPE device doesn't behave like > any of these two (it is neither a fully emulated device, nor a > device that is exclusively owned by a guest at any given time). > > I expect some level of discussion on the list including at least Will > and myself before you respin this. Thanks for the quick feedback. Andrew Murray > > M. > > -- > Jazz is not dead, it just smells funny.