Message ID | 20210104131542.495413-1-like.xu@linux.intel.com (mailing list archive) |
---|---|
Headers | show |
Series | KVM: x86/pmu: Add support to enable Guest PEBS via DS | expand |
On Mon, Jan 04, 2021, Like Xu wrote: > 2) Slow path (part 3, patch 0012-0017) > > This is when the host assigned physical PMC has a different index > from the virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) > In this case, KVM needs to rewrite the PEBS records to change the > applicable counter indexes to the virtual PMC indexes, which would > otherwise contain the physical counter index written by PEBS facility, > and switch the counter reset values to the offset corresponding to > the physical counter indexes in the DS data structure. > > Large PEBS needs to be disabled by KVM rewriting the > pebs_interrupt_threshold filed in DS to only one record in > the slow path. This is because a guest may implicitly drain PEBS buffer, > e.g., context switch. KVM doesn't get a chance to update the PEBS buffer. Are the PEBS record write, PEBS index update, and subsequent PMI atomic with respect to instruction execution? If not, doesn't this approach still leave a window where the guest could see the wrong counter? The virtualization hole is also visible if the guest is reading the PEBS records from a different vCPU, though I assume no sane kernel does that? > The physical PMC index will confuse the guest. The difficulty comes > when multiple events get rescheduled inside the guest. Hence disabling > large PEBS in this case might be an easy and safe way to keep it corrects > as an initial step here.
Hi Sean, Thanks for your comments ! On 2021/1/15 3:10, Sean Christopherson wrote: > On Mon, Jan 04, 2021, Like Xu wrote: >> 2) Slow path (part 3, patch 0012-0017) >> >> This is when the host assigned physical PMC has a different index >> from the virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) >> In this case, KVM needs to rewrite the PEBS records to change the >> applicable counter indexes to the virtual PMC indexes, which would >> otherwise contain the physical counter index written by PEBS facility, >> and switch the counter reset values to the offset corresponding to >> the physical counter indexes in the DS data structure. >> >> Large PEBS needs to be disabled by KVM rewriting the >> pebs_interrupt_threshold filed in DS to only one record in >> the slow path. This is because a guest may implicitly drain PEBS buffer, >> e.g., context switch. KVM doesn't get a chance to update the PEBS buffer. > Are the PEBS record write, PEBS index update, and subsequent PMI atomic with > respect to instruction execution? If not, doesn't this approach still leave a > window where the guest could see the wrong counter? First, KVM would limit/rewrite guest DS pebs_interrupt_threshold to one record before vm-entry, (see patch [PATCH v3 14/17] KVM: vmx/pmu: Limit pebs_interrupt_threshold in the guest DS area) which means once a PEBS record is written into the guest pebs buffer, a PEBS PMI will be generated immediately and thus vm-exit. Second, KVM would complete the PEBS record rewriting, PEBS index update, and inject vPMI before the next vm-entry (we deal with these separately in patches 15-17 for easy review). After the updated PEBS record(s) are (atomically?) prepared, guests will be notified via PMI and there is no window for vcpu to check whether there is a PEBS record due to vm-exit. > The virtualization hole is also visible if the guest is reading the PEBS records > from a different vCPU, though I assume no sane kernel does that? I have checked the guest PEBS driver behavior for Linux and Windows, and they're sane. Theoretically, it's true for busy-poll PBES buffer readers from other vCPUs and to fix it, making all vCPUs vm-exit is onerous for a large-size guest and I don't think you would accept this or do we have a better idea ? In fact, we don't think it's a hole or vulnerability because the motivation for correcting the counter index(s) is to help guest PEBS reader understand their PEBS records correctly and provide the same sampling accuracy as the non-cross mapped case, instead of providing a new attack interface from guest to host. PeterZ commented on the V1 version and insisted that the host perf allows the guest counter to be assigned a cross-mapped back-end counter. In this case, the slow path patches (13-17) are introduced to ensure that from the guest counter perspective, the PEBS records are also correct. We do not want these records to be invalid and ignored, which would undermine the accuracy of PEBS. In the practical use, the slow patch rarely happens and we're glad to see if the fast patch could be upstream and the cross-mapped case is teamprily disabled until we're on the same page for the cross mapped case. In actual use, slow path rarely occur. As a first step, we propose to upstream the quick patches (patch 01-12) with your help. The guest PEBS would been disabled temporarily when guest PEBS counters are cross-mapped until we figure out a satisfactory cross-mapping solution. --- thx,likexu > >> The physical PMC index will confuse the guest. The difficulty comes >> when multiple events get rescheduled inside the guest. Hence disabling >> large PEBS in this case might be an easy and safe way to keep it corrects >> as an initial step here.
On Fri, Jan 15, 2021, Xu, Like wrote: > Hi Sean, > > Thanks for your comments ! > > On 2021/1/15 3:10, Sean Christopherson wrote: > > On Mon, Jan 04, 2021, Like Xu wrote: > > > 2) Slow path (part 3, patch 0012-0017) > > > > > > This is when the host assigned physical PMC has a different index > > > from the virtual PMC (e.g. using physical PMC1 to emulate virtual PMC0) > > > In this case, KVM needs to rewrite the PEBS records to change the > > > applicable counter indexes to the virtual PMC indexes, which would > > > otherwise contain the physical counter index written by PEBS facility, > > > and switch the counter reset values to the offset corresponding to > > > the physical counter indexes in the DS data structure. > > > > > > Large PEBS needs to be disabled by KVM rewriting the > > > pebs_interrupt_threshold filed in DS to only one record in > > > the slow path. This is because a guest may implicitly drain PEBS buffer, > > > e.g., context switch. KVM doesn't get a chance to update the PEBS buffer. > > Are the PEBS record write, PEBS index update, and subsequent PMI atomic with > > respect to instruction execution? If not, doesn't this approach still leave a > > window where the guest could see the wrong counter? > > First, KVM would limit/rewrite guest DS pebs_interrupt_threshold to one > record before vm-entry, > (see patch [PATCH v3 14/17] KVM: vmx/pmu: Limit pebs_interrupt_threshold in > the guest DS area) > which means once a PEBS record is written into the guest pebs buffer, > a PEBS PMI will be generated immediately and thus vm-exit. I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" guaranteed to be atomic? In practice, under what scenarios will guest counters get cross-mapped? And, how does this support affect guest accuracy? I.e. how bad do things get for the guest if we simply disable guest counters if they can't have a 1:1 association with their physical counter?
> I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > guaranteed to be atomic? Of course not. > > In practice, under what scenarios will guest counters get cross-mapped? And, > how does this support affect guest accuracy? I.e. how bad do things get for the > guest if we simply disable guest counters if they can't have a 1:1 association > with their physical counter? This would completely break perfmon for the guest, likely with no way to recover. -Andi
On Fri, Jan 15, 2021, Andi Kleen wrote: > > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > > guaranteed to be atomic? > > Of course not. So there's still a window where the guest could observe the bad counter index, correct?
On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: > On Fri, Jan 15, 2021, Andi Kleen wrote: > > > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > > > guaranteed to be atomic? > > > > Of course not. > > So there's still a window where the guest could observe the bad counter index, > correct? Yes. But with single record PEBS it doesn't really matter with normal perfmon drivers. -Andi
On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: > On Fri, Jan 15, 2021, Andi Kleen wrote: > > > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > > > guaranteed to be atomic? > > > > Of course not. > > So there's still a window where the guest could observe the bad counter index, > correct? Guest could do a hypercall to fix up the DS area before it tries to read it I suppose. Or the HV could expose the index mapping and have the guest fix up it. Adding a little virt crud on top shouldn't be too hard.
Hi Peter, On 2021/1/22 17:56, Peter Zijlstra wrote: > On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: >> On Fri, Jan 15, 2021, Andi Kleen wrote: >>>> I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" >>>> guaranteed to be atomic? >>> >>> Of course not. >> >> So there's still a window where the guest could observe the bad counter index, >> correct? > > Guest could do a hypercall to fix up the DS area before it tries to read > it I suppose. Or the HV could expose the index mapping and have the > guest fix up it. A weird (malicious) guest would read unmodified PEBS records in the guest PEBS buffer from other vCPUs without the need for hypercall or index mapping from HV. Do you see any security issues on this host index leak window? > > Adding a little virt crud on top shouldn't be too hard. > The patches 13-17 in this version has modified the guest PEBS buffer to correct the index mapping information in the guest PEBS records. --- thx,likexu
On Mon, Jan 25, 2021 at 04:08:22PM +0800, Like Xu wrote: > Hi Peter, > > On 2021/1/22 17:56, Peter Zijlstra wrote: > > On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: > > > On Fri, Jan 15, 2021, Andi Kleen wrote: > > > > > I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" > > > > > guaranteed to be atomic? > > > > > > > > Of course not. > > > > > > So there's still a window where the guest could observe the bad counter index, > > > correct? > > > > Guest could do a hypercall to fix up the DS area before it tries to read > > it I suppose. Or the HV could expose the index mapping and have the > > guest fix up it. > > A weird (malicious) guest would read unmodified PEBS records in the > guest PEBS buffer from other vCPUs without the need for hypercall or > index mapping from HV. > > Do you see any security issues on this host index leak window? > > > > > Adding a little virt crud on top shouldn't be too hard. > > > > The patches 13-17 in this version has modified the guest PEBS buffer > to correct the index mapping information in the guest PEBS records. Right, but given there is no atomicity between writing the DS area and triggering the PMI (as already established earlier in this thread), a malicious guest can already access this information, no?
On 2021/1/25 19:13, Peter Zijlstra wrote: > On Mon, Jan 25, 2021 at 04:08:22PM +0800, Like Xu wrote: >> Hi Peter, >> >> On 2021/1/22 17:56, Peter Zijlstra wrote: >>> On Fri, Jan 15, 2021 at 10:51:38AM -0800, Sean Christopherson wrote: >>>> On Fri, Jan 15, 2021, Andi Kleen wrote: >>>>>> I'm asking about ucode/hardare. Is the "guest pebs buffer write -> PEBS PMI" >>>>>> guaranteed to be atomic? >>>>> Of course not. >>>> So there's still a window where the guest could observe the bad counter index, >>>> correct? >>> Guest could do a hypercall to fix up the DS area before it tries to read >>> it I suppose. Or the HV could expose the index mapping and have the >>> guest fix up it. >> A weird (malicious) guest would read unmodified PEBS records in the >> guest PEBS buffer from other vCPUs without the need for hypercall or >> index mapping from HV. >> >> Do you see any security issues on this host index leak window? >> >>> Adding a little virt crud on top shouldn't be too hard. >>> >> The patches 13-17 in this version has modified the guest PEBS buffer >> to correct the index mapping information in the guest PEBS records. > Right, but given there is no atomicity between writing the DS area and > triggering the PMI (as already established earlier in this thread), a > malicious guest can already access this information, no? > So under the premise that counter cross-mapping is allowed, how can hypercall help fix it ? Personally, I think it is acceptable at the moment, and no security issues based on this have been defined and found.
On Mon, Jan 25, 2021 at 08:07:06PM +0800, Xu, Like wrote: > So under the premise that counter cross-mapping is allowed, > how can hypercall help fix it ? Hypercall or otherwise exposing the mapping, will let the guest fix it up when it already touches the data. Which avoids the host from having to access the guest memory and is faster, no?
On 2021/1/25 20:18, Peter Zijlstra wrote: > On Mon, Jan 25, 2021 at 08:07:06PM +0800, Xu, Like wrote: > >> So under the premise that counter cross-mapping is allowed, >> how can hypercall help fix it ? > Hypercall or otherwise exposing the mapping, will let the guest fix it > up when it already touches the data. Which avoids the host from having > to access the guest memory and is faster, no? - as you may know, the mapping table is changing rapidly from the time records to be rewritten to the time records to be read; - the patches will modify the records before it is notified via PMI which means it's transparent to normal guests (including Windows); - a malicious guest would ignore the exposed mapping and the hypercall and I don't think it can solve the leakage issue at all; - make the guest aware of that hypercall or mapping requires more code changes in the guest side; but now we can make it on the KVM side and we also know that cross-mapping case rarely happens, and the overhead is acceptable based on our tests; Please let me know if you or Sean are not going to buy in the PEBS records rewrite proposal in the patch 13 - 17. --- thx,likexu