Message ID | 20241203090045.942078-5-nikunj@amd.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Add Secure TSC support for SNP guests | expand |
On Tue, Dec 03, 2024 at 02:30:36PM +0530, Nikunj A Dadhania wrote: > Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as > the subsequent TSC value reads are undefined. What does that mean exactly? I'd prefer if we issued a WARN_ONCE() there on the write to catch any offenders. *NO ONE* should be writing the TSC MSR but that's a different story. IOW, something like this ontop of yours? --- diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c index c22cb2ea4b99..050170eb28e6 100644 --- a/arch/x86/coco/sev/core.c +++ b/arch/x86/coco/sev/core.c @@ -1443,9 +1443,15 @@ static enum es_result __vc_handle_msr_tsc(struct pt_regs *regs, bool write) { u64 tsc; - if (write) - return ES_OK; + if (!(sev_status & MSR_AMD64_SNP_SECURE_TSC)) + goto read_tsc; + + if (write) { + WARN_ONCE(1, "TSC MSR writes are verboten!\n"); + return ES_UNSUPPORTED; + } +read_tsc: tsc = rdtsc_ordered(); regs->ax = lower_32_bits(tsc); regs->dx = upper_32_bits(tsc); @@ -1462,11 +1468,14 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt) /* Is it a WRMSR? */ write = ctxt->insn.opcode.bytes[1] == 0x30; - if (regs->cx == MSR_SVSM_CAA) + switch(regs->cx) { + case MSR_SVSM_CAA: return __vc_handle_msr_caa(regs, write); - - if (regs->cx == MSR_IA32_TSC && (sev_status & MSR_AMD64_SNP_SECURE_TSC)) + case MSR_IA32_TSC: return __vc_handle_msr_tsc(regs, write); + default: + break; + } ghcb_set_rcx(ghcb, regs->cx); if (write) {
On 12/9/2024 9:27 PM, Borislav Petkov wrote: > On Tue, Dec 03, 2024 at 02:30:36PM +0530, Nikunj A Dadhania wrote: >> Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as >> the subsequent TSC value reads are undefined. > > What does that mean exactly? That is the warning from the APM: 15.36.18 Secure TSC "Guests that run with Secure TSC enabled are not expected to perform writes to the TSC MSR (10h). If such a write occurs, subsequent TSC values read are undefined." What I make out of it is: if a write is performed to the TSC MSR, subsequent reads of TSC is not reliable/trusted. That was the reason to ignore such writes in the #VC handler. > > I'd prefer if we issued a WARN_ONCE() there on the write to catch any > offenders. Do you also want to terminate the offending guest? ES_UNSUPPORTED return will do that. > > *NO ONE* should be writing the TSC MSR but that's a different story. > > IOW, something like this ontop of yours? > > --- > > diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c > index c22cb2ea4b99..050170eb28e6 100644 > --- a/arch/x86/coco/sev/core.c > +++ b/arch/x86/coco/sev/core.c > @@ -1443,9 +1443,15 @@ static enum es_result __vc_handle_msr_tsc(struct pt_regs *regs, bool write) > { > u64 tsc; > > - if (write) > - return ES_OK; > + if (!(sev_status & MSR_AMD64_SNP_SECURE_TSC)) > + goto read_tsc; This is changing the behavior for SEV-ES and SNP guests(non SECURE_TSC), TSC MSR reads are converted to RDTSC. This is a good optimization. But just wanted to bring up the subtle impact. > + > + if (write) { > + WARN_ONCE(1, "TSC MSR writes are verboten!\n"); > + return ES_UNSUPPORTED; Sure, we can add a WARN_ONCE(). > + } > > +read_tsc: > tsc = rdtsc_ordered(); > regs->ax = lower_32_bits(tsc); > regs->dx = upper_32_bits(tsc); > @@ -1462,11 +1468,14 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt) > /* Is it a WRMSR? */ > write = ctxt->insn.opcode.bytes[1] == 0x30; > > - if (regs->cx == MSR_SVSM_CAA) > + switch(regs->cx) { Yes, I was thinking about a switch, as there will be more such instances when we enable newer features. > + case MSR_SVSM_CAA: > return __vc_handle_msr_caa(regs, write); > - > - if (regs->cx == MSR_IA32_TSC && (sev_status & MSR_AMD64_SNP_SECURE_TSC)) > + case MSR_IA32_TSC: > return __vc_handle_msr_tsc(regs, write); > + default: > + break; > + } > > ghcb_set_rcx(ghcb, regs->cx); > if (write) { > Regards, Nikunj
On Tue, Dec 10, 2024 at 10:32:23AM +0530, Nikunj A. Dadhania wrote: > That is the warning from the APM: 15.36.18 Secure TSC > > "Guests that run with Secure TSC enabled are not expected to perform writes to > the TSC MSR (10h). If such a write occurs, subsequent TSC values read are > undefined." > > What I make out of it is: if a write is performed to the TSC MSR, subsequent > reads of TSC is not reliable/trusted. Basically, what happens on baremetal too. > Do you also want to terminate the offending guest? > > ES_UNSUPPORTED return will do that. I guess that would be too harsh. I guess a warn and a ES_OK should be fine for now. > This is changing the behavior for SEV-ES and SNP guests(non SECURE_TSC), TSC > MSR reads are converted to RDTSC. This is a good optimization. But just > wanted to bring up the subtle impact. That RDTSC happens still in the guest, right? But in its #VC handler. Versus it being a HV GHCB protocol call. I guess this conversion should be a separate patch in case there's some issues like the HV intercepting RDTSC... i.e., VMEXIT_RDTSC. We should probably handle that case too and then fallback to the GHCB call. Or is there a catch 22 I'm missing here... > Yes, I was thinking about a switch, as there will be more such instances when we > enable newer features. Exactly. Thx.
On 12/9/24 23:02, Nikunj A. Dadhania wrote: > On 12/9/2024 9:27 PM, Borislav Petkov wrote: >> On Tue, Dec 03, 2024 at 02:30:36PM +0530, Nikunj A Dadhania wrote: >>> Secure TSC enabled guests should not write to MSR_IA32_TSC(10H) register as >>> the subsequent TSC value reads are undefined. >> >> What does that mean exactly? > > That is the warning from the APM: 15.36.18 Secure TSC > > "Guests that run with Secure TSC enabled are not expected to perform writes to > the TSC MSR (10h). If such a write occurs, subsequent TSC values read are > undefined." > > What I make out of it is: if a write is performed to the TSC MSR, subsequent > reads of TSC is not reliable/trusted. > > That was the reason to ignore such writes in the #VC handler. > >> >> I'd prefer if we issued a WARN_ONCE() there on the write to catch any >> offenders. > > Do you also want to terminate the offending guest? > > ES_UNSUPPORTED return will do that. > >> >> *NO ONE* should be writing the TSC MSR but that's a different story. >> >> IOW, something like this ontop of yours? >> >> --- >> >> diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c >> index c22cb2ea4b99..050170eb28e6 100644 >> --- a/arch/x86/coco/sev/core.c >> +++ b/arch/x86/coco/sev/core.c >> @@ -1443,9 +1443,15 @@ static enum es_result __vc_handle_msr_tsc(struct pt_regs *regs, bool write) >> { >> u64 tsc; >> >> - if (write) >> - return ES_OK; >> + if (!(sev_status & MSR_AMD64_SNP_SECURE_TSC)) >> + goto read_tsc; > > This is changing the behavior for SEV-ES and SNP guests(non SECURE_TSC), TSC MSR > reads are converted to RDTSC. This is a good optimization. But just wanted to > bring up the subtle impact. Right, I think it should still flow through the GHCB MSR request for non-Secure TSC guests. > >> + >> + if (write) { >> + WARN_ONCE(1, "TSC MSR writes are verboten!\n"); >> + return ES_UNSUPPORTED; > > Sure, we can add a WARN_ONCE(). You'll want to test this... IIRC, I'm not sure if a WARN_ONCE() will be properly printed when issued within the #VC handler (since it will generate a nested #VC). Thanks, Tom > >> + } >> >> +read_tsc: >> tsc = rdtsc_ordered(); >> regs->ax = lower_32_bits(tsc); >> regs->dx = upper_32_bits(tsc); >> @@ -1462,11 +1468,14 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt) >> /* Is it a WRMSR? */ >> write = ctxt->insn.opcode.bytes[1] == 0x30; >> >> - if (regs->cx == MSR_SVSM_CAA) >> + switch(regs->cx) { > > Yes, I was thinking about a switch, as there will be more such instances when we > enable newer features. > >> + case MSR_SVSM_CAA: >> return __vc_handle_msr_caa(regs, write); >> - >> - if (regs->cx == MSR_IA32_TSC && (sev_status & MSR_AMD64_SNP_SECURE_TSC)) >> + case MSR_IA32_TSC: >> return __vc_handle_msr_tsc(regs, write); >> + default: >> + break; >> + } >> >> ghcb_set_rcx(ghcb, regs->cx); >> if (write) { >> > > Regards, > Nikunj
On 12/10/2024 5:13 PM, Borislav Petkov wrote: > On Tue, Dec 10, 2024 at 10:32:23AM +0530, Nikunj A. Dadhania wrote: >> Do you also want to terminate the offending guest? >> >> ES_UNSUPPORTED return will do that. > > I guess that would be too harsh. I guess a warn and a ES_OK should be fine for > now. Yes, that will be better. >> This is changing the behavior for SEV-ES and SNP guests(non SECURE_TSC), TSC >> MSR reads are converted to RDTSC. This is a good optimization. But just >> wanted to bring up the subtle impact. > > That RDTSC happens still in the guest, right? But in its #VC handler. Versus it > being a HV GHCB protocol call. Yes, and the change is working fine, I have verified it. > I guess this conversion should be a separate > patch in case there's some issues like the HV intercepting RDTSC... i.e., > VMEXIT_RDTSC. I tried instrumenting code to intercept RDTSC and RDTSCP, KVM does not handle EXIT_RDTSC and the guest (including non-secure guests) crashes pretty early with the following in the host kernel log: kvm_amd: kvm [2153024]: vcpu0, guest rIP: 0xbbea5fc2 svm: unexpected exit reason 0x6e > We should probably handle that case too and then fallback to the GHCB call. Or > is there a catch 22 I'm missing here... Regards, Nikunj
On 12/10/2024 7:59 PM, Tom Lendacky wrote: > On 12/9/24 23:02, Nikunj A. Dadhania wrote: >> On 12/9/2024 9:27 PM, Borislav Petkov wrote: >>> On Tue, Dec 03, 2024 at 02:30:36PM +0530, Nikunj A Dadhania wrote: >>> + >>> + if (write) { >>> + WARN_ONCE(1, "TSC MSR writes are verboten!\n"); >>> + return ES_UNSUPPORTED; >> >> Sure, we can add a WARN_ONCE(). > > You'll want to test this... IIRC, I'm not sure if a WARN_ONCE() will be > properly printed when issued within the #VC handler (since it will > generate a nested #VC). Right, a write to TSC MSR generates the following splat: [ 17.450076] ------------[ cut here ]------------ [ 17.450077] TSC MSR writes are verboten! [ 17.450079] WARNING: CPU: 0 PID: 617 at arch/x86/coco/sev/core.c:1456 vc_handle_exitcode.part.0+0xe54/0x1110 [ 17.450090] CPU: 0 UID: 0 PID: 617 Comm: wrmsr Tainted: G S 6.13.0-rc1-00093-g5e3143d631a9-dirty #121 Regards, Nikunj
On Tue, Dec 10, 2024 at 08:29:31AM -0600, Tom Lendacky wrote: > > This is changing the behavior for SEV-ES and SNP guests(non SECURE_TSC), TSC MSR > > reads are converted to RDTSC. This is a good optimization. But just wanted to > > bring up the subtle impact. > > Right, I think it should still flow through the GHCB MSR request for > non-Secure TSC guests. Why? I'm trying to think of a reason but I'm getting confused by what needs to happen where and when... :-\
On 12/11/24 13:00, Borislav Petkov wrote: > On Tue, Dec 10, 2024 at 08:29:31AM -0600, Tom Lendacky wrote: >>> This is changing the behavior for SEV-ES and SNP guests(non SECURE_TSC), TSC MSR >>> reads are converted to RDTSC. This is a good optimization. But just wanted to >>> bring up the subtle impact. >> >> Right, I think it should still flow through the GHCB MSR request for >> non-Secure TSC guests. > > Why? > > I'm trying to think of a reason but I'm getting confused by what needs to > happen where and when... :-\ It could be any reason... maybe the hypervisor wants to know when this MSR used in order to tell the guest owner to update their code. Writing to or reading from that MSR is not that common, so I would think we want to keep the same behavior that has been in effect. But if we do want to make this change, maybe do it separate from the Secure TSC series since it alters the behavior of SEV-ES guests and SEV-SNP guests without Secure TSC. Thanks, Tom >
On Wed, Dec 11, 2024 at 04:01:31PM -0600, Tom Lendacky wrote: > It could be any reason... maybe the hypervisor wants to know when this > MSR used in order to tell the guest owner to update their code. Writing > to or reading from that MSR is not that common, so I would think we want > to keep the same behavior that has been in effect. Ah, I thought you're gonna say something along the lines of, yeah, we must use the HV GHCB protocol because of <raisins> and there's no other way this could work. > But if we do want to make this change, maybe do it separate from the > Secure TSC series since it alters the behavior of SEV-ES guests and > SEV-SNP guests without Secure TSC. Already suggested so - this should be a separate patch. It would be interesting to see if it brings any improvement by avoiding the HV call... especially since RDTSC is used a *lot* and prominently at that in sched_clock, for example.
On 12/11/24 16:22, Borislav Petkov wrote: > On Wed, Dec 11, 2024 at 04:01:31PM -0600, Tom Lendacky wrote: >> It could be any reason... maybe the hypervisor wants to know when this >> MSR used in order to tell the guest owner to update their code. Writing >> to or reading from that MSR is not that common, so I would think we want >> to keep the same behavior that has been in effect. > > Ah, I thought you're gonna say something along the lines of, yeah, we must use > the HV GHCB protocol because of <raisins> and there's no other way this could > work. > >> But if we do want to make this change, maybe do it separate from the >> Secure TSC series since it alters the behavior of SEV-ES guests and >> SEV-SNP guests without Secure TSC. > > Already suggested so - this should be a separate patch. > > It would be interesting to see if it brings any improvement by avoiding the HV > call... especially since RDTSC is used a *lot* and prominently at that in > sched_clock, for example. I doubt you would notice anything since it doesn't look like Linux ever reads from or writes to MSR_IA32_TSC, preferring to just use RDTSC or RDTSCP which aren't usually intercepted at all and so never goes out to the HV. Thanks, Tom >
diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c index 39683101b526..af28fb962309 100644 --- a/arch/x86/coco/sev/core.c +++ b/arch/x86/coco/sev/core.c @@ -1433,6 +1433,31 @@ static enum es_result __vc_handle_msr_caa(struct pt_regs *regs, bool write) return ES_OK; } +/* + * TSC related accesses should not exit to the hypervisor when a guest is + * executing with SecureTSC enabled, so special handling is required for + * accesses of MSR_IA32_TSC: + * + * Writes: Writing to MSR_IA32_TSC can cause subsequent reads + * of the TSC to return undefined values, so ignore all + * writes. + * Reads: Reads of MSR_IA32_TSC should return the current TSC + * value, use the value returned by RDTSC. + */ +static enum es_result __vc_handle_msr_tsc(struct pt_regs *regs, bool write) +{ + u64 tsc; + + if (write) + return ES_OK; + + tsc = rdtsc_ordered(); + regs->ax = lower_32_bits(tsc); + regs->dx = upper_32_bits(tsc); + + return ES_OK; +} + static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt) { struct pt_regs *regs = ctxt->regs; @@ -1445,6 +1470,9 @@ static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt) if (regs->cx == MSR_SVSM_CAA) return __vc_handle_msr_caa(regs, write); + if (regs->cx == MSR_IA32_TSC && (sev_status & MSR_AMD64_SNP_SECURE_TSC)) + return __vc_handle_msr_tsc(regs, write); + ghcb_set_rcx(ghcb, regs->cx); if (write) { ghcb_set_rax(ghcb, regs->ax);