diff mbox series

[v12,08/12] KVM: x86: Add Intel Processor Trace context switch for each vcpu

Message ID 1532598182-10711-9-git-send-email-luwei.kang@intel.com (mailing list archive)
State New, archived
Headers show
Series Intel Processor Trace virtualization enabling | expand

Commit Message

Luwei Kang July 26, 2018, 9:42 a.m. UTC
From: Chao Peng <chao.p.peng@linux.intel.com>

Load/Store Intel processor trace register in context switch.
MSR IA32_RTIT_CTL is loaded/stored automatically from VMCS.
In HOST_GUEST mode, we need load/resore PT MSRs only when PT
is enabled in guest.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
---
 arch/x86/include/asm/intel_pt.h |  2 +
 arch/x86/kvm/vmx.c              | 94 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 96 insertions(+)

Comments

Alexander Shishkin Oct. 2, 2018, 10:17 a.m. UTC | #1
Luwei Kang <luwei.kang@intel.com> writes:

> +static void pt_guest_enter(struct vcpu_vmx *vmx)
> +{
> +	if (pt_mode == PT_MODE_SYSTEM)
> +		return;
> +
> +	/* Save host state before VM entry */
> +	rdmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
> +
> +	/*
> +	 * Set guest state of MSR_IA32_RTIT_CTL MSR (PT will be disabled
> +	 * on VM entry when it has been disabled in guest before).
> +	 */
> +	vmcs_write64(GUEST_IA32_RTIT_CTL, vmx->pt_desc.guest.ctl);
> +
> +	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
> +		wrmsrl(MSR_IA32_RTIT_CTL, 0);
> +		pt_save_msr(&vmx->pt_desc.host, vmx->pt_desc.addr_range);
> +		pt_load_msr(&vmx->pt_desc.guest, vmx->pt_desc.addr_range);

Ok, so this is where the host PT is disabled.

So, my recommendation for this patchset is: we need to address the fact
that we are disabling host PT without telling the host. We need to
either stop doing it or inform the host. I don't think there need to be
any more versions of this patchset until this issue is addressed.

Thanks,
--
Alex
Paolo Bonzini Oct. 3, 2018, 11:56 a.m. UTC | #2
On 02/10/2018 12:17, Alexander Shishkin wrote:
> Ok, so this is where the host PT is disabled.
> 
> So, my recommendation for this patchset is: we need to address the fact
> that we are disabling host PT without telling the host. We need to
> either stop doing it or inform the host. I don't think there need to be
> any more versions of this patchset until this issue is addressed.

If you don't want that effect, do not enable host-guest mode.  The
default is "off" in fact.

I agree that it would be better to inform the host, but I think it's
acceptable to proceed with these patches since the mode is strictly opt
in (kvm_intel.pt_mode=1).  Perfect is the enemy of good, and I'd like to
get the "boring" code in first.

Luwei, can you document this in patch 4 as requested by Alexander?

Thanks,

Paolo
Alexander Shishkin Oct. 4, 2018, 10:20 a.m. UTC | #3
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 02/10/2018 12:17, Alexander Shishkin wrote:
>> Ok, so this is where the host PT is disabled.
>> 
>> So, my recommendation for this patchset is: we need to address the fact
>> that we are disabling host PT without telling the host. We need to
>> either stop doing it or inform the host. I don't think there need to be
>> any more versions of this patchset until this issue is addressed.
>
> If you don't want that effect, do not enable host-guest mode.  The
> default is "off" in fact.

One shouldn't have to enable or disable anything in KVM to stop it from
breaking one's existing workflow. That makes no sense.

There already are controls in perf that enable/disable guest tracing.

Regards,
--
Alex
Paolo Bonzini Oct. 4, 2018, 11:50 a.m. UTC | #4
On 04/10/2018 12:20, Alexander Shishkin wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 02/10/2018 12:17, Alexander Shishkin wrote:
>>> Ok, so this is where the host PT is disabled.
>>>
>>> So, my recommendation for this patchset is: we need to address the fact
>>> that we are disabling host PT without telling the host. We need to
>>> either stop doing it or inform the host. I don't think there need to be
>>> any more versions of this patchset until this issue is addressed.
>>
>> If you don't want that effect, do not enable host-guest mode.  The
>> default is "off" in fact.
> 
> One shouldn't have to enable or disable anything in KVM to stop it from
> breaking one's existing workflow. That makes no sense.

If you "have to enable or disable anything" it means you have to
override the default.  But the default in this patches is "no change
compared to before the patches", leaving tracing of both host and guest
entirely to the host, so I don't understand your remark.  What workflow
is broken?

> There already are controls in perf that enable/disable guest tracing.

You are confusing "tracing guest from the host" and "the guest can trace
itself".  This patchset is adding support for the latter, and that
affects directly whether the tracing CPUID leaf can be added to the
guest.  Therefore it's not perf that can decide whether to turn it on;
KVM must know it when /dev/kvm is opened, which is why it is a module
parameter.

Paolo
Luwei Kang Oct. 8, 2018, 5:48 a.m. UTC | #5
> > Ok, so this is where the host PT is disabled.
> >
> > So, my recommendation for this patchset is: we need to address the
> > fact that we are disabling host PT without telling the host. We need
> > to either stop doing it or inform the host. I don't think there need
> > to be any more versions of this patchset until this issue is addressed.
> 
> If you don't want that effect, do not enable host-guest mode.  The default is "off" in fact.
> 
> I agree that it would be better to inform the host, but I think it's acceptable to proceed with these patches since the mode is strictly
> opt in (kvm_intel.pt_mode=1).  Perfect is the enemy of good, and I'd like to get the "boring" code in first.
> 
> Luwei, can you document this in patch 4 as requested by Alexander?
> 

Sorry for a delay reply due to holiday of PRC. I will add more detail description in the patch of Intel PT virtualization mode.

Thanks,
Luwei Kang
diff mbox series

Patch

diff --git a/arch/x86/include/asm/intel_pt.h b/arch/x86/include/asm/intel_pt.h
index b05cfa4..4f00636 100644
--- a/arch/x86/include/asm/intel_pt.h
+++ b/arch/x86/include/asm/intel_pt.h
@@ -8,6 +8,8 @@ 
 #define PT_MODE_SYSTEM		0
 #define PT_MODE_HOST_GUEST	1
 
+#define RTIT_ADDR_RANGE		4
+
 enum pt_capabilities {
 	PT_CAP_max_subleaf = 0,
 	PT_CAP_cr3_filtering,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 947f232..1a1568e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -782,6 +782,24 @@  static inline int pi_test_sn(struct pi_desc *pi_desc)
 			(unsigned long *)&pi_desc->control);
 }
 
+struct pt_ctx {
+	u64 ctl;
+	u64 status;
+	u64 output_base;
+	u64 output_mask;
+	u64 cr3_match;
+	u64 addr_a[RTIT_ADDR_RANGE];
+	u64 addr_b[RTIT_ADDR_RANGE];
+};
+
+struct pt_desc {
+	u64 ctl_bitmask;
+	u32 addr_range;
+	u32 caps[PT_CPUID_REGS_NUM * PT_CPUID_LEAVES];
+	struct pt_ctx host;
+	struct pt_ctx guest;
+};
+
 struct vcpu_vmx {
 	struct kvm_vcpu       vcpu;
 	unsigned long         host_rsp;
@@ -878,6 +896,8 @@  struct vcpu_vmx {
 	u64 msr_ia32_feature_control;
 	u64 msr_ia32_feature_control_valid_bits;
 	u64 ept_pointer;
+
+	struct pt_desc pt_desc;
 };
 
 enum segment_cache_field {
@@ -2634,6 +2654,69 @@  static unsigned long segment_base(u16 selector)
 }
 #endif
 
+static inline void pt_load_msr(struct pt_ctx *ctx, u32 addr_range)
+{
+	u32 i;
+
+	wrmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
+	wrmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
+	wrmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
+	wrmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
+	for (i = 0; i < addr_range; i++) {
+		wrmsrl(MSR_IA32_RTIT_ADDR0_A + i * 2, ctx->addr_a[i]);
+		wrmsrl(MSR_IA32_RTIT_ADDR0_B + i * 2, ctx->addr_b[i]);
+	}
+}
+
+static inline void pt_save_msr(struct pt_ctx *ctx, u32 addr_range)
+{
+	u32 i;
+
+	rdmsrl(MSR_IA32_RTIT_STATUS, ctx->status);
+	rdmsrl(MSR_IA32_RTIT_OUTPUT_BASE, ctx->output_base);
+	rdmsrl(MSR_IA32_RTIT_OUTPUT_MASK, ctx->output_mask);
+	rdmsrl(MSR_IA32_RTIT_CR3_MATCH, ctx->cr3_match);
+	for (i = 0; i < addr_range; i++) {
+		rdmsrl(MSR_IA32_RTIT_ADDR0_A + i * 2, ctx->addr_a[i]);
+		rdmsrl(MSR_IA32_RTIT_ADDR0_B + i * 2, ctx->addr_b[i]);
+	}
+}
+
+static void pt_guest_enter(struct vcpu_vmx *vmx)
+{
+	if (pt_mode == PT_MODE_SYSTEM)
+		return;
+
+	/* Save host state before VM entry */
+	rdmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
+
+	/*
+	 * Set guest state of MSR_IA32_RTIT_CTL MSR (PT will be disabled
+	 * on VM entry when it has been disabled in guest before).
+	 */
+	vmcs_write64(GUEST_IA32_RTIT_CTL, vmx->pt_desc.guest.ctl);
+
+	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
+		wrmsrl(MSR_IA32_RTIT_CTL, 0);
+		pt_save_msr(&vmx->pt_desc.host, vmx->pt_desc.addr_range);
+		pt_load_msr(&vmx->pt_desc.guest, vmx->pt_desc.addr_range);
+	}
+}
+
+static void pt_guest_exit(struct vcpu_vmx *vmx)
+{
+	if (pt_mode == PT_MODE_SYSTEM)
+		return;
+
+	if (vmx->pt_desc.guest.ctl & RTIT_CTL_TRACEEN) {
+		pt_save_msr(&vmx->pt_desc.guest, vmx->pt_desc.addr_range);
+		pt_load_msr(&vmx->pt_desc.host, vmx->pt_desc.addr_range);
+	}
+
+	/* Reload host state (IA32_RTIT_CTL will be cleared on VM exit). */
+	wrmsrl(MSR_IA32_RTIT_CTL, vmx->pt_desc.host.ctl);
+}
+
 static void vmx_save_host_state(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -6466,6 +6549,13 @@  static void vmx_vcpu_setup(struct vcpu_vmx *vmx)
 		vmcs_write64(PML_ADDRESS, page_to_phys(vmx->pml_pg));
 		vmcs_write16(GUEST_PML_INDEX, PML_ENTITY_NUM - 1);
 	}
+
+	if (pt_mode == PT_MODE_HOST_GUEST) {
+		memset(&vmx->pt_desc, 0, sizeof(vmx->pt_desc));
+		/* Bit[6~0] are forced to 1, writes are ignored. */
+		vmx->pt_desc.guest.output_mask = 0x7F;
+		vmcs_write64(GUEST_IA32_RTIT_CTL, 0);
+	}
 }
 
 static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
@@ -10388,6 +10478,8 @@  static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 	    vcpu->arch.pkru != vmx->host_pkru)
 		__write_pkru(vcpu->arch.pkru);
 
+	pt_guest_enter(vmx);
+
 	atomic_switch_perf_msrs(vmx);
 
 	vmx_arm_hv_timer(vcpu);
@@ -10580,6 +10672,8 @@  static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 				  | (1 << VCPU_EXREG_CR3));
 	vcpu->arch.regs_dirty = 0;
 
+	pt_guest_exit(vmx);
+
 	/*
 	 * eager fpu is enabled if PKEY is supported and CR4 is switched
 	 * back on host, so it is safe to read guest PKRU from current