[BISECT,4.1.y] regression: kvm: migration hangs guest
diff mbox

Message ID alpine.LRH.2.11.1701251535220.13235@mail.ewheeler.net
State New
Headers show

Commit Message

Eric Wheeler Jan. 25, 2017, 11:50 p.m. UTC
Hello All,

We discovered guests hanging when using the 4.1.y kernel after 4.1.16 with 
live migration after CentOS applied this patch to the latest version of 
qemu-kvm user space:
	https://git.centos.org/blob/rpms!!qemu-kvm.git/34b32196890e2c41b0aee042e600ba422f29db17/SOURCES!kvm-target-i386-get-put-MSR_TSC_AUX-across-reset-and-mig.patch

Please also see this Bugzilla entry: 
	https://bugzilla.redhat.com/show_bug.cgi?id=1408333

After a bisect, We found this commit to be causing the hang: 
	8a3185c54d650a86dafc8d8bcafa124b50944315 KVM: x86: expose MSR_TSC_AUX to userspace

It turns out that this is not actually the problem, but rather these 
commits need to be pulled in as well to support the 8a31 commit:
	609e36d372a KVM: x86: pass host_initiated to functions that read MSRs
	81b1b9ca6d5 KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX

Thus, we need to either:
	1. Revert commit 8a3185c54d650a86dafc8d8bcafa124b50944315
		- or - 
	2. Merge commits 609e36d372a and 81b1b9ca6d5 into 4.1.y.

If you choose the latter options, then please see below for the backport
patch of 81b1b9ca6d5 (609e36d372a cherry-picks just fine).

What do you believe would be best?

--
Eric Wheeler



From 788baceebe8a7bbfab4da82caebbca8cdf188e1a Mon Sep 17 00:00:00 2001
From: Haozhong Zhang <haozhong.zhang@intel.com>
Date: Mon, 14 Dec 2015 23:13:38 +0800
Subject: [PATCH] KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX

The current handling of accesses to guest MSR_TSC_AUX returns error if
vcpu does not support rdtscp, though those accesses are initiated by
host. This can result in the reboot failure of some versions of
QEMU. This patch fixes this issue by passing those host initiated
accesses for further handling instead.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Conflicts:
	arch/x86/kvm/vmx.c
---
 arch/x86/kvm/vmx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Paolo Bonzini Jan. 25, 2017, 11:58 p.m. UTC | #1
On 26/01/2017 00:50, Eric Wheeler wrote:
> 	609e36d372a KVM: x86: pass host_initiated to functions that read MSRs
> 	81b1b9ca6d5 KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
> 
> Thus, we need to either:
> 	1. Revert commit 8a3185c54d650a86dafc8d8bcafa124b50944315
> 		- or - 
> 	2. Merge commits 609e36d372a and 81b1b9ca6d5 into 4.1.y.
> 
> If you choose the latter options, then please see below for the backport
> patch of 81b1b9ca6d5 (609e36d372a cherry-picks just fine).
> 
> What do you believe would be best?

The latter is better.

Paolo
Eric Wheeler Jan. 26, 2017, 11:29 p.m. UTC | #2
On Thu, 26 Jan 2017, Paolo Bonzini wrote:
> On 26/01/2017 00:50, Eric Wheeler wrote:
> > 	609e36d372a KVM: x86: pass host_initiated to functions that read MSRs
> > 	81b1b9ca6d5 KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
> > 
> > Thus, we need to either:
> > 	1. Revert commit 8a3185c54d650a86dafc8d8bcafa124b50944315
> > 		- or - 
> > 	2. Merge commits 609e36d372a and 81b1b9ca6d5 into 4.1.y.
> > 
> > If you choose the latter options, then please see below for the backport
> > patch of 81b1b9ca6d5 (609e36d372a cherry-picks just fine).
> > 
> > What do you believe would be best?
> 
> The latter is better.
> 
> Paolo

Hi Greg,

Paolo suggests cherry-picking the dependent commits with the backport 
patch instead of reverting the original patch.  Do you need anything more 
in this thread to bring this into v4.1.y? 

Its an "Option 3" according to Documentation/stable_kernel_rules.txt with 
dependent commits so I'm not sure if you want this formatted any 
differently to make your merge easy---let me know.

Thanks!


--
Eric Wheeler
Philipp Hahn Jan. 27, 2017, 6:21 a.m. UTC | #3
Hello,

Am 27.01.2017 um 00:29 schrieb Eric Wheeler:
> On Thu, 26 Jan 2017, Paolo Bonzini wrote:
>> On 26/01/2017 00:50, Eric Wheeler wrote:
>>> 	609e36d372a KVM: x86: pass host_initiated to functions that read MSRs
>>> 	81b1b9ca6d5 KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
>>>
>>> Thus, we need to either:
>>> 	1. Revert commit 8a3185c54d650a86dafc8d8bcafa124b50944315
>>> 		- or - 
>>> 	2. Merge commits 609e36d372a and 81b1b9ca6d5 into 4.1.y.
>>>
>>> If you choose the latter options, then please see below for the backport
>>> patch of 81b1b9ca6d5 (609e36d372a cherry-picks just fine).
>>>
>>> What do you believe would be best?
>>
>> The latter is better.
>>
>> Paolo
> 
> Hi Greg,

4.1 is maintained by Sasha Levin <alexander.levin@verizon.com>, cc-ed.

> Paolo suggests cherry-picking the dependent commits with the backport 
> patch instead of reverting the original patch.  Do you need anything more 
> in this thread to bring this into v4.1.y? 
> 
> Its an "Option 3" according to Documentation/stable_kernel_rules.txt with 
> dependent commits so I'm not sure if you want this formatted any 
> differently to make your merge easy---let me know.

Philipp
Eric Wheeler Feb. 9, 2017, 9:34 p.m. UTC | #4
On Fri, 27 Jan 2017, Philipp Hahn wrote:

> Hello,
> 
> Am 27.01.2017 um 00:29 schrieb Eric Wheeler:
> > On Thu, 26 Jan 2017, Paolo Bonzini wrote:
> >> On 26/01/2017 00:50, Eric Wheeler wrote:
> >>> 	609e36d372a KVM: x86: pass host_initiated to functions that read MSRs
> >>> 	81b1b9ca6d5 KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
> >>>
> >>> Thus, we need to either:
> >>> 	1. Revert commit 8a3185c54d650a86dafc8d8bcafa124b50944315
> >>> 		- or - 
> >>> 	2. Merge commits 609e36d372a and 81b1b9ca6d5 into 4.1.y.
> >>>
> >>> If you choose the latter options, then please see below for the backport
> >>> patch of 81b1b9ca6d5 (609e36d372a cherry-picks just fine).
> >>>
> >>> What do you believe would be best?
> >>
> >> The latter is better.
> >>
> >> Paolo
> > 
> > Hi Greg,
> 
> 4.1 is maintained by Sasha Levin <alexander.levin@verizon.com>, cc-ed.

Thank you Philipp!

> > Paolo suggests cherry-picking the dependent commits with the backport 
> > patch instead of reverting the original patch.  Do you need anything more 
> > in this thread to bring this into v4.1.y? 

Sasha, do you need anything more to pull this into 4.1?

Thanks!


--
Eric Wheeler


> > 
> > Its an "Option 3" according to Documentation/stable_kernel_rules.txt with 
> > dependent commits so I'm not sure if you want this formatted any 
> > differently to make your merge easy---let me know.
> 
> Philipp
> 
>
Eric Wheeler Feb. 15, 2017, 6:36 p.m. UTC | #5
On Thu, 9 Feb 2017, Eric Wheeler wrote:

> On Fri, 27 Jan 2017, Philipp Hahn wrote:
> 
> > Hello,
> > 
> > Am 27.01.2017 um 00:29 schrieb Eric Wheeler:
> > > On Thu, 26 Jan 2017, Paolo Bonzini wrote:
> > >> On 26/01/2017 00:50, Eric Wheeler wrote:
> > >>> 	609e36d372a KVM: x86: pass host_initiated to functions that read MSRs
> > >>> 	81b1b9ca6d5 KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
> > >>>
> > >>> Thus, we need to either:
> > >>> 	1. Revert commit 8a3185c54d650a86dafc8d8bcafa124b50944315
> > >>> 		- or - 
> > >>> 	2. Merge commits 609e36d372a and 81b1b9ca6d5 into 4.1.y.
> > >>>
> > >>> If you choose the latter options, then please see below for the backport
> > >>> patch of 81b1b9ca6d5 (609e36d372a cherry-picks just fine).
> > >>>
> > >>> What do you believe would be best?
> > >>
> > >> The latter is better.

Continuing on the Redhat Bugzilla thread, Dr. David Alan Gilbert noted 
that we also need to cherry-pick this commit: b0996ae48
	https://bugzilla.redhat.com/show_bug.cgi?id=1408333

To re-summerize, these commits need pulled into 4.1 for KVM migration to 
be stable:

1. 609e36d372a
2. 81b1b9ca6d5 [see backport patch in the first message of this thread]
3. b0996ae48

-Eric

> > >> Paolo
> > > 
> > > Hi Greg,
> > 
> > 4.1 is maintained by Sasha Levin <alexander.levin@verizon.com>, cc-ed.
> 
> Thank you Philipp!
> 
> > > Paolo suggests cherry-picking the dependent commits with the backport 
> > > patch instead of reverting the original patch.  Do you need anything more 
> > > in this thread to bring this into v4.1.y? 
> 
> Sasha, do you need anything more to pull this into 4.1?
> 
> Thanks!
> 
> 
> --
> Eric Wheeler
> 
> 
> > > 
> > > Its an "Option 3" according to Documentation/stable_kernel_rules.txt with 
> > > dependent commits so I'm not sure if you want this formatted any 
> > > differently to make your merge easy---let me know.
> > 
> > Philipp
> > 
> > 
>
Ben Hutchings March 6, 2017, 7:11 p.m. UTC | #6
On Wed, 2017-01-25 at 15:50 -0800, Eric Wheeler wrote:
> Hello All,
> 
> We discovered guests hanging when using the 4.1.y kernel after 4.1.16 with 
> live migration after CentOS applied this patch to the latest version of 
> qemu-kvm user space:
>         https://git.centos.org/blob/rpms!!qemu-kvm.git/34b32196890e2c41b0aee042e600ba422f29db17/SOURCES!kvm-target-i386-get-put-MSR_TSC_AUX-across-reset-and-mig.patch
> 
> Please also see this Bugzilla entry: 
>         https://bugzilla.redhat.com/show_bug.cgi?id=1408333
> 
> After a bisect, We found this commit to be causing the hang: 
>         8a3185c54d650a86dafc8d8bcafa124b50944315 KVM: x86: expose MSR_TSC_AUX to userspace
> 
> It turns out that this is not actually the problem, but rather these 
> commits need to be pulled in as well to support the 8a31 commit:
>         609e36d372a KVM: x86: pass host_initiated to functions that read MSRs
>         81b1b9ca6d5 KVM: VMX: Fix host initiated access to guest MSR_TSC_AUX
> 
> Thus, we need to either:
>         1. Revert commit 8a3185c54d650a86dafc8d8bcafa124b50944315
>                 - or - 
>         2. Merge commits 609e36d372a and 81b1b9ca6d5 into 4.1.y.
> 
> If you choose the latter options, then please see below for the backport
> patch of 81b1b9ca6d5 (609e36d372a cherry-picks just fine).
> 
> What do you believe would be best?

For 3.2 and 3.16, I will revert the change.  Thanks for tracking this down.

Ben.

Patch
diff mbox

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 341ea55..6b219a7 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2697,7 +2697,7 @@  static int vmx_get_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
 		data = vcpu->arch.ia32_xss;
 		break;
 	case MSR_TSC_AUX:
-		if (!to_vmx(vcpu)->rdtscp_enabled)
+		if (!to_vmx(vcpu)->rdtscp_enabled && !msr_info->host_initiated)
 			return 1;
 		/* Otherwise falls through */
 	default:
@@ -2804,7 +2804,7 @@  static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
 			clear_atomic_switch_msr(vmx, MSR_IA32_XSS);
 		break;
 	case MSR_TSC_AUX:
-		if (!vmx->rdtscp_enabled)
+		if (!vmx->rdtscp_enabled && !msr_info->host_initiated)
 			return 1;
 		/* Check reserved bit, higher 32 bits should be zero */
 		if ((data >> 32) != 0)