diff mbox

[RFC] KVM: Synthesize G bit for all segments.

Message ID 1404729492.30313.803.camel@akataria-dtop.eng.vmware.com (mailing list archive)
State New, archived
Headers show

Commit Message

Alok Kataria July 7, 2014, 10:38 a.m. UTC
From: Jim Mattson <jmattson@vmware.com>

We have noticed that qemu-kvm hangs early in the BIOS when runnning nested
under some versions of VMware ESXi.

The problem we believe is because KVM assumes that the platform preserves
the 'G' but for any segment register. The SVM specification itemizes the
segment attribute bits that are observed by the CPU, but the (G)ranularity bit
is not one of the bits itemized, for any segment. Though current AMD CPUs keep
track of the (G)ranularity bit for all segment registers other than CS, the
specification does not require it. VMware's virtual CPU may not track the
(G)ranularity bit for any segment register.

Since kvm already synthesizes the (G)ranularity bit for the CS segment. It
should do so for all segments. The patch below does that, and helps get rid of
the hangs. Patch applies on top of Linus' tree.

Signed-off-by: Jim Mattson <jmattson@vmware.com>
Signed-off-by: Alok N Kataria <akataria@vmware.com>



--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jan Kiszka July 7, 2014, 10:52 a.m. UTC | #1
On 2014-07-07 12:38, Alok Kataria wrote:
> From: Jim Mattson <jmattson@vmware.com>
> 
> We have noticed that qemu-kvm hangs early in the BIOS when runnning nested
> under some versions of VMware ESXi.
> 
> The problem we believe is because KVM assumes that the platform preserves
> the 'G' but for any segment register. The SVM specification itemizes the
> segment attribute bits that are observed by the CPU, but the (G)ranularity bit
> is not one of the bits itemized, for any segment. Though current AMD CPUs keep
> track of the (G)ranularity bit for all segment registers other than CS, the
> specification does not require it. VMware's virtual CPU may not track the
> (G)ranularity bit for any segment register.
> 
> Since kvm already synthesizes the (G)ranularity bit for the CS segment. It
> should do so for all segments. The patch below does that, and helps get rid of
> the hangs. Patch applies on top of Linus' tree.
> 
> Signed-off-by: Jim Mattson <jmattson@vmware.com>
> Signed-off-by: Alok N Kataria <akataria@vmware.com>
> 
> Index: linux-2.6/arch/x86/kvm/svm.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kvm/svm.c	2014-07-07 15:32:52.724368183 +0530
> +++ linux-2.6/arch/x86/kvm/svm.c	2014-07-07 15:34:19.664748841 +0530
> @@ -1415,7 +1415,7 @@
>  	var->avl = (s->attrib >> SVM_SELECTOR_AVL_SHIFT) & 1;
>  	var->l = (s->attrib >> SVM_SELECTOR_L_SHIFT) & 1;
>  	var->db = (s->attrib >> SVM_SELECTOR_DB_SHIFT) & 1;
> -	var->g = (s->attrib >> SVM_SELECTOR_G_SHIFT) & 1;
> +	var->g = s->limit > 0xfffff;
>  
>  	/*
>  	 * AMD's VMCB does not have an explicit unusable field, so emulate it
> @@ -1424,14 +1424,6 @@
>  	var->unusable = !var->present || (var->type == 0);
>  
>  	switch (seg) {
> -	case VCPU_SREG_CS:
> -		/*
> -		 * SVM always stores 0 for the 'G' bit in the CS selector in
> -		 * the VMCB on a VMEXIT. This hurts cross-vendor migration:
> -		 * Intel's VMENTRY has a check on the 'G' bit.
> -		 */
> -		var->g = s->limit > 0xfffff;
> -		break;
>  	case VCPU_SREG_TR:
>  		/*
>  		 * Work around a bug where the busy flag in the tr selector
> 
> 

Thanks for pushing this. I already tried to analyze the spec in this
regard in [1].

But even if it turns out we could read the bit on real HW, I think this
patch is fine in order to be compatible with ESXi.

Jan

[1] http://thread.gmane.org/gmane.comp.emulators.kvm.devel/124252
Paolo Bonzini July 7, 2014, 1:04 p.m. UTC | #2
Il 07/07/2014 12:38, Alok Kataria ha scritto:
> From: Jim Mattson <jmattson@vmware.com>
>
> We have noticed that qemu-kvm hangs early in the BIOS when runnning nested
> under some versions of VMware ESXi.
>
> The problem we believe is because KVM assumes that the platform preserves
> the 'G' but for any segment register. The SVM specification itemizes the
> segment attribute bits that are observed by the CPU, but the (G)ranularity bit
> is not one of the bits itemized, for any segment. Though current AMD CPUs keep
> track of the (G)ranularity bit for all segment registers other than CS, the
> specification does not require it. VMware's virtual CPU may not track the
> (G)ranularity bit for any segment register.
>
> Since kvm already synthesizes the (G)ranularity bit for the CS segment. It
> should do so for all segments. The patch below does that, and helps get rid of
> the hangs. Patch applies on top of Linus' tree.
>
> Signed-off-by: Jim Mattson <jmattson@vmware.com>
> Signed-off-by: Alok N Kataria <akataria@vmware.com>
>
> Index: linux-2.6/arch/x86/kvm/svm.c
> ===================================================================
> --- linux-2.6.orig/arch/x86/kvm/svm.c	2014-07-07 15:32:52.724368183 +0530
> +++ linux-2.6/arch/x86/kvm/svm.c	2014-07-07 15:34:19.664748841 +0530
> @@ -1415,7 +1415,7 @@
>  	var->avl = (s->attrib >> SVM_SELECTOR_AVL_SHIFT) & 1;
>  	var->l = (s->attrib >> SVM_SELECTOR_L_SHIFT) & 1;
>  	var->db = (s->attrib >> SVM_SELECTOR_DB_SHIFT) & 1;
> -	var->g = (s->attrib >> SVM_SELECTOR_G_SHIFT) & 1;
> +	var->g = s->limit > 0xfffff;
>
>  	/*
>  	 * AMD's VMCB does not have an explicit unusable field, so emulate it
> @@ -1424,14 +1424,6 @@
>  	var->unusable = !var->present || (var->type == 0);
>
>  	switch (seg) {
> -	case VCPU_SREG_CS:
> -		/*
> -		 * SVM always stores 0 for the 'G' bit in the CS selector in
> -		 * the VMCB on a VMEXIT. This hurts cross-vendor migration:
> -		 * Intel's VMENTRY has a check on the 'G' bit.
> -		 */
> -		var->g = s->limit > 0xfffff;
> -		break;
>  	case VCPU_SREG_TR:
>  		/*
>  		 * Work around a bug where the busy flag in the tr selector
>
>

Looks good, but please add a comment in svm_set_segment.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6/arch/x86/kvm/svm.c
===================================================================
--- linux-2.6.orig/arch/x86/kvm/svm.c	2014-07-07 15:32:52.724368183 +0530
+++ linux-2.6/arch/x86/kvm/svm.c	2014-07-07 15:34:19.664748841 +0530
@@ -1415,7 +1415,7 @@ 
 	var->avl = (s->attrib >> SVM_SELECTOR_AVL_SHIFT) & 1;
 	var->l = (s->attrib >> SVM_SELECTOR_L_SHIFT) & 1;
 	var->db = (s->attrib >> SVM_SELECTOR_DB_SHIFT) & 1;
-	var->g = (s->attrib >> SVM_SELECTOR_G_SHIFT) & 1;
+	var->g = s->limit > 0xfffff;
 
 	/*
 	 * AMD's VMCB does not have an explicit unusable field, so emulate it
@@ -1424,14 +1424,6 @@ 
 	var->unusable = !var->present || (var->type == 0);
 
 	switch (seg) {
-	case VCPU_SREG_CS:
-		/*
-		 * SVM always stores 0 for the 'G' bit in the CS selector in
-		 * the VMCB on a VMEXIT. This hurts cross-vendor migration:
-		 * Intel's VMENTRY has a check on the 'G' bit.
-		 */
-		var->g = s->limit > 0xfffff;
-		break;
 	case VCPU_SREG_TR:
 		/*
 		 * Work around a bug where the busy flag in the tr selector