diff mbox

[4/6] KVM: x86: Fix determining flat mode in recalculate_apic_map

Message ID 1412099359-5316-5-git-send-email-namit@cs.technion.ac.il (mailing list archive)
State New, archived
Headers show

Commit Message

Nadav Amit Sept. 30, 2014, 5:49 p.m. UTC
Determining flat mode according to cid_mask is wrong, since currently KVM
supports zero clusters in x2apic mode. Use ldr_bits instead.

Since we recalculate the apic map whenever the DFR is changed, the bug appears
to have no effect, and perhaps the entire check can be removed.

Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
---
 arch/x86/kvm/lapic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Radim Krčmář Oct. 1, 2014, 4:04 p.m. UTC | #1
2014-09-30 20:49+0300, Nadav Amit:
> Determining flat mode according to cid_mask is wrong, since currently KVM
> supports zero clusters in x2apic mode. Use ldr_bits instead.

No, it is in else branch of 'if (apic_x2apic_mode(apic))', so it works
as intended now, and ldr_bits == 8 is always true.

> Since we recalculate the apic map whenever the DFR is changed, the bug appears
> to have no effect, and perhaps the entire check can be removed.

It could, for the check is just an optimization.
(This whole code deserves a rewrite though ...)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nadav Amit Oct. 1, 2014, 5:30 p.m. UTC | #2
On Oct 1, 2014, at 7:04 PM, Radim Kr?má? <rkrcmar@redhat.com> wrote:

> 2014-09-30 20:49+0300, Nadav Amit:
>> Determining flat mode according to cid_mask is wrong, since currently KVM
>> supports zero clusters in x2apic mode. Use ldr_bits instead.
> 
> No, it is in else branch of 'if (apic_x2apic_mode(apic))', so it works
> as intended now, and ldr_bits == 8 is always true.
The condition “!new->cid_mask” is certainly always true (unless we already set the cid/lid according to cluster-mode settings).
I did not feel comfortable removing it without replacing it with something “equivalent”.

> 
>> Since we recalculate the apic map whenever the DFR is changed, the bug appears
>> to have no effect, and perhaps the entire check can be removed.
> 
> It could, for the check is just an optimization.
> (This whole code deserves a rewrite though ...)

I did not hit such bug, but IMO the code is buggy.
The APIC mode is determined by processors that enabled their apic enabled (in the spurious vector), and all the enabled one should configure the same mode.
Nonetheless, as stated in the SDM 10.4.7.2 "Local APIC State After It Has Been Software Disabled APICs” - the disabled APICs should still respond to NMIs, INIT, etc.
So it appears the loop should be broken into two loops - first determine the apic mode, according to the first enabled APIC. Then, iterate again over vCPUs and build the logical map.

Regards,
Nadav
Radim Krčmář Oct. 1, 2014, 6:27 p.m. UTC | #3
2014-10-01 20:30+0300, Nadav Amit:
> On Oct 1, 2014, at 7:04 PM, Radim Kr?má? <rkrcmar@redhat.com> wrote:
> > 2014-09-30 20:49+0300, Nadav Amit:
> The condition “!new->cid_mask” is certainly always true (unless we already set the cid/lid according to cluster-mode settings).

Exactly, it "optimizes" the switch to a non-default clustered mode.

> I did not feel comfortable removing it without replacing it with something “equivalent”.
> 
> > 
> >> Since we recalculate the apic map whenever the DFR is changed, the bug appears
> >> to have no effect, and perhaps the entire check can be removed.
> > 
> > It could, for the check is just an optimization.
> > (This whole code deserves a rewrite though ...)
> 
> I did not hit such bug, but IMO the code is buggy.
> The APIC mode is determined by processors that enabled their apic enabled (in the spurious vector), and all the enabled one should configure the same mode.
> Nonetheless, as stated in the SDM 10.4.7.2 "Local APIC State After It Has Been Software Disabled APICs” - the disabled APICs should still respond to NMIs, INIT, etc.
> So it appears the loop should be broken into two loops - first determine the apic mode, according to the first enabled APIC. Then, iterate again over vCPUs and build the logical map.

Our assumption that all have the same mode is horrible.
(Do they all have be the same?)

The only thing we allow out of your scenario that I can see is software
disabled x2apic after enabled clustered xapic processors and that
doesn't need two loops, just a sw check at x2apic.
Practically, it is a harmless bug :)
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nadav Amit Oct. 1, 2014, 7:16 p.m. UTC | #4
On Oct 1, 2014, at 9:27 PM, Radim Kr?má? <rkrcmar@redhat.com> wrote:

> 2014-10-01 20:30+0300, Nadav Amit:
>> On Oct 1, 2014, at 7:04 PM, Radim Kr?má? <rkrcmar@redhat.com> wrote:
>>> 2014-09-30 20:49+0300, Nadav Amit:
>> The condition “!new->cid_mask” is certainly always true (unless we already set the cid/lid according to cluster-mode settings).
> 
> Exactly, it "optimizes" the switch to a non-default clustered mode.
> 
>> I did not feel comfortable removing it without replacing it with something “equivalent”.
>> 
>>> 
>>>> Since we recalculate the apic map whenever the DFR is changed, the bug appears
>>>> to have no effect, and perhaps the entire check can be removed.
>>> 
>>> It could, for the check is just an optimization.
>>> (This whole code deserves a rewrite though ...)
>> 
>> I did not hit such bug, but IMO the code is buggy.
>> The APIC mode is determined by processors that enabled their apic enabled (in the spurious vector), and all the enabled one should configure the same mode.
>> Nonetheless, as stated in the SDM 10.4.7.2 "Local APIC State After It Has Been Software Disabled APICs” - the disabled APICs should still respond to NMIs, INIT, etc.
>> So it appears the loop should be broken into two loops - first determine the apic mode, according to the first enabled APIC. Then, iterate again over vCPUs and build the logical map.
> 
> Our assumption that all have the same mode is horrible.
> (Do they all have be the same?)
Yes: "All processors that have their APIC software enabled (using the spurious vector enable/disable bit) must have their DFRs (Destination Format Registers) programmed identically."
> 
> The only thing we allow out of your scenario that I can see is software
> disabled x2apic after enabled clustered xapic processors and that
> doesn't need two loops, just a sw check at x2apic.
> Practically, it is a harmless bug :)
So does xsa-108... ;-)
Now seriously: First, the bug may affect certain cases of cpu hot-plug, etc.
Second, there are additional implications. Consider a situation in which the first VCPUs have lapic disabled, and they do not have the same DFR/x2apic mode as the rest of the VCPUs.
This is ok according to the SDM, but in such case, the logical map they would have would not match the spic-mode. Therefore, they may not receive NMIs, INIT, etc. - which they should regardless to the fact their LAPIC is disabled.

Regards,
Nadav
Radim Krčmář Oct. 1, 2014, 8:58 p.m. UTC | #5
2014-10-01 22:16+0300, Nadav Amit:
> On Oct 1, 2014, at 9:27 PM, Radim Kr?má? <rkrcmar@redhat.com> wrote:
> > Our assumption that all have the same mode is horrible.
> > (Do they all have be the same?)
> Yes: "All processors that have their APIC software enabled (using the spurious vector enable/disable bit) must have their DFRs (Destination Format Registers) programmed identically."

Thanks.

> > The only thing we allow out of your scenario that I can see is software
> > disabled x2apic after enabled clustered xapic processors and that
> > doesn't need two loops, just a sw check at x2apic.
> > Practically, it is a harmless bug :)
> So does xsa-108... ;-)
> Now seriously: First, the bug may affect certain cases of cpu hot-plug, etc.
> Second, there are additional implications. Consider a situation in which the first VCPUs have lapic disabled, and they do not have the same DFR/x2apic mode as the rest of the VCPUs.

(I coudn't find anything that would affect the host.)

> This is ok according to the SDM, but in such case, the logical map they would have would not match the spic-mode. Therefore, they may not receive NMIs, INIT, etc. - which they should regardless to the fact their LAPIC is disabled.

The guest wouldn't work, more suprising is that it usually does ;)
I'll rewrite it later if you are pressed for other stuff.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Gleb Natapov Oct. 4, 2014, 6:50 a.m. UTC | #6
On Tue, Sep 30, 2014 at 08:49:17PM +0300, Nadav Amit wrote:
> Determining flat mode according to cid_mask is wrong, since currently KVM
> supports zero clusters in x2apic mode. Use ldr_bits instead.
As a comment above the 'if' you are fixing says the code assumes all APICs are in
the same mode (if they are not that's an OS bug). In that case the code never gets
here with x2apic mode.

> 
> Since we recalculate the apic map whenever the DFR is changed, the bug appears
> to have no effect, and perhaps the entire check can be removed.
> 
> Signed-off-by: Nadav Amit <namit@cs.technion.ac.il>
> ---
>  arch/x86/kvm/lapic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index b8345dd..cfce429 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -171,7 +171,7 @@ static void recalculate_apic_map(struct kvm *kvm)
>  			new->cid_mask = (1 << KVM_X2APIC_CID_BITS) - 1;
>  			new->lid_mask = 0xffff;
>  		} else if (kvm_apic_sw_enabled(apic) &&
> -				!new->cid_mask /* flat mode */ &&
> +				new->ldr_bits == 8 /* flat mode */ &&
>  				kvm_apic_get_reg(apic, APIC_DFR) == APIC_DFR_CLUSTER) {
>  			new->cid_shift = 4;
>  			new->cid_mask = 0xf;
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b8345dd..cfce429 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -171,7 +171,7 @@  static void recalculate_apic_map(struct kvm *kvm)
 			new->cid_mask = (1 << KVM_X2APIC_CID_BITS) - 1;
 			new->lid_mask = 0xffff;
 		} else if (kvm_apic_sw_enabled(apic) &&
-				!new->cid_mask /* flat mode */ &&
+				new->ldr_bits == 8 /* flat mode */ &&
 				kvm_apic_get_reg(apic, APIC_DFR) == APIC_DFR_CLUSTER) {
 			new->cid_shift = 4;
 			new->cid_mask = 0xf;