diff mbox series

[for-6.1,v2] i386: do not call cpudef-only models functions for max, host, base

Message ID 20210723112921.12637-1-cfontana@suse.de (mailing list archive)
State New, archived
Headers show
Series [for-6.1,v2] i386: do not call cpudef-only models functions for max, host, base | expand

Commit Message

Claudio Fontana July 23, 2021, 11:29 a.m. UTC
Some cpu properties have to be set only for cpu models in builtin_x86_defs,
registered with x86_register_cpu_model_type, and not for
cpu models "base", "max", and the subclass "host".

These properties are the ones set by function x86_cpu_apply_props,
(also including kvm_default_props, tcg_default_props),
and the "vendor" property for the KVM and HVF accelerators.

After recent refactoring of cpu, which also affected these properties,
they were instead set unconditionally for all x86 cpus.

This has been detected as a bug with Nested on AMD with cpu "host",
as svm was not turned on by default, due to the wrongful setting of
kvm_default_props via x86_cpu_apply_props, which set svm to "off".

Rectify the bug introduced in commit "i386: split cpu accelerators"
and document the functions that are builtin_x86_defs-only.

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: f5cc5a5c ("i386: split cpu accelerators from cpu.c,"...)
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/477
---
 target/i386/cpu.c         |  19 ++++++-
 target/i386/host-cpu.c    |  13 +++--
 target/i386/kvm/kvm-cpu.c | 105 ++++++++++++++++++++------------------
 target/i386/tcg/tcg-cpu.c |  11 ++--
 4 files changed, 89 insertions(+), 59 deletions(-)

Comments

Woodhouse, David Nov. 29, 2021, 11:39 a.m. UTC | #1
On Fri, 2021-07-23 at 13:29 +0200, Claudio Fontana wrote:
>  static void kvm_cpu_instance_init(CPUState *cs)
>  {
>      X86CPU *cpu = X86_CPU(cs);
> +    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
>  
>      host_cpu_instance_init(cpu);
>  
> -    if (!kvm_irqchip_in_kernel()) {
> -        x86_cpu_change_kvm_default("x2apic", "off");
> -    } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
> -        x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
> -    }
> -
> -    /* Special cases not set in the X86CPUDefinition structs: */
> +    if (xcc->model) {
> +        /* only applies to builtin_x86_defs cpus */
> +        if (!kvm_irqchip_in_kernel()) {
> +            x86_cpu_change_kvm_default("x2apic", "off");
> +        } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
> +            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
> +        }
>  
> -    x86_cpu_apply_props(cpu, kvm_default_props);
> +        /* Special cases not set in the X86CPUDefinition structs: */
> +        x86_cpu_apply_props(cpu, kvm_default_props);
> +    }
>  

I think this causes a regression in x2apic and kvm-msi-ext-dest-id
support. If you start qemu thus:

qemu-system-x86_64 -machine q35,accel=kvm,usb=off,kernel_irqchip=split -cpu host -smp 288,sockets=9,cores=16,threads=2

The guest now sees those features, but we don't actually call
kvm_enable_x2apic() so the APIC broadcast quirk doesn't get disabled,
and interrupts targeted at APIC ID 255 are interpreted as broadcasts:

[ 73.198504] __common_interrupt: 0.34 No irq handler for vector
[ 73.198515] __common_interrupt: 11.34 No irq handler for vector
[ 73.198517] __common_interrupt: 12.34 No irq handler for vector
[ 73.198521] __common_interrupt: 15.34 No irq handler for vector
[ 73.198524] __common_interrupt: 17.34 No irq handler for vector
[ 73.198528] __common_interrupt: 34.34 No irq handler for vector
[ 73.198529] __common_interrupt: 20.34 No irq handler for vector
[ 73.198533] __common_interrupt: 41.34 No irq handler for vector
[ 73.198539] __common_interrupt: 27.34 No irq handler for vector
[ 73.198542] __common_interrupt: 28.34 No irq handler for vector



Amazon Development Centre (London) Ltd. Registered in England and Wales with registration number 04543232 with its registered office at 1 Principal Place, Worship Street, London EC2A 2FA, United Kingdom.
Claudio Fontana Nov. 29, 2021, 2:14 p.m. UTC | #2
On 11/29/21 12:39 PM, Woodhouse, David wrote:
> On Fri, 2021-07-23 at 13:29 +0200, Claudio Fontana wrote:
>>  static void kvm_cpu_instance_init(CPUState *cs)
>>  {
>>      X86CPU *cpu = X86_CPU(cs);
>> +    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
>>  
>>      host_cpu_instance_init(cpu);
>>  
>> -    if (!kvm_irqchip_in_kernel()) {
>> -        x86_cpu_change_kvm_default("x2apic", "off");
>> -    } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>> -        x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>> -    }
>> -
>> -    /* Special cases not set in the X86CPUDefinition structs: */
>> +    if (xcc->model) {
>> +        /* only applies to builtin_x86_defs cpus */
>> +        if (!kvm_irqchip_in_kernel()) {
>> +            x86_cpu_change_kvm_default("x2apic", "off");
>> +        } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>> +            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>> +        }
>>  
>> -    x86_cpu_apply_props(cpu, kvm_default_props);
>> +        /* Special cases not set in the X86CPUDefinition structs: */
>> +        x86_cpu_apply_props(cpu, kvm_default_props);
>> +    }
>>  
> 
> I think this causes a regression in x2apic and kvm-msi-ext-dest-id
> support. If you start qemu thus:

If I recall correctly, this change just tries to restore the behavior prior to  
commit f5cc5a5c168674f84bf061cdb307c2d25fba5448 ,

fixing the issue introduced with the refactoring at that time.

Can you try bisecting prior to f5cc5a5c168674f84bf061cdb307c2d25fba5448 , to see if the actual breakage comes from somewhere else?

> 
> qemu-system-x86_64 -machine q35,accel=kvm,usb=off,kernel_irqchip=split -cpu host -smp 288,sockets=9,cores=16,threads=2
> 
> The guest now sees those features, but we don't actually call
> kvm_enable_x2apic() so the APIC broadcast quirk doesn't get disabled,
> and interrupts targeted at APIC ID 255 are interpreted as broadcasts:
> 
> [ 73.198504] __common_interrupt: 0.34 No irq handler for vector
> [ 73.198515] __common_interrupt: 11.34 No irq handler for vector
> [ 73.198517] __common_interrupt: 12.34 No irq handler for vector
> [ 73.198521] __common_interrupt: 15.34 No irq handler for vector
> [ 73.198524] __common_interrupt: 17.34 No irq handler for vector
> [ 73.198528] __common_interrupt: 34.34 No irq handler for vector
> [ 73.198529] __common_interrupt: 20.34 No irq handler for vector
> [ 73.198533] __common_interrupt: 41.34 No irq handler for vector
> [ 73.198539] __common_interrupt: 27.34 No irq handler for vector
> [ 73.198542] __common_interrupt: 28.34 No irq handler for vector
> 
> 

Any image to specifically test out? Would an actual 9 sockets machine be required to reproduce this?

Thanks,

Claudio
David Woodhouse Nov. 29, 2021, 3:11 p.m. UTC | #3
On Mon, 2021-11-29 at 15:14 +0100, Claudio Fontana wrote:
> On 11/29/21 12:39 PM, Woodhouse, David wrote:
> > On Fri, 2021-07-23 at 13:29 +0200, Claudio Fontana wrote:
> > >  static void kvm_cpu_instance_init(CPUState *cs)
> > >  {
> > >      X86CPU *cpu = X86_CPU(cs);
> > > +    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
> > > 
> > >      host_cpu_instance_init(cpu);
> > > 
> > > -    if (!kvm_irqchip_in_kernel()) {
> > > -        x86_cpu_change_kvm_default("x2apic", "off");
> > > -    } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
> > > -        x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
> > > -    }
> > > -
> > > -    /* Special cases not set in the X86CPUDefinition structs: */
> > > +    if (xcc->model) {
> > > +        /* only applies to builtin_x86_defs cpus */
> > > +        if (!kvm_irqchip_in_kernel()) {
> > > +            x86_cpu_change_kvm_default("x2apic", "off");
> > > +        } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
> > > +            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
> > > +        }
> > > 
> > > -    x86_cpu_apply_props(cpu, kvm_default_props);
> > > +        /* Special cases not set in the X86CPUDefinition structs: */
> > > +        x86_cpu_apply_props(cpu, kvm_default_props);
> > > +    }
> > > 
> > 
> > I think this causes a regression in x2apic and kvm-msi-ext-dest-id
> > support. If you start qemu thus:
> 
> If I recall correctly, this change just tries to restore the behavior prior to
> commit f5cc5a5c168674f84bf061cdb307c2d25fba5448 ,
> 
> fixing the issue introduced with the refactoring at that time.
> 
> Can you try bisecting prior to
> f5cc5a5c168674f84bf061cdb307c2d25fba5448 , to see if the actual
> breakage comes from somewhere else?

Hm, so it looks like it never worked for '-cpu host' *until* commit
f5cc5a5c16.

It didn't matter before c1bb5418e3 because you couldn't enable that
many vCPUs without an IOMMU, and the *IOMMU* setup would call
kvm_enable_x2apic().

But after that, nothing ever called kvm_enable_x2apic() in the '-cpu
host' case until commit f5cc5a5c16, which fixed it... until you
restored the previous behaviour :)

This "works" to fix this case, but presumably isn't correct:

--- a/target/i386/kvm/kvm-cpu.c
+++ b/target/i386/kvm/kvm-cpu.c
@@ -161,7 +161,7 @@ static void kvm_cpu_instance_init(CPUState *cs)
 
     host_cpu_instance_init(cpu);
 
-    if (xcc->model) {
+    if (1 || xcc->model) {
         /* only applies to builtin_x86_defs cpus */
         if (!kvm_irqchip_in_kernel()) {
             x86_cpu_change_kvm_default("x2apic", "off");


> > Any image to specifically test out? Would an actual 9 sockets machine be required to reproduce this?

No, but the more CPUs you have in the host the less you have to wait
for 288 vCPUs to spin up :)

My test is:

./qemu-system-x86_64 -machine q35,accel=kvm,usb=off,kernel_irqchip=split -cpu host -m 2G -smp sockets=9,cores=16,threads=2 -drive file=/var/lib/libvirt/images/fedora.qcow2,if=virtio -serial mon:stdio -display none  -kernel ~/git/linux/arch/x86/boot/bzImage  -append "console=ttyS0,115200 root=/dev/vda1" 


I then play with the affinity of the AHCI MSI. Pointing it at CPU 255
should show the problem. 

[root@localhost ~]# cd /proc/irq/313
[root@localhost 313]# echo 255 > smp_affinity_list 
[root@localhost 313]#
[   65.365821] Composed MSI for APIC 255 vector 0x22: 0/feeff000 22
[root@localhost 313]# grep ahci /proc/interrupts 


I also added some debugging into host and guest kernels to be a little
more explicit:

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index b70344bf6600..53191db5145d 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1866,6 +1866,7 @@ static __init void try_to_enable_x2apic(int remap_mode)
 		 * used for non-remapped IRQ domains.
 		 */
 		if (x86_init.hyper.msi_ext_dest_id()) {
+			pr_info("x2apic: support extended destination ID\n");
 			virt_ext_dest_id = 1;
 			apic_limit = 32767;
 		}
@@ -2539,6 +2540,7 @@ void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg,
 		msg->arch_addr_lo.virt_destid_8_14 = cfg->dest_apicid >> 8;
 	else
 		WARN_ON_ONCE(cfg->dest_apicid > 0xFF);
+	printk("Composed MSI for APIC %d vector 0x%x: %x/%x %x\n", cfg->dest_apicid, cfg->vector, msg->address_hi, msg->address_lo, msg->data);
 }
 
 u32 x86_msi_msg_get_destid(struct msi_msg *msg, bool extid)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 59abbdad7729..f0a7715763a2 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -856,6 +856,8 @@ static void __init kvm_apic_init(void)
 
 static bool __init kvm_msi_ext_dest_id(void)
 {
+	printk("dest id? %d (%x)\n", kvm_para_has_feature(KVM_FEATURE_MSI_EXT_DEST_ID),
+	       kvm_arch_para_features());
 	return kvm_para_has_feature(KVM_FEATURE_MSI_EXT_DEST_ID);
 }
 
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 759952dd1222..defe6a843780 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -894,15 +894,21 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
 {
 	if (kvm->arch.x2apic_broadcast_quirk_disabled) {
 		if ((irq->dest_id == APIC_BROADCAST &&
-				map->mode != KVM_APIC_MODE_X2APIC))
+		     map->mode != KVM_APIC_MODE_X2APIC)) {
+			printk("dest %d mode %d makes bcast\n", irq->dest_id, map->mode);
 			return true;
-		if (irq->dest_id == X2APIC_BROADCAST)
+		}
+		if (irq->dest_id == X2APIC_BROADCAST)  {
+			printk("Sent to X2APIC bcast\n");
 			return true;
+		}
 	} else {
 		bool x2apic_ipi = src && *src && apic_x2apic_mode(*src);
 		if (irq->dest_id == (x2apic_ipi ?
-		                     X2APIC_BROADCAST : APIC_BROADCAST))
+		                     X2APIC_BROADCAST : APIC_BROADCAST)) {
+			printk("no quirk dest %x\n", irq->dest_id);
 			return true;
+		}
 	}
 
 	return false;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d8f1d2169b45..5b0fd6d37a7e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5714,6 +5714,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		if (cap->args[0] & KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK)
 			kvm->arch.x2apic_broadcast_quirk_disabled = true;
 
+		printk("X2APIC API: %x\n", cap->args[0]);
 		r = 0;
 		break;
 	case KVM_CAP_X86_DISABLE_EXITS:
Claudio Fontana Nov. 29, 2021, 4:57 p.m. UTC | #4
On 11/29/21 4:11 PM, David Woodhouse wrote:
> On Mon, 2021-11-29 at 15:14 +0100, Claudio Fontana wrote:
>> On 11/29/21 12:39 PM, Woodhouse, David wrote:
>>> On Fri, 2021-07-23 at 13:29 +0200, Claudio Fontana wrote:
>>>>  static void kvm_cpu_instance_init(CPUState *cs)
>>>>  {
>>>>      X86CPU *cpu = X86_CPU(cs);
>>>> +    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
>>>>
>>>>      host_cpu_instance_init(cpu);
>>>>
>>>> -    if (!kvm_irqchip_in_kernel()) {
>>>> -        x86_cpu_change_kvm_default("x2apic", "off");
>>>> -    } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>>>> -        x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>>>> -    }
>>>> -
>>>> -    /* Special cases not set in the X86CPUDefinition structs: */
>>>> +    if (xcc->model) {
>>>> +        /* only applies to builtin_x86_defs cpus */
>>>> +        if (!kvm_irqchip_in_kernel()) {
>>>> +            x86_cpu_change_kvm_default("x2apic", "off");
>>>> +        } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>>>> +            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>>>> +        }
>>>>
>>>> -    x86_cpu_apply_props(cpu, kvm_default_props);
>>>> +        /* Special cases not set in the X86CPUDefinition structs: */
>>>> +        x86_cpu_apply_props(cpu, kvm_default_props);
>>>> +    }
>>>>
>>>
>>> I think this causes a regression in x2apic and kvm-msi-ext-dest-id
>>> support. If you start qemu thus:
>>
>> If I recall correctly, this change just tries to restore the behavior prior to
>> commit f5cc5a5c168674f84bf061cdb307c2d25fba5448 ,
>>
>> fixing the issue introduced with the refactoring at that time.
>>
>> Can you try bisecting prior to
>> f5cc5a5c168674f84bf061cdb307c2d25fba5448 , to see if the actual
>> breakage comes from somewhere else?
> 
> Hm, so it looks like it never worked for '-cpu host' *until* commit
> f5cc5a5c16.

Right, so here we are talking about properly supporting this for the first time.

The fact that it works with f5cc5a5c16 is more an accident than anything else, that commit was clearly broken
(exemplified by reports of failed boots).

So we need to find the proper solution, ie, exactly which features should be enabled for which cpu classes and models.

> 
> It didn't matter before c1bb5418e3 because you couldn't enable that
> many vCPUs without an IOMMU, and the *IOMMU* setup would call
> kvm_enable_x2apic().
> 
> But after that, nothing ever called kvm_enable_x2apic() in the '-cpu
> host' case until commit f5cc5a5c16, which fixed it... until you
> restored the previous behaviour :)
> 
> This "works" to fix this case, but presumably isn't correct:

Right, we cannot just enable all this code, or the original refactor would have been right.

These kvm default properties have been as far as I know intended for the cpu actual models (builtin_x86_defs),
and not for the special cpu classes max, host and base. This is what the revert addresses.

I suspect what we actually need here is to review exactly in which specific cases kvm_enable_x2apic() should be called in the end.

The code there is mixing changes to the kvm_default_props that are then applied using x86_cpu_apply_props (and that part should be only for xcc->model != NULL),
with the actual enablement of the kvm x2apic using kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags) via kvm_enable_x2apic().

One way is to ignore this detail and just move out those checks, since changes to kvm_default_props are harmless once we skip the x86_cpu_apply_props call,
as such: 

------

static void kvm_cpu_instance_init(CPUState *cs)
{
    X86CPU *cpu = X86_CPU(cs);
    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);

    host_cpu_instance_init(cpu);

    /* only applies to builtin_x86_defs cpus */
    if (!kvm_irqchip_in_kernel()) {
        x86_cpu_change_kvm_default("x2apic", "off");
    } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
        x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
    }

    if (xcc->model) {
        /* Special cases not set in the X86CPUDefinition structs: */
        x86_cpu_apply_props(cpu, kvm_default_props);
    }

    if (cpu->max_features) {
        kvm_cpu_max_instance_init(cpu);
    }

    kvm_cpu_xsave_init();
}

------

this might however cause further confusion later on, and I wonder if this is actually correct, should we _always_ enable x2apic when kvm_irqchip_is_split() returns true?
Even for cpu class "base"? I am not too sure.

Another option that comes to mind is to add a call to enable x2apic for max features cpus only ("host", "max") and not for base.

Thoughts? Paolo, Edoardo, anything comes to mind from your side?

Ciao,

Claudio


> 
> --- a/target/i386/kvm/kvm-cpu.c
> +++ b/target/i386/kvm/kvm-cpu.c
> @@ -161,7 +161,7 @@ static void kvm_cpu_instance_init(CPUState *cs)
>  
>      host_cpu_instance_init(cpu);
>  
> -    if (xcc->model) {
> +    if (1 || xcc->model) {
>          /* only applies to builtin_x86_defs cpus */
>          if (!kvm_irqchip_in_kernel()) {
>              x86_cpu_change_kvm_default("x2apic", "off");
> 
> 
>>> Any image to specifically test out? Would an actual 9 sockets machine be required to reproduce this?
> 
> No, but the more CPUs you have in the host the less you have to wait
> for 288 vCPUs to spin up :)
> 
> My test is:
> 
> ./qemu-system-x86_64 -machine q35,accel=kvm,usb=off,kernel_irqchip=split -cpu host -m 2G -smp sockets=9,cores=16,threads=2 -drive file=/var/lib/libvirt/images/fedora.qcow2,if=virtio -serial mon:stdio -display none  -kernel ~/git/linux/arch/x86/boot/bzImage  -append "console=ttyS0,115200 root=/dev/vda1" 
> 
> 
> I then play with the affinity of the AHCI MSI. Pointing it at CPU 255
> should show the problem. 
> 
> [root@localhost ~]# cd /proc/irq/313
> [root@localhost 313]# echo 255 > smp_affinity_list 
> [root@localhost 313]#
> [   65.365821] Composed MSI for APIC 255 vector 0x22: 0/feeff000 22
> [root@localhost 313]# grep ahci /proc/interrupts 
> 
> 
> I also added some debugging into host and guest kernels to be a little
> more explicit:
> 
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index b70344bf6600..53191db5145d 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -1866,6 +1866,7 @@ static __init void try_to_enable_x2apic(int remap_mode)
>  		 * used for non-remapped IRQ domains.
>  		 */
>  		if (x86_init.hyper.msi_ext_dest_id()) {
> +			pr_info("x2apic: support extended destination ID\n");
>  			virt_ext_dest_id = 1;
>  			apic_limit = 32767;
>  		}
> @@ -2539,6 +2540,7 @@ void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg,
>  		msg->arch_addr_lo.virt_destid_8_14 = cfg->dest_apicid >> 8;
>  	else
>  		WARN_ON_ONCE(cfg->dest_apicid > 0xFF);
> +	printk("Composed MSI for APIC %d vector 0x%x: %x/%x %x\n", cfg->dest_apicid, cfg->vector, msg->address_hi, msg->address_lo, msg->data);
>  }
>  
>  u32 x86_msi_msg_get_destid(struct msi_msg *msg, bool extid)
> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
> index 59abbdad7729..f0a7715763a2 100644
> --- a/arch/x86/kernel/kvm.c
> +++ b/arch/x86/kernel/kvm.c
> @@ -856,6 +856,8 @@ static void __init kvm_apic_init(void)
>  
>  static bool __init kvm_msi_ext_dest_id(void)
>  {
> +	printk("dest id? %d (%x)\n", kvm_para_has_feature(KVM_FEATURE_MSI_EXT_DEST_ID),
> +	       kvm_arch_para_features());
>  	return kvm_para_has_feature(KVM_FEATURE_MSI_EXT_DEST_ID);
>  }
>  
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 759952dd1222..defe6a843780 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -894,15 +894,21 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
>  {
>  	if (kvm->arch.x2apic_broadcast_quirk_disabled) {
>  		if ((irq->dest_id == APIC_BROADCAST &&
> -				map->mode != KVM_APIC_MODE_X2APIC))
> +		     map->mode != KVM_APIC_MODE_X2APIC)) {
> +			printk("dest %d mode %d makes bcast\n", irq->dest_id, map->mode);
>  			return true;
> -		if (irq->dest_id == X2APIC_BROADCAST)
> +		}
> +		if (irq->dest_id == X2APIC_BROADCAST)  {
> +			printk("Sent to X2APIC bcast\n");
>  			return true;
> +		}
>  	} else {
>  		bool x2apic_ipi = src && *src && apic_x2apic_mode(*src);
>  		if (irq->dest_id == (x2apic_ipi ?
> -		                     X2APIC_BROADCAST : APIC_BROADCAST))
> +		                     X2APIC_BROADCAST : APIC_BROADCAST)) {
> +			printk("no quirk dest %x\n", irq->dest_id);
>  			return true;
> +		}
>  	}
>  
>  	return false;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index d8f1d2169b45..5b0fd6d37a7e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -5714,6 +5714,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  		if (cap->args[0] & KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK)
>  			kvm->arch.x2apic_broadcast_quirk_disabled = true;
>  
> +		printk("X2APIC API: %x\n", cap->args[0]);
>  		r = 0;
>  		break;
>  	case KVM_CAP_X86_DISABLE_EXITS:
>
Claudio Fontana Nov. 29, 2021, 5:03 p.m. UTC | #5
On 11/29/21 5:57 PM, Claudio Fontana wrote:
> On 11/29/21 4:11 PM, David Woodhouse wrote:
>> On Mon, 2021-11-29 at 15:14 +0100, Claudio Fontana wrote:
>>> On 11/29/21 12:39 PM, Woodhouse, David wrote:
>>>> On Fri, 2021-07-23 at 13:29 +0200, Claudio Fontana wrote:
>>>>>  static void kvm_cpu_instance_init(CPUState *cs)
>>>>>  {
>>>>>      X86CPU *cpu = X86_CPU(cs);
>>>>> +    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
>>>>>
>>>>>      host_cpu_instance_init(cpu);
>>>>>
>>>>> -    if (!kvm_irqchip_in_kernel()) {
>>>>> -        x86_cpu_change_kvm_default("x2apic", "off");
>>>>> -    } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>>>>> -        x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>>>>> -    }
>>>>> -
>>>>> -    /* Special cases not set in the X86CPUDefinition structs: */
>>>>> +    if (xcc->model) {
>>>>> +        /* only applies to builtin_x86_defs cpus */
>>>>> +        if (!kvm_irqchip_in_kernel()) {
>>>>> +            x86_cpu_change_kvm_default("x2apic", "off");
>>>>> +        } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>>>>> +            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>>>>> +        }
>>>>>
>>>>> -    x86_cpu_apply_props(cpu, kvm_default_props);
>>>>> +        /* Special cases not set in the X86CPUDefinition structs: */
>>>>> +        x86_cpu_apply_props(cpu, kvm_default_props);
>>>>> +    }
>>>>>
>>>>
>>>> I think this causes a regression in x2apic and kvm-msi-ext-dest-id
>>>> support. If you start qemu thus:
>>>
>>> If I recall correctly, this change just tries to restore the behavior prior to
>>> commit f5cc5a5c168674f84bf061cdb307c2d25fba5448 ,
>>>
>>> fixing the issue introduced with the refactoring at that time.
>>>
>>> Can you try bisecting prior to
>>> f5cc5a5c168674f84bf061cdb307c2d25fba5448 , to see if the actual
>>> breakage comes from somewhere else?
>>
>> Hm, so it looks like it never worked for '-cpu host' *until* commit
>> f5cc5a5c16.
> 
> Right, so here we are talking about properly supporting this for the first time.
> 
> The fact that it works with f5cc5a5c16 is more an accident than anything else, that commit was clearly broken
> (exemplified by reports of failed boots).
> 
> So we need to find the proper solution, ie, exactly which features should be enabled for which cpu classes and models.
> 
>>
>> It didn't matter before c1bb5418e3 because you couldn't enable that
>> many vCPUs without an IOMMU, and the *IOMMU* setup would call
>> kvm_enable_x2apic().
>>
>> But after that, nothing ever called kvm_enable_x2apic() in the '-cpu
>> host' case until commit f5cc5a5c16, which fixed it... until you
>> restored the previous behaviour :)
>>
>> This "works" to fix this case, but presumably isn't correct:
> 
> Right, we cannot just enable all this code, or the original refactor would have been right.
> 
> These kvm default properties have been as far as I know intended for the cpu actual models (builtin_x86_defs),
> and not for the special cpu classes max, host and base. This is what the revert addresses.
> 
> I suspect what we actually need here is to review exactly in which specific cases kvm_enable_x2apic() should be called in the end.
> 
> The code there is mixing changes to the kvm_default_props that are then applied using x86_cpu_apply_props (and that part should be only for xcc->model != NULL),
> with the actual enablement of the kvm x2apic using kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags) via kvm_enable_x2apic().
> 
> One way is to ignore this detail and just move out those checks, since changes to kvm_default_props are harmless once we skip the x86_cpu_apply_props call,
> as such: 
> 
> ------
> 
> static void kvm_cpu_instance_init(CPUState *cs)
> {
>     X86CPU *cpu = X86_CPU(cs);
>     X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
> 
>     host_cpu_instance_init(cpu);
> 
>     /* only applies to builtin_x86_defs cpus */
>     if (!kvm_irqchip_in_kernel()) {
>         x86_cpu_change_kvm_default("x2apic", "off");
>     } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>         x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>     }
> 
>     if (xcc->model) {
>         /* Special cases not set in the X86CPUDefinition structs: */
>         x86_cpu_apply_props(cpu, kvm_default_props);
>     }
> 
>     if (cpu->max_features) {
>         kvm_cpu_max_instance_init(cpu);
>     }
> 
>     kvm_cpu_xsave_init();
> }
> 
> ------
> 
> this might however cause further confusion later on, and I wonder if this is actually correct, should we _always_ enable x2apic when kvm_irqchip_is_split() returns true?

... and only when kvm_irqchip_is_split() ?

> Even for cpu class "base"? I am not too sure.
> 
> Another option that comes to mind is to add a call to enable x2apic for max features cpus only ("host", "max") and not for base.
> 
> Thoughts? Paolo, Edoardo, anything comes to mind from your side?
> 
> Ciao,
> 
> Claudio
> 
> 
>>
>> --- a/target/i386/kvm/kvm-cpu.c
>> +++ b/target/i386/kvm/kvm-cpu.c
>> @@ -161,7 +161,7 @@ static void kvm_cpu_instance_init(CPUState *cs)
>>  
>>      host_cpu_instance_init(cpu);
>>  
>> -    if (xcc->model) {
>> +    if (1 || xcc->model) {
>>          /* only applies to builtin_x86_defs cpus */
>>          if (!kvm_irqchip_in_kernel()) {
>>              x86_cpu_change_kvm_default("x2apic", "off");
>>
>>
>>>> Any image to specifically test out? Would an actual 9 sockets machine be required to reproduce this?
>>
>> No, but the more CPUs you have in the host the less you have to wait
>> for 288 vCPUs to spin up :)
>>
>> My test is:
>>
>> ./qemu-system-x86_64 -machine q35,accel=kvm,usb=off,kernel_irqchip=split -cpu host -m 2G -smp sockets=9,cores=16,threads=2 -drive file=/var/lib/libvirt/images/fedora.qcow2,if=virtio -serial mon:stdio -display none  -kernel ~/git/linux/arch/x86/boot/bzImage  -append "console=ttyS0,115200 root=/dev/vda1" 
>>
>>
>> I then play with the affinity of the AHCI MSI. Pointing it at CPU 255
>> should show the problem. 
>>
>> [root@localhost ~]# cd /proc/irq/313
>> [root@localhost 313]# echo 255 > smp_affinity_list 
>> [root@localhost 313]#
>> [   65.365821] Composed MSI for APIC 255 vector 0x22: 0/feeff000 22
>> [root@localhost 313]# grep ahci /proc/interrupts 
>>
>>
>> I also added some debugging into host and guest kernels to be a little
>> more explicit:
>>
>> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
>> index b70344bf6600..53191db5145d 100644
>> --- a/arch/x86/kernel/apic/apic.c
>> +++ b/arch/x86/kernel/apic/apic.c
>> @@ -1866,6 +1866,7 @@ static __init void try_to_enable_x2apic(int remap_mode)
>>  		 * used for non-remapped IRQ domains.
>>  		 */
>>  		if (x86_init.hyper.msi_ext_dest_id()) {
>> +			pr_info("x2apic: support extended destination ID\n");
>>  			virt_ext_dest_id = 1;
>>  			apic_limit = 32767;
>>  		}
>> @@ -2539,6 +2540,7 @@ void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg,
>>  		msg->arch_addr_lo.virt_destid_8_14 = cfg->dest_apicid >> 8;
>>  	else
>>  		WARN_ON_ONCE(cfg->dest_apicid > 0xFF);
>> +	printk("Composed MSI for APIC %d vector 0x%x: %x/%x %x\n", cfg->dest_apicid, cfg->vector, msg->address_hi, msg->address_lo, msg->data);
>>  }
>>  
>>  u32 x86_msi_msg_get_destid(struct msi_msg *msg, bool extid)
>> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
>> index 59abbdad7729..f0a7715763a2 100644
>> --- a/arch/x86/kernel/kvm.c
>> +++ b/arch/x86/kernel/kvm.c
>> @@ -856,6 +856,8 @@ static void __init kvm_apic_init(void)
>>  
>>  static bool __init kvm_msi_ext_dest_id(void)
>>  {
>> +	printk("dest id? %d (%x)\n", kvm_para_has_feature(KVM_FEATURE_MSI_EXT_DEST_ID),
>> +	       kvm_arch_para_features());
>>  	return kvm_para_has_feature(KVM_FEATURE_MSI_EXT_DEST_ID);
>>  }
>>  
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 759952dd1222..defe6a843780 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -894,15 +894,21 @@ static bool kvm_apic_is_broadcast_dest(struct kvm *kvm, struct kvm_lapic **src,
>>  {
>>  	if (kvm->arch.x2apic_broadcast_quirk_disabled) {
>>  		if ((irq->dest_id == APIC_BROADCAST &&
>> -				map->mode != KVM_APIC_MODE_X2APIC))
>> +		     map->mode != KVM_APIC_MODE_X2APIC)) {
>> +			printk("dest %d mode %d makes bcast\n", irq->dest_id, map->mode);
>>  			return true;
>> -		if (irq->dest_id == X2APIC_BROADCAST)
>> +		}
>> +		if (irq->dest_id == X2APIC_BROADCAST)  {
>> +			printk("Sent to X2APIC bcast\n");
>>  			return true;
>> +		}
>>  	} else {
>>  		bool x2apic_ipi = src && *src && apic_x2apic_mode(*src);
>>  		if (irq->dest_id == (x2apic_ipi ?
>> -		                     X2APIC_BROADCAST : APIC_BROADCAST))
>> +		                     X2APIC_BROADCAST : APIC_BROADCAST)) {
>> +			printk("no quirk dest %x\n", irq->dest_id);
>>  			return true;
>> +		}
>>  	}
>>  
>>  	return false;
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index d8f1d2169b45..5b0fd6d37a7e 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -5714,6 +5714,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>>  		if (cap->args[0] & KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK)
>>  			kvm->arch.x2apic_broadcast_quirk_disabled = true;
>>  
>> +		printk("X2APIC API: %x\n", cap->args[0]);
>>  		r = 0;
>>  		break;
>>  	case KVM_CAP_X86_DISABLE_EXITS:
>>
>
David Woodhouse Nov. 29, 2021, 5:17 p.m. UTC | #6
On Mon, 2021-11-29 at 17:57 +0100, Claudio Fontana wrote:
> On 11/29/21 4:11 PM, David Woodhouse wrote:
> > On Mon, 2021-11-29 at 15:14 +0100, Claudio Fontana wrote:
> > > On 11/29/21 12:39 PM, Woodhouse, David wrote:
> > > > On Fri, 2021-07-23 at 13:29 +0200, Claudio Fontana wrote:
> > > > >  static void kvm_cpu_instance_init(CPUState *cs)
> > > > >  {
> > > > >      X86CPU *cpu = X86_CPU(cs);
> > > > > +    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
> > > > > 
> > > > >      host_cpu_instance_init(cpu);
> > > > > 
> > > > > -    if (!kvm_irqchip_in_kernel()) {
> > > > > -        x86_cpu_change_kvm_default("x2apic", "off");
> > > > > -    } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
> > > > > -        x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
> > > > > -    }
> > > > > -
> > > > > -    /* Special cases not set in the X86CPUDefinition structs: */
> > > > > +    if (xcc->model) {
> > > > > +        /* only applies to builtin_x86_defs cpus */
> > > > > +        if (!kvm_irqchip_in_kernel()) {
> > > > > +            x86_cpu_change_kvm_default("x2apic", "off");
> > > > > +        } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
> > > > > +            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
> > > > > +        }
> > > > > 
> > > > > -    x86_cpu_apply_props(cpu, kvm_default_props);
> > > > > +        /* Special cases not set in the X86CPUDefinition structs: */
> > > > > +        x86_cpu_apply_props(cpu, kvm_default_props);
> > > > > +    }
> > > > > 
> > > > 
> > > > I think this causes a regression in x2apic and kvm-msi-ext-dest-id
> > > > support. If you start qemu thus:
> > > 
> > > If I recall correctly, this change just tries to restore the behavior prior to
> > > commit f5cc5a5c168674f84bf061cdb307c2d25fba5448 ,
> > > 
> > > fixing the issue introduced with the refactoring at that time.
> > > 
> > > Can you try bisecting prior to
> > > f5cc5a5c168674f84bf061cdb307c2d25fba5448 , to see if the actual
> > > breakage comes from somewhere else?
> > 
> > Hm, so it looks like it never worked for '-cpu host' *until* commit
> > f5cc5a5c16.
> 
> Right, so here we are talking about properly supporting this for the first time.
> 
> The fact that it works with f5cc5a5c16 is more an accident than anything else, that commit was clearly broken
> (exemplified by reports of failed boots).
> 
> So we need to find the proper solution, ie, exactly which features should be enabled for which cpu classes and models.
> 
> > 
> > It didn't matter before c1bb5418e3 because you couldn't enable that
> > many vCPUs without an IOMMU, and the *IOMMU* setup would call
> > kvm_enable_x2apic().
> > 
> > But after that, nothing ever called kvm_enable_x2apic() in the '-cpu
> > host' case until commit f5cc5a5c16, which fixed it... until you
> > restored the previous behaviour :)
> > 
> > This "works" to fix this case, but presumably isn't correct:
> 
> Right, we cannot just enable all this code, or the original refactor would have been right.
> 
> These kvm default properties have been as far as I know intended for the cpu actual models (builtin_x86_defs),
> and not for the special cpu classes max, host and base. This is what the revert addresses.
> 
> I suspect what we actually need here is to review exactly in which specific cases kvm_enable_x2apic() should be called in the end.
> 
> The code there is mixing changes to the kvm_default_props that are then applied using x86_cpu_apply_props (and that part should be only for xcc->model != NULL),
> with the actual enablement of the kvm x2apic using kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags) via kvm_enable_x2apic().
> 
> One way is to ignore this detail and just move out those checks, since changes to kvm_default_props are harmless once we skip the x86_cpu_apply_props call,
> as such: 
> 
> ------
> 
> static void kvm_cpu_instance_init(CPUState *cs)
> {
>     X86CPU *cpu = X86_CPU(cs);
>     X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
> 
>     host_cpu_instance_init(cpu);
> 
>     /* only applies to builtin_x86_defs cpus */
>     if (!kvm_irqchip_in_kernel()) {
>         x86_cpu_change_kvm_default("x2apic", "off");
>     } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>         x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>     }
> 
>     if (xcc->model) {
>         /* Special cases not set in the X86CPUDefinition structs: */
>         x86_cpu_apply_props(cpu, kvm_default_props);
>     }
> 

I don't believe that works in the case when kvm_enable_x2apic() fails
on an older kernel. Although it sets the defaults, it still doesn't
then *apply* them so it makes no difference.

How about this:

--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -739,9 +739,9 @@ void pc_machine_done(Notifier *notifier, void *data)
 
 
     if (x86ms->apic_id_limit > 255 && !xen_enabled() &&
-        !kvm_irqchip_in_kernel()) {
+        (!kvm_irqchip_in_kernel() || !kvm_enable_x2apic())) {
         error_report("current -smp configuration requires kernel "
-                     "irqchip support.");
+                     "irqchip and X2APIC API support.");
         exit(EXIT_FAILURE);
     }
 }
Claudio Fontana Nov. 29, 2021, 7:10 p.m. UTC | #7
On 11/29/21 6:17 PM, David Woodhouse wrote:
> On Mon, 2021-11-29 at 17:57 +0100, Claudio Fontana wrote:
>> On 11/29/21 4:11 PM, David Woodhouse wrote:
>>> On Mon, 2021-11-29 at 15:14 +0100, Claudio Fontana wrote:
>>>> On 11/29/21 12:39 PM, Woodhouse, David wrote:
>>>>> On Fri, 2021-07-23 at 13:29 +0200, Claudio Fontana wrote:
>>>>>>  static void kvm_cpu_instance_init(CPUState *cs)
>>>>>>  {
>>>>>>      X86CPU *cpu = X86_CPU(cs);
>>>>>> +    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
>>>>>>
>>>>>>      host_cpu_instance_init(cpu);
>>>>>>
>>>>>> -    if (!kvm_irqchip_in_kernel()) {
>>>>>> -        x86_cpu_change_kvm_default("x2apic", "off");
>>>>>> -    } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>>>>>> -        x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>>>>>> -    }
>>>>>> -
>>>>>> -    /* Special cases not set in the X86CPUDefinition structs: */
>>>>>> +    if (xcc->model) {
>>>>>> +        /* only applies to builtin_x86_defs cpus */
>>>>>> +        if (!kvm_irqchip_in_kernel()) {
>>>>>> +            x86_cpu_change_kvm_default("x2apic", "off");
>>>>>> +        } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>>>>>> +            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>>>>>> +        }
>>>>>>
>>>>>> -    x86_cpu_apply_props(cpu, kvm_default_props);
>>>>>> +        /* Special cases not set in the X86CPUDefinition structs: */
>>>>>> +        x86_cpu_apply_props(cpu, kvm_default_props);
>>>>>> +    }
>>>>>>
>>>>>
>>>>> I think this causes a regression in x2apic and kvm-msi-ext-dest-id
>>>>> support. If you start qemu thus:
>>>>
>>>> If I recall correctly, this change just tries to restore the behavior prior to
>>>> commit f5cc5a5c168674f84bf061cdb307c2d25fba5448 ,
>>>>
>>>> fixing the issue introduced with the refactoring at that time.
>>>>
>>>> Can you try bisecting prior to
>>>> f5cc5a5c168674f84bf061cdb307c2d25fba5448 , to see if the actual
>>>> breakage comes from somewhere else?
>>>
>>> Hm, so it looks like it never worked for '-cpu host' *until* commit
>>> f5cc5a5c16.
>>
>> Right, so here we are talking about properly supporting this for the first time.
>>
>> The fact that it works with f5cc5a5c16 is more an accident than anything else, that commit was clearly broken
>> (exemplified by reports of failed boots).
>>
>> So we need to find the proper solution, ie, exactly which features should be enabled for which cpu classes and models.
>>
>>>
>>> It didn't matter before c1bb5418e3 because you couldn't enable that
>>> many vCPUs without an IOMMU, and the *IOMMU* setup would call
>>> kvm_enable_x2apic().
>>>
>>> But after that, nothing ever called kvm_enable_x2apic() in the '-cpu
>>> host' case until commit f5cc5a5c16, which fixed it... until you
>>> restored the previous behaviour :)
>>>
>>> This "works" to fix this case, but presumably isn't correct:
>>
>> Right, we cannot just enable all this code, or the original refactor would have been right.
>>
>> These kvm default properties have been as far as I know intended for the cpu actual models (builtin_x86_defs),
>> and not for the special cpu classes max, host and base. This is what the revert addresses.
>>
>> I suspect what we actually need here is to review exactly in which specific cases kvm_enable_x2apic() should be called in the end.
>>
>> The code there is mixing changes to the kvm_default_props that are then applied using x86_cpu_apply_props (and that part should be only for xcc->model != NULL),
>> with the actual enablement of the kvm x2apic using kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags) via kvm_enable_x2apic().
>>
>> One way is to ignore this detail and just move out those checks, since changes to kvm_default_props are harmless once we skip the x86_cpu_apply_props call,
>> as such: 
>>
>> ------
>>
>> static void kvm_cpu_instance_init(CPUState *cs)
>> {
>>     X86CPU *cpu = X86_CPU(cs);
>>     X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
>>
>>     host_cpu_instance_init(cpu);
>>
>>     /* only applies to builtin_x86_defs cpus */
>>     if (!kvm_irqchip_in_kernel()) {
>>         x86_cpu_change_kvm_default("x2apic", "off");
>>     } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
>>         x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>>     }
>>
>>     if (xcc->model) {
>>         /* Special cases not set in the X86CPUDefinition structs: */
>>         x86_cpu_apply_props(cpu, kvm_default_props);
>>     }
>>
> 
> I don't believe that works in the case when kvm_enable_x2apic() fails
> on an older kernel. Although it sets the defaults, it still doesn't
> then *apply* them so it makes no difference.

Hmm I thought what you actually care for, for cpu "host", is just the kvm_enable_x2apic() call, not the kvm_default_props.

Do you also expect the kvm_default_prop "kvm-msi-ext-dest-id" to be switch to "on" and applied?

kvm_default_props were never applied to cpus without an x86 model definition (except for that brief time when I did it by mistake).


> 
> How about this:
> 
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -739,9 +739,9 @@ void pc_machine_done(Notifier *notifier, void *data)
>  
>  
>      if (x86ms->apic_id_limit > 255 && !xen_enabled() &&
> -        !kvm_irqchip_in_kernel()) {
> +        (!kvm_irqchip_in_kernel() || !kvm_enable_x2apic())) {
>          error_report("current -smp configuration requires kernel "
> -                     "irqchip support.");
> +                     "irqchip and X2APIC API support.");
>          exit(EXIT_FAILURE);
>      }
>  }
> 

Interesting. This would leave things like microvm out right? But maybe it's ok?

Ciao,

Claudio
David Woodhouse Nov. 29, 2021, 7:19 p.m. UTC | #8
On Mon, 2021-11-29 at 20:10 +0100, Claudio Fontana wrote:
> 
> Hmm I thought what you actually care for, for cpu "host", is just the kvm_enable_x2apic() call, not the kvm_default_props.
> 
> 
> 
> Do you also expect the kvm_default_prop "kvm-msi-ext-dest-id" to be switch to "on" and applied?

It's already on today. It just isn't *true* because QEMU never called
kvm_enable_x2apic().

So what I care about (in case ∃ APIC IDs >= 255) is two things:

 1. Qemu needs to call kvm_enable_x2apic().
 2. If that *fails* qemu needs to *stop* advertising X2APIC and ext-dest-id.


That last patch snippet in pc_machine_done() should suffice to achieve
that, I think. Because if kvm_enable_x2apic() fails and qemu has been
asked for that many CPUs, it aborts completely. Which seems right.
Claudio Fontana Nov. 29, 2021, 7:55 p.m. UTC | #9
On 11/29/21 8:19 PM, David Woodhouse wrote:
> On Mon, 2021-11-29 at 20:10 +0100, Claudio Fontana wrote:
>>
>> Hmm I thought what you actually care for, for cpu "host", is just the kvm_enable_x2apic() call, not the kvm_default_props.
>>
>>
>>
>> Do you also expect the kvm_default_prop "kvm-msi-ext-dest-id" to be switch to "on" and applied?
> 
> It's already on today. It just isn't *true* because QEMU never called
> kvm_enable_x2apic().


property should be on, but not by setting in kvm_default_prop / applied via kvm_default_prop, that mechanism is for the versioned cpu models,
which use X86CPUModel / X86CPUDefinition , and "host" isn't one of them.

Out of curiosity, does my previous snippet actually work? Not that I am sure it is the best solution,
just for my understanding. It would be surprising to me that the need to actually manually apply "kvm-msi-ext-dest-id" to "on" there.
 
> 
> So what I care about (in case ∃ APIC IDs >= 255) is two things:
> 
>  1. Qemu needs to call kvm_enable_x2apic().
>  2. If that *fails* qemu needs to *stop* advertising X2APIC and ext-dest-id.
> 
> 
> That last patch snippet in pc_machine_done() should suffice to achieve
> that, I think. Because if kvm_enable_x2apic() fails and qemu has been
> asked for that many CPUs, it aborts completely. Which seems right.
> 

seems right to abort if requesting > 255 APIC IDs cannot be satisfied, I agree.

So I think in the end, we want to:

1) make sure that when accel=kvm and smp > 255 for i386, using cpu "host", kvm_enable_x2apic() is called and successful.

2) in addressing requirement 1), we do not break something else (other machines, other cpu classes/models, TCG, ...).

3) as a plus we might want to cleanup and determine once and for all where kvm_enable_x2apic() should be called:
   we have calls in intel_iommu.c and in the kvm cpu class instance initialization here in kvm-cpu.c today:
   before adding a third call we should really ask ourselves where the proper initialization of this should happen.

Let me know about the previous snippet, and I'd really look for other comments from Eduardo or Paolo at this point, regarding the "what should be" question.


Ciao,

Claudio
David Woodhouse Nov. 29, 2021, 8:29 p.m. UTC | #10
On Mon, 2021-11-29 at 20:55 +0100, Claudio Fontana wrote:
> On 11/29/21 8:19 PM, David Woodhouse wrote:
> > On Mon, 2021-11-29 at 20:10 +0100, Claudio Fontana wrote:
> > > 
> > > Hmm I thought what you actually care for, for cpu "host", is just the kvm_enable_x2apic() call, not the kvm_default_props.
> > > 
> > > 
> > > 
> > > Do you also expect the kvm_default_prop "kvm-msi-ext-dest-id" to be switch to "on" and applied?
> > 
> > It's already on today. It just isn't *true* because QEMU never called
> > kvm_enable_x2apic().
> 
> 
> property should be on, but not by setting in kvm_default_prop / applied via kvm_default_prop, that mechanism is for the versioned cpu models,
> which use X86CPUModel / X86CPUDefinition , and "host" isn't one of them.
> 
> Out of curiosity, does my previous snippet actually work? Not that I am sure it is the best solution,
> just for my understanding. It would be surprising to me that the need to actually manually apply "kvm-msi-ext-dest-id" to "on" there.
> 

This one?

--- a/target/i386/kvm/kvm-cpu.c
+++ b/target/i386/kvm/kvm-cpu.c
@@ -161,14 +161,14 @@ static void kvm_cpu_instance_init(CPUState *cs)
 
     host_cpu_instance_init(cpu);
 
-    if (xcc->model) {
         /* only applies to builtin_x86_defs cpus */
         if (!kvm_irqchip_in_kernel()) {
             x86_cpu_change_kvm_default("x2apic", "off");
         } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
-            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
+               x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
         }
 
+    if (xcc->model) {
         /* Special cases not set in the X86CPUDefinition structs: */
         x86_cpu_apply_props(cpu, kvm_default_props);
     }

Note that in today's HEAD we already advertise X2APIC and ext-dest-id
to the '-cpu host' guest; it's just not *true* because we never call
kvm_enable_x2apic().

So yes, the above works on a modern kernel where kvm_enable_x2apic()
succeeds. But that's the easy case.

Where your snippet *won't* work is in the case of running on an old
kernel where kvm_enable_x2apic() fails.

In that case it needs to turn x2apic support *off*. But simply calling
(or not calling) x86_cpu_change_kvm_default() makes absolutely no
difference unless those defaults are *applied* by calling
x86_cpu_apply_props() or making the same change by some other means.


> > So what I care about (in case ∃ APIC IDs >= 255) is two things:
> > 
> >  1. Qemu needs to call kvm_enable_x2apic().
> >  2. If that *fails* qemu needs to *stop* advertising X2APIC and ext-dest-id.
> > 
> > 
> > That last patch snippet in pc_machine_done() should suffice to achieve
> > that, I think. Because if kvm_enable_x2apic() fails and qemu has been
> > asked for that many CPUs, it aborts completely. Which seems right.
> > 
> 
> seems right to abort if requesting > 255 APIC IDs cannot be satisfied, I agree.
> 
> So I think in the end, we want to:
> 
> 1) make sure that when accel=kvm and smp > 255 for i386, using cpu "host", kvm_enable_x2apic() is called and successful.
> 
> 2) in addressing requirement 1), we do not break something else (other machines, other cpu classes/models, TCG, ...).
> 
> 3) as a plus we might want to cleanup and determine once and for all where kvm_enable_x2apic() should be called:
>    we have calls in intel_iommu.c and in the kvm cpu class instance initialization here in kvm-cpu.c today:
>    before adding a third call we should really ask ourselves where the proper initialization of this should happen.
> 

I think the existing two calls to kvm_enable_x2apic() become mostly
redundant. Because in fact the vtd_decide_config() and
kvm_cpu_instance_init() callers would both by perfectly OK without
kvm_enable_x2apic() if there isn't a CPU with an APIC ID >= 255
anyway. 

And that means that with my patch, pc_machine_done() will have
*aborted* if their conditions aren't met.

But then again, if since kvm_enable_x2apic() is both the initial
initialisation *and* a cached sanity check that it has indeed been
enabled successfully, there perhaps isn't any *harm* in having them do
the check for themselves?
Claudio Fontana Nov. 30, 2021, 9 a.m. UTC | #11
On 11/29/21 9:29 PM, David Woodhouse wrote:
> On Mon, 2021-11-29 at 20:55 +0100, Claudio Fontana wrote:
>> On 11/29/21 8:19 PM, David Woodhouse wrote:
>>> On Mon, 2021-11-29 at 20:10 +0100, Claudio Fontana wrote:
>>>>
>>>> Hmm I thought what you actually care for, for cpu "host", is just the kvm_enable_x2apic() call, not the kvm_default_props.
>>>>
>>>>
>>>>
>>>> Do you also expect the kvm_default_prop "kvm-msi-ext-dest-id" to be switch to "on" and applied?
>>>
>>> It's already on today. It just isn't *true* because QEMU never called
>>> kvm_enable_x2apic().
>>
>>
>> property should be on, but not by setting in kvm_default_prop / applied via kvm_default_prop, that mechanism is for the versioned cpu models,
>> which use X86CPUModel / X86CPUDefinition , and "host" isn't one of them.
>>
>> Out of curiosity, does my previous snippet actually work? Not that I am sure it is the best solution,
>> just for my understanding. It would be surprising to me that the need to actually manually apply "kvm-msi-ext-dest-id" to "on" there.
>>
> 
> This one?
> 
> --- a/target/i386/kvm/kvm-cpu.c
> +++ b/target/i386/kvm/kvm-cpu.c
> @@ -161,14 +161,14 @@ static void kvm_cpu_instance_init(CPUState *cs)
>  
>      host_cpu_instance_init(cpu);
>  
> -    if (xcc->model) {
>          /* only applies to builtin_x86_defs cpus */
>          if (!kvm_irqchip_in_kernel()) {
>              x86_cpu_change_kvm_default("x2apic", "off");
>          } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
> -            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
> +               x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
>          }
>  
> +    if (xcc->model) {
>          /* Special cases not set in the X86CPUDefinition structs: */
>          x86_cpu_apply_props(cpu, kvm_default_props);
>      }
> 
> Note that in today's HEAD we already advertise X2APIC and ext-dest-id
> to the '-cpu host' guest; it's just not *true* because we never call
> kvm_enable_x2apic().

This is clear to me. The move of the code there is simply to:

1) make sure that, for non-host, versioned cpu models, we continue to set the kvm_default_props and apply them in the same way.

2) for the host cpu, make sure that kvm_enable_x2apic() is called. x86_cpu_change_kvm_default is completely irrelevant for host cpu, as you have noted.

You are right that we continue to not handle the error path correctly on kvm_enable_x2apic failure, for both smp > 255 and <= 255.  

> 
> So yes, the above works on a modern kernel where kvm_enable_x2apic()
> succeeds. But that's the easy case.
> 
> Where your snippet *won't* work is in the case of running on an old
> kernel where kvm_enable_x2apic() fails.


I tend to agree that what we want if kvm_enable_x2apic fails is to abort QEMU if we have been requesting smp > 255,
and if we did not request smp > 255 cpus, we want to not advertise the feature.

This is not accomplished, neither by my snippet above, not by the existing code at any point in git history, and not by yours in itself.

Your change seems to accomplish the call to kvm_enable_x2apic, and abort if requested smp > 255,
but it does not stop advertising X2APIC and ext-dest-id on kvm_enable_x2apic failure for the case < 255, so we'd need to add that.

Do I understand it right? Do we need to wrap all of this logic in a if (kvm_enabled()) ?

> 
> In that case it needs to turn x2apic support *off*. But simply calling
> (or not calling) x86_cpu_change_kvm_default() makes absolutely no
> difference unless those defaults are *applied* by calling
> x86_cpu_apply_props()

right, it makes absolutely no difference, and we cannot use kvm_default_props, as they are for something else entirely.

> or making the same change by some other means.

right, we need to change it by other means.

It is still unclear to me for which cpu classes and versioned models we should behave like this. Thoughts?

"max"? "base"? versioned models: depending on the model features?

> 
> 
>>> So what I care about (in case ∃ APIC IDs >= 255) is two things:
>>>
>>>  1. Qemu needs to call kvm_enable_x2apic().
>>>  2. If that *fails* qemu needs to *stop* advertising X2APIC and ext-dest-id.

Understand. We also need though:

3. Not call kvm_enable_x2apic() when it should not be called (non-KVM accelerator, which cpu classes and models)
4. Not stop advertising X2APIC and ext-dest-id when we shouldn't stop advertising it.

>>>
>>>
>>> That last patch snippet in pc_machine_done() should suffice to achieve
>>> that, I think. Because if kvm_enable_x2apic() fails and qemu has been
>>> asked for that many CPUs, it aborts completely. Which seems right.

see comments above, and should we limit that code to when kvm is enabled?

>>>
>>
>> seems right to abort if requesting > 255 APIC IDs cannot be satisfied, I agree.
>>
>> So I think in the end, we want to:
>>
>> 1) make sure that when accel=kvm and smp > 255 for i386, using cpu "host", kvm_enable_x2apic() is called and successful.
>>
>> 2) in addressing requirement 1), we do not break something else (other machines, other cpu classes/models, TCG, ...).
>>
>> 3) as a plus we might want to cleanup and determine once and for all where kvm_enable_x2apic() should be called:
>>    we have calls in intel_iommu.c and in the kvm cpu class instance initialization here in kvm-cpu.c today:
>>    before adding a third call we should really ask ourselves where the proper initialization of this should happen.
>>
> 
> I think the existing two calls to kvm_enable_x2apic() become mostly
> redundant. Because in fact the vtd_decide_config() and
> kvm_cpu_instance_init() callers would both by perfectly OK without
> kvm_enable_x2apic() if there isn't a CPU with an APIC ID >= 255
> anyway. 
> 
> And that means that with my patch, pc_machine_done() will have
> *aborted* if their conditions aren't met.

I think it is good to abort early if we figure out that the user request of APIC ID >= 255 cannot be satisfied. 

> 
> But then again, if since kvm_enable_x2apic() is both the initial
> initialisation *and* a cached sanity check that it has indeed been
> enabled successfully, there perhaps isn't any *harm* in having them do
> the check for themselves?
> 

Well the harm in my mind is, do we need to handle the error condition correctly at each single place? 
Do we risk slightly different behavior and advertised features depending on where the call happens?

Seems that we can reduce the complexity and long term risk by handling things in one single place, if we definitely find what that place should be.

Thanks,

Claudio
David Woodhouse Nov. 30, 2021, 12:13 p.m. UTC | #12
On Tue, 2021-11-30 at 10:00 +0100, Claudio Fontana wrote:
> I tend to agree that what we want if kvm_enable_x2apic fails is to abort QEMU if we have been requesting smp > 255,
> and if we did not request smp > 255 cpus, we want to not advertise the feature.
> 
> This is not accomplished, neither by my snippet above, not by the existing code at any point in git history, and not by yours in itself.
> 
> Your change seems to accomplish the call to kvm_enable_x2apic, and abort if requested smp > 255,
> but it does not stop advertising X2APIC and ext-dest-id on kvm_enable_x2apic failure for the case < 255, so we'd need to add that.

We don't need kvm_enable_x2apic() at all if all APIC IDs are <255.

The kvm_enable_x2apic() call sets two flags. The first (USE_32BIT_IDS)
makes KVM take the high bits of the target APIC ID from the high bits
of the MSI address. So is only relevant if we ever want to deliver
interrupts to CPUs above 255.

The second (DISABLE_BROADCAST_QUIRK) prevents APIC ID 255 from being
interpreted as a broadcast. This is relevant if you ever want to
deliver interrupts to APIC #255.

So we need kvm_enable_x2apic() *if* we have APICs >= 255, but we don't
care about it if we have fewer CPUs.

Note the condition is '>= 255'. Not '> 255'.

With APIC IDs < 256 it also doesn't matter whether we advertise the
ext-dest-id feature to the guest or not, since that only tells them
where to put the upper bits... and there aren't any upper bits.

> Do I understand it right? Do we need to wrap all of this logic in a if (kvm_enabled()) ?

For Xen because we aren't really emulating CPUs or APICs at all, it
doesn't matter and you can have as many CPUs as you like.

For TCG or (KVM && !kvm_irqchip_in_kernel()) we are limited to 254
because our userspace lapic doesn't emulate x2apic at all. But in the
TCG case, even if KVM isn't *built*, kvm_irqchip_in_kernel() will
return false. So all we need to check is kvm_irqchip_in_kernel().

I think the correct check is what I had in pc_machine_done() with the
off-by-one error fixed:

    if (x86ms->apic_id_limit >= 255 && !xen_enabled() &&
        (!kvm_irqchip_in_kernel() || !kvm_enable_x2apic())) {

But that doesn't help microvm unless we copy it there too. 

Let's take a look at the code we were already looking at in
kvm_cpu_instance_init():

        /* only applies to builtin_x86_defs cpus */
        if (!kvm_irqchip_in_kernel()) {
            x86_cpu_change_kvm_default("x2apic", "off");

That part is purely cosmetic because kvm_arch_get_supported_cpuid() is
*already* going to mask out the X2APIC bit if(!kvm_irqchip_in_kernel())
anyway.

        } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
		x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
        }

That part makes sense but there's a check missing here because even
*without* ext-dest-id we can't support APIC ID 255 without using
kvm_enable_x2apic().

And by the time we're in kvm_cpu_instance_init() for the CPU with APIC
ID #255, it's too late to disable x2apic support for all the previous
CPUs.

So... we either do the check in pc_machine_done() and also a similar
check for microvm, or we actually abort in kvm_cpu_instance_init() if
this CPU's APIC ID is >=255 and kvm_enable_x2apic() has failed.

Either way, the call in intel_iommu can die.




> > 
> > In that case it needs to turn x2apic support *off*. But simply calling
> > (or not calling) x86_cpu_change_kvm_default() makes absolutely no
> > difference unless those defaults are *applied* by calling
> > x86_cpu_apply_props()
> 
> right, it makes absolutely no difference, and we cannot use kvm_default_props, as they are for something else entirely.
> 
> > or making the same change by some other means.
> 
> right, we need to change it by other means.
> 
> It is still unclear to me for which cpu classes and versioned models we should behave like this. Thoughts?
> 
> "max"? "base"? versioned models: depending on the model features?
> 
> > 
> > 
> > > > So what I care about (in case ∃ APIC IDs >= 255) is two things:
> > > > 
> > > >  1. Qemu needs to call kvm_enable_x2apic().
> > > >  2. If that *fails* qemu needs to *stop* advertising X2APIC and ext-dest-id.
> 
> Understand. We also need though:
> 
> 3. Not call kvm_enable_x2apic() when it should not be called (non-KVM accelerator, which cpu classes and models)
> 4. Not stop advertising X2APIC and ext-dest-id when we shouldn't stop advertising it.
> 
> > > > 
> > > > 
> > > > That last patch snippet in pc_machine_done() should suffice to achieve
> > > > that, I think. Because if kvm_enable_x2apic() fails and qemu has been
> > > > asked for that many CPUs, it aborts completely. Which seems right.
> 
> see comments above, and should we limit that code to when kvm is enabled?
> 
> > > > 
> > > 
> > > seems right to abort if requesting > 255 APIC IDs cannot be satisfied, I agree.
> > > 
> > > So I think in the end, we want to:
> > > 
> > > 1) make sure that when accel=kvm and smp > 255 for i386, using cpu "host", kvm_enable_x2apic() is called and successful.
> > > 
> > > 2) in addressing requirement 1), we do not break something else (other machines, other cpu classes/models, TCG, ...).
> > > 
> > > 3) as a plus we might want to cleanup and determine once and for all where kvm_enable_x2apic() should be called:
> > >    we have calls in intel_iommu.c and in the kvm cpu class instance initialization here in kvm-cpu.c today:
> > >    before adding a third call we should really ask ourselves where the proper initialization of this should happen.
> > > 
> > 
> > I think the existing two calls to kvm_enable_x2apic() become mostly
> > redundant. Because in fact the vtd_decide_config() and
> > kvm_cpu_instance_init() callers would both by perfectly OK without
> > kvm_enable_x2apic() if there isn't a CPU with an APIC ID >= 255
> > anyway. 
> > 
> > And that means that with my patch, pc_machine_done() will have
> > *aborted* if their conditions aren't met.
> 
> I think it is good to abort early if we figure out that the user request of APIC ID >= 255 cannot be satisfied. 
> 
> > 
> > But then again, if since kvm_enable_x2apic() is both the initial
> > initialisation *and* a cached sanity check that it has indeed been
> > enabled successfully, there perhaps isn't any *harm* in having them do
> > the check for themselves?
> > 
> 
> Well the harm in my mind is, do we need to handle the error condition correctly at each single place? 
> Do we risk slightly different behavior and advertised features depending on where the call happens?
> 
> Seems that we can reduce the complexity and long term risk by handling things in one single place, if we definitely find what that place should be.
> 
> Thanks,
> 
> Claudio
>
diff mbox series

Patch

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 48b55ebd0a..edb97ebbbe 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -4919,6 +4919,9 @@  static uint64_t x86_cpu_get_supported_feature_word(FeatureWord w,
     return r;
 }
 
+/*
+ * Only for builtin_x86_defs models initialized with x86_register_cpudef_types.
+ */
 void x86_cpu_apply_props(X86CPU *cpu, PropValue *props)
 {
     PropValue *pv;
@@ -4931,7 +4934,11 @@  void x86_cpu_apply_props(X86CPU *cpu, PropValue *props)
     }
 }
 
-/* Apply properties for the CPU model version specified in model */
+/*
+ * Apply properties for the CPU model version specified in model.
+ * Only for builtin_x86_defs models initialized with x86_register_cpudef_types.
+ */
+
 static void x86_cpu_apply_version_props(X86CPU *cpu, X86CPUModel *model)
 {
     const X86CPUVersionDefinition *vdef;
@@ -4960,7 +4967,9 @@  static void x86_cpu_apply_version_props(X86CPU *cpu, X86CPUModel *model)
     assert(vdef->version == version);
 }
 
-/* Load data from X86CPUDefinition into a X86CPU object
+/*
+ * Load data from X86CPUDefinition into a X86CPU object.
+ * Only for builtin_x86_defs models initialized with x86_register_cpudef_types.
  */
 static void x86_cpu_load_model(X86CPU *cpu, X86CPUModel *model)
 {
@@ -5051,6 +5060,12 @@  static void x86_register_cpu_model_type(const char *name, X86CPUModel *model)
     type_register(&ti);
 }
 
+
+/*
+ * register builtin_x86_defs;
+ * "max", "base" and subclasses ("host") are not registered here.
+ * See x86_cpu_register_types for all model registrations.
+ */
 static void x86_register_cpudef_types(const X86CPUDefinition *def)
 {
     X86CPUModel *m;
diff --git a/target/i386/host-cpu.c b/target/i386/host-cpu.c
index 4ea9e354ea..10f8aba86e 100644
--- a/target/i386/host-cpu.c
+++ b/target/i386/host-cpu.c
@@ -150,13 +150,16 @@  void host_cpu_vendor_fms(char *vendor, int *family, int *model, int *stepping)
 
 void host_cpu_instance_init(X86CPU *cpu)
 {
-    uint32_t ebx = 0, ecx = 0, edx = 0;
-    char vendor[CPUID_VENDOR_SZ + 1];
+    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
 
-    host_cpuid(0, 0, NULL, &ebx, &ecx, &edx);
-    x86_cpu_vendor_words2str(vendor, ebx, edx, ecx);
+    if (xcc->model) {
+        uint32_t ebx = 0, ecx = 0, edx = 0;
+        char vendor[CPUID_VENDOR_SZ + 1];
 
-    object_property_set_str(OBJECT(cpu), "vendor", vendor, &error_abort);
+        host_cpuid(0, 0, NULL, &ebx, &ecx, &edx);
+        x86_cpu_vendor_words2str(vendor, ebx, edx, ecx);
+        object_property_set_str(OBJECT(cpu), "vendor", vendor, &error_abort);
+    }
 }
 
 void host_cpu_max_instance_init(X86CPU *cpu)
diff --git a/target/i386/kvm/kvm-cpu.c b/target/i386/kvm/kvm-cpu.c
index bbe817764d..d95028018e 100644
--- a/target/i386/kvm/kvm-cpu.c
+++ b/target/i386/kvm/kvm-cpu.c
@@ -52,47 +52,6 @@  static bool kvm_cpu_realizefn(CPUState *cs, Error **errp)
     return host_cpu_realizefn(cs, errp);
 }
 
-/*
- * KVM-specific features that are automatically added/removed
- * from all CPU models when KVM is enabled.
- *
- * NOTE: features can be enabled by default only if they were
- *       already available in the oldest kernel version supported
- *       by the KVM accelerator (see "OS requirements" section at
- *       docs/system/target-i386.rst)
- */
-static PropValue kvm_default_props[] = {
-    { "kvmclock", "on" },
-    { "kvm-nopiodelay", "on" },
-    { "kvm-asyncpf", "on" },
-    { "kvm-steal-time", "on" },
-    { "kvm-pv-eoi", "on" },
-    { "kvmclock-stable-bit", "on" },
-    { "x2apic", "on" },
-    { "kvm-msi-ext-dest-id", "off" },
-    { "acpi", "off" },
-    { "monitor", "off" },
-    { "svm", "off" },
-    { NULL, NULL },
-};
-
-void x86_cpu_change_kvm_default(const char *prop, const char *value)
-{
-    PropValue *pv;
-    for (pv = kvm_default_props; pv->prop; pv++) {
-        if (!strcmp(pv->prop, prop)) {
-            pv->value = value;
-            break;
-        }
-    }
-
-    /*
-     * It is valid to call this function only for properties that
-     * are already present in the kvm_default_props table.
-     */
-    assert(pv->prop);
-}
-
 static bool lmce_supported(void)
 {
     uint64_t mce_cap = 0;
@@ -150,21 +109,69 @@  static void kvm_cpu_xsave_init(void)
     }
 }
 
+/*
+ * KVM-specific features that are automatically added/removed
+ * from cpudef models when KVM is enabled.
+ * Only for builtin_x86_defs models initialized with x86_register_cpudef_types.
+ *
+ * NOTE: features can be enabled by default only if they were
+ *       already available in the oldest kernel version supported
+ *       by the KVM accelerator (see "OS requirements" section at
+ *       docs/system/target-i386.rst)
+ */
+static PropValue kvm_default_props[] = {
+    { "kvmclock", "on" },
+    { "kvm-nopiodelay", "on" },
+    { "kvm-asyncpf", "on" },
+    { "kvm-steal-time", "on" },
+    { "kvm-pv-eoi", "on" },
+    { "kvmclock-stable-bit", "on" },
+    { "x2apic", "on" },
+    { "kvm-msi-ext-dest-id", "off" },
+    { "acpi", "off" },
+    { "monitor", "off" },
+    { "svm", "off" },
+    { NULL, NULL },
+};
+
+/*
+ * Only for builtin_x86_defs models initialized with x86_register_cpudef_types.
+ */
+void x86_cpu_change_kvm_default(const char *prop, const char *value)
+{
+    PropValue *pv;
+    for (pv = kvm_default_props; pv->prop; pv++) {
+        if (!strcmp(pv->prop, prop)) {
+            pv->value = value;
+            break;
+        }
+    }
+
+    /*
+     * It is valid to call this function only for properties that
+     * are already present in the kvm_default_props table.
+     */
+    assert(pv->prop);
+}
+
 static void kvm_cpu_instance_init(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
+    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
 
     host_cpu_instance_init(cpu);
 
-    if (!kvm_irqchip_in_kernel()) {
-        x86_cpu_change_kvm_default("x2apic", "off");
-    } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
-        x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
-    }
-
-    /* Special cases not set in the X86CPUDefinition structs: */
+    if (xcc->model) {
+        /* only applies to builtin_x86_defs cpus */
+        if (!kvm_irqchip_in_kernel()) {
+            x86_cpu_change_kvm_default("x2apic", "off");
+        } else if (kvm_irqchip_is_split() && kvm_enable_x2apic()) {
+            x86_cpu_change_kvm_default("kvm-msi-ext-dest-id", "on");
+        }
 
-    x86_cpu_apply_props(cpu, kvm_default_props);
+        /* Special cases not set in the X86CPUDefinition structs: */
+        x86_cpu_apply_props(cpu, kvm_default_props);
+    }
 
     if (cpu->max_features) {
         kvm_cpu_max_instance_init(cpu);
diff --git a/target/i386/tcg/tcg-cpu.c b/target/i386/tcg/tcg-cpu.c
index e96ec9bbcc..e86bc93384 100644
--- a/target/i386/tcg/tcg-cpu.c
+++ b/target/i386/tcg/tcg-cpu.c
@@ -99,7 +99,8 @@  static void tcg_cpu_xsave_init(void)
 }
 
 /*
- * TCG-specific defaults that override all CPU models when using TCG
+ * TCG-specific defaults that override cpudef models when using TCG.
+ * Only for builtin_x86_defs models initialized with x86_register_cpudef_types.
  */
 static PropValue tcg_default_props[] = {
     { "vme", "off" },
@@ -109,8 +110,12 @@  static PropValue tcg_default_props[] = {
 static void tcg_cpu_instance_init(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
-    /* Special cases not set in the X86CPUDefinition structs: */
-    x86_cpu_apply_props(cpu, tcg_default_props);
+    X86CPUClass *xcc = X86_CPU_GET_CLASS(cpu);
+
+    if (xcc->model) {
+        /* Special cases not set in the X86CPUDefinition structs: */
+        x86_cpu_apply_props(cpu, tcg_default_props);
+    }
 
     tcg_cpu_xsave_init();
 }