Message ID | 20180109070112.30806-1-vincent@bernat.im (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
[CCing Jiri, Kashyap] On Tue, Jan 09, 2018 at 08:01:12AM +0100, Vincent Bernat wrote: > PCID has been introduced in Westmere and, since Linux 3.6 > (ad756a1603c5), KVM exposes PCID flag if host has it. Update CPU model > for Westmere, Sandy Bridge and Ivy Bridge accordingly. > > Ensure compat 2.11 keeps PCID disabled by default for those models and > document the new requirement for host kernel. > > Signed-off-by: Vincent Bernat <vincent@bernat.im> > --- > include/hw/i386/pc.h | 14 +++++++++++++- > qemu-doc.texi | 11 +++++++++++ > target/i386/cpu.c | 7 ++++--- > 3 files changed, 28 insertions(+), 4 deletions(-) > > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h > index bb49165fe0a4..f4ccbfdc4ac2 100644 > --- a/include/hw/i386/pc.h > +++ b/include/hw/i386/pc.h > @@ -327,6 +327,18 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *); > .driver = "Skylake-Server" "-" TYPE_X86_CPU,\ > .property = "clflushopt",\ > .value = "off",\ > + },{\ > + .driver = "Westmere-" TYPE_X86_CPU,\ > + .property = "pcid",\ > + .value = "off",\ > + },{\ It looks like there are Westmere CPUs out there without PCID (e.g. Core i5 650). The QEMU CPU model is not described as Core i5, though: it's described as "E56xx/L56xx/X56xx". Should we add PCID anyway and let people running Core i5 hosts deal with it manually when updating the machine-type? Or should we add a "Westmere-PCID" (maybe Westmere-EP?) CPU model? Adding Westmere-PCID would require adding a Westmere-PCID-IBRS CPU model too, so this is starting to look a bit ridiculous. Sane VM management systems would know how to use "-cpu Westmere,+pcid" without requiring new CPU model entries in QEMU. What's missing in existing management stacks to allow that to happen? > + .driver = "SandyBridge-" TYPE_X86_CPU,\ > + .property = "pcid",\ > + .value = "off",\ > + },{\ > + .driver = "IvyBridge-" TYPE_X86_CPU,\ > + .property = "pcid",\ > + .value = "off",\ Now, proving a negative is more difficult: how can we be sure that there are no SandyBridge and IvyBridge CPUs out there without PCID? > }, > > #define PC_COMPAT_2_10 \ > @@ -351,7 +363,7 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *); > .driver = "mch",\ > .property = "extended-tseg-mbytes",\ > .value = stringify(0),\ > - },\ > + }, > > #define PC_COMPAT_2_8 \ > HW_COMPAT_2_8 \ > diff --git a/qemu-doc.texi b/qemu-doc.texi > index 8d0c809ad5cf..9e1a03181427 100644 > --- a/qemu-doc.texi > +++ b/qemu-doc.texi > @@ -37,6 +37,7 @@ > * QEMU System emulator for non PC targets:: > * QEMU Guest Agent:: > * QEMU User space emulator:: > +* System requirements:: > * Implementation notes:: > * Deprecated features:: > * License:: > @@ -2565,6 +2566,16 @@ Act as if the host page size was 'pagesize' bytes > Run the emulation in single step mode. > @end table > > +@node System requirements > +@chapter System requirements > + > +@section KVM kernel module > + > +On x86_64 hosts, the default set of CPU features enabled by the KVM > +accelerator require the host to be running Linux v3.6 or newer. If the > +minimum requirement is not met, the guest will not be runnable, > +depending on the selected CPU model. Older emulated machines, like > +``pc-q35-2.10'', may work with older kernels. > > @include qemu-tech.texi > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index 65f785c7e739..873c0151ef57 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -1081,7 +1081,7 @@ static X86CPUDefinition builtin_x86_defs[] = { > .features[FEAT_1_ECX] = > CPUID_EXT_AES | CPUID_EXT_POPCNT | CPUID_EXT_SSE42 | > CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | > - CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3, > + CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 | CPUID_EXT_PCID, > .features[FEAT_8000_0001_EDX] = > CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX, > .features[FEAT_8000_0001_ECX] = > @@ -1109,7 +1109,7 @@ static X86CPUDefinition builtin_x86_defs[] = { > CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_POPCNT | > CPUID_EXT_X2APIC | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 | > CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | CPUID_EXT_PCLMULQDQ | > - CPUID_EXT_SSE3, > + CPUID_EXT_SSE3 | CPUID_EXT_PCID, > .features[FEAT_8000_0001_EDX] = > CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_NX | > CPUID_EXT2_SYSCALL, > @@ -1140,7 +1140,8 @@ static X86CPUDefinition builtin_x86_defs[] = { > CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_POPCNT | > CPUID_EXT_X2APIC | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 | > CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | CPUID_EXT_PCLMULQDQ | > - CPUID_EXT_SSE3 | CPUID_EXT_F16C | CPUID_EXT_RDRAND, > + CPUID_EXT_SSE3 | CPUID_EXT_F16C | CPUID_EXT_RDRAND | > + CPUID_EXT_PCID, > .features[FEAT_7_0_EBX] = > CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_SMEP | > CPUID_7_0_EBX_ERMS, > -- > 2.15.1 >
❦ 12 janvier 2018 16:47 -0200, Eduardo Habkost <ehabkost@redhat.com> : > Adding Westmere-PCID would require adding a Westmere-PCID-IBRS > CPU model too, so this is starting to look a bit ridiculous. > Sane VM management systems would know how to use > "-cpu Westmere,+pcid" without requiring new CPU model entries in > QEMU. What's missing in existing management stacks to allow that > to happen? That's what I actually do. So, I am fine with the solution of doing nothing. However, it would be nice for unaware people to get the speedup of pcid without knowing about it. Maybe we can just forget about Westmere and still apply it to Sandy Bridge and Ivy Bridge.
On Sat, Jan 13, 2018 at 08:22:31AM +0100, Vincent Bernat wrote: > ❦ 12 janvier 2018 16:47 -0200, Eduardo Habkost <ehabkost@redhat.com> : > > > Adding Westmere-PCID would require adding a Westmere-PCID-IBRS > > CPU model too, so this is starting to look a bit ridiculous. > > Sane VM management systems would know how to use > > "-cpu Westmere,+pcid" without requiring new CPU model entries in > > QEMU. What's missing in existing management stacks to allow that > > to happen? > > That's what I actually do. So, I am fine with the solution of doing > nothing. However, it would be nice for unaware people to get the speedup > of pcid without knowing about it. Maybe we can just forget about > Westmere and still apply it to Sandy Bridge and Ivy Bridge. If management stacks today don't let the user choose "Westmere,+pcid", we probably have no other choice than adding a Westmere-PCID CPU model. But our management stacks need to be fixed so we won't need similar hacks in the future.
❦ 16 janvier 2018 10:41 -0200, Eduardo Habkost <ehabkost@redhat.com> : >> > Adding Westmere-PCID would require adding a Westmere-PCID-IBRS >> > CPU model too, so this is starting to look a bit ridiculous. >> > Sane VM management systems would know how to use >> > "-cpu Westmere,+pcid" without requiring new CPU model entries in >> > QEMU. What's missing in existing management stacks to allow that >> > to happen? >> >> That's what I actually do. So, I am fine with the solution of doing >> nothing. However, it would be nice for unaware people to get the speedup >> of pcid without knowing about it. Maybe we can just forget about >> Westmere and still apply it to Sandy Bridge and Ivy Bridge. > > If management stacks today don't let the user choose > "Westmere,+pcid", we probably have no other choice than adding a > Westmere-PCID CPU model. But our management stacks need to be > fixed so we won't need similar hacks in the future. With libvirt: <cpu mode='custom' match='exact'> <model>Westmere</model> <feature policy='require' name='pcid'/> </cpu> We are using CloudStack on top of that and it's also an available option. However, looking at OpenStack, it doesn't seem possible: https://github.com/openstack/nova/blob/6b248518da794a4c82665c22abf7bee5aa527a47/nova/conf/libvirt.py#L506
On Tue, Jan 16, 2018 at 01:55:22PM +0100, Vincent Bernat wrote: > ❦ 16 janvier 2018 10:41 -0200, Eduardo Habkost <ehabkost@redhat.com> : > > >> > Adding Westmere-PCID would require adding a Westmere-PCID-IBRS > >> > CPU model too, so this is starting to look a bit ridiculous. > >> > Sane VM management systems would know how to use > >> > "-cpu Westmere,+pcid" without requiring new CPU model entries in > >> > QEMU. What's missing in existing management stacks to allow that > >> > to happen? > >> > >> That's what I actually do. So, I am fine with the solution of doing > >> nothing. However, it would be nice for unaware people to get the speedup > >> of pcid without knowing about it. Maybe we can just forget about > >> Westmere and still apply it to Sandy Bridge and Ivy Bridge. > > > > If management stacks today don't let the user choose > > "Westmere,+pcid", we probably have no other choice than adding a > > Westmere-PCID CPU model. But our management stacks need to be > > fixed so we won't need similar hacks in the future. True; I'm aware of the limitation here in Nova. > With libvirt: > > <cpu mode='custom' match='exact'> > <model>Westmere</model> > <feature policy='require' name='pcid'/> > </cpu> Yep, libvirt upstream allows it. > We are using CloudStack on top of that and it's also an available > option. However, looking at OpenStack, it doesn't seem possible: > https://github.com/openstack/nova/blob/6b248518da794a4c82665c22abf7bee5aa527a47/nova/conf/libvirt.py#L506 That's correct, upstream OpenStack Nova doesn't yet have facility to specify granular CPU feature names. Nova just ought to wire up the facility libvirt already provides.
[CCing Daniel] On Tue, Jan 16, 2018 at 04:33:00PM +0100, Kashyap Chamarthy wrote: > On Tue, Jan 16, 2018 at 01:55:22PM +0100, Vincent Bernat wrote: > > ❦ 16 janvier 2018 10:41 -0200, Eduardo Habkost <ehabkost@redhat.com> : > > > > >> > Adding Westmere-PCID would require adding a Westmere-PCID-IBRS > > >> > CPU model too, so this is starting to look a bit ridiculous. > > >> > Sane VM management systems would know how to use > > >> > "-cpu Westmere,+pcid" without requiring new CPU model entries in > > >> > QEMU. What's missing in existing management stacks to allow that > > >> > to happen? > > >> > > >> That's what I actually do. So, I am fine with the solution of doing > > >> nothing. However, it would be nice for unaware people to get the speedup > > >> of pcid without knowing about it. Maybe we can just forget about > > >> Westmere and still apply it to Sandy Bridge and Ivy Bridge. > > > > > > If management stacks today don't let the user choose > > > "Westmere,+pcid", we probably have no other choice than adding a > > > Westmere-PCID CPU model. But our management stacks need to be > > > fixed so we won't need similar hacks in the future. > > True; I'm aware of the limitation here in Nova. > > > With libvirt: > > > > <cpu mode='custom' match='exact'> > > <model>Westmere</model> > > <feature policy='require' name='pcid'/> > > </cpu> > > Yep, libvirt upstream allows it. > > > We are using CloudStack on top of that and it's also an available > > option. However, looking at OpenStack, it doesn't seem possible: > > https://github.com/openstack/nova/blob/6b248518da794a4c82665c22abf7bee5aa527a47/nova/conf/libvirt.py#L506 > > That's correct, upstream OpenStack Nova doesn't yet have facility to > specify granular CPU feature names. Nova just ought to wire up the > facility libvirt already provides. I still don't understand why OpenStack doesn't let users add or modify elements on the domain XML. This isn't the first time I see this preventing users from fixing problems or optimizing their systems. Is there a summary of the reasons behind this limitation somewhere?
On Tue, Jan 16, 2018 at 03:08:15PM -0200, Eduardo Habkost wrote: > [CCing Daniel] > > On Tue, Jan 16, 2018 at 04:33:00PM +0100, Kashyap Chamarthy wrote: > > On Tue, Jan 16, 2018 at 01:55:22PM +0100, Vincent Bernat wrote: > > > ❦ 16 janvier 2018 10:41 -0200, Eduardo Habkost <ehabkost@redhat.com> : > > > > > > >> > Adding Westmere-PCID would require adding a Westmere-PCID-IBRS > > > >> > CPU model too, so this is starting to look a bit ridiculous. > > > >> > Sane VM management systems would know how to use > > > >> > "-cpu Westmere,+pcid" without requiring new CPU model entries in > > > >> > QEMU. What's missing in existing management stacks to allow that > > > >> > to happen? > > > >> > > > >> That's what I actually do. So, I am fine with the solution of doing > > > >> nothing. However, it would be nice for unaware people to get the speedup > > > >> of pcid without knowing about it. Maybe we can just forget about > > > >> Westmere and still apply it to Sandy Bridge and Ivy Bridge. > > > > > > > > If management stacks today don't let the user choose > > > > "Westmere,+pcid", we probably have no other choice than adding a > > > > Westmere-PCID CPU model. But our management stacks need to be > > > > fixed so we won't need similar hacks in the future. > > > > True; I'm aware of the limitation here in Nova. > > > > > With libvirt: > > > > > > <cpu mode='custom' match='exact'> > > > <model>Westmere</model> > > > <feature policy='require' name='pcid'/> > > > </cpu> > > > > Yep, libvirt upstream allows it. > > > > > We are using CloudStack on top of that and it's also an available > > > option. However, looking at OpenStack, it doesn't seem possible: > > > https://github.com/openstack/nova/blob/6b248518da794a4c82665c22abf7bee5aa527a47/nova/conf/libvirt.py#L506 > > > > That's correct, upstream OpenStack Nova doesn't yet have facility to > > specify granular CPU feature names. Nova just ought to wire up the > > facility libvirt already provides. > > I still don't understand why OpenStack doesn't let users add or > modify elements on the domain XML. This isn't the first time I > see this preventing users from fixing problems or optimizing > their systems. > > Is there a summary of the reasons behind this limitation > somewhere? Exposing ability to control every aspect of Libvirt XML is a non-goal of Nova. A great many of the features require different modelling and/or explicit handling by Nova to work well in the context of OpenStack's architecture. The domain XML is automatically generated on the fly by Nova based on the info it gets from various inputs, so there's nothing that can be editted directly to add custom elements. The only way that would allow modification is for Nova to send the XML it generates to an external plugin script and read back modified XML. Historically Nova did have alot of plugin points that allowed arbitrary admin hacks like this, but they end up being a support burden in themselves, as they end up being black boxes which change Nova behaviour in unpredictable ways. Thus Nova has actually worked to remove as many of the plugins as possible. In this case there is a clear benefit to being able to add extra CPU features, over the base named model. It is easy for Nova to wire this up and it should do so as a priority. Regards, Daniel
On Tue, Jan 16, 2018 at 05:43:44PM +0000, Daniel P. Berrange wrote: > On Tue, Jan 16, 2018 at 03:08:15PM -0200, Eduardo Habkost wrote: > > [CCing Daniel] [...] > > I still don't understand why OpenStack doesn't let users add or > > modify elements on the domain XML. This isn't the first time I > > see this preventing users from fixing problems or optimizing > > their systems. > > > > Is there a summary of the reasons behind this limitation > > somewhere? > > Exposing ability to control every aspect of Libvirt XML is a non-goal of > Nova. A great many of the features require different modelling and/or > explicit handling by Nova to work well in the context of OpenStack's > architecture. The domain XML is automatically generated on the fly by > Nova based on the info it gets from various inputs, so there's nothing > that can be editted directly to add custom elements. The only way that > would allow modification is for Nova to send the XML it generates to > an external plugin script and read back modified XML. Historically Nova > did have alot of plugin points that allowed arbitrary admin hacks like > this, but they end up being a support burden in themselves, as they > end up being black boxes which change Nova behaviour in unpredictable > ways. Thus Nova has actually worked to remove as many of the plugins > as possible. > > In this case there is a clear benefit to being able to add extra CPU > features, over the base named model. It is easy for Nova to wire this > up and it should do so as a priority. Agreed, it has long been pending in Nova; I also recall you've identified other use cases for it (e.g. ability to mention 1G huge pages with the CPU flag "pdpe1gb"). So I began a quick WIP Nova change to allow it to specify CPU feature flags. I haven't worked out all the details yet, and still addressing the quick comment you (DanPB) made on IRC. https://review.openstack.org/#/c/534384 -- [WIP] libvirt: Allow to specify granular CPU feature flags PS: Thursday and Friday I'll be a bit sporadic in my availability, but this change is on top of my TODO list.
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index bb49165fe0a4..f4ccbfdc4ac2 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -327,6 +327,18 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *); .driver = "Skylake-Server" "-" TYPE_X86_CPU,\ .property = "clflushopt",\ .value = "off",\ + },{\ + .driver = "Westmere-" TYPE_X86_CPU,\ + .property = "pcid",\ + .value = "off",\ + },{\ + .driver = "SandyBridge-" TYPE_X86_CPU,\ + .property = "pcid",\ + .value = "off",\ + },{\ + .driver = "IvyBridge-" TYPE_X86_CPU,\ + .property = "pcid",\ + .value = "off",\ }, #define PC_COMPAT_2_10 \ @@ -351,7 +363,7 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *); .driver = "mch",\ .property = "extended-tseg-mbytes",\ .value = stringify(0),\ - },\ + }, #define PC_COMPAT_2_8 \ HW_COMPAT_2_8 \ diff --git a/qemu-doc.texi b/qemu-doc.texi index 8d0c809ad5cf..9e1a03181427 100644 --- a/qemu-doc.texi +++ b/qemu-doc.texi @@ -37,6 +37,7 @@ * QEMU System emulator for non PC targets:: * QEMU Guest Agent:: * QEMU User space emulator:: +* System requirements:: * Implementation notes:: * Deprecated features:: * License:: @@ -2565,6 +2566,16 @@ Act as if the host page size was 'pagesize' bytes Run the emulation in single step mode. @end table +@node System requirements +@chapter System requirements + +@section KVM kernel module + +On x86_64 hosts, the default set of CPU features enabled by the KVM +accelerator require the host to be running Linux v3.6 or newer. If the +minimum requirement is not met, the guest will not be runnable, +depending on the selected CPU model. Older emulated machines, like +``pc-q35-2.10'', may work with older kernels. @include qemu-tech.texi diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 65f785c7e739..873c0151ef57 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -1081,7 +1081,7 @@ static X86CPUDefinition builtin_x86_defs[] = { .features[FEAT_1_ECX] = CPUID_EXT_AES | CPUID_EXT_POPCNT | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | - CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3, + CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 | CPUID_EXT_PCID, .features[FEAT_8000_0001_EDX] = CPUID_EXT2_LM | CPUID_EXT2_SYSCALL | CPUID_EXT2_NX, .features[FEAT_8000_0001_ECX] = @@ -1109,7 +1109,7 @@ static X86CPUDefinition builtin_x86_defs[] = { CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | CPUID_EXT_PCLMULQDQ | - CPUID_EXT_SSE3, + CPUID_EXT_SSE3 | CPUID_EXT_PCID, .features[FEAT_8000_0001_EDX] = CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_NX | CPUID_EXT2_SYSCALL, @@ -1140,7 +1140,8 @@ static X86CPUDefinition builtin_x86_defs[] = { CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 | CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 | CPUID_EXT_PCLMULQDQ | - CPUID_EXT_SSE3 | CPUID_EXT_F16C | CPUID_EXT_RDRAND, + CPUID_EXT_SSE3 | CPUID_EXT_F16C | CPUID_EXT_RDRAND | + CPUID_EXT_PCID, .features[FEAT_7_0_EBX] = CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_ERMS,
PCID has been introduced in Westmere and, since Linux 3.6 (ad756a1603c5), KVM exposes PCID flag if host has it. Update CPU model for Westmere, Sandy Bridge and Ivy Bridge accordingly. Ensure compat 2.11 keeps PCID disabled by default for those models and document the new requirement for host kernel. Signed-off-by: Vincent Bernat <vincent@bernat.im> --- include/hw/i386/pc.h | 14 +++++++++++++- qemu-doc.texi | 11 +++++++++++ target/i386/cpu.c | 7 ++++--- 3 files changed, 28 insertions(+), 4 deletions(-)