Message ID | 1467265812-5872-1-git-send-email-luwei.kang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 30.06.16 at 07:50, <luwei.kang@intel.com> wrote: > --- a/xen/tools/gen-cpuid.py > +++ b/xen/tools/gen-cpuid.py > @@ -243,6 +243,17 @@ def crunch_numbers(state): > # AMD K6-2+ and K6-III processors shipped with 3DNow+, beyond the > # standard 3DNow in the earlier K6 processors. > _3DNOW: [_3DNOWEXT], > + > + # AVX2 is an extension to AVX, providing mainly new integer instructions. > + # In principle, AVX512 only takes YMM register state as a prerequisite > + # and many AVX2 instructions are extended by AVX512F to 512-bit forms. > + AVX2: [AVX512F], I think this was meant to be "but" or "yet" instead of "and". And then I'm not particularly happy about the asymmetry with AVX (which is not dependent on SSE or any of its successors, despite extending many of them to 256-bit forms). Plus e.g. FMA gets extended too, but isn't being made a dependency. And quite opposite to that, at least some AVX2 instructions get extended only by e.g. AVX512BW. Nor are any of the extended forms listed to require other than the AVX512* bits checked in the doc. Jan
What about remove the dependency between AVX2 and AVX512F ( AVX2: [AVX512F], ) ? -----Original Message----- From: Jan Beulich [mailto:JBeulich@suse.com] Sent: Friday, July 1, 2016 3:56 PM To: andrew.cooper3@citrix.com; Kang, Luwei <luwei.kang@intel.com> Cc: chao.p.peng@linux.intel.com; xen-devel@lists.xen.org Subject: Re: [PATCH v4] x86/cpuid: AVX-512 Feature Detection >>> On 30.06.16 at 07:50, <luwei.kang@intel.com> wrote: > --- a/xen/tools/gen-cpuid.py > +++ b/xen/tools/gen-cpuid.py > @@ -243,6 +243,17 @@ def crunch_numbers(state): > # AMD K6-2+ and K6-III processors shipped with 3DNow+, beyond the > # standard 3DNow in the earlier K6 processors. > _3DNOW: [_3DNOWEXT], > + > + # AVX2 is an extension to AVX, providing mainly new integer instructions. > + # In principle, AVX512 only takes YMM register state as a prerequisite > + # and many AVX2 instructions are extended by AVX512F to 512-bit forms. > + AVX2: [AVX512F], I think this was meant to be "but" or "yet" instead of "and". And then I'm not particularly happy about the asymmetry with AVX (which is not dependent on SSE or any of its successors, despite extending many of them to 256-bit forms). Plus e.g. FMA gets extended too, but isn't being made a dependency. And quite opposite to that, at least some AVX2 instructions get extended only by e.g. AVX512BW. Nor are any of the extended forms listed to require other than the AVX512* bits checked in the doc. Jan
>>> On 05.07.16 at 04:31, <luwei.kang@intel.com> wrote: First of all - please don't top post. > What about remove the dependency between AVX2 and AVX512F ( AVX2: [AVX512F], ) ? Yes, that's what I think we want, but we need Andrew's agreement here. > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Friday, July 1, 2016 3:56 PM >>>> On 30.06.16 at 07:50, <luwei.kang@intel.com> wrote: >> --- a/xen/tools/gen-cpuid.py >> +++ b/xen/tools/gen-cpuid.py >> @@ -243,6 +243,17 @@ def crunch_numbers(state): >> # AMD K6-2+ and K6-III processors shipped with 3DNow+, beyond the >> # standard 3DNow in the earlier K6 processors. >> _3DNOW: [_3DNOWEXT], >> + >> + # AVX2 is an extension to AVX, providing mainly new integer instructions. >> + # In principle, AVX512 only takes YMM register state as a prerequisite >> + # and many AVX2 instructions are extended by AVX512F to 512-bit forms. >> + AVX2: [AVX512F], > > I think this was meant to be "but" or "yet" instead of "and". And then I'm > not particularly happy about the asymmetry with AVX (which is not dependent > on SSE or any of its successors, despite extending many of them to 256-bit > forms). Plus e.g. FMA gets extended too, but isn't being made a dependency. > And quite opposite to that, at least some AVX2 instructions get extended only > by e.g. AVX512BW. > Nor are any of the extended forms listed to require other than the > AVX512* bits checked in the doc. > > Jan
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@suse.com] > Sent: Tuesday, July 5, 2016 3:03 PM > To: Kang, Luwei <luwei.kang@intel.com> > Cc: andrew.cooper3@citrix.com; chao.p.peng@linux.intel.com; xen- > devel@lists.xen.org > Subject: RE: [PATCH v4] x86/cpuid: AVX-512 Feature Detection > > >>> On 05.07.16 at 04:31, <luwei.kang@intel.com> wrote: > > First of all - please don't top post. > > > What about remove the dependency between AVX2 and AVX512F ( AVX2: > [AVX512F], ) ? > > Yes, that's what I think we want, but we need Andrew's agreement here. > Hi Andrew, what is your opinion ? > > From: Jan Beulich [mailto:JBeulich@suse.com] > > Sent: Friday, July 1, 2016 3:56 PM > >>>> On 30.06.16 at 07:50, <luwei.kang@intel.com> wrote: > >> --- a/xen/tools/gen-cpuid.py > >> +++ b/xen/tools/gen-cpuid.py > >> @@ -243,6 +243,17 @@ def crunch_numbers(state): > >> # AMD K6-2+ and K6-III processors shipped with 3DNow+, beyond > the > >> # standard 3DNow in the earlier K6 processors. > >> _3DNOW: [_3DNOWEXT], > >> + > >> + # AVX2 is an extension to AVX, providing mainly new integer > instructions. > >> + # In principle, AVX512 only takes YMM register state as a > prerequisite > >> + # and many AVX2 instructions are extended by AVX512F to 512-bit > forms. > >> + AVX2: [AVX512F], > > > > I think this was meant to be "but" or "yet" instead of "and". And then > > I'm not particularly happy about the asymmetry with AVX (which is not > > dependent on SSE or any of its successors, despite extending many of > > them to 256-bit forms). Plus e.g. FMA gets extended too, but isn't being > made a dependency. > > And quite opposite to that, at least some AVX2 instructions get > > extended only by e.g. AVX512BW. > > Nor are any of the extended forms listed to require other than the > > AVX512* bits checked in the doc. > > > > Jan > >
On 07/07/16 03:42, Kang, Luwei wrote: > >> -----Original Message----- >> From: Jan Beulich [mailto:JBeulich@suse.com] >> Sent: Tuesday, July 5, 2016 3:03 PM >> To: Kang, Luwei <luwei.kang@intel.com> >> Cc: andrew.cooper3@citrix.com; chao.p.peng@linux.intel.com; xen- >> devel@lists.xen.org >> Subject: RE: [PATCH v4] x86/cpuid: AVX-512 Feature Detection >> >>>>> On 05.07.16 at 04:31, <luwei.kang@intel.com> wrote: >> First of all - please don't top post. >> >>> What about remove the dependency between AVX2 and AVX512F ( AVX2: >> [AVX512F], ) ? >> >> Yes, that's what I think we want, but we need Andrew's agreement here. >> > Hi Andrew, what is your opinion ? You are in a better position to answer than me. For a specific instruction which may be VEX and EVEX encoded, is the circuitry for a specific instruction shared, or discrete? Is there a plausible future where you might support only the EVEX variant of an instruction, and not the VEX variant? These dependences are about what may be reasonably assumed about the way the environment is structured. It doesn't look reasonable to advertise an AVX512 environment to guests while at the same time stating that AVX2 is not present. If this is correct, then the dependency should stay. If Intel plausibly things it might release hardware with !AVX2 but AVX512, then the dependency should be dropped. ~Andrew
> >> First of all - please don't top post. > >> > >>> What about remove the dependency between AVX2 and AVX512F > ( AVX2: > >> [AVX512F], ) ? > >> > >> Yes, that's what I think we want, but we need Andrew's agreement here. > >> > > Hi Andrew, what is your opinion ? > > You are in a better position to answer than me. > > For a specific instruction which may be VEX and EVEX encoded, is the circuitry > for a specific instruction shared, or discrete? Is there a plausible future > where you might support only the EVEX variant of an instruction, and not the > VEX variant? > > These dependences are about what may be reasonably assumed about the > way the environment is structured. It doesn't look reasonable to advertise > an AVX512 environment to guests while at the same time stating that AVX2 is > not present. If this is correct, then the dependency should stay. > If Intel plausibly things it might release hardware with !AVX2 but AVX512, > then the dependency should be dropped. Regarding the dependency between AVX2 and AVX512F, I have ask some hardware architecture engineer. AVX512 is a superset of AVX2, the most important item being the state. If the state for AVX2 isn't enabled (in XCR0), then AVX512 also can't function. So if we want to use AVX512, AVX2 must be supported and enabled. The dependence between AVX512F and AVX2 may need be reserved. > > ~Andrew
On 25/07/16 07:09, Kang, Luwei wrote: >>>> First of all - please don't top post. >>>> >>>>> What about remove the dependency between AVX2 and AVX512F >> ( AVX2: >>>> [AVX512F], ) ? >>>> >>>> Yes, that's what I think we want, but we need Andrew's agreement here. >>>> >>> Hi Andrew, what is your opinion ? >> You are in a better position to answer than me. >> >> For a specific instruction which may be VEX and EVEX encoded, is the circuitry >> for a specific instruction shared, or discrete? Is there a plausible future >> where you might support only the EVEX variant of an instruction, and not the >> VEX variant? >> >> These dependences are about what may be reasonably assumed about the >> way the environment is structured. It doesn't look reasonable to advertise >> an AVX512 environment to guests while at the same time stating that AVX2 is >> not present. If this is correct, then the dependency should stay. >> If Intel plausibly things it might release hardware with !AVX2 but AVX512, >> then the dependency should be dropped. > Regarding the dependency between AVX2 and AVX512F, I have ask some hardware architecture engineer. > > AVX512 is a superset of AVX2, the most important item being the state. If the state for AVX2 isn't enabled (in XCR0), then AVX512 also can't function. > > So if we want to use AVX512, AVX2 must be supported and enabled. The dependence between AVX512F and AVX2 may need be reserved. Ok, so AVX512 strictly depends on AVX2. In which case, the python code was correct. The meaning of the key/value relationship is "if the feature in the key is not present, all features in the value must also be disabled". ~Andrew
> On 25/07/16 07:09, Kang, Luwei wrote: > >>>> First of all - please don't top post. > >>>> > >>>>> What about remove the dependency between AVX2 and AVX512F > >> ( AVX2: > >>>> [AVX512F], ) ? > >>>> > >>>> Yes, that's what I think we want, but we need Andrew's agreement here. > >>>> > >>> Hi Andrew, what is your opinion ? > >> You are in a better position to answer than me. > >> > >> For a specific instruction which may be VEX and EVEX encoded, is the > >> circuitry for a specific instruction shared, or discrete? Is there a > >> plausible future where you might support only the EVEX variant of an > >> instruction, and not the VEX variant? > >> > >> These dependences are about what may be reasonably assumed about the > >> way the environment is structured. It doesn't look reasonable to > >> advertise an AVX512 environment to guests while at the same time > >> stating that AVX2 is not present. If this is correct, then the dependency should stay. > >> If Intel plausibly things it might release hardware with !AVX2 but > >> AVX512, then the dependency should be dropped. > > Regarding the dependency between AVX2 and AVX512F, I have ask some hardware architecture engineer. > > > > AVX512 is a superset of AVX2, the most important item being the state. If the state for AVX2 isn't enabled (in XCR0), then AVX512 > also can't function. > > > > So if we want to use AVX512, AVX2 must be supported and enabled. The dependence between AVX512F and AVX2 may need be > reserved. > > Ok, so AVX512 strictly depends on AVX2. > > In which case, the python code was correct. The meaning of the key/value relationship is "if the feature in the key is not present, all > features in the value must also be disabled". Hi jan, what is your opinion? > > ~Andrew
>>> On 03.08.16 at 03:32, <luwei.kang@intel.com> wrote: >> On 25/07/16 07:09, Kang, Luwei wrote: >> >>>> First of all - please don't top post. >> >>>> >> >>>>> What about remove the dependency between AVX2 and AVX512F >> >> ( AVX2: >> >>>> [AVX512F], ) ? >> >>>> >> >>>> Yes, that's what I think we want, but we need Andrew's agreement here. >> >>>> >> >>> Hi Andrew, what is your opinion ? >> >> You are in a better position to answer than me. >> >> >> >> For a specific instruction which may be VEX and EVEX encoded, is the >> >> circuitry for a specific instruction shared, or discrete? Is there a >> >> plausible future where you might support only the EVEX variant of an >> >> instruction, and not the VEX variant? >> >> >> >> These dependences are about what may be reasonably assumed about the >> >> way the environment is structured. It doesn't look reasonable to >> >> advertise an AVX512 environment to guests while at the same time >> >> stating that AVX2 is not present. If this is correct, then the dependency > should stay. >> >> If Intel plausibly things it might release hardware with !AVX2 but >> >> AVX512, then the dependency should be dropped. >> > Regarding the dependency between AVX2 and AVX512F, I have ask some hardware > architecture engineer. >> > >> > AVX512 is a superset of AVX2, the most important item being the state. If > the state for AVX2 isn't enabled (in XCR0), then AVX512 >> also can't function. >> > >> > So if we want to use AVX512, AVX2 must be supported and enabled. The > dependence between AVX512F and AVX2 may need be >> reserved. >> >> Ok, so AVX512 strictly depends on AVX2. >> >> In which case, the python code was correct. The meaning of the key/value > relationship is "if the feature in the key is not present, all >> features in the value must also be disabled". > > Hi jan, what is your opinion? There's no opinion to be had here, according to your earlier reply. I do, however, continue to question that model: State and instruction set are independent items. Of course YMM state is a prereq to ZMM state, but I do not buy AVX2 insn support being a prereq to AVX-512 insns. Yet to me the AVX2 and AVX-512F feature flags solely represent the instructions, while the XSTATE leaf bits represent the states. Jan
> >>> On 03.08.16 at 03:32, <luwei.kang@intel.com> wrote: > >> On 25/07/16 07:09, Kang, Luwei wrote: > >> >>>> First of all - please don't top post. > >> >>>> > >> >>>>> What about remove the dependency between AVX2 and AVX512F > >> >> ( AVX2: > >> >>>> [AVX512F], ) ? > >> >>>> > >> >>>> Yes, that's what I think we want, but we need Andrew's agreement here. > >> >>>> > >> >>> Hi Andrew, what is your opinion ? > >> >> You are in a better position to answer than me. > >> >> > >> >> For a specific instruction which may be VEX and EVEX encoded, is > >> >> the circuitry for a specific instruction shared, or discrete? Is > >> >> there a plausible future where you might support only the EVEX > >> >> variant of an instruction, and not the VEX variant? > >> >> > >> >> These dependences are about what may be reasonably assumed about > >> >> the way the environment is structured. It doesn't look reasonable > >> >> to advertise an AVX512 environment to guests while at the same > >> >> time stating that AVX2 is not present. If this is correct, then > >> >> the dependency > > should stay. > >> >> If Intel plausibly things it might release hardware with !AVX2 but > >> >> AVX512, then the dependency should be dropped. > >> > Regarding the dependency between AVX2 and AVX512F, I have ask some > >> > hardware > > architecture engineer. > >> > > >> > AVX512 is a superset of AVX2, the most important item being the > >> > state. If > > the state for AVX2 isn't enabled (in XCR0), then AVX512 > >> also can't function. > >> > > >> > So if we want to use AVX512, AVX2 must be supported and enabled. > >> > The > > dependence between AVX512F and AVX2 may need be > >> reserved. > >> > >> Ok, so AVX512 strictly depends on AVX2. > >> > >> In which case, the python code was correct. The meaning of the > >> key/value > > relationship is "if the feature in the key is not present, all > >> features in the value must also be disabled". > > > > Hi jan, what is your opinion? > > There's no opinion to be had here, according to your earlier reply. I do, however, continue to question that model: State and > instruction set are independent items. Of course YMM state is a prereq to ZMM state, but I do not buy AVX2 insn support being a > prereq to AVX-512 insns. Yet to me the AVX2 and AVX-512F feature flags solely represent the instructions, while the XSTATE leaf bits > represent the states. > Hi jan, I get some information from hardware colleague. There is no dependence about AVX512 instructions and AVX2 instructions. It is also not states in the official document. But I want to know the meaning of the dependence "AVX2: [AVX512F]" in "gen-cpuid.py" file. If it is the feature dependence, I think the dependence between AVX512 and AVX2 may need to add. As we talk before, Zmm is part of AVX512 feature. If the state for AVX2 isn't enabled (in XCR0), then AVX512 also can't function. Apart from that, there is no processors with AVX512 without AVX2 currently and it is safe to treat AVX2 as part of the thermometer leading to AVX512. Regarding if will have a cpu support avx512 without avx2 in future, it is really hard to say. If it is the instructions dependence, I have no idea to delete this dependence in "gen-cpuid.py" file. So, I want to know your advice. Thanks, Luwei Kang
>>> On 10.08.16 at 11:58, <luwei.kang@intel.com> wrote: >> >>> On 03.08.16 at 03:32, <luwei.kang@intel.com> wrote: >> >> On 25/07/16 07:09, Kang, Luwei wrote: >> >> >>>> First of all - please don't top post. >> >> >>>> >> >> >>>>> What about remove the dependency between AVX2 and AVX512F >> >> >> ( AVX2: >> >> >>>> [AVX512F], ) ? >> >> >>>> >> >> >>>> Yes, that's what I think we want, but we need Andrew's agreement here. >> >> >>>> >> >> >>> Hi Andrew, what is your opinion ? >> >> >> You are in a better position to answer than me. >> >> >> >> >> >> For a specific instruction which may be VEX and EVEX encoded, is >> >> >> the circuitry for a specific instruction shared, or discrete? Is >> >> >> there a plausible future where you might support only the EVEX >> >> >> variant of an instruction, and not the VEX variant? >> >> >> >> >> >> These dependences are about what may be reasonably assumed about >> >> >> the way the environment is structured. It doesn't look reasonable >> >> >> to advertise an AVX512 environment to guests while at the same >> >> >> time stating that AVX2 is not present. If this is correct, then >> >> >> the dependency >> > should stay. >> >> >> If Intel plausibly things it might release hardware with !AVX2 but >> >> >> AVX512, then the dependency should be dropped. >> >> > Regarding the dependency between AVX2 and AVX512F, I have ask some >> >> > hardware >> > architecture engineer. >> >> > >> >> > AVX512 is a superset of AVX2, the most important item being the >> >> > state. If >> > the state for AVX2 isn't enabled (in XCR0), then AVX512 >> >> also can't function. >> >> > >> >> > So if we want to use AVX512, AVX2 must be supported and enabled. >> >> > The >> > dependence between AVX512F and AVX2 may need be >> >> reserved. >> >> >> >> Ok, so AVX512 strictly depends on AVX2. >> >> >> >> In which case, the python code was correct. The meaning of the >> >> key/value >> > relationship is "if the feature in the key is not present, all >> >> features in the value must also be disabled". >> > >> > Hi jan, what is your opinion? >> >> There's no opinion to be had here, according to your earlier reply. I do, > however, continue to question that model: State and >> instruction set are independent items. Of course YMM state is a prereq to > ZMM state, but I do not buy AVX2 insn support being a >> prereq to AVX-512 insns. Yet to me the AVX2 and AVX-512F feature flags solely > represent the instructions, while the XSTATE leaf bits >> represent the states. >> > > Hi jan, > I get some information from hardware colleague. There is no dependence > about AVX512 instructions and AVX2 instructions. It is also not states in the > official document. As I had assumed. > But I want to know the meaning of the dependence "AVX2: [AVX512F]" in > "gen-cpuid.py" file. > If it is the feature dependence, I think the dependence between AVX512 > and AVX2 may need to add. As we talk before, Zmm is part of AVX512 feature. > If the state for AVX2 isn't enabled (in XCR0), then AVX512 also can't > function. Apart from that, there is no processors with AVX512 without AVX2 > currently and it is safe to treat AVX2 as part of the thermometer leading to > AVX512. Regarding if will have a cpu support avx512 without avx2 in future, > it is really hard to say. > If it is the instructions dependence, I have no idea to delete this > dependence in "gen-cpuid.py" file. > So, I want to know your advice. Well, the main issue here is that we have no feature flag representation for the xstate bits. If we had, the relevant parts of the dependencies should look like XSTATE_YMM: [AVX, XSTATE_ZMM] AVX: [AVX2] XSTATE_ZMM: [AVX512F] AVX512F: [AVX512CD, ...] But in their absence I guess keeping the AVX2 dependency as you have it is the only reasonable approach. Just that I'd like it to be accompanied by a comment explaining that this isn't the actual dependency (so that if XSTATE feature flags got introduced later, it would be clear how to adjust those parts). Andrew - I keep forgetting why you think having such XSTATE_* feature flags is not appropriate. Jan
On 10/08/16 11:29, Jan Beulich wrote: >>>> On 10.08.16 at 11:58, <luwei.kang@intel.com> wrote: >>>>>> On 03.08.16 at 03:32, <luwei.kang@intel.com> wrote: >>>>> On 25/07/16 07:09, Kang, Luwei wrote: >>>>>>>>> First of all - please don't top post. >>>>>>>>> >>>>>>>>>> What about remove the dependency between AVX2 and AVX512F >>>>>>> ( AVX2: >>>>>>>>> [AVX512F], ) ? >>>>>>>>> >>>>>>>>> Yes, that's what I think we want, but we need Andrew's agreement here. >>>>>>>>> >>>>>>>> Hi Andrew, what is your opinion ? >>>>>>> You are in a better position to answer than me. >>>>>>> >>>>>>> For a specific instruction which may be VEX and EVEX encoded, is >>>>>>> the circuitry for a specific instruction shared, or discrete? Is >>>>>>> there a plausible future where you might support only the EVEX >>>>>>> variant of an instruction, and not the VEX variant? >>>>>>> >>>>>>> These dependences are about what may be reasonably assumed about >>>>>>> the way the environment is structured. It doesn't look reasonable >>>>>>> to advertise an AVX512 environment to guests while at the same >>>>>>> time stating that AVX2 is not present. If this is correct, then >>>>>>> the dependency >>>> should stay. >>>>>>> If Intel plausibly things it might release hardware with !AVX2 but >>>>>>> AVX512, then the dependency should be dropped. >>>>>> Regarding the dependency between AVX2 and AVX512F, I have ask some >>>>>> hardware >>>> architecture engineer. >>>>>> AVX512 is a superset of AVX2, the most important item being the >>>>>> state. If >>>> the state for AVX2 isn't enabled (in XCR0), then AVX512 >>>>> also can't function. >>>>>> So if we want to use AVX512, AVX2 must be supported and enabled. >>>>>> The >>>> dependence between AVX512F and AVX2 may need be >>>>> reserved. >>>>> >>>>> Ok, so AVX512 strictly depends on AVX2. >>>>> >>>>> In which case, the python code was correct. The meaning of the >>>>> key/value >>>> relationship is "if the feature in the key is not present, all >>>>> features in the value must also be disabled". >>>> Hi jan, what is your opinion? >>> There's no opinion to be had here, according to your earlier reply. I do, >> however, continue to question that model: State and >>> instruction set are independent items. Of course YMM state is a prereq to >> ZMM state, but I do not buy AVX2 insn support being a >>> prereq to AVX-512 insns. Yet to me the AVX2 and AVX-512F feature flags solely >> represent the instructions, while the XSTATE leaf bits >>> represent the states. >>> >> Hi jan, >> I get some information from hardware colleague. There is no dependence >> about AVX512 instructions and AVX2 instructions. It is also not states in the >> official document. > As I had assumed. > >> But I want to know the meaning of the dependence "AVX2: [AVX512F]" in >> "gen-cpuid.py" file. >> If it is the feature dependence, I think the dependence between AVX512 >> and AVX2 may need to add. As we talk before, Zmm is part of AVX512 feature. >> If the state for AVX2 isn't enabled (in XCR0), then AVX512 also can't >> function. Apart from that, there is no processors with AVX512 without AVX2 >> currently and it is safe to treat AVX2 as part of the thermometer leading to >> AVX512. Regarding if will have a cpu support avx512 without avx2 in future, >> it is really hard to say. >> If it is the instructions dependence, I have no idea to delete this >> dependence in "gen-cpuid.py" file. >> So, I want to know your advice. > Well, the main issue here is that we have no feature flag representation > for the xstate bits. If we had, the relevant parts of the dependencies > should look like > > XSTATE_YMM: [AVX, XSTATE_ZMM] > AVX: [AVX2] > XSTATE_ZMM: [AVX512F] > AVX512F: [AVX512CD, ...] > > But in their absence I guess keeping the AVX2 dependency as you have > it is the only reasonable approach. Just that I'd like it to be accompanied > by a comment explaining that this isn't the actual dependency (so that if > XSTATE feature flags got introduced later, it would be clear how to > adjust those parts). > > Andrew - I keep forgetting why you think having such XSTATE_* feature > flags is not appropriate. This is all about providing a plausible cpuid layout to a guest. It is not plausible that we will ever see a processor with e.g. AVX512F but not XSTATE_ZMM. Therefore, to avoid that misconfiguration (or the inverse misconfiguration) from ever occurring, the XSTATE is derived from the plain feature flags. ~Andrew
>>> On 10.08.16 at 14:10, <andrew.cooper3@citrix.com> wrote: > On 10/08/16 11:29, Jan Beulich wrote: >>>>> On 10.08.16 at 11:58, <luwei.kang@intel.com> wrote: >>>>>>> On 03.08.16 at 03:32, <luwei.kang@intel.com> wrote: >>>>>> On 25/07/16 07:09, Kang, Luwei wrote: >>>>>>>>>> First of all - please don't top post. >>>>>>>>>> >>>>>>>>>>> What about remove the dependency between AVX2 and AVX512F >>>>>>>> ( AVX2: >>>>>>>>>> [AVX512F], ) ? >>>>>>>>>> >>>>>>>>>> Yes, that's what I think we want, but we need Andrew's agreement here. >>>>>>>>>> >>>>>>>>> Hi Andrew, what is your opinion ? >>>>>>>> You are in a better position to answer than me. >>>>>>>> >>>>>>>> For a specific instruction which may be VEX and EVEX encoded, is >>>>>>>> the circuitry for a specific instruction shared, or discrete? Is >>>>>>>> there a plausible future where you might support only the EVEX >>>>>>>> variant of an instruction, and not the VEX variant? >>>>>>>> >>>>>>>> These dependences are about what may be reasonably assumed about >>>>>>>> the way the environment is structured. It doesn't look reasonable >>>>>>>> to advertise an AVX512 environment to guests while at the same >>>>>>>> time stating that AVX2 is not present. If this is correct, then >>>>>>>> the dependency >>>>> should stay. >>>>>>>> If Intel plausibly things it might release hardware with !AVX2 but >>>>>>>> AVX512, then the dependency should be dropped. >>>>>>> Regarding the dependency between AVX2 and AVX512F, I have ask some >>>>>>> hardware >>>>> architecture engineer. >>>>>>> AVX512 is a superset of AVX2, the most important item being the >>>>>>> state. If >>>>> the state for AVX2 isn't enabled (in XCR0), then AVX512 >>>>>> also can't function. >>>>>>> So if we want to use AVX512, AVX2 must be supported and enabled. >>>>>>> The >>>>> dependence between AVX512F and AVX2 may need be >>>>>> reserved. >>>>>> >>>>>> Ok, so AVX512 strictly depends on AVX2. >>>>>> >>>>>> In which case, the python code was correct. The meaning of the >>>>>> key/value >>>>> relationship is "if the feature in the key is not present, all >>>>>> features in the value must also be disabled". >>>>> Hi jan, what is your opinion? >>>> There's no opinion to be had here, according to your earlier reply. I do, >>> however, continue to question that model: State and >>>> instruction set are independent items. Of course YMM state is a prereq to >>> ZMM state, but I do not buy AVX2 insn support being a >>>> prereq to AVX-512 insns. Yet to me the AVX2 and AVX-512F feature flags solely >>> represent the instructions, while the XSTATE leaf bits >>>> represent the states. >>>> >>> Hi jan, >>> I get some information from hardware colleague. There is no dependence >>> about AVX512 instructions and AVX2 instructions. It is also not states in > the >>> official document. >> As I had assumed. >> >>> But I want to know the meaning of the dependence "AVX2: [AVX512F]" in >>> "gen-cpuid.py" file. >>> If it is the feature dependence, I think the dependence between AVX512 >>> and AVX2 may need to add. As we talk before, Zmm is part of AVX512 feature. > >>> If the state for AVX2 isn't enabled (in XCR0), then AVX512 also can't >>> function. Apart from that, there is no processors with AVX512 without AVX2 >>> currently and it is safe to treat AVX2 as part of the thermometer leading to > >>> AVX512. Regarding if will have a cpu support avx512 without avx2 in future, >>> it is really hard to say. >>> If it is the instructions dependence, I have no idea to delete this >>> dependence in "gen-cpuid.py" file. >>> So, I want to know your advice. >> Well, the main issue here is that we have no feature flag representation >> for the xstate bits. If we had, the relevant parts of the dependencies >> should look like >> >> XSTATE_YMM: [AVX, XSTATE_ZMM] >> AVX: [AVX2] >> XSTATE_ZMM: [AVX512F] >> AVX512F: [AVX512CD, ...] >> >> But in their absence I guess keeping the AVX2 dependency as you have >> it is the only reasonable approach. Just that I'd like it to be accompanied >> by a comment explaining that this isn't the actual dependency (so that if >> XSTATE feature flags got introduced later, it would be clear how to >> adjust those parts). >> >> Andrew - I keep forgetting why you think having such XSTATE_* feature >> flags is not appropriate. > > This is all about providing a plausible cpuid layout to a guest. > > It is not plausible that we will ever see a processor with e.g. AVX512F > but not XSTATE_ZMM. This is a somewhat bogus argument considering we have FXSR: [FFXSR, SSE], which, as you certainly realize, is slightly wrong nowadays: XSTATE_XMM would suffice as a prereq, without any FXSR. (But I certainly realize that lack of FXSR is unlikely, as that's considered part of the base x86-64 architecture, and I also realize that expressing alternative prereqs would make the deep dependency generation logic more complicated.) Jan
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c index c89ab6e..7696b1e 100644 --- a/xen/arch/x86/hvm/hvm.c +++ b/xen/arch/x86/hvm/hvm.c @@ -3474,6 +3474,20 @@ void hvm_cpuid(unsigned int input, unsigned int *eax, unsigned int *ebx, xstate_sizes[_XSTATE_BNDCSR]); } + if ( _ebx & cpufeat_mask(X86_FEATURE_AVX512F) ) + { + xfeature_mask |= XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM; + xstate_size = max(xstate_size, + xstate_offsets[_XSTATE_OPMASK] + + xstate_sizes[_XSTATE_OPMASK]); + xstate_size = max(xstate_size, + xstate_offsets[_XSTATE_ZMM] + + xstate_sizes[_XSTATE_ZMM]); + xstate_size = max(xstate_size, + xstate_offsets[_XSTATE_HI_ZMM] + + xstate_sizes[_XSTATE_HI_ZMM]); + } + if ( _ecx & cpufeat_mask(X86_FEATURE_PKU) ) { xfeature_mask |= XSTATE_PKRU; diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index 767d0b0..8fb697b 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -975,7 +975,7 @@ void pv_cpuid(struct cpu_user_regs *regs) switch ( leaf ) { - uint32_t tmp, _ecx; + uint32_t tmp, _ecx, _ebx; case 0x00000001: c &= pv_featureset[FEATURESET_1c]; @@ -1157,6 +1157,26 @@ void pv_cpuid(struct cpu_user_regs *regs) xstate_sizes[_XSTATE_YMM]); } + if ( !is_control_domain(currd) && !is_hardware_domain(currd) ) + domain_cpuid(currd, 7, 0, &tmp, &_ebx, &tmp, &tmp); + else + cpuid_count(7, 0, &tmp, &_ebx, &tmp, &tmp); + _ebx &= pv_featureset[FEATURESET_7b0]; + + if ( _ebx & cpufeat_mask(X86_FEATURE_AVX512F) ) + { + xfeature_mask |= XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM; + xstate_size = max(xstate_size, + xstate_offsets[_XSTATE_OPMASK] + + xstate_sizes[_XSTATE_OPMASK]); + xstate_size = max(xstate_size, + xstate_offsets[_XSTATE_ZMM] + + xstate_sizes[_XSTATE_ZMM]); + xstate_size = max(xstate_size, + xstate_offsets[_XSTATE_HI_ZMM] + + xstate_sizes[_XSTATE_HI_ZMM]); + } + a = (uint32_t)xfeature_mask; d = (uint32_t)(xfeature_mask >> 32); c = xstate_size; diff --git a/xen/include/public/arch-x86/cpufeatureset.h b/xen/include/public/arch-x86/cpufeatureset.h index 39acf8c..9320c9e 100644 --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -206,15 +206,24 @@ XEN_CPUFEATURE(PQM, 5*32+12) /* Platform QoS Monitoring */ XEN_CPUFEATURE(NO_FPU_SEL, 5*32+13) /*! FPU CS/DS stored as zero */ XEN_CPUFEATURE(MPX, 5*32+14) /*S Memory Protection Extensions */ XEN_CPUFEATURE(PQE, 5*32+15) /* Platform QoS Enforcement */ +XEN_CPUFEATURE(AVX512F, 5*32+16) /*A AVX-512 Foundation Instructions */ +XEN_CPUFEATURE(AVX512DQ, 5*32+17) /*A AVX-512 Doubleword & Quadword Instrs */ XEN_CPUFEATURE(RDSEED, 5*32+18) /*A RDSEED instruction */ XEN_CPUFEATURE(ADX, 5*32+19) /*A ADCX, ADOX instructions */ XEN_CPUFEATURE(SMAP, 5*32+20) /*S Supervisor Mode Access Prevention */ +XEN_CPUFEATURE(AVX512IFMA, 5*32+21) /*A AVX-512 Integer Fused Multiply Add */ XEN_CPUFEATURE(CLFLUSHOPT, 5*32+23) /*A CLFLUSHOPT instruction */ XEN_CPUFEATURE(CLWB, 5*32+24) /*A CLWB instruction */ +XEN_CPUFEATURE(AVX512PF, 5*32+26) /*A AVX-512 Prefetch Instructions */ +XEN_CPUFEATURE(AVX512ER, 5*32+27) /*A AVX-512 Exponent & Reciprocal Instrs */ +XEN_CPUFEATURE(AVX512CD, 5*32+28) /*A AVX-512 Conflict Detection Instrs */ XEN_CPUFEATURE(SHA, 5*32+29) /*A SHA1 & SHA256 instructions */ +XEN_CPUFEATURE(AVX512BW, 5*32+30) /*A AVX-512 Byte and Word Instructions */ +XEN_CPUFEATURE(AVX512VL, 5*32+31) /*A AVX-512 Vector Length Extensions */ /* Intel-defined CPU features, CPUID level 0x00000007:0.ecx, word 6 */ XEN_CPUFEATURE(PREFETCHWT1, 6*32+ 0) /*A PREFETCHWT1 instruction */ +XEN_CPUFEATURE(AVX512VBMI, 6*32+ 1) /*A AVX-512 Vector Byte Manipulation Instrs */ XEN_CPUFEATURE(PKU, 6*32+ 3) /*H Protection Keys for Userspace */ XEN_CPUFEATURE(OSPKE, 6*32+ 4) /*! OS Protection Keys Enable */ diff --git a/xen/tools/gen-cpuid.py b/xen/tools/gen-cpuid.py index 7c45eca..e8b64ce 100755 --- a/xen/tools/gen-cpuid.py +++ b/xen/tools/gen-cpuid.py @@ -243,6 +243,17 @@ def crunch_numbers(state): # AMD K6-2+ and K6-III processors shipped with 3DNow+, beyond the # standard 3DNow in the earlier K6 processors. _3DNOW: [_3DNOWEXT], + + # AVX2 is an extension to AVX, providing mainly new integer instructions. + # In principle, AVX512 only takes YMM register state as a prerequisite + # and many AVX2 instructions are extended by AVX512F to 512-bit forms. + AVX2: [AVX512F], + + # AVX512F is taken to mean hardware support for EVEX encoded instructions, + # 512bit registers, and the instructions themselves. All further AVX512 features + # are built on top of AVX512F + AVX512F: [AVX512DQ, AVX512IFMA, AVX512PF, AVX512ER, AVX512CD, + AVX512BW, AVX512VL, AVX512VBMI], } deep_features = tuple(sorted(deps.keys()))
AVX-512 is an extention of AVX2. Its spec can be found at: https://software.intel.com/sites/default/files/managed/b4/3a/319433-024.pdf This patch detects AVX-512 features by CPUID. Signed-off-by: Luwei Kang <luwei.kang@intel.com> --- [V4]: Update the description about features dependency. [V3]: Adjust dependencies between features. [V2]: 1.get max xstate_size from each bit. 2.change get cpuid function parameter from 0x07 to 7. 3.add dependencies between features in xen/tools/gen-cpuid.py. 4.split the cpuid call just like the way the hvm_cpuid() side works. --- xen/arch/x86/hvm/hvm.c | 14 ++++++++++++++ xen/arch/x86/traps.c | 22 +++++++++++++++++++++- xen/include/public/arch-x86/cpufeatureset.h | 9 +++++++++ xen/tools/gen-cpuid.py | 11 +++++++++++ 4 files changed, 55 insertions(+), 1 deletion(-)