Message ID | 20170531193434.6918-3-david@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 05/31/2017 03:34 PM, David Hildenbrand wrote: > Currently, under z/VM on a 0x2827, QEMU will detect a 0x2828 if no > IBC value is provided. QEMU will simply take the last model of that HW > generation, which happens to be the BC version. > > Let's improve our search for that case by selecting the latest CPU > definition that matches the CPU type. This might still detect e.g. a > GA2 version on a GA1 system, but as we don't have further information at > hand, there isn't too much we can do about it. > Isn't it better to choose the earliest machine possible in this case? Systems are backwards compatible far more often than they are forwards compatible. Said in a more verbose way, I'd rather pretend to be a GA1 on a GA2 (ignoring some new features) than pretend to be a GA2 on a GA1 and attempt to use something that does not exist. I'm sure you've thought of all of this though, so what am I missing? > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > target/s390x/cpu_models.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c > index 9e23535..b6220c8 100644 > --- a/target/s390x/cpu_models.c > +++ b/target/s390x/cpu_models.c > @@ -184,6 +184,7 @@ const S390CPUDef *s390_find_cpu_def(uint16_t type, uint8_t gen, uint8_t ec_ga, > S390FeatBitmap features) > { > const S390CPUDef *last_compatible = NULL; > + const S390CPUDef *matching_cpu_type = NULL; > int i; > > if (!gen) { > @@ -218,8 +219,16 @@ const S390CPUDef *s390_find_cpu_def(uint16_t type, uint8_t gen, uint8_t ec_ga, > if (def->type == type && def->ec_ga == ec_ga) { > return def; > } > + /* remember if we've at least seen one with the same cpu type */ > + if (def->type == type) { > + matching_cpu_type = def; > + } > last_compatible = def; > } > + /* prefer the model with the same cpu type, esp. don't take the BC for EC */ > + if (matching_cpu_type) { > + return matching_cpu_type; > + } > return last_compatible; > } >
On 01.06.2017 19:06, Jason J. Herne wrote: > On 05/31/2017 03:34 PM, David Hildenbrand wrote: >> Currently, under z/VM on a 0x2827, QEMU will detect a 0x2828 if no >> IBC value is provided. QEMU will simply take the last model of that HW >> generation, which happens to be the BC version. >> >> Let's improve our search for that case by selecting the latest CPU >> definition that matches the CPU type. This might still detect e.g. a >> GA2 version on a GA1 system, but as we don't have further information at >> hand, there isn't too much we can do about it. >> > > Isn't it better to choose the earliest machine possible in this case? > Systems > are backwards compatible far more often than they are forwards compatible. > > Said in a more verbose way, I'd rather pretend to be a GA1 on a GA2 > (ignoring > some new features) than pretend to be a GA2 on a GA1 and attempt to use > something that does not exist. > > I'm sure you've thought of all of this though, so what am I missing? > Hi Jason, excellent question. And of course I thought of it: Fact 1: All GAX models share the same base feature set. A GAX++ might support "more features". Fact 2: Without an IBC, the guest can't detect the GA version. a) If we have no IBC (esp. unblocked_ibc == 0), the IBC we will present to the guest in read_SCP_info() will be 0. The guest will not know which GA version it has. The problem of missing IBC propagates. b) If we don't have a feature of the GA++ version, also our guest won't have it. So in summary, the guest also has no idea of its GA version. So this gives more "flexibility" and is backwards compatible. Think about it like this: You're running on 0x2827 GA2. Old QEMU version indicated "0x2828 GA1 == 0x2827 GA2". After you updated QEMU, you suddenly detect "0x2827 GA1". You're previous libvirt guest might suddenly refuse to run. So it boils down to "it's just a QEMU representation that is more flexible, the guest can't see the difference". Thanks! > >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> target/s390x/cpu_models.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c >> index 9e23535..b6220c8 100644 >> --- a/target/s390x/cpu_models.c >> +++ b/target/s390x/cpu_models.c >> @@ -184,6 +184,7 @@ const S390CPUDef *s390_find_cpu_def(uint16_t type, uint8_t gen, uint8_t ec_ga, >> S390FeatBitmap features) >> { >> const S390CPUDef *last_compatible = NULL; >> + const S390CPUDef *matching_cpu_type = NULL; >> int i; >> >> if (!gen) { >> @@ -218,8 +219,16 @@ const S390CPUDef *s390_find_cpu_def(uint16_t type, uint8_t gen, uint8_t ec_ga, >> if (def->type == type && def->ec_ga == ec_ga) { >> return def; >> } >> + /* remember if we've at least seen one with the same cpu type */ >> + if (def->type == type) { >> + matching_cpu_type = def; >> + } >> last_compatible = def; >> } >> + /* prefer the model with the same cpu type, esp. don't take the BC for EC */ >> + if (matching_cpu_type) { >> + return matching_cpu_type; >> + } >> return last_compatible; >> } >> >
On 06/01/2017 07:42 PM, David Hildenbrand wrote: > On 01.06.2017 19:06, Jason J. Herne wrote: >> On 05/31/2017 03:34 PM, David Hildenbrand wrote: >>> Currently, under z/VM on a 0x2827, QEMU will detect a 0x2828 if no >>> IBC value is provided. QEMU will simply take the last model of that HW >>> generation, which happens to be the BC version. >>> >>> Let's improve our search for that case by selecting the latest CPU >>> definition that matches the CPU type. This might still detect e.g. a >>> GA2 version on a GA1 system, but as we don't have further information at >>> hand, there isn't too much we can do about it. >>> >> >> Isn't it better to choose the earliest machine possible in this case? >> Systems >> are backwards compatible far more often than they are forwards compatible. >> >> Said in a more verbose way, I'd rather pretend to be a GA1 on a GA2 >> (ignoring >> some new features) than pretend to be a GA2 on a GA1 and attempt to use >> something that does not exist. >> >> I'm sure you've thought of all of this though, so what am I missing? >> > > Hi Jason, > > excellent question. And of course I thought of it: > > Fact 1: All GAX models share the same base feature set. A GAX++ might > support "more features". > > Fact 2: Without an IBC, the guest can't detect the GA version. > > a) If we have no IBC (esp. unblocked_ibc == 0), the IBC we will present > to the guest in read_SCP_info() will be 0. The guest will not know which > GA version it has. The problem of missing IBC propagates. > > b) If we don't have a feature of the GA++ version, also our guest won't > have it. So in summary, the guest also has no idea of its GA version. > > So this gives more "flexibility" and is backwards compatible. > > Think about it like this: You're running on 0x2827 GA2. > > Old QEMU version indicated "0x2828 GA1 == 0x2827 GA2". After you updated > QEMU, you suddenly detect "0x2827 GA1". You're previous libvirt guest > might suddenly refuse to run. > > So it boils down to "it's just a QEMU representation that is more > flexible, the guest can't see the difference". > > Thanks! > Your change makes a lot of sense, but ain't easy to understand IMHO. Maybe integrating some of the discussion from above into the commit message would be helpful. Especially this assumption that the client code is fine with getting a more recent GA version (and anything influenced by it) if the ec_ga parameter is (basically) invalid, because ec_ga invalid means no IBC, and that (somehow) ensures the guest will be unaware of the picked GA (and everything affected by it) I do understand that your patch is clearly superior to what we have now, but I'm sill not sure why are we picking the last and not the firs one with the matching type (and not necessarily matching ec_ga). I guess the solution is to be sought along 'more "flexibility". That would mean the GA does not matter for the guest, but matters for something else (QEMU, libvirt, I do not know). But if it matters, then for some to me unknown reasons newer GA must be safe there too. Fact is (by picking the last one) the algorithm ain't stable against appending a new GA to the list in a new qemu version (that's why I assume it must be safe). Let me add one meta comment too. IMHO what makes the function hard to understand, is that we have a bunch of optional parameters, and that although this seems to be some sort of approximation problem (find the best good enough), the metric ain't obvious for somebody not to familiar with the problem domain (like me). AFAIU you are fixing the following problem domain (me). Seems that the metric implemented was sub-optimal: if parameters present {type}, absent {ec_ga} maybe present {gen, features}, then type is not honored appropriately. One can probably figure out what's intended by understanding all the usages (which makes reviewing it demanding). For instance, you also refer to a particular usage in your title/subject (that is "model->def = s390_find_cpu_def(cpu_type, ibc_gen(unblocked_ibc), ibc_ec_ga(unblocked_ibc), NULL);" in target/s390x/kvm.c). Since I agree it's an improvement: Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> >> >>> Signed-off-by: David Hildenbrand <david@redhat.com> >>> --- >>> target/s390x/cpu_models.c | 9 +++++++++ >>> 1 file changed, 9 insertions(+) >>> >>> diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c >>> index 9e23535..b6220c8 100644 >>> --- a/target/s390x/cpu_models.c >>> +++ b/target/s390x/cpu_models.c >>> @@ -184,6 +184,7 @@ const S390CPUDef *s390_find_cpu_def(uint16_t type, uint8_t gen, uint8_t ec_ga, >>> S390FeatBitmap features) >>> { >>> const S390CPUDef *last_compatible = NULL; >>> + const S390CPUDef *matching_cpu_type = NULL; >>> int i; >>> >>> if (!gen) { >>> @@ -218,8 +219,16 @@ const S390CPUDef *s390_find_cpu_def(uint16_t type, uint8_t gen, uint8_t ec_ga, >>> if (def->type == type && def->ec_ga == ec_ga) { >>> return def; >>> } >>> + /* remember if we've at least seen one with the same cpu type */ >>> + if (def->type == type) { >>> + matching_cpu_type = def; >>> + } >>> last_compatible = def; >>> } >>> + /* prefer the model with the same cpu type, esp. don't take the BC for EC */ >>> + if (matching_cpu_type) { >>> + return matching_cpu_type; >>> + } >>> return last_compatible; >>> } >>> >> > >
On 06/02/2017 07:28 PM, Halil Pasic wrote: [...] > Maybe integrating some of the discussion from above into the commit > message would be helpful. applied with the following patch description Currently, under z/VM on a 0x2827, QEMU will detect a 0x2828 if no IBC value is provided. QEMU will simply take the last model of that HW generation, which happens to be the BC version. Let's improve our search for that case by selecting the latest CPU definition that matches the CPU type. This for example will avoid detecting an z13 as a z13s. We might still detect a GA2 version on a GA1 system, but as we don't have further information at hand, there isn't too much we can do about it. The alternative of always presenting the oldest GA is not backward compatible, e.g: You're running on 0x2827 GA2. Old QEMU version indicated "0x2828 GA1 == 0x2827 GA2". After you updated QEMU, you suddenly detect "0x2827 GA1". You're previous libvirt guest might suddenly refuse to run. In the end presenting a newer GA level does not matter because: 1: All GAX models share the same base feature set. A GAX++ might support "more features". 2: Without an IBC, the guest can't detect the GA version. If we have no IBC (esp. unblocked_ibc == 0), the IBC we will present to the guest in read_SCP_info() will be 0. The guest will not know which GA version it has. The problem of missing IBC propagates. If we don't have a feature of the GA++ version, also our guest won't have it. So in summary, the guest also has no idea of its GA version. Signed-off-by: David Hildenbrand <david@redhat.com> Message-Id: <20170531193434.6918-3-david@redhat.com> Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> [improve patch description by reusing mailing list discussion]
On 06.06.2017 10:39, Christian Borntraeger wrote: > On 06/02/2017 07:28 PM, Halil Pasic wrote: > [...] >> Maybe integrating some of the discussion from above into the commit >> message would be helpful. > > applied with the following patch description Just wanted to ask you to do that :) Thanks! Looks good to me! > > > Currently, under z/VM on a 0x2827, QEMU will detect a 0x2828 if no > IBC value is provided. QEMU will simply take the last model of that HW > generation, which happens to be the BC version. > > Let's improve our search for that case by selecting the latest CPU > definition that matches the CPU type. This for example will avoid > detecting an z13 as a z13s. > > We might still detect a GA2 version on a GA1 system, but as we don't > have further information at hand, there isn't too much we can do about > it. The alternative of always presenting the oldest GA is not backward > compatible, e.g: > You're running on 0x2827 GA2. > Old QEMU version indicated "0x2828 GA1 == 0x2827 GA2". After you updated > QEMU, you suddenly detect "0x2827 GA1". You're previous libvirt guest > might suddenly refuse to run. > > In the end presenting a newer GA level does not matter because: > > 1: All GAX models share the same base feature set. A GAX++ might > support "more features". > 2: Without an IBC, the guest can't detect the GA version. > > If we have no IBC (esp. unblocked_ibc == 0), the IBC we will present > to the guest in read_SCP_info() will be 0. The guest will not know > which GA version it has. The problem of missing IBC propagates. > > If we don't have a feature of the GA++ version, also our guest won't > have it. So in summary, the guest also has no idea of its GA version. > > Signed-off-by: David Hildenbrand <david@redhat.com> > Message-Id: <20170531193434.6918-3-david@redhat.com> > Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com> > Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com> > Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> > [improve patch description by reusing mailing list discussion] > >
> Your change makes a lot of sense, but ain't easy to understand IMHO. > > Maybe integrating some of the discussion from above into the commit > message would be helpful. Especially this assumption that the client code > is fine with getting a more recent GA version (and anything influenced by > it) if the ec_ga parameter is (basically) invalid, because ec_ga invalid > means no IBC, and that (somehow) ensures the guest will be unaware of the > picked GA (and everything affected by it) > > I do understand that your patch is clearly superior to what we have now, > but I'm sill not sure why are we picking the last and not the firs one > with the matching type (and not necessarily matching ec_ga). I guess the > solution is to be sought along 'more "flexibility". That would mean the > GA does not matter for the guest, but matters for something else (QEMU, > libvirt, I do not know). > > But if it matters, then for some to me unknown reasons newer GA must be > safe there too. Fact is (by picking the last one) the algorithm ain't > stable against appending a new GA to the list in a new qemu version > (that's why I assume it must be safe). Identifying a more recent GA version just allows us to forward more features to the guest (introduced by a new GA version), that's all. Think of a GA version as a firmware update. Sooner or later, most systems will be updated. From a customer point of view, it's better to detect a GA2 on a GA1 than a GA1 on a GA2 ;) The guest will only be able to tell the difference if there are actually new features around - which can only happen if you're really on the next GA version. > > Let me add one meta comment too. IMHO what makes the function hard to > understand, is that we have a bunch of optional parameters, and that > although this seems to be some sort of approximation problem (find the > best good enough), the metric ain't obvious for somebody not to familiar > with the problem domain (like me). AFAIU you are fixing the following > problem domain (me). Seems that the metric implemented was sub-optimal: > if parameters present {type}, absent {ec_ga} maybe present {gen, > features}, then type is not honored appropriately. One can probably > figure out what's intended by understanding all the usages (which makes > reviewing it demanding). For instance, you also refer to a particular > usage in your title/subject (that is "model->def = > s390_find_cpu_def(cpu_type, ibc_gen(unblocked_ibc), > ibc_ec_ga(unblocked_ibc), NULL);" in target/s390x/kvm.c). Actually, I thought that all relevant z/VM systems would properly forward the IBC. By relevant, I am speaking about zEC12+. So I never saw this while working on the CPU model. We could split this function into multiple: find by (cpuid), (cpuid,gen,ec_ga), (features), (gen,ec_ga,features). All without optional parameters. But I doubt that this will result in better code, especially on the caller side. If you have an idea how to improve that piece of code, feel free to send a cleanup. Thanks for having a look! > > Since I agree it's an improvement: > > Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
On 06/06/2017 10:57 AM, David Hildenbrand wrote: >> Let me add one meta comment too. IMHO what makes the function hard to >> understand, is that we have a bunch of optional parameters, and that >> although this seems to be some sort of approximation problem (find the >> best good enough), the metric ain't obvious for somebody not to familiar >> with the problem domain (like me). AFAIU you are fixing the following >> problem domain (me). Seems that the metric implemented was sub-optimal: >> if parameters present {type}, absent {ec_ga} maybe present {gen, >> features}, then type is not honored appropriately. One can probably >> figure out what's intended by understanding all the usages (which makes >> reviewing it demanding). For instance, you also refer to a particular >> usage in your title/subject (that is "model->def = >> s390_find_cpu_def(cpu_type, ibc_gen(unblocked_ibc), >> ibc_ec_ga(unblocked_ibc), NULL);" in target/s390x/kvm.c). > Actually, I thought that all relevant z/VM systems would properly > forward the IBC. By relevant, I am speaking about zEC12+. So I never saw > this while working on the CPU model. > > We could split this function into multiple: find by (cpuid), > (cpuid,gen,ec_ga), (features), (gen,ec_ga,features). All without > optional parameters. But I doubt that this will result in better code, > especially on the caller side. > > If you have an idea how to improve that piece of code, feel free to send > a cleanup. I prefer leaving it as-is. My understanding ain't complete enough anyways. Making the caller side needlessly complicated is definitely not what we want. One approach to make understanding/reviewing the function easier (especially in isolation) would be adding a comment describing it's pre- and postcondition (akin to Hoare logic but in natural language). If that can be one easily, then I do not think we have a problem. My problem is that I can't (that's what I mean by the 'metric' is not obvious). > > Thanks for having a look! > yw Halil
diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c index 9e23535..b6220c8 100644 --- a/target/s390x/cpu_models.c +++ b/target/s390x/cpu_models.c @@ -184,6 +184,7 @@ const S390CPUDef *s390_find_cpu_def(uint16_t type, uint8_t gen, uint8_t ec_ga, S390FeatBitmap features) { const S390CPUDef *last_compatible = NULL; + const S390CPUDef *matching_cpu_type = NULL; int i; if (!gen) { @@ -218,8 +219,16 @@ const S390CPUDef *s390_find_cpu_def(uint16_t type, uint8_t gen, uint8_t ec_ga, if (def->type == type && def->ec_ga == ec_ga) { return def; } + /* remember if we've at least seen one with the same cpu type */ + if (def->type == type) { + matching_cpu_type = def; + } last_compatible = def; } + /* prefer the model with the same cpu type, esp. don't take the BC for EC */ + if (matching_cpu_type) { + return matching_cpu_type; + } return last_compatible; }
Currently, under z/VM on a 0x2827, QEMU will detect a 0x2828 if no IBC value is provided. QEMU will simply take the last model of that HW generation, which happens to be the BC version. Let's improve our search for that case by selecting the latest CPU definition that matches the CPU type. This might still detect e.g. a GA2 version on a GA1 system, but as we don't have further information at hand, there isn't too much we can do about it. Signed-off-by: David Hildenbrand <david@redhat.com> --- target/s390x/cpu_models.c | 9 +++++++++ 1 file changed, 9 insertions(+)