Message ID | 20240820145514.63046-1-nabiev.arman13@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | ppc: fixed incorrect name filed in vmstate_tlbemb_entry | expand |
On Tue, 20 Aug 2024 at 17:03, <nabiev.arman13@gmail.com> wrote: > > From: armanincredible <nabiev.arman13@gmail.com> > > Signed-off-by: armanincredible <nabiev.arman13@gmail.com> [cc'd the ppc maintainers and list] > --- > target/ppc/machine.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/ppc/machine.c b/target/ppc/machine.c > index 731dd8df35..d433fd45fc 100644 > --- a/target/ppc/machine.c > +++ b/target/ppc/machine.c > @@ -621,7 +621,7 @@ static bool tlbemb_needed(void *opaque) > } > > static const VMStateDescription vmstate_tlbemb = { > - .name = "cpu/tlb6xx", > + .name = "cpu/tlbemb", > .version_id = 1, > .minimum_version_id = 1, > .needed = tlbemb_needed, This does look clearly a mistake, but on the other hand the name field in a VMStateDescription is part of the on-the-wire format, so changing it breaks migration compatibility. Before we make this change we need to confirm that it is not used on any machine types where we care about cross version migration compat. Alternatively if we need to keep the compatibility across versions we could leave it as is and add a comment about why. (I don't think we'll have a problem with incorrectly interpreting a tlbemb as a tlb6xx, it will mismatch for other reasons.) thanks -- PMM
Sorry for not providing enough argumentation for my patch. I found a configuration where this error occurs. Please take a look at https://gitlab.com/qemu-project/qemu/-/issues/2522. вт, 20 авг. 2024 г. в 19:20, Peter Maydell <peter.maydell@linaro.org>: > On Tue, 20 Aug 2024 at 17:03, <nabiev.arman13@gmail.com> wrote: > > > > From: armanincredible <nabiev.arman13@gmail.com> > > > > Signed-off-by: armanincredible <nabiev.arman13@gmail.com> > > [cc'd the ppc maintainers and list] > > > > > --- > > target/ppc/machine.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/target/ppc/machine.c b/target/ppc/machine.c > > index 731dd8df35..d433fd45fc 100644 > > --- a/target/ppc/machine.c > > +++ b/target/ppc/machine.c > > @@ -621,7 +621,7 @@ static bool tlbemb_needed(void *opaque) > > } > > > > static const VMStateDescription vmstate_tlbemb = { > > - .name = "cpu/tlb6xx", > > + .name = "cpu/tlbemb", > > .version_id = 1, > > .minimum_version_id = 1, > > .needed = tlbemb_needed, > > This does look clearly a mistake, but on the other hand the > name field in a VMStateDescription is part of the on-the-wire > format, so changing it breaks migration compatibility. > > Before we make this change we need to confirm that it is > not used on any machine types where we care about cross > version migration compat. > > Alternatively if we need to keep the compatibility across > versions we could leave it as is and add a comment about > why. (I don't think we'll have a problem with incorrectly > interpreting a tlbemb as a tlb6xx, it will mismatch for > other reasons.) > > thanks > -- PMM >
On Wed, 21 Aug 2024 at 15:08, Physics Набиев <nabiev.arman13@gmail.com> wrote: > > Sorry for not providing enough argumentation for my patch. I found a configuration where this error occurs. Please take a look at https://gitlab.com/qemu-project/qemu/-/issues/2522. Hmm. I don't understand why fixing this field name would fix record-and-replay, because I would have thought that all that should matter is that the name that we write out in the record matches the one we read in in the replay... -- PMM
In my example in https://gitlab.com/qemu-project/qemu/-/issues/2522 the .needed function returns true for vmstate_tlbemb, but not for vmstate_tlb6xx. I tried to do some tests without fixing the typo. When I changed the .fields in the two structures to the same value so that the size of the data they stored matched, everything worked. I also changed the order of vmstate_tlb6xx and vmstate_tlbemb in the subsections field of vmstate_ppc_cpu, everything worked as well. According to https://www.qemu.org/docs/master/devel/migration/main.html#:~:text=On%20the%20receiving%20side%2C%20if,that%20didn%E2%80%99t%20send%20the%20subsection and on my own tests I think the problem is that when reading saved data, qemu uses the device name to determine an object that extracts a certain size of data. Since the names are the same for vmstate_tlb6xx and vmstate_tlbemb, it uses the functions for the first one due to a certain order, which leads to an error, since the data from the second one was saved. On Wed, 21 Aug 2024 at 18:45, Peter Maydell <peter.maydell@linaro.org> wrote: > On Wed, 21 Aug 2024 at 15:08, Physics Набиев <nabiev.arman13@gmail.com> > wrote: > > > > Sorry for not providing enough argumentation for my patch. I found a > configuration where this error occurs. Please take a look at > https://gitlab.com/qemu-project/qemu/-/issues/2522. > > Hmm. I don't understand why fixing this field name would > fix record-and-replay, because I would have thought that > all that should matter is that the name that we write out > in the record matches the one we read in in the replay... > > -- PMM >
On Wed, 21 Aug 2024 at 19:56, Arman Nabiev <nabiev.arman13@gmail.com> wrote: > > In my example in https://gitlab.com/qemu-project/qemu/-/issues/2522 the .needed function returns true for vmstate_tlbemb, but not for vmstate_tlb6xx. I tried to do some tests without fixing the typo. When I changed the .fields in the two structures to the same value so that the size of the data they stored matched, everything worked. I also changed the order of vmstate_tlb6xx and vmstate_tlbemb in the subsections field of vmstate_ppc_cpu, everything worked as well. > According to https://www.qemu.org/docs/master/devel/migration/main.html#:~:text=On%20the%20receiving%20side%2C%20if,that%20didn%E2%80%99t%20send%20the%20subsection and on my own tests I think the problem is that when reading saved data, qemu uses the device name to determine an object that extracts a certain size of data. Since the names are the same for vmstate_tlb6xx and vmstate_tlbemb, it uses the functions for the first one due to a certain order, which leads to an error, since the data from the second one was saved. Aha, yes, that would explain it -- the PPC CPU has both subsections in its subsection list, but they have the same name, so we pick the wrong one when we see the name in the incoming data. In that case we can take this fix without worrying about a migration compat break, because clearly migration has never worked for this CPU type... -- PMM
On Wed, 21 Aug 2024 at 20:33, Peter Maydell <peter.maydell@linaro.org> wrote: > > On Wed, 21 Aug 2024 at 19:56, Arman Nabiev <nabiev.arman13@gmail.com> wrote: > > > > In my example in https://gitlab.com/qemu-project/qemu/-/issues/2522 the .needed function returns true for vmstate_tlbemb, but not for vmstate_tlb6xx. I tried to do some tests without fixing the typo. When I changed the .fields in the two structures to the same value so that the size of the data they stored matched, everything worked. I also changed the order of vmstate_tlb6xx and vmstate_tlbemb in the subsections field of vmstate_ppc_cpu, everything worked as well. > > According to https://www.qemu.org/docs/master/devel/migration/main.html#:~:text=On%20the%20receiving%20side%2C%20if,that%20didn%E2%80%99t%20send%20the%20subsection and on my own tests I think the problem is that when reading saved data, qemu uses the device name to determine an object that extracts a certain size of data. Since the names are the same for vmstate_tlb6xx and vmstate_tlbemb, it uses the functions for the first one due to a certain order, which leads to an error, since the data from the second one was saved. > > Aha, yes, that would explain it -- the PPC CPU has both > subsections in its subsection list, but they have the > same name, so we pick the wrong one when we see the > name in the incoming data. > > In that case we can take this fix without worrying > about a migration compat break, because clearly > migration has never worked for this CPU type... I did a quick test and indeed migration doesn't work, not just record-and-replay: $ ./build/ppc/qemu-system-ppc -drive if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2 -monitor stdio -M bamboo QEMU 9.0.92 monitor - type 'help' for more information (qemu) savevm foo (qemu) loadvm foo Missing section footer for cpu Error: Error -22 while loading VM state So I'm happy that this patch is the right fix, and it can have my Reviewed-by: Peter Maydell <peter.maydell@linaro.org> provided that we fix a couple of things in the commit message: (1) For QEMU, Signed-off-by lines should generally be your full name, not a pseudonym (as I assume "armanincredible" is). (2) We should give in the commit message details of what has been fixed and also a Resolves: line for the gitlab issue, something like: ===begin== target/ppc: Fix migration of CPUs with TLB_EMB TLB type In vmstate_tlbemb a cut-and-paste error meant we gave this vmstate subsection the same "cpu/tlb6xx" name as the vmstate_tlb6xx subsection. This breaks migration load for any CPU using the TLB_EMB CPU type, because when we see the "tlb6xx" name in the incoming data we try to interpret it as a vmstate_tlb6xx subsection, which it isn't the right format for: $ qemu-system-ppc -drive if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2 -monitor stdio -M bamboo QEMU 9.0.92 monitor - type 'help' for more information (qemu) savevm foo (qemu) loadvm foo Missing section footer for cpu Error: Error -22 while loading VM state Correct the incorrect vmstate section name. Since migration for these CPU types was completely broken before, we don't need to care that this is a migration compatibility break. This affects the PPC 405, 440, 460 and e200 CPU families. Cc: qemu-stable@nongnu.org Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2522 Signed-off-by: [your name here] <nabiev.arman13@gmail.com> ===endit=== thanks -- PMM
Thank you. Do I need to resend the patch with the specified corrections, or will you do it and all I need is the name in signed-off? In the second case, you need to specify Arman Nabiev. Just in case I'll attach the fixed patch here. On Thu, 22 Aug 2024 at 16:12, Peter Maydell <peter.maydell@linaro.org> wrote: > On Wed, 21 Aug 2024 at 20:33, Peter Maydell <peter.maydell@linaro.org> > wrote: > > > > On Wed, 21 Aug 2024 at 19:56, Arman Nabiev <nabiev.arman13@gmail.com> > wrote: > > > > > > In my example in https://gitlab.com/qemu-project/qemu/-/issues/2522 > the .needed function returns true for vmstate_tlbemb, but not for > vmstate_tlb6xx. I tried to do some tests without fixing the typo. When I > changed the .fields in the two structures to the same value so that the > size of the data they stored matched, everything worked. I also changed the > order of vmstate_tlb6xx and vmstate_tlbemb in the subsections field of > vmstate_ppc_cpu, everything worked as well. > > > According to > https://www.qemu.org/docs/master/devel/migration/main.html#:~:text=On%20the%20receiving%20side%2C%20if,that%20didn%E2%80%99t%20send%20the%20subsection > and on my own tests I think the problem is that when reading saved data, > qemu uses the device name to determine an object that extracts a certain > size of data. Since the names are the same for vmstate_tlb6xx and > vmstate_tlbemb, it uses the functions for the first one due to a certain > order, which leads to an error, since the data from the second one was > saved. > > > > Aha, yes, that would explain it -- the PPC CPU has both > > subsections in its subsection list, but they have the > > same name, so we pick the wrong one when we see the > > name in the incoming data. > > > > In that case we can take this fix without worrying > > about a migration compat break, because clearly > > migration has never worked for this CPU type... > > I did a quick test and indeed migration doesn't work, not > just record-and-replay: > > $ ./build/ppc/qemu-system-ppc -drive > if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2 > -monitor stdio -M bamboo > QEMU 9.0.92 monitor - type 'help' for more information > (qemu) savevm foo > (qemu) loadvm foo > Missing section footer for cpu > Error: Error -22 while loading VM state > > So I'm happy that this patch is the right fix, and > it can have my > > Reviewed-by: Peter Maydell <peter.maydell@linaro.org> > > provided that we fix a couple of things in the commit message: > > (1) For QEMU, Signed-off-by lines should generally be your > full name, not a pseudonym (as I assume "armanincredible" is). > > (2) We should give in the commit message details of > what has been fixed and also a Resolves: line for the > gitlab issue, something like: > > ===begin== > target/ppc: Fix migration of CPUs with TLB_EMB TLB type > > In vmstate_tlbemb a cut-and-paste error meant we gave > this vmstate subsection the same "cpu/tlb6xx" name as > the vmstate_tlb6xx subsection. This breaks migration load > for any CPU using the TLB_EMB CPU type, because when we > see the "tlb6xx" name in the incoming data we try to > interpret it as a vmstate_tlb6xx subsection, which it > isn't the right format for: > > $ qemu-system-ppc -drive > if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2 > -monitor stdio -M bamboo > QEMU 9.0.92 monitor - type 'help' for more information > (qemu) savevm foo > (qemu) loadvm foo > Missing section footer for cpu > Error: Error -22 while loading VM state > > Correct the incorrect vmstate section name. Since migration > for these CPU types was completely broken before, we don't > need to care that this is a migration compatibility break. > > This affects the PPC 405, 440, 460 and e200 CPU families. > > Cc: qemu-stable@nongnu.org > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2522 > Signed-off-by: [your name here] <nabiev.arman13@gmail.com> > ===endit=== > > thanks > -- PMM >
diff --git a/target/ppc/machine.c b/target/ppc/machine.c index 731dd8df35..d433fd45fc 100644 --- a/target/ppc/machine.c +++ b/target/ppc/machine.c @@ -621,7 +621,7 @@ static bool tlbemb_needed(void *opaque) } static const VMStateDescription vmstate_tlbemb = { - .name = "cpu/tlb6xx", + .name = "cpu/tlbemb", .version_id = 1, .minimum_version_id = 1, .needed = tlbemb_needed,