diff mbox series

target/loongarch/kvm: Fix VM recovery from disk failures

Message ID 20240430012356.2620763-1-gaosong@loongson.cn (mailing list archive)
State New, archived
Headers show
Series target/loongarch/kvm: Fix VM recovery from disk failures | expand

Commit Message

gaosong April 30, 2024, 1:23 a.m. UTC
vmstate does not save kvm_state_conter,
which can cause VM recovery from disk to fail.

Signed-off-by: Song Gao <gaosong@loongson.cn>
---
 target/loongarch/machine.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Philippe Mathieu-Daudé April 30, 2024, 8:53 a.m. UTC | #1
(Cc'ing migration maintainers)

On 30/4/24 03:23, Song Gao wrote:
> vmstate does not save kvm_state_conter,
> which can cause VM recovery from disk to fail.

Cc: qemu-stable@nongnu.org
Fixes: d11681c94f ("target/loongarch: Implement kvm_arch_init_vcpu")

> Signed-off-by: Song Gao <gaosong@loongson.cn>
> ---
>   target/loongarch/machine.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
> index c7029fb9b4..4cd1bf06ff 100644
> --- a/target/loongarch/machine.c
> +++ b/target/loongarch/machine.c
> @@ -191,6 +191,8 @@ const VMStateDescription vmstate_loongarch_cpu = {
>           VMSTATE_STRUCT_ARRAY(env.tlb, LoongArchCPU, LOONGARCH_TLB_MAX,
>                                0, vmstate_tlb, LoongArchTLB),
>   
> +        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
> +
>           VMSTATE_END_OF_LIST()
>       },
>       .subsections = (const VMStateDescription * const []) {

The migration stream is versioned, so you should increase it,
but this field is only relevant for KVM (it shouldn't be there
in non-KVM builds). IMHO the correct migration way to fix that
is (untested):

-- >8 --
diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
index c7029fb9b4..08032c6d71 100644
--- a/target/loongarch/machine.c
+++ b/target/loongarch/machine.c
@@ -8,8 +8,27 @@
  #include "qemu/osdep.h"
  #include "cpu.h"
  #include "migration/cpu.h"
+#include "sysemu/kvm.h"
  #include "vec.h"

+#ifdef CONFIG_KVM
+static bool kvmcpu_needed(void *opaque)
+{
+    return kvm_enabled();
+}
+
+static const VMStateDescription vmstate_kvmtimer = {
+    .name = "cpu/kvmtimer",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = kvmcpu_needed,
+    .fields = (const VMStateField[]) {
+        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
+        VMSTATE_END_OF_LIST()
+    }
+};
+#endif /* CONFIG_KVM */
+
  static const VMStateDescription vmstate_fpu_reg = {
      .name = "fpu_reg",
      .version_id = 1,
@@ -194,6 +213,9 @@ const VMStateDescription vmstate_loongarch_cpu = {
          VMSTATE_END_OF_LIST()
      },
      .subsections = (const VMStateDescription * const []) {
+#ifdef CONFIG_KVM
+        &vmstate_kvmcpu,
+#endif
          &vmstate_fpu,
          &vmstate_lsx,
          &vmstate_lasx,
---
Fabiano Rosas April 30, 2024, 2 p.m. UTC | #2
Philippe Mathieu-Daudé <philmd@linaro.org> writes:

> (Cc'ing migration maintainers)
>
> On 30/4/24 03:23, Song Gao wrote:
>> vmstate does not save kvm_state_conter,
>> which can cause VM recovery from disk to fail.
>
> Cc: qemu-stable@nongnu.org
> Fixes: d11681c94f ("target/loongarch: Implement kvm_arch_init_vcpu")
>
>> Signed-off-by: Song Gao <gaosong@loongson.cn>
>> ---
>>   target/loongarch/machine.c | 2 ++
>>   1 file changed, 2 insertions(+)
>> 
>> diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
>> index c7029fb9b4..4cd1bf06ff 100644
>> --- a/target/loongarch/machine.c
>> +++ b/target/loongarch/machine.c
>> @@ -191,6 +191,8 @@ const VMStateDescription vmstate_loongarch_cpu = {
>>           VMSTATE_STRUCT_ARRAY(env.tlb, LoongArchCPU, LOONGARCH_TLB_MAX,
>>                                0, vmstate_tlb, LoongArchTLB),
>>   
>> +        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
>> +
>>           VMSTATE_END_OF_LIST()
>>       },
>>       .subsections = (const VMStateDescription * const []) {
>
> The migration stream is versioned, so you should increase it,
> but this field is only relevant for KVM (it shouldn't be there
> in non-KVM builds). IMHO the correct migration way to fix that
> is (untested):
>
> -- >8 --
> diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
> index c7029fb9b4..08032c6d71 100644
> --- a/target/loongarch/machine.c
> +++ b/target/loongarch/machine.c
> @@ -8,8 +8,27 @@
>   #include "qemu/osdep.h"
>   #include "cpu.h"
>   #include "migration/cpu.h"
> +#include "sysemu/kvm.h"
>   #include "vec.h"
>
> +#ifdef CONFIG_KVM
> +static bool kvmcpu_needed(void *opaque)
> +{
> +    return kvm_enabled();
> +}
> +
> +static const VMStateDescription vmstate_kvmtimer = {
> +    .name = "cpu/kvmtimer",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = kvmcpu_needed,
> +    .fields = (const VMStateField[]) {
> +        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +#endif /* CONFIG_KVM */
> +
>   static const VMStateDescription vmstate_fpu_reg = {
>       .name = "fpu_reg",
>       .version_id = 1,
> @@ -194,6 +213,9 @@ const VMStateDescription vmstate_loongarch_cpu = {
>           VMSTATE_END_OF_LIST()
>       },
>       .subsections = (const VMStateDescription * const []) {
> +#ifdef CONFIG_KVM
> +        &vmstate_kvmcpu,
> +#endif
>           &vmstate_fpu,
>           &vmstate_lsx,
>           &vmstate_lasx,
> ---

LGTM, I'd just leave only the .needed function under CONFIG_KVM instead
of the whole subsection.
Peter Xu May 1, 2024, 3:45 p.m. UTC | #3
On Tue, Apr 30, 2024 at 11:00:24AM -0300, Fabiano Rosas wrote:
> Philippe Mathieu-Daudé <philmd@linaro.org> writes:
> 
> > (Cc'ing migration maintainers)
> >
> > On 30/4/24 03:23, Song Gao wrote:
> >> vmstate does not save kvm_state_conter,
> >> which can cause VM recovery from disk to fail.
> >
> > Cc: qemu-stable@nongnu.org
> > Fixes: d11681c94f ("target/loongarch: Implement kvm_arch_init_vcpu")
> >
> >> Signed-off-by: Song Gao <gaosong@loongson.cn>
> >> ---
> >>   target/loongarch/machine.c | 2 ++
> >>   1 file changed, 2 insertions(+)
> >> 
> >> diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
> >> index c7029fb9b4..4cd1bf06ff 100644
> >> --- a/target/loongarch/machine.c
> >> +++ b/target/loongarch/machine.c
> >> @@ -191,6 +191,8 @@ const VMStateDescription vmstate_loongarch_cpu = {
> >>           VMSTATE_STRUCT_ARRAY(env.tlb, LoongArchCPU, LOONGARCH_TLB_MAX,
> >>                                0, vmstate_tlb, LoongArchTLB),
> >>   
> >> +        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
> >> +
> >>           VMSTATE_END_OF_LIST()
> >>       },
> >>       .subsections = (const VMStateDescription * const []) {
> >
> > The migration stream is versioned, so you should increase it,
> > but this field is only relevant for KVM (it shouldn't be there
> > in non-KVM builds). IMHO the correct migration way to fix that
> > is (untested):
> >
> > -- >8 --
> > diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
> > index c7029fb9b4..08032c6d71 100644
> > --- a/target/loongarch/machine.c
> > +++ b/target/loongarch/machine.c
> > @@ -8,8 +8,27 @@
> >   #include "qemu/osdep.h"
> >   #include "cpu.h"
> >   #include "migration/cpu.h"
> > +#include "sysemu/kvm.h"
> >   #include "vec.h"
> >
> > +#ifdef CONFIG_KVM
> > +static bool kvmcpu_needed(void *opaque)
> > +{
> > +    return kvm_enabled();
> > +}
> > +
> > +static const VMStateDescription vmstate_kvmtimer = {
> > +    .name = "cpu/kvmtimer",
> > +    .version_id = 1,
> > +    .minimum_version_id = 1,
> > +    .needed = kvmcpu_needed,
> > +    .fields = (const VMStateField[]) {
> > +        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
> > +        VMSTATE_END_OF_LIST()
> > +    }
> > +};
> > +#endif /* CONFIG_KVM */
> > +
> >   static const VMStateDescription vmstate_fpu_reg = {
> >       .name = "fpu_reg",
> >       .version_id = 1,
> > @@ -194,6 +213,9 @@ const VMStateDescription vmstate_loongarch_cpu = {
> >           VMSTATE_END_OF_LIST()
> >       },
> >       .subsections = (const VMStateDescription * const []) {
> > +#ifdef CONFIG_KVM
> > +        &vmstate_kvmcpu,
> > +#endif
> >           &vmstate_fpu,
> >           &vmstate_lsx,
> >           &vmstate_lasx,
> > ---
> 
> LGTM, I'd just leave only the .needed function under CONFIG_KVM instead
> of the whole subsection.

But when !KVM it means there's no ".needed" and it'll still be migrated?

IMHO it depends on whether loongarch is in the state already of trying to
keep its ABI at all.  I think we should still try to enjoy the time when
that ABI is not required, then we can simply add whatever fields, and let
things break with no big deal.

Note that if with CONFIG_KVM it means it can break migration between kvm
v.s. tcg when only one qemu enabled kvm when compile.  Considering the
patch is from the maintainer (which seems to say "breaking that is all
fine!") I'd say the original patch looks good actually, as it allows kvm /
tcg migrations too as a baseline.

Thanks,
Fabiano Rosas May 2, 2024, 12:45 p.m. UTC | #4
Peter Xu <peterx@redhat.com> writes:

> On Tue, Apr 30, 2024 at 11:00:24AM -0300, Fabiano Rosas wrote:
>> Philippe Mathieu-Daudé <philmd@linaro.org> writes:
>> 
>> > (Cc'ing migration maintainers)
>> >
>> > On 30/4/24 03:23, Song Gao wrote:
>> >> vmstate does not save kvm_state_conter,
>> >> which can cause VM recovery from disk to fail.
>> >
>> > Cc: qemu-stable@nongnu.org
>> > Fixes: d11681c94f ("target/loongarch: Implement kvm_arch_init_vcpu")
>> >
>> >> Signed-off-by: Song Gao <gaosong@loongson.cn>
>> >> ---
>> >>   target/loongarch/machine.c | 2 ++
>> >>   1 file changed, 2 insertions(+)
>> >> 
>> >> diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
>> >> index c7029fb9b4..4cd1bf06ff 100644
>> >> --- a/target/loongarch/machine.c
>> >> +++ b/target/loongarch/machine.c
>> >> @@ -191,6 +191,8 @@ const VMStateDescription vmstate_loongarch_cpu = {
>> >>           VMSTATE_STRUCT_ARRAY(env.tlb, LoongArchCPU, LOONGARCH_TLB_MAX,
>> >>                                0, vmstate_tlb, LoongArchTLB),
>> >>   
>> >> +        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
>> >> +
>> >>           VMSTATE_END_OF_LIST()
>> >>       },
>> >>       .subsections = (const VMStateDescription * const []) {
>> >
>> > The migration stream is versioned, so you should increase it,
>> > but this field is only relevant for KVM (it shouldn't be there
>> > in non-KVM builds). IMHO the correct migration way to fix that
>> > is (untested):
>> >
>> > -- >8 --
>> > diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
>> > index c7029fb9b4..08032c6d71 100644
>> > --- a/target/loongarch/machine.c
>> > +++ b/target/loongarch/machine.c
>> > @@ -8,8 +8,27 @@
>> >   #include "qemu/osdep.h"
>> >   #include "cpu.h"
>> >   #include "migration/cpu.h"
>> > +#include "sysemu/kvm.h"
>> >   #include "vec.h"
>> >
>> > +#ifdef CONFIG_KVM
>> > +static bool kvmcpu_needed(void *opaque)
>> > +{
>> > +    return kvm_enabled();
>> > +}
>> > +
>> > +static const VMStateDescription vmstate_kvmtimer = {
>> > +    .name = "cpu/kvmtimer",
>> > +    .version_id = 1,
>> > +    .minimum_version_id = 1,
>> > +    .needed = kvmcpu_needed,
>> > +    .fields = (const VMStateField[]) {
>> > +        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
>> > +        VMSTATE_END_OF_LIST()
>> > +    }
>> > +};
>> > +#endif /* CONFIG_KVM */
>> > +
>> >   static const VMStateDescription vmstate_fpu_reg = {
>> >       .name = "fpu_reg",
>> >       .version_id = 1,
>> > @@ -194,6 +213,9 @@ const VMStateDescription vmstate_loongarch_cpu = {
>> >           VMSTATE_END_OF_LIST()
>> >       },
>> >       .subsections = (const VMStateDescription * const []) {
>> > +#ifdef CONFIG_KVM
>> > +        &vmstate_kvmcpu,
>> > +#endif
>> >           &vmstate_fpu,
>> >           &vmstate_lsx,
>> >           &vmstate_lasx,
>> > ---
>> 
>> LGTM, I'd just leave only the .needed function under CONFIG_KVM instead
>> of the whole subsection.
>
> But when !KVM it means there's no ".needed" and it'll still be migrated?

I expressed myself poorly, I meant put the return from .needed under
CONFIG_KVM. But that is not even necessary, kvm_enabled() is enough.

>
> IMHO it depends on whether loongarch is in the state already of trying to
> keep its ABI at all.  I think we should still try to enjoy the time when
> that ABI is not required, then we can simply add whatever fields, and let
> things break with no big deal.
>
> Note that if with CONFIG_KVM it means it can break migration between kvm
> v.s. tcg when only one qemu enabled kvm when compile.  Considering the
> patch is from the maintainer (which seems to say "breaking that is all
> fine!") I'd say the original patch looks good actually, as it allows kvm /
> tcg migrations too as a baseline.

I'm fine with this approach as well.

>
> Thanks,
gaosong May 7, 2024, 8:12 a.m. UTC | #5
Thanks for the comments !

在 2024/5/2 下午8:45, Fabiano Rosas 写道:
> Peter Xu <peterx@redhat.com> writes:
>
>> On Tue, Apr 30, 2024 at 11:00:24AM -0300, Fabiano Rosas wrote:
>>> Philippe Mathieu-Daudé <philmd@linaro.org> writes:
>>>
>>>> (Cc'ing migration maintainers)
>>>>
>>>> On 30/4/24 03:23, Song Gao wrote:
>>>>> vmstate does not save kvm_state_conter,
>>>>> which can cause VM recovery from disk to fail.
>>>> Cc: qemu-stable@nongnu.org
>>>> Fixes: d11681c94f ("target/loongarch: Implement kvm_arch_init_vcpu")
>>>>
>>>>> Signed-off-by: Song Gao <gaosong@loongson.cn>
>>>>> ---
>>>>>    target/loongarch/machine.c | 2 ++
>>>>>    1 file changed, 2 insertions(+)
>>>>>
>>>>> diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
>>>>> index c7029fb9b4..4cd1bf06ff 100644
>>>>> --- a/target/loongarch/machine.c
>>>>> +++ b/target/loongarch/machine.c
>>>>> @@ -191,6 +191,8 @@ const VMStateDescription vmstate_loongarch_cpu = {
>>>>>            VMSTATE_STRUCT_ARRAY(env.tlb, LoongArchCPU, LOONGARCH_TLB_MAX,
>>>>>                                 0, vmstate_tlb, LoongArchTLB),
>>>>>    
>>>>> +        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
>>>>> +
>>>>>            VMSTATE_END_OF_LIST()
>>>>>        },
>>>>>        .subsections = (const VMStateDescription * const []) {
>>>> The migration stream is versioned, so you should increase it,
>>>> but this field is only relevant for KVM (it shouldn't be there
>>>> in non-KVM builds). IMHO the correct migration way to fix that
>>>> is (untested):
>>>>
>>>> -- >8 --
>>>> diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
>>>> index c7029fb9b4..08032c6d71 100644
>>>> --- a/target/loongarch/machine.c
>>>> +++ b/target/loongarch/machine.c
>>>> @@ -8,8 +8,27 @@
>>>>    #include "qemu/osdep.h"
>>>>    #include "cpu.h"
>>>>    #include "migration/cpu.h"
>>>> +#include "sysemu/kvm.h"
>>>>    #include "vec.h"
>>>>
>>>> +#ifdef CONFIG_KVM
>>>> +static bool kvmcpu_needed(void *opaque)
>>>> +{
>>>> +    return kvm_enabled();
>>>> +}
>>>> +
>>>> +static const VMStateDescription vmstate_kvmtimer = {
>>>> +    .name = "cpu/kvmtimer",
>>>> +    .version_id = 1,
>>>> +    .minimum_version_id = 1,
>>>> +    .needed = kvmcpu_needed,
>>>> +    .fields = (const VMStateField[]) {
>>>> +        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
>>>> +        VMSTATE_END_OF_LIST()
>>>> +    }
>>>> +};
>>>> +#endif /* CONFIG_KVM */
>>>> +
>>>>    static const VMStateDescription vmstate_fpu_reg = {
>>>>        .name = "fpu_reg",
>>>>        .version_id = 1,
>>>> @@ -194,6 +213,9 @@ const VMStateDescription vmstate_loongarch_cpu = {
>>>>            VMSTATE_END_OF_LIST()
>>>>        },
>>>>        .subsections = (const VMStateDescription * const []) {
>>>> +#ifdef CONFIG_KVM
>>>> +        &vmstate_kvmcpu,
>>>> +#endif
>>>>            &vmstate_fpu,
>>>>            &vmstate_lsx,
>>>>            &vmstate_lasx,
>>>> ---
>>> LGTM, I'd just leave only the .needed function under CONFIG_KVM instead
>>> of the whole subsection.
>> But when !KVM it means there's no ".needed" and it'll still be migrated?
> I expressed myself poorly, I meant put the return from .needed under
> CONFIG_KVM. But that is not even necessary, kvm_enabled() is enough.
>
>> IMHO it depends on whether loongarch is in the state already of trying to
>> keep its ABI at all.  I think we should still try to enjoy the time when
>> that ABI is not required, then we can simply add whatever fields, and let
>> things break with no big deal.
>>
>> Note that if with CONFIG_KVM it means it can break migration between kvm
>> v.s. tcg when only one qemu enabled kvm when compile.
Just remove CONIFG_KVM  would be OK?

Thanks.
Song Gao
>>    Considering the
>> patch is from the maintainer (which seems to say "breaking that is all
>> fine!") I'd say the original patch looks good actually, as it allows kvm /
>> tcg migrations too as a baseline.
> I'm fine with this approach as well.
>
>> Thanks,
Peter Xu May 7, 2024, 3:47 p.m. UTC | #6
On Tue, May 07, 2024 at 04:12:34PM +0800, gaosong wrote:
> Just remove CONIFG_KVM  would be OK?

You're the loongarch maintainer so I'd say your call. :)

If you're not yet sure, IMHO we should go with the simplest, which is the
original oneliner patch.

Thanks,
Peter Maydell May 7, 2024, 3:52 p.m. UTC | #7
On Tue, 7 May 2024 at 16:47, Peter Xu <peterx@redhat.com> wrote:
>
> On Tue, May 07, 2024 at 04:12:34PM +0800, gaosong wrote:
> > Just remove CONIFG_KVM  would be OK?
>
> You're the loongarch maintainer so I'd say your call. :)
>
> If you're not yet sure, IMHO we should go with the simplest, which is the
> original oneliner patch.

The original patch needs to also bump the version numbers when
it adds the new field.

Even when we do not wish to maintain migration compatibility,
bumping the version number means that users get a (more or less)
helpful error message if they try an unsupported cross-version
migration, rather than weird behaviour.

thanks
-- PMM
diff mbox series

Patch

diff --git a/target/loongarch/machine.c b/target/loongarch/machine.c
index c7029fb9b4..4cd1bf06ff 100644
--- a/target/loongarch/machine.c
+++ b/target/loongarch/machine.c
@@ -191,6 +191,8 @@  const VMStateDescription vmstate_loongarch_cpu = {
         VMSTATE_STRUCT_ARRAY(env.tlb, LoongArchCPU, LOONGARCH_TLB_MAX,
                              0, vmstate_tlb, LoongArchTLB),
 
+        VMSTATE_UINT64(kvm_state_counter, LoongArchCPU),
+
         VMSTATE_END_OF_LIST()
     },
     .subsections = (const VMStateDescription * const []) {