diff mbox

migration: Update docs to discourage version bumps

Message ID 20170209181746.29896-1-dgilbert@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Dr. David Alan Gilbert Feb. 9, 2017, 6:17 p.m. UTC
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Version bumps break backwards migration; update the docs
to explain to people that's bad and how to avoid it.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 docs/migration.txt | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

Comments

Markus Armbruster Feb. 10, 2017, 8:01 a.m. UTC | #1
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> writes:

> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Version bumps break backwards migration; update the docs
> to explain to people that's bad and how to avoid it.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  docs/migration.txt | 42 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
>
> diff --git a/docs/migration.txt b/docs/migration.txt
> index b462ead..f375f4b 100644
> --- a/docs/migration.txt
> +++ b/docs/migration.txt
> @@ -161,6 +161,11 @@ include/hw/hw.h.
>  
>  === More about versions ===
>  
> +Version numbers are intended for major incompatible changes to the
> +migration of a device, and using them breaks backwards-migration
> +compatibility; in general most changes can be made by adding Subsections
> +(see below) or _TEST macros (see below) which won't break compatibility.
> +
>  You can see that there are several version fields:
>  
>  - version_id: the maximum version_id supported by VMState for that device.
> @@ -175,6 +180,9 @@ version_id.  And the function load_state_old() (if present) is able to
>  load state from minimum_version_id_old to minimum_version_id.  This
>  function is deprecated and will be removed when no more users are left.
>  
> +The VMState with the 'version_id' value will always be generated and thus
> +can't be loaded by any older QEMU.
> +

Suggest active voice: "Saving state will always create ...".

>  ===  Massaging functions ===
>  
>  Sometimes, it is not enough to be able to save the state directly
> @@ -292,6 +300,40 @@ save/send this state when we are in the middle of a pio operation
>  not enabled, the values on that fields are garbage and don't need to
>  be sent.
>  
> +Using a condition function that checks a 'property' to determine whether
> +to send a subsection allows backwards migration compatibility.
> +For example;
> +   a) Add a new property using DEFINE_PROP_BOOL - e.g. support-foo and
> +      default it to true.
> +   b) Add an entry to the HW_COMPAT_ for the previous version
> +      that sets the property to false.
> +   c) Add a static bool  support_foo function

Let's add "... that tests the property"

> +   d) Add a subsection with a .needed set to the support_foo function
> +   e) (potentially) Add a pre_load that sets up a default value for 'foo'
> +      to be used if the subsection isn't loaded.
> +
> +Now that subsection will not be generated when using an older
> +machine type and the migration stream will be accepted by older
> +QEMU versions.

Suppressing a subsection for older machine types is obviously fine when
the device state transmitted in the subsection is unused with these
machine types.  This isn't usually the case, however.

If it's used, then it resets to a default state on migration.  Not
visible when it already is in the default state.  When it isn't,
migration has a side effect on the device, which can range from from
benign to disastrous.  Trade-off: ability to migrate vs. side effect.  I
think we should point it out explicitly.

When the side effect is too serious to accept, but non-default state is
sufficiently rare, we can choose to still enable backward migration in
default state, by having .needed() return "is in non-default state".
Improves "can't migrate backwards" to "occasionally can't migrate
backwards".  I think this technique should be mentioned as well.  I know
you dislike the "random" failures it brings; feel free to add dire
warnings.

> +
> += Not sending existing elements =
> +
> +Sometimes members of the VMState are no longer needed;
> +  removing them will break migration compatibility
> +  making them version dependent and bumping the version will break backwards
> +   migration compatibility.
> +
> +The best way is to:
> +  a) Add a new property/compatibility/function in the same way for subsections
> +    above.
> +  b) replace the VMSTATE macro with the _TEST version of the macro, e.g.:
> +     VMSTATE_UINT32(foo, barstruct)
> +   becomes
> +     VMSTATE_UINT32_TEST(foo, barstruct, pre_version_baz)
> +
> +  Sometime in the future when we no longer care about the ancient
> +versions these can be killed off.
> +
>  = Return path =
>  
>  In most migration scenarios there is only a single data path that runs

Lovely improvement, thanks!
Dr. David Alan Gilbert Feb. 10, 2017, 11:04 a.m. UTC | #2
* Markus Armbruster (armbru@redhat.com) wrote:
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> writes:
> 
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > Version bumps break backwards migration; update the docs
> > to explain to people that's bad and how to avoid it.
> >
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  docs/migration.txt | 42 ++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 42 insertions(+)
> >
> > diff --git a/docs/migration.txt b/docs/migration.txt
> > index b462ead..f375f4b 100644
> > --- a/docs/migration.txt
> > +++ b/docs/migration.txt
> > @@ -161,6 +161,11 @@ include/hw/hw.h.
> >  
> >  === More about versions ===
> >  
> > +Version numbers are intended for major incompatible changes to the
> > +migration of a device, and using them breaks backwards-migration
> > +compatibility; in general most changes can be made by adding Subsections
> > +(see below) or _TEST macros (see below) which won't break compatibility.
> > +
> >  You can see that there are several version fields:
> >  
> >  - version_id: the maximum version_id supported by VMState for that device.
> > @@ -175,6 +180,9 @@ version_id.  And the function load_state_old() (if present) is able to
> >  load state from minimum_version_id_old to minimum_version_id.  This
> >  function is deprecated and will be removed when no more users are left.
> >  
> > +The VMState with the 'version_id' value will always be generated and thus
> > +can't be loaded by any older QEMU.
> > +
> 
> Suggest active voice: "Saving state will always create ...".
> 
> >  ===  Massaging functions ===
> >  
> >  Sometimes, it is not enough to be able to save the state directly
> > @@ -292,6 +300,40 @@ save/send this state when we are in the middle of a pio operation
> >  not enabled, the values on that fields are garbage and don't need to
> >  be sent.
> >  
> > +Using a condition function that checks a 'property' to determine whether
> > +to send a subsection allows backwards migration compatibility.
> > +For example;
> > +   a) Add a new property using DEFINE_PROP_BOOL - e.g. support-foo and
> > +      default it to true.
> > +   b) Add an entry to the HW_COMPAT_ for the previous version
> > +      that sets the property to false.
> > +   c) Add a static bool  support_foo function
> 
> Let's add "... that tests the property"
> 
> > +   d) Add a subsection with a .needed set to the support_foo function
> > +   e) (potentially) Add a pre_load that sets up a default value for 'foo'
> > +      to be used if the subsection isn't loaded.
> > +
> > +Now that subsection will not be generated when using an older
> > +machine type and the migration stream will be accepted by older
> > +QEMU versions.
> 
> Suppressing a subsection for older machine types is obviously fine when
> the device state transmitted in the subsection is unused with these
> machine types.  This isn't usually the case, however.
> 
> If it's used, then it resets to a default state on migration.  Not
> visible when it already is in the default state.  When it isn't,
> migration has a side effect on the device, which can range from from
> benign to disastrous.  Trade-off: ability to migrate vs. side effect.  I
> think we should point it out explicitly.
> 
> When the side effect is too serious to accept, but non-default state is
> sufficiently rare, we can choose to still enable backward migration in
> default state, by having .needed() return "is in non-default state".
> Improves "can't migrate backwards" to "occasionally can't migrate
> backwards".  I think this technique should be mentioned as well.  I know
> you dislike the "random" failures it brings; feel free to add dire
> warnings.

OK, I've added an extra paragraph for that in the v2 I just posted.

Dave

> > +
> > += Not sending existing elements =
> > +
> > +Sometimes members of the VMState are no longer needed;
> > +  removing them will break migration compatibility
> > +  making them version dependent and bumping the version will break backwards
> > +   migration compatibility.
> > +
> > +The best way is to:
> > +  a) Add a new property/compatibility/function in the same way for subsections
> > +    above.
> > +  b) replace the VMSTATE macro with the _TEST version of the macro, e.g.:
> > +     VMSTATE_UINT32(foo, barstruct)
> > +   becomes
> > +     VMSTATE_UINT32_TEST(foo, barstruct, pre_version_baz)
> > +
> > +  Sometime in the future when we no longer care about the ancient
> > +versions these can be killed off.
> > +
> >  = Return path =
> >  
> >  In most migration scenarios there is only a single data path that runs
> 
> Lovely improvement, thanks!
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff mbox

Patch

diff --git a/docs/migration.txt b/docs/migration.txt
index b462ead..f375f4b 100644
--- a/docs/migration.txt
+++ b/docs/migration.txt
@@ -161,6 +161,11 @@  include/hw/hw.h.
 
 === More about versions ===
 
+Version numbers are intended for major incompatible changes to the
+migration of a device, and using them breaks backwards-migration
+compatibility; in general most changes can be made by adding Subsections
+(see below) or _TEST macros (see below) which won't break compatibility.
+
 You can see that there are several version fields:
 
 - version_id: the maximum version_id supported by VMState for that device.
@@ -175,6 +180,9 @@  version_id.  And the function load_state_old() (if present) is able to
 load state from minimum_version_id_old to minimum_version_id.  This
 function is deprecated and will be removed when no more users are left.
 
+The VMState with the 'version_id' value will always be generated and thus
+can't be loaded by any older QEMU.
+
 ===  Massaging functions ===
 
 Sometimes, it is not enough to be able to save the state directly
@@ -292,6 +300,40 @@  save/send this state when we are in the middle of a pio operation
 not enabled, the values on that fields are garbage and don't need to
 be sent.
 
+Using a condition function that checks a 'property' to determine whether
+to send a subsection allows backwards migration compatibility.
+For example;
+   a) Add a new property using DEFINE_PROP_BOOL - e.g. support-foo and
+      default it to true.
+   b) Add an entry to the HW_COMPAT_ for the previous version
+      that sets the property to false.
+   c) Add a static bool  support_foo function
+   d) Add a subsection with a .needed set to the support_foo function
+   e) (potentially) Add a pre_load that sets up a default value for 'foo'
+      to be used if the subsection isn't loaded.
+
+Now that subsection will not be generated when using an older
+machine type and the migration stream will be accepted by older
+QEMU versions.
+
+= Not sending existing elements =
+
+Sometimes members of the VMState are no longer needed;
+  removing them will break migration compatibility
+  making them version dependent and bumping the version will break backwards
+   migration compatibility.
+
+The best way is to:
+  a) Add a new property/compatibility/function in the same way for subsections
+    above.
+  b) replace the VMSTATE macro with the _TEST version of the macro, e.g.:
+     VMSTATE_UINT32(foo, barstruct)
+   becomes
+     VMSTATE_UINT32_TEST(foo, barstruct, pre_version_baz)
+
+  Sometime in the future when we no longer care about the ancient
+versions these can be killed off.
+
 = Return path =
 
 In most migration scenarios there is only a single data path that runs