diff mbox series

[v3] failover: unregister ROM on unplug

Message ID 20210721093955.225759-1-lvivier@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v3] failover: unregister ROM on unplug | expand

Commit Message

Laurent Vivier July 21, 2021, 9:39 a.m. UTC
The intend of failover is to allow to migrate a VM with a VFIO
networking card without disrupting the network operation by switching
to a virtio-net device during the migration.

This simple change allows to test failover with a simulated device
like e1000e rather than a vfio device, even if it's useless in real
life it can help to debug failover.

This is interesting to developers that want to test failover on
a system with no vfio device. Moreover it simplifies host networking
configuration as we can use the same bridge for virtio-net and
the other failover networking device.

Without this change the migration of a system configured with failover
fails with:

  ...
  -device virtio-net-pci,id=virtionet0,failover=on,...  \
  -device e1000,failover_pair_id=virtionet0,... \
  ...

  (qemu) migrate ...

  Unknown ramblock "0000:00:01.1:00.0/e1000e.rom", cannot accept migration
  error while loading state for instance 0x0 of device 'ram'
  load of migration failed: Invalid argument

This happens because QEMU correctly unregisters the interface vmstate but
not the ROM one. This patch fixes that.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---

Notes:
    v3:
      remove useless space before comma
    
    v2:
      reset has_rom to false
      update commit log message

 hw/net/virtio-net.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Dr. David Alan Gilbert July 21, 2021, 11:10 a.m. UTC | #1
* Laurent Vivier (lvivier@redhat.com) wrote:
> The intend of failover is to allow to migrate a VM with a VFIO
> networking card without disrupting the network operation by switching
> to a virtio-net device during the migration.
> 
> This simple change allows to test failover with a simulated device
> like e1000e rather than a vfio device, even if it's useless in real
> life it can help to debug failover.
> 
> This is interesting to developers that want to test failover on
> a system with no vfio device. Moreover it simplifies host networking
> configuration as we can use the same bridge for virtio-net and
> the other failover networking device.
> 
> Without this change the migration of a system configured with failover
> fails with:
> 
>   ...
>   -device virtio-net-pci,id=virtionet0,failover=on,...  \
>   -device e1000,failover_pair_id=virtionet0,... \
>   ...
> 
>   (qemu) migrate ...
> 
>   Unknown ramblock "0000:00:01.1:00.0/e1000e.rom", cannot accept migration
>   error while loading state for instance 0x0 of device 'ram'
>   load of migration failed: Invalid argument
> 
> This happens because QEMU correctly unregisters the interface vmstate but
> not the ROM one. This patch fixes that.
> 
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> ---
> 
> Notes:
>     v3:
>       remove useless space before comma
>     
>     v2:
>       reset has_rom to false
>       update commit log message
> 
>  hw/net/virtio-net.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 16d20cdee52a..c0c2ec1ebb98 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -3256,6 +3256,10 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
>      if (migration_in_setup(s) && !should_be_hidden) {
>          if (failover_unplug_primary(n, dev)) {
>              vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
> +            if (PCI_DEVICE(dev)->has_rom) {
> +                PCI_DEVICE(dev)->has_rom = false;
> +                vmstate_unregister_ram(&PCI_DEVICE(dev)->rom, dev);
> +            }

Not actually originated by your fix, but....

Why doesn't failover_replug_primary re-add the vmstates?

(I did wonder if passing rom-file="" to the e1000 would help in your
testing case, but it still creates the RAM image).

Dave

>              qapi_event_send_unplug_primary(dev->id);
>              qatomic_set(&n->failover_primary_hidden, true);
>          } else {
> -- 
> 2.31.1
>
Laurent Vivier July 21, 2021, 11:22 a.m. UTC | #2
On 21/07/2021 13:10, Dr. David Alan Gilbert wrote:
> * Laurent Vivier (lvivier@redhat.com) wrote:
>> The intend of failover is to allow to migrate a VM with a VFIO
>> networking card without disrupting the network operation by switching
>> to a virtio-net device during the migration.
>>
>> This simple change allows to test failover with a simulated device
>> like e1000e rather than a vfio device, even if it's useless in real
>> life it can help to debug failover.
>>
>> This is interesting to developers that want to test failover on
>> a system with no vfio device. Moreover it simplifies host networking
>> configuration as we can use the same bridge for virtio-net and
>> the other failover networking device.
>>
>> Without this change the migration of a system configured with failover
>> fails with:
>>
>>   ...
>>   -device virtio-net-pci,id=virtionet0,failover=on,...  \
>>   -device e1000,failover_pair_id=virtionet0,... \
>>   ...
>>
>>   (qemu) migrate ...
>>
>>   Unknown ramblock "0000:00:01.1:00.0/e1000e.rom", cannot accept migration
>>   error while loading state for instance 0x0 of device 'ram'
>>   load of migration failed: Invalid argument
>>
>> This happens because QEMU correctly unregisters the interface vmstate but
>> not the ROM one. This patch fixes that.
>>
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> ---
>>
>> Notes:
>>     v3:
>>       remove useless space before comma
>>     
>>     v2:
>>       reset has_rom to false
>>       update commit log message
>>
>>  hw/net/virtio-net.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>> index 16d20cdee52a..c0c2ec1ebb98 100644
>> --- a/hw/net/virtio-net.c
>> +++ b/hw/net/virtio-net.c
>> @@ -3256,6 +3256,10 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
>>      if (migration_in_setup(s) && !should_be_hidden) {
>>          if (failover_unplug_primary(n, dev)) {
>>              vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
>> +            if (PCI_DEVICE(dev)->has_rom) {
>> +                PCI_DEVICE(dev)->has_rom = false;
>> +                vmstate_unregister_ram(&PCI_DEVICE(dev)->rom, dev);
>> +            }
> 
> Not actually originated by your fix, but....
> 
> Why doesn't failover_replug_primary re-add the vmstates?

Good point.

I'm going to check but I think the vmstates are re-added by the hotplug operations that
are used in failover_replug_primary().

Thanks,
Laurent
Juan Quintela July 21, 2021, 1:41 p.m. UTC | #3
"Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> * Laurent Vivier (lvivier@redhat.com) wrote:
>> The intend of failover is to allow to migrate a VM with a VFIO
>> networking card without disrupting the network operation by switching
>> to a virtio-net device during the migration.
>> 
>> This simple change allows to test failover with a simulated device
>> like e1000e rather than a vfio device, even if it's useless in real
>> life it can help to debug failover.
>> 
>> This is interesting to developers that want to test failover on
>> a system with no vfio device. Moreover it simplifies host networking
>> configuration as we can use the same bridge for virtio-net and
>> the other failover networking device.
>> 
>> Without this change the migration of a system configured with failover
>> fails with:
>> 
>>   ...
>>   -device virtio-net-pci,id=virtionet0,failover=on,...  \
>>   -device e1000,failover_pair_id=virtionet0,... \
>>   ...
>> 
>>   (qemu) migrate ...
>> 
>>   Unknown ramblock "0000:00:01.1:00.0/e1000e.rom", cannot accept migration
>>   error while loading state for instance 0x0 of device 'ram'
>>   load of migration failed: Invalid argument
>> 
>> This happens because QEMU correctly unregisters the interface vmstate but
>> not the ROM one. This patch fixes that.
>> 
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> ---
>> 
>> Notes:
>>     v3:
>>       remove useless space before comma
>>     
>>     v2:
>>       reset has_rom to false
>>       update commit log message
>> 
>>  hw/net/virtio-net.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>> 
>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>> index 16d20cdee52a..c0c2ec1ebb98 100644
>> --- a/hw/net/virtio-net.c
>> +++ b/hw/net/virtio-net.c
>> @@ -3256,6 +3256,10 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
>>      if (migration_in_setup(s) && !should_be_hidden) {
>>          if (failover_unplug_primary(n, dev)) {
>>              vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
>> +            if (PCI_DEVICE(dev)->has_rom) {
>> +                PCI_DEVICE(dev)->has_rom = false;
>> +                vmstate_unregister_ram(&PCI_DEVICE(dev)->rom, dev);
>> +            }
>
> Not actually originated by your fix, but....
>
> Why doesn't failover_replug_primary re-add the vmstates?

Because we can't migrate until the "unplug" has happened.
Yes, it is a mess.

I think this is the saner patch that I can think of for that
functionality.

What I wonder is why we register rom as ram, but I guess that the rom
can be updated from userspace, or who knows.

Later, Juan.

> (I did wonder if passing rom-file="" to the e1000 would help in your
> testing case, but it still creates the RAM image).
>
> Dave
>
>>              qapi_event_send_unplug_primary(dev->id);
>>              qatomic_set(&n->failover_primary_hidden, true);
>>          } else {
>> -- 
>> 2.31.1
>>
Dr. David Alan Gilbert July 21, 2021, 1:45 p.m. UTC | #4
* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > * Laurent Vivier (lvivier@redhat.com) wrote:
> >> The intend of failover is to allow to migrate a VM with a VFIO
> >> networking card without disrupting the network operation by switching
> >> to a virtio-net device during the migration.
> >> 
> >> This simple change allows to test failover with a simulated device
> >> like e1000e rather than a vfio device, even if it's useless in real
> >> life it can help to debug failover.
> >> 
> >> This is interesting to developers that want to test failover on
> >> a system with no vfio device. Moreover it simplifies host networking
> >> configuration as we can use the same bridge for virtio-net and
> >> the other failover networking device.
> >> 
> >> Without this change the migration of a system configured with failover
> >> fails with:
> >> 
> >>   ...
> >>   -device virtio-net-pci,id=virtionet0,failover=on,...  \
> >>   -device e1000,failover_pair_id=virtionet0,... \
> >>   ...
> >> 
> >>   (qemu) migrate ...
> >> 
> >>   Unknown ramblock "0000:00:01.1:00.0/e1000e.rom", cannot accept migration
> >>   error while loading state for instance 0x0 of device 'ram'
> >>   load of migration failed: Invalid argument
> >> 
> >> This happens because QEMU correctly unregisters the interface vmstate but
> >> not the ROM one. This patch fixes that.
> >> 
> >> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> >> ---
> >> 
> >> Notes:
> >>     v3:
> >>       remove useless space before comma
> >>     
> >>     v2:
> >>       reset has_rom to false
> >>       update commit log message
> >> 
> >>  hw/net/virtio-net.c | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >> 
> >> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> >> index 16d20cdee52a..c0c2ec1ebb98 100644
> >> --- a/hw/net/virtio-net.c
> >> +++ b/hw/net/virtio-net.c
> >> @@ -3256,6 +3256,10 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
> >>      if (migration_in_setup(s) && !should_be_hidden) {
> >>          if (failover_unplug_primary(n, dev)) {
> >>              vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
> >> +            if (PCI_DEVICE(dev)->has_rom) {
> >> +                PCI_DEVICE(dev)->has_rom = false;
> >> +                vmstate_unregister_ram(&PCI_DEVICE(dev)->rom, dev);
> >> +            }
> >
> > Not actually originated by your fix, but....
> >
> > Why doesn't failover_replug_primary re-add the vmstates?
> 
> Because we can't migrate until the "unplug" has happened.
> Yes, it is a mess.

But if the migrate fails, shouldn't it add it back?

Dave

> I think this is the saner patch that I can think of for that
> functionality.
> 
> What I wonder is why we register rom as ram, but I guess that the rom
> can be updated from userspace, or who knows.
> 
> Later, Juan.
> 
> > (I did wonder if passing rom-file="" to the e1000 would help in your
> > testing case, but it still creates the RAM image).
> >
> > Dave
> >
> >>              qapi_event_send_unplug_primary(dev->id);
> >>              qatomic_set(&n->failover_primary_hidden, true);
> >>          } else {
> >> -- 
> >> 2.31.1
> >> 
>
Juan Quintela July 21, 2021, 1:56 p.m. UTC | #5
Laurent Vivier <lvivier@redhat.com> wrote:
> The intend of failover is to allow to migrate a VM with a VFIO
> networking card without disrupting the network operation by switching
> to a virtio-net device during the migration.
>
> This simple change allows to test failover with a simulated device
> like e1000e rather than a vfio device, even if it's useless in real
> life it can help to debug failover.
>
> This is interesting to developers that want to test failover on
> a system with no vfio device. Moreover it simplifies host networking
> configuration as we can use the same bridge for virtio-net and
> the other failover networking device.
>
> Without this change the migration of a system configured with failover
> fails with:
>
>   ...
>   -device virtio-net-pci,id=virtionet0,failover=on,...  \
>   -device e1000,failover_pair_id=virtionet0,... \
>   ...
>
>   (qemu) migrate ...
>
>   Unknown ramblock "0000:00:01.1:00.0/e1000e.rom", cannot accept migration
>   error while loading state for instance 0x0 of device 'ram'
>   load of migration failed: Invalid argument
>
> This happens because QEMU correctly unregisters the interface vmstate but
> not the ROM one. This patch fixes that.
>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>


Reviewed-by: Juan Quintela <quintela@redhat.com>

As this is only for testing.
Philippe Mathieu-Daudé July 21, 2021, 2:01 p.m. UTC | #6
On 7/21/21 3:45 PM, Dr. David Alan Gilbert wrote:
> * Juan Quintela (quintela@redhat.com) wrote:
>> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
>>> * Laurent Vivier (lvivier@redhat.com) wrote:
>>>> The intend of failover is to allow to migrate a VM with a VFIO
>>>> networking card without disrupting the network operation by switching
>>>> to a virtio-net device during the migration.
>>>>
>>>> This simple change allows to test failover with a simulated device
>>>> like e1000e rather than a vfio device, even if it's useless in real
>>>> life it can help to debug failover.
>>>>
>>>> This is interesting to developers that want to test failover on
>>>> a system with no vfio device. Moreover it simplifies host networking
>>>> configuration as we can use the same bridge for virtio-net and
>>>> the other failover networking device.
>>>>
>>>> Without this change the migration of a system configured with failover
>>>> fails with:
>>>>
>>>>   ...
>>>>   -device virtio-net-pci,id=virtionet0,failover=on,...  \
>>>>   -device e1000,failover_pair_id=virtionet0,... \
>>>>   ...
>>>>
>>>>   (qemu) migrate ...
>>>>
>>>>   Unknown ramblock "0000:00:01.1:00.0/e1000e.rom", cannot accept migration
>>>>   error while loading state for instance 0x0 of device 'ram'
>>>>   load of migration failed: Invalid argument
>>>>
>>>> This happens because QEMU correctly unregisters the interface vmstate but
>>>> not the ROM one. This patch fixes that.
>>>>
>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>> ---
>>>>
>>>> Notes:
>>>>     v3:
>>>>       remove useless space before comma
>>>>     
>>>>     v2:
>>>>       reset has_rom to false
>>>>       update commit log message
>>>>
>>>>  hw/net/virtio-net.c | 4 ++++
>>>>  1 file changed, 4 insertions(+)
>>>>
>>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>>> index 16d20cdee52a..c0c2ec1ebb98 100644
>>>> --- a/hw/net/virtio-net.c
>>>> +++ b/hw/net/virtio-net.c
>>>> @@ -3256,6 +3256,10 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
>>>>      if (migration_in_setup(s) && !should_be_hidden) {
>>>>          if (failover_unplug_primary(n, dev)) {
>>>>              vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
>>>> +            if (PCI_DEVICE(dev)->has_rom) {
>>>> +                PCI_DEVICE(dev)->has_rom = false;
>>>> +                vmstate_unregister_ram(&PCI_DEVICE(dev)->rom, dev);
>>>> +            }
>>>
>>> Not actually originated by your fix, but....
>>>
>>> Why doesn't failover_replug_primary re-add the vmstates?
>>
>> Because we can't migrate until the "unplug" has happened.
>> Yes, it is a mess.
> 
> But if the migrate fails, shouldn't it add it back?
> 
> Dave
> 
>> I think this is the saner patch that I can think of for that
>> functionality.
>>
>> What I wonder is why we register rom as ram, but I guess that the rom
>> can be updated from userspace, or who knows.

Unlikely, and if we can, this is a bug.

We call memory_region_init_rom() in pci_add_option_rom().

 * memory_region_init_rom: Initialize a ROM memory region.
 *
 * This has the same effect as calling memory_region_init_ram()
 * and then marking the resulting region read-only with
 * memory_region_set_readonly(). This includes arranging for the
 * contents to be migrated.

I agree it would be clearer to have a vmstate_unregister_rom()
function internally calling vmstate_unregister_ram().

>>
>> Later, Juan.
>>
>>> (I did wonder if passing rom-file="" to the e1000 would help in your
>>> testing case, but it still creates the RAM image).
>>>
>>> Dave
>>>
>>>>              qapi_event_send_unplug_primary(dev->id);
>>>>              qatomic_set(&n->failover_primary_hidden, true);
>>>>          } else {
>>>> -- 
>>>> 2.31.1
>>>>
>>
diff mbox series

Patch

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 16d20cdee52a..c0c2ec1ebb98 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3256,6 +3256,10 @@  static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
     if (migration_in_setup(s) && !should_be_hidden) {
         if (failover_unplug_primary(n, dev)) {
             vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
+            if (PCI_DEVICE(dev)->has_rom) {
+                PCI_DEVICE(dev)->has_rom = false;
+                vmstate_unregister_ram(&PCI_DEVICE(dev)->rom, dev);
+            }
             qapi_event_send_unplug_primary(dev->id);
             qatomic_set(&n->failover_primary_hidden, true);
         } else {