diff mbox series

virtio-net: failover: add missing remove_migration_state_change_notifier()

Message ID 20210427135147.111218-1-lvivier@redhat.com (mailing list archive)
State New, archived
Headers show
Series virtio-net: failover: add missing remove_migration_state_change_notifier() | expand

Commit Message

Laurent Vivier April 27, 2021, 1:51 p.m. UTC
In the failover case configuration, virtio_net_device_realize() uses an
add_migration_state_change_notifier() to add a state notifier, but this
notifier is not removed by the unrealize function when the virtio-net
card is unplugged.

If the card is unplugged and a migration is started, the notifier is
called and as it is not valid anymore QEMU crashes.

This patch fixes the problem by adding the
remove_migration_state_change_notifier() in virtio_net_device_unrealize().

The problem can be reproduced with:

  $ qemu-system-x86_64 -enable-kvm -m 1g -M q35 \
    -device pcie-root-port,slot=4,id=root1 \
    -device pcie-root-port,slot=5,id=root2 \
    -device virtio-net-pci,id=net1,mac=52:54:00:6f:55:cc,failover=on,bus=root1 \
    -monitor stdio disk.qcow2
  (qemu) device_del net1
  (qemu) migrate "exec:gzip -c > STATEFILE.gz"

  Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
  0x0000000000000000 in ?? ()
  (gdb) bt
  #0  0x0000000000000000 in  ()
  #1  0x0000555555d726d7 in notifier_list_notify (...)
      at .../util/notify.c:39
  #2  0x0000555555842c1a in migrate_fd_connect (...)
      at .../migration/migration.c:3975
  #3  0x0000555555950f7d in migration_channel_connect (...)
      error@entry=0x0) at .../migration/channel.c:107
  #4  0x0000555555910922 in exec_start_outgoing_migration (...)
      at .../migration/exec.c:42

Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 hw/net/virtio-net.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Dr. David Alan Gilbert April 27, 2021, 2:02 p.m. UTC | #1
* Laurent Vivier (lvivier@redhat.com) wrote:
> In the failover case configuration, virtio_net_device_realize() uses an
> add_migration_state_change_notifier() to add a state notifier, but this
> notifier is not removed by the unrealize function when the virtio-net
> card is unplugged.
> 
> If the card is unplugged and a migration is started, the notifier is
> called and as it is not valid anymore QEMU crashes.
> 
> This patch fixes the problem by adding the
> remove_migration_state_change_notifier() in virtio_net_device_unrealize().
> 
> The problem can be reproduced with:
> 
>   $ qemu-system-x86_64 -enable-kvm -m 1g -M q35 \
>     -device pcie-root-port,slot=4,id=root1 \
>     -device pcie-root-port,slot=5,id=root2 \
>     -device virtio-net-pci,id=net1,mac=52:54:00:6f:55:cc,failover=on,bus=root1 \
>     -monitor stdio disk.qcow2
>   (qemu) device_del net1
>   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> 
>   Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
>   0x0000000000000000 in ?? ()
>   (gdb) bt
>   #0  0x0000000000000000 in  ()
>   #1  0x0000555555d726d7 in notifier_list_notify (...)
>       at .../util/notify.c:39
>   #2  0x0000555555842c1a in migrate_fd_connect (...)
>       at .../migration/migration.c:3975
>   #3  0x0000555555950f7d in migration_channel_connect (...)
>       error@entry=0x0) at .../migration/channel.c:107
>   #4  0x0000555555910922 in exec_start_outgoing_migration (...)
>       at .../migration/exec.c:42
> 
> Reported-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>

Yep, I think that's OK.


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  hw/net/virtio-net.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 66b9ff451185..914051feb75b 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -3373,6 +3373,7 @@ static void virtio_net_device_unrealize(DeviceState *dev)
>  
>      if (n->failover) {
>          device_listener_unregister(&n->primary_listener);
> +        remove_migration_state_change_notifier(&n->migration_state);
>      }
>  
>      max_queues = n->multiqueue ? n->max_queues : 1;
> -- 
> 2.30.2
>
Philippe Mathieu-Daudé April 27, 2021, 2:16 p.m. UTC | #2
On 4/27/21 4:02 PM, Dr. David Alan Gilbert wrote:
> * Laurent Vivier (lvivier@redhat.com) wrote:
>> In the failover case configuration, virtio_net_device_realize() uses an
>> add_migration_state_change_notifier() to add a state notifier, but this
>> notifier is not removed by the unrealize function when the virtio-net
>> card is unplugged.
>>
>> If the card is unplugged and a migration is started, the notifier is
>> called and as it is not valid anymore QEMU crashes.
>>
>> This patch fixes the problem by adding the
>> remove_migration_state_change_notifier() in virtio_net_device_unrealize().
>>
>> The problem can be reproduced with:
>>
>>   $ qemu-system-x86_64 -enable-kvm -m 1g -M q35 \
>>     -device pcie-root-port,slot=4,id=root1 \
>>     -device pcie-root-port,slot=5,id=root2 \
>>     -device virtio-net-pci,id=net1,mac=52:54:00:6f:55:cc,failover=on,bus=root1 \
>>     -monitor stdio disk.qcow2
>>   (qemu) device_del net1
>>   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>>
>>   Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
>>   0x0000000000000000 in ?? ()
>>   (gdb) bt
>>   #0  0x0000000000000000 in  ()
>>   #1  0x0000555555d726d7 in notifier_list_notify (...)
>>       at .../util/notify.c:39
>>   #2  0x0000555555842c1a in migrate_fd_connect (...)
>>       at .../migration/migration.c:3975
>>   #3  0x0000555555950f7d in migration_channel_connect (...)
>>       error@entry=0x0) at .../migration/channel.c:107
>>   #4  0x0000555555910922 in exec_start_outgoing_migration (...)
>>       at .../migration/exec.c:42
>>
>> Reported-by: Igor Mammedov <imammedo@redhat.com>
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> 
> Yep, I think that's OK.

IIUC HMP device_del ends calling device_finalize(), and per Igor's
explanation of qdev transition states:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg661853.html
I'd expect this to be done in instance_init/instance_finalize,
but since 'failover' is a virtio_net_properties, the callbacks
has to be registered in the realize() function, thus unregistered
in unrealize().

Mentioning it is in realize/unrealize due to 'failover' being a property:
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
>> ---
>>  hw/net/virtio-net.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>> index 66b9ff451185..914051feb75b 100644
>> --- a/hw/net/virtio-net.c
>> +++ b/hw/net/virtio-net.c
>> @@ -3373,6 +3373,7 @@ static void virtio_net_device_unrealize(DeviceState *dev)
>>  
>>      if (n->failover) {
>>          device_listener_unregister(&n->primary_listener);
>> +        remove_migration_state_change_notifier(&n->migration_state);
>>      }
>>  
>>      max_queues = n->multiqueue ? n->max_queues : 1;
>> -- 
>> 2.30.2
>>
Jason Wang April 28, 2021, 2:28 a.m. UTC | #3
在 2021/4/27 下午9:51, Laurent Vivier 写道:
> In the failover case configuration, virtio_net_device_realize() uses an
> add_migration_state_change_notifier() to add a state notifier, but this
> notifier is not removed by the unrealize function when the virtio-net
> card is unplugged.
>
> If the card is unplugged and a migration is started, the notifier is
> called and as it is not valid anymore QEMU crashes.
>
> This patch fixes the problem by adding the
> remove_migration_state_change_notifier() in virtio_net_device_unrealize().
>
> The problem can be reproduced with:
>
>    $ qemu-system-x86_64 -enable-kvm -m 1g -M q35 \
>      -device pcie-root-port,slot=4,id=root1 \
>      -device pcie-root-port,slot=5,id=root2 \
>      -device virtio-net-pci,id=net1,mac=52:54:00:6f:55:cc,failover=on,bus=root1 \
>      -monitor stdio disk.qcow2
>    (qemu) device_del net1
>    (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>
>    Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
>    0x0000000000000000 in ?? ()
>    (gdb) bt
>    #0  0x0000000000000000 in  ()
>    #1  0x0000555555d726d7 in notifier_list_notify (...)
>        at .../util/notify.c:39
>    #2  0x0000555555842c1a in migrate_fd_connect (...)
>        at .../migration/migration.c:3975
>    #3  0x0000555555950f7d in migration_channel_connect (...)
>        error@entry=0x0) at .../migration/channel.c:107
>    #4  0x0000555555910922 in exec_start_outgoing_migration (...)
>        at .../migration/exec.c:42
>
> Reported-by: Igor Mammedov <imammedo@redhat.com>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>


Acked-by: Jason Wang <jasowang@redhat.com>

This should be added to stable I guess.

Thanks


> ---
>   hw/net/virtio-net.c | 1 +
>   1 file changed, 1 insertion(+)
>
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 66b9ff451185..914051feb75b 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -3373,6 +3373,7 @@ static void virtio_net_device_unrealize(DeviceState *dev)
>   
>       if (n->failover) {
>           device_listener_unregister(&n->primary_listener);
> +        remove_migration_state_change_notifier(&n->migration_state);
>       }
>   
>       max_queues = n->multiqueue ? n->max_queues : 1;
Michael S. Tsirkin April 28, 2021, 10:14 a.m. UTC | #4
On Tue, Apr 27, 2021 at 03:02:34PM +0100, Dr. David Alan Gilbert wrote:
> * Laurent Vivier (lvivier@redhat.com) wrote:
> > In the failover case configuration, virtio_net_device_realize() uses an
> > add_migration_state_change_notifier() to add a state notifier, but this
> > notifier is not removed by the unrealize function when the virtio-net
> > card is unplugged.
> > 
> > If the card is unplugged and a migration is started, the notifier is
> > called and as it is not valid anymore QEMU crashes.
> > 
> > This patch fixes the problem by adding the
> > remove_migration_state_change_notifier() in virtio_net_device_unrealize().
> > 
> > The problem can be reproduced with:
> > 
> >   $ qemu-system-x86_64 -enable-kvm -m 1g -M q35 \
> >     -device pcie-root-port,slot=4,id=root1 \
> >     -device pcie-root-port,slot=5,id=root2 \
> >     -device virtio-net-pci,id=net1,mac=52:54:00:6f:55:cc,failover=on,bus=root1 \
> >     -monitor stdio disk.qcow2
> >   (qemu) device_del net1
> >   (qemu) migrate "exec:gzip -c > STATEFILE.gz"
> > 
> >   Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
> >   0x0000000000000000 in ?? ()
> >   (gdb) bt
> >   #0  0x0000000000000000 in  ()
> >   #1  0x0000555555d726d7 in notifier_list_notify (...)
> >       at .../util/notify.c:39
> >   #2  0x0000555555842c1a in migrate_fd_connect (...)
> >       at .../migration/migration.c:3975
> >   #3  0x0000555555950f7d in migration_channel_connect (...)
> >       error@entry=0x0) at .../migration/channel.c:107
> >   #4  0x0000555555910922 in exec_start_outgoing_migration (...)
> >       at .../migration/exec.c:42
> > 
> > Reported-by: Igor Mammedov <imammedo@redhat.com>
> > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> 
> Yep, I think that's OK.
> 
> 
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

net stuff so I expect Jason will merge this ...

> > ---
> >  hw/net/virtio-net.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > index 66b9ff451185..914051feb75b 100644
> > --- a/hw/net/virtio-net.c
> > +++ b/hw/net/virtio-net.c
> > @@ -3373,6 +3373,7 @@ static void virtio_net_device_unrealize(DeviceState *dev)
> >  
> >      if (n->failover) {
> >          device_listener_unregister(&n->primary_listener);
> > +        remove_migration_state_change_notifier(&n->migration_state);
> >      }
> >  
> >      max_queues = n->multiqueue ? n->max_queues : 1;
> > -- 
> > 2.30.2
> > 
> -- 
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Jason Wang April 29, 2021, 7:26 a.m. UTC | #5
在 2021/4/28 下午6:14, Michael S. Tsirkin 写道:
> On Tue, Apr 27, 2021 at 03:02:34PM +0100, Dr. David Alan Gilbert wrote:
>> * Laurent Vivier (lvivier@redhat.com) wrote:
>>> In the failover case configuration, virtio_net_device_realize() uses an
>>> add_migration_state_change_notifier() to add a state notifier, but this
>>> notifier is not removed by the unrealize function when the virtio-net
>>> card is unplugged.
>>>
>>> If the card is unplugged and a migration is started, the notifier is
>>> called and as it is not valid anymore QEMU crashes.
>>>
>>> This patch fixes the problem by adding the
>>> remove_migration_state_change_notifier() in virtio_net_device_unrealize().
>>>
>>> The problem can be reproduced with:
>>>
>>>    $ qemu-system-x86_64 -enable-kvm -m 1g -M q35 \
>>>      -device pcie-root-port,slot=4,id=root1 \
>>>      -device pcie-root-port,slot=5,id=root2 \
>>>      -device virtio-net-pci,id=net1,mac=52:54:00:6f:55:cc,failover=on,bus=root1 \
>>>      -monitor stdio disk.qcow2
>>>    (qemu) device_del net1
>>>    (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>>>
>>>    Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
>>>    0x0000000000000000 in ?? ()
>>>    (gdb) bt
>>>    #0  0x0000000000000000 in  ()
>>>    #1  0x0000555555d726d7 in notifier_list_notify (...)
>>>        at .../util/notify.c:39
>>>    #2  0x0000555555842c1a in migrate_fd_connect (...)
>>>        at .../migration/migration.c:3975
>>>    #3  0x0000555555950f7d in migration_channel_connect (...)
>>>        error@entry=0x0) at .../migration/channel.c:107
>>>    #4  0x0000555555910922 in exec_start_outgoing_migration (...)
>>>        at .../migration/exec.c:42
>>>
>>> Reported-by: Igor Mammedov <imammedo@redhat.com>
>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> Yep, I think that's OK.
>>
>>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>
> net stuff so I expect Jason will merge this ...


Ok, I've queued this.

Thanks


>
>>> ---
>>>   hw/net/virtio-net.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
>>> index 66b9ff451185..914051feb75b 100644
>>> --- a/hw/net/virtio-net.c
>>> +++ b/hw/net/virtio-net.c
>>> @@ -3373,6 +3373,7 @@ static void virtio_net_device_unrealize(DeviceState *dev)
>>>   
>>>       if (n->failover) {
>>>           device_listener_unregister(&n->primary_listener);
>>> +        remove_migration_state_change_notifier(&n->migration_state);
>>>       }
>>>   
>>>       max_queues = n->multiqueue ? n->max_queues : 1;
>>> -- 
>>> 2.30.2
>>>
>> -- 
>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
Laurent Vivier May 5, 2021, 7:46 a.m. UTC | #6
On 29/04/2021 09:26, Jason Wang wrote:
> 
> 在 2021/4/28 下午6:14, Michael S. Tsirkin 写道:
>> On Tue, Apr 27, 2021 at 03:02:34PM +0100, Dr. David Alan Gilbert wrote:
>>> * Laurent Vivier (lvivier@redhat.com) wrote:
>>>> In the failover case configuration, virtio_net_device_realize() uses an
>>>> add_migration_state_change_notifier() to add a state notifier, but this
>>>> notifier is not removed by the unrealize function when the virtio-net
>>>> card is unplugged.
>>>>
>>>> If the card is unplugged and a migration is started, the notifier is
>>>> called and as it is not valid anymore QEMU crashes.
>>>>
>>>> This patch fixes the problem by adding the
>>>> remove_migration_state_change_notifier() in virtio_net_device_unrealize().
>>>>
>>>> The problem can be reproduced with:
>>>>
>>>>    $ qemu-system-x86_64 -enable-kvm -m 1g -M q35 \
>>>>      -device pcie-root-port,slot=4,id=root1 \
>>>>      -device pcie-root-port,slot=5,id=root2 \
>>>>      -device virtio-net-pci,id=net1,mac=52:54:00:6f:55:cc,failover=on,bus=root1 \
>>>>      -monitor stdio disk.qcow2
>>>>    (qemu) device_del net1
>>>>    (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>>>>
>>>>    Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
>>>>    0x0000000000000000 in ?? ()
>>>>    (gdb) bt
>>>>    #0  0x0000000000000000 in  ()
>>>>    #1  0x0000555555d726d7 in notifier_list_notify (...)
>>>>        at .../util/notify.c:39
>>>>    #2  0x0000555555842c1a in migrate_fd_connect (...)
>>>>        at .../migration/migration.c:3975
>>>>    #3  0x0000555555950f7d in migration_channel_connect (...)
>>>>        error@entry=0x0) at .../migration/channel.c:107
>>>>    #4  0x0000555555910922 in exec_start_outgoing_migration (...)
>>>>        at .../migration/exec.c:42
>>>>
>>>> Reported-by: Igor Mammedov <imammedo@redhat.com>
>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>> Yep, I think that's OK.
>>>
>>>
>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>
>> net stuff so I expect Jason will merge this ...
> 
> 
> Ok, I've queued this.

Thank you.

Any idea when the PR will be sent?

Thanks,
Laurent
Jason Wang May 6, 2021, 3:30 a.m. UTC | #7
在 2021/5/5 下午3:46, Laurent Vivier 写道:
> On 29/04/2021 09:26, Jason Wang wrote:
>> 在 2021/4/28 下午6:14, Michael S. Tsirkin 写道:
>>> On Tue, Apr 27, 2021 at 03:02:34PM +0100, Dr. David Alan Gilbert wrote:
>>>> * Laurent Vivier (lvivier@redhat.com) wrote:
>>>>> In the failover case configuration, virtio_net_device_realize() uses an
>>>>> add_migration_state_change_notifier() to add a state notifier, but this
>>>>> notifier is not removed by the unrealize function when the virtio-net
>>>>> card is unplugged.
>>>>>
>>>>> If the card is unplugged and a migration is started, the notifier is
>>>>> called and as it is not valid anymore QEMU crashes.
>>>>>
>>>>> This patch fixes the problem by adding the
>>>>> remove_migration_state_change_notifier() in virtio_net_device_unrealize().
>>>>>
>>>>> The problem can be reproduced with:
>>>>>
>>>>>     $ qemu-system-x86_64 -enable-kvm -m 1g -M q35 \
>>>>>       -device pcie-root-port,slot=4,id=root1 \
>>>>>       -device pcie-root-port,slot=5,id=root2 \
>>>>>       -device virtio-net-pci,id=net1,mac=52:54:00:6f:55:cc,failover=on,bus=root1 \
>>>>>       -monitor stdio disk.qcow2
>>>>>     (qemu) device_del net1
>>>>>     (qemu) migrate "exec:gzip -c > STATEFILE.gz"
>>>>>
>>>>>     Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
>>>>>     0x0000000000000000 in ?? ()
>>>>>     (gdb) bt
>>>>>     #0  0x0000000000000000 in  ()
>>>>>     #1  0x0000555555d726d7 in notifier_list_notify (...)
>>>>>         at .../util/notify.c:39
>>>>>     #2  0x0000555555842c1a in migrate_fd_connect (...)
>>>>>         at .../migration/migration.c:3975
>>>>>     #3  0x0000555555950f7d in migration_channel_connect (...)
>>>>>         error@entry=0x0) at .../migration/channel.c:107
>>>>>     #4  0x0000555555910922 in exec_start_outgoing_migration (...)
>>>>>         at .../migration/exec.c:42
>>>>>
>>>>> Reported-by: Igor Mammedov <imammedo@redhat.com>
>>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>> Yep, I think that's OK.
>>>>
>>>>
>>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>
>>> net stuff so I expect Jason will merge this ...
>>
>> Ok, I've queued this.
> Thank you.
>
> Any idea when the PR will be sent?


I would be no later than the end of this week. I plan to merge RSS so I 
want to have some basic for that.

Thanks


>
> Thanks,
> Laurent
>
diff mbox series

Patch

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 66b9ff451185..914051feb75b 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3373,6 +3373,7 @@  static void virtio_net_device_unrealize(DeviceState *dev)
 
     if (n->failover) {
         device_listener_unregister(&n->primary_listener);
+        remove_migration_state_change_notifier(&n->migration_state);
     }
 
     max_queues = n->multiqueue ? n->max_queues : 1;