Message ID | 20210924090427.9218-10-kwolf@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | qdev: Add JSON -device and fix QMP device_add | expand |
On Fri, Sep 24, 2021 at 11:04:25AM +0200, Kevin Wolf wrote: > Directly call qdev_device_add_from_qdict() for QMP device_add instead of > first going through QemuOpts and converting back to QDict. > > Note that this changes the behaviour of device_add, though in ways that > should be considered bug fixes: > > QemuOpts ignores differences between data types, so you could > successfully pass a string "123" for an integer property, or a string > "on" for a boolean property (and vice versa). After this change, the > correct data type for the property must be used in the JSON input. > > qemu_opts_from_qdict() also silently ignores any options whose value is > a QDict, QList or QNull. > > To illustrate, the following QMP command was accepted before and is now > rejected for both reasons: > > { "execute": "device_add", > "arguments": { "driver": "scsi-cd", > "drive": { "completely": "invalid" }, > "physical_block_size": "4096" } } > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > softmmu/qdev-monitor.c | 18 +++++++++++------- > 1 file changed, 11 insertions(+), 7 deletions(-) > Reviewed-by: Eric Blake <eblake@redhat.com>
On 9/24/21 11:04, Kevin Wolf wrote: > Directly call qdev_device_add_from_qdict() for QMP device_add instead of > first going through QemuOpts and converting back to QDict. > > Note that this changes the behaviour of device_add, though in ways that > should be considered bug fixes: > > QemuOpts ignores differences between data types, so you could > successfully pass a string "123" for an integer property, or a string > "on" for a boolean property (and vice versa). After this change, the > correct data type for the property must be used in the JSON input. > > qemu_opts_from_qdict() also silently ignores any options whose value is > a QDict, QList or QNull. > > To illustrate, the following QMP command was accepted before and is now > rejected for both reasons: > > { "execute": "device_add", > "arguments": { "driver": "scsi-cd", > "drive": { "completely": "invalid" }, > "physical_block_size": "4096" } } > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > --- > softmmu/qdev-monitor.c | 18 +++++++++++------- > 1 file changed, 11 insertions(+), 7 deletions(-) > > diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c > index c09b7430eb..8622ccade6 100644 > --- a/softmmu/qdev-monitor.c > +++ b/softmmu/qdev-monitor.c > @@ -812,7 +812,8 @@ void hmp_info_qdm(Monitor *mon, const QDict *qdict) > qdev_print_devinfos(true); > } > > -void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) > +static void monitor_device_add(QDict *qdict, QObject **ret_data, > + bool from_json, Error **errp) > { > QemuOpts *opts; > DeviceState *dev; > @@ -825,7 +826,9 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) > qemu_opts_del(opts); > return; > } > - dev = qdev_device_add(opts, errp); > + qemu_opts_del(opts); > + > + dev = qdev_device_add_from_qdict(qdict, from_json, errp); > Hi Kevin, I'm wandering if deleting the opts (which remove it from the "device" opts list) is really a no-op ? The opts list is, eg, traversed in hw/net/virtio-net.c in the function failover_find_primary_device_id() which may be called during the virtio_net_set_features() (a TYPE_VIRTIO_NET method). I do not have the knowledge to tell when this method is called. But If this is after we create the devices. Then the list will be empty at this point now. It seems, there are 2 other calling sites of "qemu_opts_foreach(qemu_find_opts("device"), [...]" in net/vhost-user.c and net/vhost-vdpa.c -- Damien
Am 27.09.2021 um 13:06 hat Damien Hedde geschrieben: > On 9/24/21 11:04, Kevin Wolf wrote: > > Directly call qdev_device_add_from_qdict() for QMP device_add instead of > > first going through QemuOpts and converting back to QDict. > > > > Note that this changes the behaviour of device_add, though in ways that > > should be considered bug fixes: > > > > QemuOpts ignores differences between data types, so you could > > successfully pass a string "123" for an integer property, or a string > > "on" for a boolean property (and vice versa). After this change, the > > correct data type for the property must be used in the JSON input. > > > > qemu_opts_from_qdict() also silently ignores any options whose value is > > a QDict, QList or QNull. > > > > To illustrate, the following QMP command was accepted before and is now > > rejected for both reasons: > > > > { "execute": "device_add", > > "arguments": { "driver": "scsi-cd", > > "drive": { "completely": "invalid" }, > > "physical_block_size": "4096" } } > > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > --- > > softmmu/qdev-monitor.c | 18 +++++++++++------- > > 1 file changed, 11 insertions(+), 7 deletions(-) > > > > diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c > > index c09b7430eb..8622ccade6 100644 > > --- a/softmmu/qdev-monitor.c > > +++ b/softmmu/qdev-monitor.c > > @@ -812,7 +812,8 @@ void hmp_info_qdm(Monitor *mon, const QDict *qdict) > > qdev_print_devinfos(true); > > } > > -void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) > > +static void monitor_device_add(QDict *qdict, QObject **ret_data, > > + bool from_json, Error **errp) > > { > > QemuOpts *opts; > > DeviceState *dev; > > @@ -825,7 +826,9 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) > > qemu_opts_del(opts); > > return; > > } > > - dev = qdev_device_add(opts, errp); > > + qemu_opts_del(opts); > > + > > + dev = qdev_device_add_from_qdict(qdict, from_json, errp); > > Hi Kevin, > > I'm wandering if deleting the opts (which remove it from the "device" opts > list) is really a no-op ? It's not exactly a no-op. Previously, the QemuOpts would only be freed when the device is destroying, now we delete it immediately after creating the device. This could matter in some cases. The one case I was aware of is that QemuOpts used to be responsible for checking for duplicate IDs. Obviously, it can't do this job any more when we call qemu_opts_del() right after creating the device. This is the reason for patch 6. > The opts list is, eg, traversed in hw/net/virtio-net.c in the function > failover_find_primary_device_id() which may be called during the > virtio_net_set_features() (a TYPE_VIRTIO_NET method). > I do not have the knowledge to tell when this method is called. But If this > is after we create the devices. Then the list will be empty at this point > now. > > It seems, there are 2 other calling sites of > "qemu_opts_foreach(qemu_find_opts("device"), [...]" in net/vhost-user.c and > net/vhost-vdpa.c Yes, you are right. These callers probably need to be changed. Going through the command line options rather than looking at the actual device objects that exist doesn't feel entirely clean anyway. Kevin
On Fri, Sep 24, 2021 at 11:04:25 +0200, Kevin Wolf wrote: > Directly call qdev_device_add_from_qdict() for QMP device_add instead of > first going through QemuOpts and converting back to QDict. > > Note that this changes the behaviour of device_add, though in ways that > should be considered bug fixes: > > QemuOpts ignores differences between data types, so you could > successfully pass a string "123" for an integer property, or a string > "on" for a boolean property (and vice versa). After this change, the > correct data type for the property must be used in the JSON input. > > qemu_opts_from_qdict() also silently ignores any options whose value is > a QDict, QList or QNull. Sorry for chiming in a bit late, but preferrably this commit should be postponed to at least the next release so that we decrease the amount of libvirt users broken by this. Granted users are supposed to use new libvirt with new qemu but that's not always the case. Anyways, libvirt is currently mangling all the types to strings with device_add. I'm currently working on fixing it and it will hopefully be done until next-month's release, but preferrably we increase the window of working combinations by postponing this until the next release.
On 10/1/21 16:42, Peter Krempa wrote: > On Fri, Sep 24, 2021 at 11:04:25 +0200, Kevin Wolf wrote: >> Directly call qdev_device_add_from_qdict() for QMP device_add instead of >> first going through QemuOpts and converting back to QDict. >> >> Note that this changes the behaviour of device_add, though in ways that >> should be considered bug fixes: >> >> QemuOpts ignores differences between data types, so you could >> successfully pass a string "123" for an integer property, or a string >> "on" for a boolean property (and vice versa). After this change, the >> correct data type for the property must be used in the JSON input. >> >> qemu_opts_from_qdict() also silently ignores any options whose value is >> a QDict, QList or QNull. > > Sorry for chiming in a bit late, but preferrably this commit should be > postponed to at least the next release so that we decrease the amount of > libvirt users broken by this. > > Granted users are supposed to use new libvirt with new qemu but that's > not always the case. > > Anyways, libvirt is currently mangling all the types to strings with > device_add. I'm currently working on fixing it and it will hopefully be > done until next-month's release, but preferrably we increase the window > of working combinations by postponing this until the next release. > > Switching to qdict is really an improvement I think. If we choose to keep legacy behavior working for now, I think we should find a way to still do this switch. Maybe we can temporarily keep the str_to_int and str_to_bool conversion when converting the qdict to the qdev properties afterward ? Damien
Am 04.10.2021 um 14:18 hat Damien Hedde geschrieben: > > > On 10/1/21 16:42, Peter Krempa wrote: > > On Fri, Sep 24, 2021 at 11:04:25 +0200, Kevin Wolf wrote: > > > Directly call qdev_device_add_from_qdict() for QMP device_add instead of > > > first going through QemuOpts and converting back to QDict. > > > > > > Note that this changes the behaviour of device_add, though in ways that > > > should be considered bug fixes: > > > > > > QemuOpts ignores differences between data types, so you could > > > successfully pass a string "123" for an integer property, or a string > > > "on" for a boolean property (and vice versa). After this change, the > > > correct data type for the property must be used in the JSON input. > > > > > > qemu_opts_from_qdict() also silently ignores any options whose value is > > > a QDict, QList or QNull. > > > > Sorry for chiming in a bit late, but preferrably this commit should be > > postponed to at least the next release so that we decrease the amount of > > libvirt users broken by this. > > > > Granted users are supposed to use new libvirt with new qemu but that's > > not always the case. > > > > Anyways, libvirt is currently mangling all the types to strings with > > device_add. I'm currently working on fixing it and it will hopefully be > > done until next-month's release, but preferrably we increase the window > > of working combinations by postponing this until the next release. > > Switching to qdict is really an improvement I think. > > If we choose to keep legacy behavior working for now, I think we > should find a way to still do this switch. Maybe we can temporarily > keep the str_to_int and str_to_bool conversion when converting the > qdict to the qdev properties afterward? I guess we can keep the detour through QemuOpts for QMP for now, and make sure that the command line code bypasses this bit and still requires correct types for JSON input. It's only this patch that breaks compatibility with libvirt, patch 8 should still be okay. Kevin
Am 27.09.2021 um 13:39 hat Kevin Wolf geschrieben: > Am 27.09.2021 um 13:06 hat Damien Hedde geschrieben: > > On 9/24/21 11:04, Kevin Wolf wrote: > > > Directly call qdev_device_add_from_qdict() for QMP device_add instead of > > > first going through QemuOpts and converting back to QDict. > > > > > > Note that this changes the behaviour of device_add, though in ways that > > > should be considered bug fixes: > > > > > > QemuOpts ignores differences between data types, so you could > > > successfully pass a string "123" for an integer property, or a string > > > "on" for a boolean property (and vice versa). After this change, the > > > correct data type for the property must be used in the JSON input. > > > > > > qemu_opts_from_qdict() also silently ignores any options whose value is > > > a QDict, QList or QNull. > > > > > > To illustrate, the following QMP command was accepted before and is now > > > rejected for both reasons: > > > > > > { "execute": "device_add", > > > "arguments": { "driver": "scsi-cd", > > > "drive": { "completely": "invalid" }, > > > "physical_block_size": "4096" } } > > > > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > > --- > > > softmmu/qdev-monitor.c | 18 +++++++++++------- > > > 1 file changed, 11 insertions(+), 7 deletions(-) > > > > > > diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c > > > index c09b7430eb..8622ccade6 100644 > > > --- a/softmmu/qdev-monitor.c > > > +++ b/softmmu/qdev-monitor.c > > > @@ -812,7 +812,8 @@ void hmp_info_qdm(Monitor *mon, const QDict *qdict) > > > qdev_print_devinfos(true); > > > } > > > -void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) > > > +static void monitor_device_add(QDict *qdict, QObject **ret_data, > > > + bool from_json, Error **errp) > > > { > > > QemuOpts *opts; > > > DeviceState *dev; > > > @@ -825,7 +826,9 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) > > > qemu_opts_del(opts); > > > return; > > > } > > > - dev = qdev_device_add(opts, errp); > > > + qemu_opts_del(opts); > > > + > > > + dev = qdev_device_add_from_qdict(qdict, from_json, errp); > > > > Hi Kevin, > > > > I'm wandering if deleting the opts (which remove it from the "device" opts > > list) is really a no-op ? > > It's not exactly a no-op. Previously, the QemuOpts would only be freed > when the device is destroying, now we delete it immediately after > creating the device. This could matter in some cases. > > The one case I was aware of is that QemuOpts used to be responsible for > checking for duplicate IDs. Obviously, it can't do this job any more > when we call qemu_opts_del() right after creating the device. This is > the reason for patch 6. > > > The opts list is, eg, traversed in hw/net/virtio-net.c in the function > > failover_find_primary_device_id() which may be called during the > > virtio_net_set_features() (a TYPE_VIRTIO_NET method). > > I do not have the knowledge to tell when this method is called. But If this > > is after we create the devices. Then the list will be empty at this point > > now. > > > > It seems, there are 2 other calling sites of > > "qemu_opts_foreach(qemu_find_opts("device"), [...]" in net/vhost-user.c and > > net/vhost-vdpa.c > > Yes, you are right. These callers probably need to be changed. Going > through the command line options rather than looking at the actual > device objects that exist doesn't feel entirely clean anyway. So I tried to have a look at the virtio-net case, and ended up very confused. Obviously looking at command line options (even of a differrent device) from within a device is very unclean. With a non-broken, i.e. type safe, device-add (as well as with the JSON CLI option introduced by this series), we can't have a QemuOpts any more that is by definition unsafe. So this code needs a replacement. My naive idea was that we just need to look at runtime state instead. Don't search the options for a device with a matching 'failover_pair_id' (which, by the way, would fail as soon as any other device introduces a property with the same name), but search for actual PCIDevices in qdev that have pci_dev->failover_pair_id set accordingly. However, the logic in failover_add_primary() suggests that we can have a state where QemuOpts for a device exist, but the device doesn't, and then it hotplugs the device from the command line options. How would we ever get into such an inconsistent state where QemuOpts contains a device that doesn't exist? Normally devices get their QemuOpts when they are created and device_finalize() deletes the QemuOpts again. Any suggestions how to get rid of the QemuOpts abuse in the failover code? If this is a device that we previously managed to rip out without deleting its QemuOpts, can we store its dev->opts (which is a type safe QDict after this series) somewhere locally instead of looking at global state? Preferably I would even like to get rid of dev->opts because we really should look at live state rather than command line options after device creation, but I guess one step at a time. (Actually, I'm half tempted to just break it because no test cases seem to exist, so apparently nobody is really interested in it.) Kevin
On 10/5/21 16:37, Kevin Wolf wrote: > Am 27.09.2021 um 13:39 hat Kevin Wolf geschrieben: >> Am 27.09.2021 um 13:06 hat Damien Hedde geschrieben: >>> On 9/24/21 11:04, Kevin Wolf wrote: >>>> Directly call qdev_device_add_from_qdict() for QMP device_add instead of >>>> first going through QemuOpts and converting back to QDict. >>>> >>>> Note that this changes the behaviour of device_add, though in ways that >>>> should be considered bug fixes: >>>> >>>> QemuOpts ignores differences between data types, so you could >>>> successfully pass a string "123" for an integer property, or a string >>>> "on" for a boolean property (and vice versa). After this change, the >>>> correct data type for the property must be used in the JSON input. >>>> >>>> qemu_opts_from_qdict() also silently ignores any options whose value is >>>> a QDict, QList or QNull. >>>> >>>> To illustrate, the following QMP command was accepted before and is now >>>> rejected for both reasons: >>>> >>>> { "execute": "device_add", >>>> "arguments": { "driver": "scsi-cd", >>>> "drive": { "completely": "invalid" }, >>>> "physical_block_size": "4096" } } >>>> >>>> Signed-off-by: Kevin Wolf <kwolf@redhat.com> >>>> --- >>>> softmmu/qdev-monitor.c | 18 +++++++++++------- >>>> 1 file changed, 11 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c >>>> index c09b7430eb..8622ccade6 100644 >>>> --- a/softmmu/qdev-monitor.c >>>> +++ b/softmmu/qdev-monitor.c >>>> @@ -812,7 +812,8 @@ void hmp_info_qdm(Monitor *mon, const QDict *qdict) >>>> qdev_print_devinfos(true); >>>> } >>>> -void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) >>>> +static void monitor_device_add(QDict *qdict, QObject **ret_data, >>>> + bool from_json, Error **errp) >>>> { >>>> QemuOpts *opts; >>>> DeviceState *dev; >>>> @@ -825,7 +826,9 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) >>>> qemu_opts_del(opts); >>>> return; >>>> } >>>> - dev = qdev_device_add(opts, errp); >>>> + qemu_opts_del(opts); >>>> + >>>> + dev = qdev_device_add_from_qdict(qdict, from_json, errp); >>> >>> Hi Kevin, >>> >>> I'm wandering if deleting the opts (which remove it from the "device" opts >>> list) is really a no-op ? >> >> It's not exactly a no-op. Previously, the QemuOpts would only be freed >> when the device is destroying, now we delete it immediately after >> creating the device. This could matter in some cases. >> >> The one case I was aware of is that QemuOpts used to be responsible for >> checking for duplicate IDs. Obviously, it can't do this job any more >> when we call qemu_opts_del() right after creating the device. This is >> the reason for patch 6. >> >>> The opts list is, eg, traversed in hw/net/virtio-net.c in the function >>> failover_find_primary_device_id() which may be called during the >>> virtio_net_set_features() (a TYPE_VIRTIO_NET method). >>> I do not have the knowledge to tell when this method is called. But If this >>> is after we create the devices. Then the list will be empty at this point >>> now. >>> >>> It seems, there are 2 other calling sites of >>> "qemu_opts_foreach(qemu_find_opts("device"), [...]" in net/vhost-user.c and >>> net/vhost-vdpa.c >> >> Yes, you are right. These callers probably need to be changed. Going >> through the command line options rather than looking at the actual >> device objects that exist doesn't feel entirely clean anyway. > > So I tried to have a look at the virtio-net case, and ended up very > confused. > > Obviously looking at command line options (even of a differrent device) > from within a device is very unclean. With a non-broken, i.e. type safe, > device-add (as well as with the JSON CLI option introduced by this > series), we can't have a QemuOpts any more that is by definition unsafe. > So this code needs a replacement. > > My naive idea was that we just need to look at runtime state instead. > Don't search the options for a device with a matching 'failover_pair_id' > (which, by the way, would fail as soon as any other device introduces a > property with the same name), but search for actual PCIDevices in qdev > that have pci_dev->failover_pair_id set accordingly. > > However, the logic in failover_add_primary() suggests that we can have a > state where QemuOpts for a device exist, but the device doesn't, and > then it hotplugs the device from the command line options. How would we > ever get into such an inconsistent state where QemuOpts contains a > device that doesn't exist? Normally devices get their QemuOpts when they > are created and device_finalize() deletes the QemuOpts again. > Just read the following from docs/system/virtio-net-failover.rst > Usage > ----- > > The primary device can be hotplugged or be part of the startup > configuration > > -device virtio-net-pci,netdev=hostnet1,id=net1, > mac=52:54:00:6f:55:cc,bus=root2,failover=on > > With the parameter failover=on the VIRTIO_NET_F_STANDBY feature > will be enabled. > > -device vfio-pci,host=5e:00.2,id=hostdev0,bus=root1, > failover_pair_id=net1 > > failover_pair_id references the id of the virtio-net standby device. > This is only for pairing the devices within QEMU. The guest kernel > module net_failover will match devices with identical MAC addresses. > > Hotplug > ------- > > Both primary and standby device can be hotplugged via the QEMU > monitor. Note that if the virtio-net device is plugged first a > warning will be issued that it couldn't find the primary device. So maybe this whole primary device lookup can happen during the -device CLI option creation loop. And we can indeed have un-created devices still in the list ? Damien > Any suggestions how to get rid of the QemuOpts abuse in the failover > code? > > If this is a device that we previously managed to rip out without > deleting its QemuOpts, can we store its dev->opts (which is a type safe > QDict after this series) somewhere locally instead of looking at global > state? Preferably I would even like to get rid of dev->opts because we > really should look at live state rather than command line options after > device creation, but I guess one step at a time. > > (Actually, I'm half tempted to just break it because no test cases seem > to exist, so apparently nobody is really interested in it.) > > Kevin >
Am 05.10.2021 um 17:52 hat Damien Hedde geschrieben: > > > On 10/5/21 16:37, Kevin Wolf wrote: > > Am 27.09.2021 um 13:39 hat Kevin Wolf geschrieben: > > > Am 27.09.2021 um 13:06 hat Damien Hedde geschrieben: > > > > On 9/24/21 11:04, Kevin Wolf wrote: > > > > > Directly call qdev_device_add_from_qdict() for QMP device_add instead of > > > > > first going through QemuOpts and converting back to QDict. > > > > > > > > > > Note that this changes the behaviour of device_add, though in ways that > > > > > should be considered bug fixes: > > > > > > > > > > QemuOpts ignores differences between data types, so you could > > > > > successfully pass a string "123" for an integer property, or a string > > > > > "on" for a boolean property (and vice versa). After this change, the > > > > > correct data type for the property must be used in the JSON input. > > > > > > > > > > qemu_opts_from_qdict() also silently ignores any options whose value is > > > > > a QDict, QList or QNull. > > > > > > > > > > To illustrate, the following QMP command was accepted before and is now > > > > > rejected for both reasons: > > > > > > > > > > { "execute": "device_add", > > > > > "arguments": { "driver": "scsi-cd", > > > > > "drive": { "completely": "invalid" }, > > > > > "physical_block_size": "4096" } } > > > > > > > > > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > > > > > --- > > > > > softmmu/qdev-monitor.c | 18 +++++++++++------- > > > > > 1 file changed, 11 insertions(+), 7 deletions(-) > > > > > > > > > > diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c > > > > > index c09b7430eb..8622ccade6 100644 > > > > > --- a/softmmu/qdev-monitor.c > > > > > +++ b/softmmu/qdev-monitor.c > > > > > @@ -812,7 +812,8 @@ void hmp_info_qdm(Monitor *mon, const QDict *qdict) > > > > > qdev_print_devinfos(true); > > > > > } > > > > > -void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) > > > > > +static void monitor_device_add(QDict *qdict, QObject **ret_data, > > > > > + bool from_json, Error **errp) > > > > > { > > > > > QemuOpts *opts; > > > > > DeviceState *dev; > > > > > @@ -825,7 +826,9 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) > > > > > qemu_opts_del(opts); > > > > > return; > > > > > } > > > > > - dev = qdev_device_add(opts, errp); > > > > > + qemu_opts_del(opts); > > > > > + > > > > > + dev = qdev_device_add_from_qdict(qdict, from_json, errp); > > > > > > > > Hi Kevin, > > > > > > > > I'm wandering if deleting the opts (which remove it from the "device" opts > > > > list) is really a no-op ? > > > > > > It's not exactly a no-op. Previously, the QemuOpts would only be freed > > > when the device is destroying, now we delete it immediately after > > > creating the device. This could matter in some cases. > > > > > > The one case I was aware of is that QemuOpts used to be responsible for > > > checking for duplicate IDs. Obviously, it can't do this job any more > > > when we call qemu_opts_del() right after creating the device. This is > > > the reason for patch 6. > > > > > > > The opts list is, eg, traversed in hw/net/virtio-net.c in the function > > > > failover_find_primary_device_id() which may be called during the > > > > virtio_net_set_features() (a TYPE_VIRTIO_NET method). > > > > I do not have the knowledge to tell when this method is called. But If this > > > > is after we create the devices. Then the list will be empty at this point > > > > now. > > > > > > > > It seems, there are 2 other calling sites of > > > > "qemu_opts_foreach(qemu_find_opts("device"), [...]" in net/vhost-user.c and > > > > net/vhost-vdpa.c > > > > > > Yes, you are right. These callers probably need to be changed. Going > > > through the command line options rather than looking at the actual > > > device objects that exist doesn't feel entirely clean anyway. > > > > So I tried to have a look at the virtio-net case, and ended up very > > confused. > > > > Obviously looking at command line options (even of a differrent device) > > from within a device is very unclean. With a non-broken, i.e. type safe, > > device-add (as well as with the JSON CLI option introduced by this > > series), we can't have a QemuOpts any more that is by definition unsafe. > > So this code needs a replacement. > > > > My naive idea was that we just need to look at runtime state instead. > > Don't search the options for a device with a matching 'failover_pair_id' > > (which, by the way, would fail as soon as any other device introduces a > > property with the same name), but search for actual PCIDevices in qdev > > that have pci_dev->failover_pair_id set accordingly. > > > > However, the logic in failover_add_primary() suggests that we can have a > > state where QemuOpts for a device exist, but the device doesn't, and > > then it hotplugs the device from the command line options. How would we > > ever get into such an inconsistent state where QemuOpts contains a > > device that doesn't exist? Normally devices get their QemuOpts when they > > are created and device_finalize() deletes the QemuOpts again. > > > Just read the following from docs/system/virtio-net-failover.rst > > > Usage > > ----- > > > > The primary device can be hotplugged or be part of the startup > > configuration > > > > -device virtio-net-pci,netdev=hostnet1,id=net1, > > mac=52:54:00:6f:55:cc,bus=root2,failover=on > > > > With the parameter failover=on the VIRTIO_NET_F_STANDBY feature > > will be enabled. > > > > -device vfio-pci,host=5e:00.2,id=hostdev0,bus=root1, > > failover_pair_id=net1 > > > > failover_pair_id references the id of the virtio-net standby device. > > This is only for pairing the devices within QEMU. The guest kernel > > module net_failover will match devices with identical MAC addresses. > > > > Hotplug > > ------- > > > > Both primary and standby device can be hotplugged via the QEMU > > monitor. Note that if the virtio-net device is plugged first a > > warning will be issued that it couldn't find the primary device. > > So maybe this whole primary device lookup can happen during the -device CLI > option creation loop. And we can indeed have un-created devices still in the > list ? Yes, that's the only case for which I could imagine for an inconsistency between the qdev tree and QemuOpts, but failover_add_primary() is only called after feature negotiation with the guest driver, so we can be sure that the -device loop has completed long ago. And even if it hadn't completed yet, the paragraph also says that even hotplugging the device later is supported, so creating devices in the wrong order should still succeed. I hope that some of the people I added to CC have some more hints. Kevin > > Any suggestions how to get rid of the QemuOpts abuse in the failover > > code? > > > > If this is a device that we previously managed to rip out without > > deleting its QemuOpts, can we store its dev->opts (which is a type safe > > QDict after this series) somewhere locally instead of looking at global > > state? Preferably I would even like to get rid of dev->opts because we > > really should look at live state rather than command line options after > > device creation, but I guess one step at a time. > > > > (Actually, I'm half tempted to just break it because no test cases seem > > to exist, so apparently nobody is really interested in it.) > > > > Kevin > > >
Kevin Wolf <kwolf@redhat.com> wrote: > Am 05.10.2021 um 17:52 hat Damien Hedde geschrieben: Hi >> > Usage >> > ----- >> > >> > The primary device can be hotplugged or be part of the startup >> > configuration >> > >> > -device virtio-net-pci,netdev=hostnet1,id=net1, >> > mac=52:54:00:6f:55:cc,bus=root2,failover=on >> > >> > With the parameter failover=on the VIRTIO_NET_F_STANDBY feature >> > will be enabled. >> > >> > -device vfio-pci,host=5e:00.2,id=hostdev0,bus=root1, >> > failover_pair_id=net1 >> > >> > failover_pair_id references the id of the virtio-net standby device. >> > This is only for pairing the devices within QEMU. The guest kernel >> > module net_failover will match devices with identical MAC addresses. >> > >> > Hotplug >> > ------- >> > >> > Both primary and standby device can be hotplugged via the QEMU >> > monitor. Note that if the virtio-net device is plugged first a >> > warning will be issued that it couldn't find the primary device. >> >> So maybe this whole primary device lookup can happen during the -device CLI >> option creation loop. And we can indeed have un-created devices still in the >> list ? > > Yes, that's the only case for which I could imagine for an inconsistency > between the qdev tree and QemuOpts, but failover_add_primary() is only > called after feature negotiation with the guest driver, so we can be > sure that the -device loop has completed long ago. > > And even if it hadn't completed yet, the paragraph also says that even > hotplugging the device later is supported, so creating devices in the > wrong order should still succeed. > > I hope that some of the people I added to CC have some more hints. Failover is ... interesting. You have two devices: primary and seconday. seconday is virtio-net, primary can be vfio and some other emulated devices. In the command line, devices can appear on any order, primary then secondary, secondary then primary, or only one of them. You can add (any of them) later in the toplevel. And now, what all this mess is about. We only enable the primary if the guest knows about failover. Otherwise we use only the virtio device (*). The important bit here is that we need to wait until the guest is booted, and the virtio-net driver is loaded, and then it tells us if it understands failover (or not). At that point we decide if we want to "really" create the primary. I know that it abuses device_add() as much as it can be, but I can't see any better way to handle it. We need to be able to "create" a device without showing it to the guest. And later, when we create a different device, and depending of driver support on the guest, we "finish" the creation of the primary device. Any good idea? Later, Juan. *: This changed recently and we can only have the "primary" and not the virtio one, but it doesn't matter on this discussion.
On 06/10/2021 10:21, Juan Quintela wrote: > Kevin Wolf <kwolf@redhat.com> wrote: >> Am 05.10.2021 um 17:52 hat Damien Hedde geschrieben: > > Hi > >>>> Usage >>>> ----- >>>> >>>> The primary device can be hotplugged or be part of the startup >>>> configuration >>>> >>>> -device virtio-net-pci,netdev=hostnet1,id=net1, >>>> mac=52:54:00:6f:55:cc,bus=root2,failover=on >>>> >>>> With the parameter failover=on the VIRTIO_NET_F_STANDBY feature >>>> will be enabled. >>>> >>>> -device vfio-pci,host=5e:00.2,id=hostdev0,bus=root1, >>>> failover_pair_id=net1 >>>> >>>> failover_pair_id references the id of the virtio-net standby device. >>>> This is only for pairing the devices within QEMU. The guest kernel >>>> module net_failover will match devices with identical MAC addresses. >>>> >>>> Hotplug >>>> ------- >>>> >>>> Both primary and standby device can be hotplugged via the QEMU >>>> monitor. Note that if the virtio-net device is plugged first a >>>> warning will be issued that it couldn't find the primary device. >>> >>> So maybe this whole primary device lookup can happen during the -device CLI >>> option creation loop. And we can indeed have un-created devices still in the >>> list ? >> >> Yes, that's the only case for which I could imagine for an inconsistency >> between the qdev tree and QemuOpts, but failover_add_primary() is only >> called after feature negotiation with the guest driver, so we can be >> sure that the -device loop has completed long ago. >> >> And even if it hadn't completed yet, the paragraph also says that even >> hotplugging the device later is supported, so creating devices in the >> wrong order should still succeed. >> >> I hope that some of the people I added to CC have some more hints. > > Failover is ... interesting. > > You have two devices: primary and seconday. > seconday is virtio-net, primary can be vfio and some other emulated > devices. > > In the command line, devices can appear on any order, primary then > secondary, secondary then primary, or only one of them. > You can add (any of them) later in the toplevel. > > And now, what all this mess is about. We only enable the primary if the > guest knows about failover. Otherwise we use only the virtio device > (*). The important bit here is that we need to wait until the guest is > booted, and the virtio-net driver is loaded, and then it tells us if it > understands failover (or not). At that point we decide if we want to > "really" create the primary. > > I know that it abuses device_add() as much as it can be, but I can't see > any better way to handle it. We need to be able to "create" a device > without showing it to the guest. And later, when we create a different > device, and depending of driver support on the guest, we "finish" the > creation of the primary device. > > Any good idea? I don't know if it can help the discussion, but I'm reformatting the failover code to move all the PCI stuff to pci files. And there is a lot of inconsistencies regarding the device_add and --device option so I've been in the end to add a list of of hidden devices rather than relying on the command line. See PATCH 8 of series "[RFC PATCH v2 0/8] virtio-net failover cleanup and new features" https://patchew.org/QEMU/20210820142002.152994-1-lvivier@redhat.com/ Thanks, Laurent
Am 06.10.2021 um 11:20 hat Laurent Vivier geschrieben: > On 06/10/2021 10:21, Juan Quintela wrote: > > Kevin Wolf <kwolf@redhat.com> wrote: > > > Am 05.10.2021 um 17:52 hat Damien Hedde geschrieben: > > > > Hi > > > > > > > Usage > > > > > ----- > > > > > > > > > > The primary device can be hotplugged or be part of the startup > > > > > configuration > > > > > > > > > > -device virtio-net-pci,netdev=hostnet1,id=net1, > > > > > mac=52:54:00:6f:55:cc,bus=root2,failover=on > > > > > > > > > > With the parameter failover=on the VIRTIO_NET_F_STANDBY feature > > > > > will be enabled. > > > > > > > > > > -device vfio-pci,host=5e:00.2,id=hostdev0,bus=root1, > > > > > failover_pair_id=net1 > > > > > > > > > > failover_pair_id references the id of the virtio-net standby device. > > > > > This is only for pairing the devices within QEMU. The guest kernel > > > > > module net_failover will match devices with identical MAC addresses. > > > > > > > > > > Hotplug > > > > > ------- > > > > > > > > > > Both primary and standby device can be hotplugged via the QEMU > > > > > monitor. Note that if the virtio-net device is plugged first a > > > > > warning will be issued that it couldn't find the primary device. > > > > > > > > So maybe this whole primary device lookup can happen during the -device CLI > > > > option creation loop. And we can indeed have un-created devices still in the > > > > list ? > > > > > > Yes, that's the only case for which I could imagine for an inconsistency > > > between the qdev tree and QemuOpts, but failover_add_primary() is only > > > called after feature negotiation with the guest driver, so we can be > > > sure that the -device loop has completed long ago. > > > > > > And even if it hadn't completed yet, the paragraph also says that even > > > hotplugging the device later is supported, so creating devices in the > > > wrong order should still succeed. > > > > > > I hope that some of the people I added to CC have some more hints. > > > > Failover is ... interesting. > > > > You have two devices: primary and seconday. > > seconday is virtio-net, primary can be vfio and some other emulated > > devices. > > > > In the command line, devices can appear on any order, primary then > > secondary, secondary then primary, or only one of them. > > You can add (any of them) later in the toplevel. > > > > And now, what all this mess is about. We only enable the primary if the > > guest knows about failover. Otherwise we use only the virtio device > > (*). The important bit here is that we need to wait until the guest is > > booted, and the virtio-net driver is loaded, and then it tells us if it > > understands failover (or not). At that point we decide if we want to > > "really" create the primary. > > > > I know that it abuses device_add() as much as it can be, but I can't see > > any better way to handle it. We need to be able to "create" a device > > without showing it to the guest. And later, when we create a different > > device, and depending of driver support on the guest, we "finish" the > > creation of the primary device. > > > > Any good idea? Hm, the naive idea would be creating the device without attaching it to any bus. But I suppose qdev doesn't let you do that. Anyway, the part that I missed yesterday is that qdev_device_add() already skips creating the device if qdev_should_hide_device(), which explains how the inconsistency is created. (As an aside, it then returns NULL without setting an error to indicate success, which is an awkward interface, and sure enough, qmp_device_add() gets it wrong and deletes the QemuOpts again. So hotplugging the virtio-net standby device doesn't even seem to work?) Could we just save the configuration in the .hide_device callback (i.e. failover_hide_primary_device() in virtio-net) to a new field in VirtIONet and then use that when actually creating the device instead of accessing the command line state in the QemuOptsList? It seems that we can currently add two primary devices that are then both hidden. failover_add_primary() adds only one of them, leaving the other one hidden. Is this a bug and we should reject such a configuration or do we need to support keeping configurations for multiple primary devices in a single standby device? This would still be ugly because the configuration is only really validated when the primary device is actually added instead of immediately on -device/device_add, but at least it would keep the ugliness more local and wouldn't block the move away from QemuOpts (the config would just be stored as a QDict after my patches). > I don't know if it can help the discussion, but I'm reformatting the > failover code to move all the PCI stuff to pci files. > > And there is a lot of inconsistencies regarding the device_add and --device > option so I've been in the end to add a list of of hidden devices rather > than relying on the command line. > > See PATCH 8 of series "[RFC PATCH v2 0/8] virtio-net failover cleanup and new features" > > https://patchew.org/QEMU/20210820142002.152994-1-lvivier@redhat.com/ While it's certainly an improvement over the current state, we really should move away from QemuOpts and I think using global state for this is wrong anyway. So it feels like it's not the change we need here, but more a step sideways. But thanks for mentioning this series here, we might get some merge conflicts there. I'll try to remember to CC you for v2 of this series. Kevin
On 06/10/2021 12:53, Kevin Wolf wrote: > Am 06.10.2021 um 11:20 hat Laurent Vivier geschrieben: >> On 06/10/2021 10:21, Juan Quintela wrote: >>> Kevin Wolf <kwolf@redhat.com> wrote: >>>> Am 05.10.2021 um 17:52 hat Damien Hedde geschrieben: >>> >>> Hi >>> >>>>>> Usage >>>>>> ----- >>>>>> >>>>>> The primary device can be hotplugged or be part of the startup >>>>>> configuration >>>>>> >>>>>> -device virtio-net-pci,netdev=hostnet1,id=net1, >>>>>> mac=52:54:00:6f:55:cc,bus=root2,failover=on >>>>>> >>>>>> With the parameter failover=on the VIRTIO_NET_F_STANDBY feature >>>>>> will be enabled. >>>>>> >>>>>> -device vfio-pci,host=5e:00.2,id=hostdev0,bus=root1, >>>>>> failover_pair_id=net1 >>>>>> >>>>>> failover_pair_id references the id of the virtio-net standby device. >>>>>> This is only for pairing the devices within QEMU. The guest kernel >>>>>> module net_failover will match devices with identical MAC addresses. >>>>>> >>>>>> Hotplug >>>>>> ------- >>>>>> >>>>>> Both primary and standby device can be hotplugged via the QEMU >>>>>> monitor. Note that if the virtio-net device is plugged first a >>>>>> warning will be issued that it couldn't find the primary device. >>>>> >>>>> So maybe this whole primary device lookup can happen during the -device CLI >>>>> option creation loop. And we can indeed have un-created devices still in the >>>>> list ? >>>> >>>> Yes, that's the only case for which I could imagine for an inconsistency >>>> between the qdev tree and QemuOpts, but failover_add_primary() is only >>>> called after feature negotiation with the guest driver, so we can be >>>> sure that the -device loop has completed long ago. >>>> >>>> And even if it hadn't completed yet, the paragraph also says that even >>>> hotplugging the device later is supported, so creating devices in the >>>> wrong order should still succeed. >>>> >>>> I hope that some of the people I added to CC have some more hints. >>> >>> Failover is ... interesting. >>> >>> You have two devices: primary and seconday. >>> seconday is virtio-net, primary can be vfio and some other emulated >>> devices. >>> >>> In the command line, devices can appear on any order, primary then >>> secondary, secondary then primary, or only one of them. >>> You can add (any of them) later in the toplevel. >>> >>> And now, what all this mess is about. We only enable the primary if the >>> guest knows about failover. Otherwise we use only the virtio device >>> (*). The important bit here is that we need to wait until the guest is >>> booted, and the virtio-net driver is loaded, and then it tells us if it >>> understands failover (or not). At that point we decide if we want to >>> "really" create the primary. >>> >>> I know that it abuses device_add() as much as it can be, but I can't see >>> any better way to handle it. We need to be able to "create" a device >>> without showing it to the guest. And later, when we create a different >>> device, and depending of driver support on the guest, we "finish" the >>> creation of the primary device. >>> >>> Any good idea? > > Hm, the naive idea would be creating the device without attaching it to > any bus. But I suppose qdev doesn't let you do that. > > Anyway, the part that I missed yesterday is that qdev_device_add() > already skips creating the device if qdev_should_hide_device(), which > explains how the inconsistency is created. > > (As an aside, it then returns NULL without setting an error to > indicate success, which is an awkward interface, and sure enough, > qmp_device_add() gets it wrong and deletes the QemuOpts again. So > hotplugging the virtio-net standby device doesn't even seem to work?) > > Could we just save the configuration in the .hide_device callback (i.e. > failover_hide_primary_device() in virtio-net) to a new field in > VirtIONet and then use that when actually creating the device instead of > accessing the command line state in the QemuOptsList? > > It seems that we can currently add two primary devices that are then > both hidden. failover_add_primary() adds only one of them, leaving the > other one hidden. Is this a bug and we should reject such a > configuration or do we need to support keeping configurations for > multiple primary devices in a single standby device? > > This would still be ugly because the configuration is only really > validated when the primary device is actually added instead of > immediately on -device/device_add, but at least it would keep the > ugliness more local and wouldn't block the move away from QemuOpts (the > config would just be stored as a QDict after my patches). > >> I don't know if it can help the discussion, but I'm reformatting the >> failover code to move all the PCI stuff to pci files. >> >> And there is a lot of inconsistencies regarding the device_add and --device >> option so I've been in the end to add a list of of hidden devices rather >> than relying on the command line. >> >> See PATCH 8 of series "[RFC PATCH v2 0/8] virtio-net failover cleanup and new features" >> >> https://patchew.org/QEMU/20210820142002.152994-1-lvivier@redhat.com/ > > While it's certainly an improvement over the current state, we really > should move away from QemuOpts and I think using global state for this I totally agree with that. > is wrong anyway. So it feels like it's not the change we need here, but > more a step sideways. Yes, I wanted to fix the problem without modifying to much the existing code. > But thanks for mentioning this series here, we might get some merge > conflicts there. I'll try to remember to CC you for v2 of this series. Thank you. I'll try to find a better solution based on your series. Laurent
diff --git a/softmmu/qdev-monitor.c b/softmmu/qdev-monitor.c index c09b7430eb..8622ccade6 100644 --- a/softmmu/qdev-monitor.c +++ b/softmmu/qdev-monitor.c @@ -812,7 +812,8 @@ void hmp_info_qdm(Monitor *mon, const QDict *qdict) qdev_print_devinfos(true); } -void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) +static void monitor_device_add(QDict *qdict, QObject **ret_data, + bool from_json, Error **errp) { QemuOpts *opts; DeviceState *dev; @@ -825,7 +826,9 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) qemu_opts_del(opts); return; } - dev = qdev_device_add(opts, errp); + qemu_opts_del(opts); + + dev = qdev_device_add_from_qdict(qdict, from_json, errp); /* * Drain all pending RCU callbacks. This is done because @@ -838,13 +841,14 @@ void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) */ drain_call_rcu(); - if (!dev) { - qemu_opts_del(opts); - return; - } object_unref(OBJECT(dev)); } +void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp) +{ + monitor_device_add(qdict, ret_data, true, errp); +} + static DeviceState *find_device_state(const char *id, Error **errp) { Object *obj; @@ -936,7 +940,7 @@ void hmp_device_add(Monitor *mon, const QDict *qdict) { Error *err = NULL; - qmp_device_add((QDict *)qdict, NULL, &err); + monitor_device_add((QDict *)qdict, NULL, false, &err); hmp_handle_error(mon, err); }
Directly call qdev_device_add_from_qdict() for QMP device_add instead of first going through QemuOpts and converting back to QDict. Note that this changes the behaviour of device_add, though in ways that should be considered bug fixes: QemuOpts ignores differences between data types, so you could successfully pass a string "123" for an integer property, or a string "on" for a boolean property (and vice versa). After this change, the correct data type for the property must be used in the JSON input. qemu_opts_from_qdict() also silently ignores any options whose value is a QDict, QList or QNull. To illustrate, the following QMP command was accepted before and is now rejected for both reasons: { "execute": "device_add", "arguments": { "driver": "scsi-cd", "drive": { "completely": "invalid" }, "physical_block_size": "4096" } } Signed-off-by: Kevin Wolf <kwolf@redhat.com> --- softmmu/qdev-monitor.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)