Message ID | 20190410092652.22616-1-yury-kotov@yandex-team.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | migration: Fix handling fd protocol | expand |
Patchew URL: https://patchew.org/QEMU/20190410092652.22616-1-yury-kotov@yandex-team.ru/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash time make docker-test-mingw@fedora SHOW_ENV=1 J=14 NETWORK=1 === TEST SCRIPT END === CC migration/xbzrle.o CC migration/postcopy-ram.o /tmp/qemu-test/src/migration/fd.c: In function 'fd_start_outgoing_migration': /tmp/qemu-test/src/migration/fd.c:40:14: error: implicit declaration of function 'qemu_dup'; did you mean 'qemu_hexdump'? [-Werror=implicit-function-declaration] dup_fd = qemu_dup(fd); ^~~~~~~~ qemu_hexdump /tmp/qemu-test/src/migration/fd.c:40:14: error: nested extern declaration of 'qemu_dup' [-Werror=nested-externs] cc1: all warnings being treated as errors make: *** [/tmp/qemu-test/src/rules.mak:69: migration/fd.o] Error 1 make: *** Waiting for unfinished jobs.... The full log is available at http://patchew.org/logs/20190410092652.22616-1-yury-kotov@yandex-team.ru/testing.docker-mingw@fedora/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
* Yury Kotov (yury-kotov@yandex-team.ru) wrote: > Currently such case is possible for incoming: > QMP: add-fd (fdset = 0, fd of some file): > adds fd to fdset 0 and returns QEMU's fd (e.g. 33) > QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc > ... > Incoming migration completes and unrefs ioc -> close(33) > QMP: remove-fd (fdset = 0, fd = 33): > removes fd from fdset 0 and qemu_close() -> close(33) => double close Well spotted! That would very rarely cause a problem, but is a race. > For outgoing migration the case is the same but getfd instead of add-fd. > Fix it by duping client's fd. > > Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> Note your patch hit a problem building on windows; I don't think we have a qemu_dup for windows. However, I think this problem is wider than just migration. For example, I see that dump.c also uses monitor_get_fd, and it's dump_cleanup also does a close on the fd. So I guess it hits the same problem? Also, qmp.c in qmp_add_client does a close on the fd in some error cases (I've not followed the normal case). So perhaps all the users of monitor_get_fd are hitting this problem. Should we be doing the dup in monitor_get_fd? Dave > --- > migration/fd.c | 39 +++++++++++++++++++++++++++++++-------- > migration/trace-events | 4 ++-- > 2 files changed, 33 insertions(+), 10 deletions(-) > > diff --git a/migration/fd.c b/migration/fd.c > index a7c13df4ad..c9ff07ac41 100644 > --- a/migration/fd.c > +++ b/migration/fd.c > @@ -15,6 +15,7 @@ > */ > > #include "qemu/osdep.h" > +#include "qapi/error.h" > #include "channel.h" > #include "fd.h" > #include "migration.h" > @@ -26,15 +27,27 @@ > void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error **errp) > { > QIOChannel *ioc; > - int fd = monitor_get_fd(cur_mon, fdname, errp); > + int fd, dup_fd; > + > + fd = monitor_get_fd(cur_mon, fdname, errp); > if (fd == -1) { > return; > } > > - trace_migration_fd_outgoing(fd); > - ioc = qio_channel_new_fd(fd, errp); > + /* fd is previously created by qmp command 'getfd', > + * so client is responsible to close it. Dup it to save original value from > + * QIOChannel's destructor */ > + dup_fd = qemu_dup(fd); > + if (dup_fd == -1) { > + error_setg(errp, "Cannot dup fd %s: %s (%d)", fdname, strerror(errno), > + errno); > + return; > + } > + > + trace_migration_fd_outgoing(fd, dup_fd); > + ioc = qio_channel_new_fd(dup_fd, errp); > if (!ioc) { > - close(fd); > + close(dup_fd); > return; > } > > @@ -55,14 +68,24 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc, > void fd_start_incoming_migration(const char *infd, Error **errp) > { > QIOChannel *ioc; > - int fd; > + int fd, dup_fd; > > fd = strtol(infd, NULL, 0); > - trace_migration_fd_incoming(fd); > > - ioc = qio_channel_new_fd(fd, errp); > + /* fd is previously created by qmp command 'add-fd' or something else, > + * so client is responsible to close it. Dup it to save original value from > + * QIOChannel's destructor */ > + dup_fd = qemu_dup(fd); > + if (dup_fd == -1) { > + error_setg(errp, "Cannot dup fd %d: %s (%d)", fd, strerror(errno), > + errno); > + return; > + } > + > + trace_migration_fd_incoming(fd, dup_fd); > + ioc = qio_channel_new_fd(dup_fd, errp); > if (!ioc) { > - close(fd); > + close(dup_fd); > return; > } > > diff --git a/migration/trace-events b/migration/trace-events > index de2e136e57..d2d30a6b3c 100644 > --- a/migration/trace-events > +++ b/migration/trace-events > @@ -258,8 +258,8 @@ migration_exec_outgoing(const char *cmd) "cmd=%s" > migration_exec_incoming(const char *cmd) "cmd=%s" > > # fd.c > -migration_fd_outgoing(int fd) "fd=%d" > -migration_fd_incoming(int fd) "fd=%d" > +migration_fd_outgoing(int fd, int dup_fd) "fd=%d dup_fd=%d" > +migration_fd_incoming(int fd, int dup_fd) "fd=%d dup_fd=%d" > > # socket.c > migration_socket_incoming_accepted(void) "" > -- > 2.21.0 > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Hi, 10.04.2019, 16:58, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> Currently such case is possible for incoming: >> QMP: add-fd (fdset = 0, fd of some file): >> adds fd to fdset 0 and returns QEMU's fd (e.g. 33) >> QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc >> ... >> Incoming migration completes and unrefs ioc -> close(33) >> QMP: remove-fd (fdset = 0, fd = 33): >> removes fd from fdset 0 and qemu_close() -> close(33) => double close > > Well spotted! That would very rarely cause a problem, but is a race. > Actually, it hits for incoming case on 2 of 50 VMs on our production... >> For outgoing migration the case is the same but getfd instead of add-fd. >> Fix it by duping client's fd. >> >> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> > > Note your patch hit a problem building on windows; I don't think we have > a qemu_dup for windows. > Oh, I see... I'll add an ifdef for this in v2. > However, I think this problem is wider than just migration. > For example, I see that dump.c also uses monitor_get_fd, and it's > dump_cleanup also does a close on the fd. So I guess it hits the same > problem? > Also, qmp.c in qmp_add_client does a close on the fd in some error cases > (I've not followed the normal case). > > So perhaps all the users of monitor_get_fd are hitting this problem. > > Should we be doing the dup in monitor_get_fd? > Hmm, it sounds reasonable but Windows's case will remain broken. And also using fd from fdset without qemu_open will remain broken. Another way to fix them: 1) Add searching of monitor's fds and not duped fdset's fds to qemu_close 2) Replace broken closes to qemu_close Regards, Yury Kotov > Dave > >> --- >> migration/fd.c | 39 +++++++++++++++++++++++++++++++-------- >> migration/trace-events | 4 ++-- >> 2 files changed, 33 insertions(+), 10 deletions(-) >> >> diff --git a/migration/fd.c b/migration/fd.c >> index a7c13df4ad..c9ff07ac41 100644 >> --- a/migration/fd.c >> +++ b/migration/fd.c >> @@ -15,6 +15,7 @@ >> */ >> >> #include "qemu/osdep.h" >> +#include "qapi/error.h" >> #include "channel.h" >> #include "fd.h" >> #include "migration.h" >> @@ -26,15 +27,27 @@ >> void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error **errp) >> { >> QIOChannel *ioc; >> - int fd = monitor_get_fd(cur_mon, fdname, errp); >> + int fd, dup_fd; >> + >> + fd = monitor_get_fd(cur_mon, fdname, errp); >> if (fd == -1) { >> return; >> } >> >> - trace_migration_fd_outgoing(fd); >> - ioc = qio_channel_new_fd(fd, errp); >> + /* fd is previously created by qmp command 'getfd', >> + * so client is responsible to close it. Dup it to save original value from >> + * QIOChannel's destructor */ >> + dup_fd = qemu_dup(fd); >> + if (dup_fd == -1) { >> + error_setg(errp, "Cannot dup fd %s: %s (%d)", fdname, strerror(errno), >> + errno); >> + return; >> + } >> + >> + trace_migration_fd_outgoing(fd, dup_fd); >> + ioc = qio_channel_new_fd(dup_fd, errp); >> if (!ioc) { >> - close(fd); >> + close(dup_fd); >> return; >> } >> >> @@ -55,14 +68,24 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc, >> void fd_start_incoming_migration(const char *infd, Error **errp) >> { >> QIOChannel *ioc; >> - int fd; >> + int fd, dup_fd; >> >> fd = strtol(infd, NULL, 0); >> - trace_migration_fd_incoming(fd); >> >> - ioc = qio_channel_new_fd(fd, errp); >> + /* fd is previously created by qmp command 'add-fd' or something else, >> + * so client is responsible to close it. Dup it to save original value from >> + * QIOChannel's destructor */ >> + dup_fd = qemu_dup(fd); >> + if (dup_fd == -1) { >> + error_setg(errp, "Cannot dup fd %d: %s (%d)", fd, strerror(errno), >> + errno); >> + return; >> + } >> + >> + trace_migration_fd_incoming(fd, dup_fd); >> + ioc = qio_channel_new_fd(dup_fd, errp); >> if (!ioc) { >> - close(fd); >> + close(dup_fd); >> return; >> } >> >> diff --git a/migration/trace-events b/migration/trace-events >> index de2e136e57..d2d30a6b3c 100644 >> --- a/migration/trace-events >> +++ b/migration/trace-events >> @@ -258,8 +258,8 @@ migration_exec_outgoing(const char *cmd) "cmd=%s" >> migration_exec_incoming(const char *cmd) "cmd=%s" >> >> # fd.c >> -migration_fd_outgoing(int fd) "fd=%d" >> -migration_fd_incoming(int fd) "fd=%d" >> +migration_fd_outgoing(int fd, int dup_fd) "fd=%d dup_fd=%d" >> +migration_fd_incoming(int fd, int dup_fd) "fd=%d dup_fd=%d" >> >> # socket.c >> migration_socket_incoming_accepted(void) "" >> -- >> 2.21.0 > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Wed, Apr 10, 2019 at 12:26:52PM +0300, Yury Kotov wrote: > Currently such case is possible for incoming: > QMP: add-fd (fdset = 0, fd of some file): > adds fd to fdset 0 and returns QEMU's fd (e.g. 33) > QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc > ... > Incoming migration completes and unrefs ioc -> close(33) > QMP: remove-fd (fdset = 0, fd = 33): > removes fd from fdset 0 and qemu_close() -> close(33) => double close IMHO this is user error. You've given the FD to QEMU and told it to use it for migration. Once you've done that you should not be calling remove-fd. remove-fd should only be used in some error code path. ie if you have called add-fd, and then some failure means you've decided you can't invoke 'migrate-incoming'. Now you should call "remove-fd". AFAIK, we have never documented that 'add-fd' must be paired with 'remove-fd'. IOW, I think this "fix" is in fact not a fix - it is instead changing the semantics of when "remove-fd" should be used. > For outgoing migration the case is the same but getfd instead of add-fd. > Fix it by duping client's fd. > > Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> > --- > migration/fd.c | 39 +++++++++++++++++++++++++++++++-------- > migration/trace-events | 4 ++-- > 2 files changed, 33 insertions(+), 10 deletions(-) > > diff --git a/migration/fd.c b/migration/fd.c > index a7c13df4ad..c9ff07ac41 100644 > --- a/migration/fd.c > +++ b/migration/fd.c > @@ -15,6 +15,7 @@ > */ > > #include "qemu/osdep.h" > +#include "qapi/error.h" > #include "channel.h" > #include "fd.h" > #include "migration.h" > @@ -26,15 +27,27 @@ > void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error **errp) > { > QIOChannel *ioc; > - int fd = monitor_get_fd(cur_mon, fdname, errp); > + int fd, dup_fd; > + > + fd = monitor_get_fd(cur_mon, fdname, errp); > if (fd == -1) { > return; > } > > - trace_migration_fd_outgoing(fd); > - ioc = qio_channel_new_fd(fd, errp); > + /* fd is previously created by qmp command 'getfd', > + * so client is responsible to close it. Dup it to save original value from > + * QIOChannel's destructor */ > + dup_fd = qemu_dup(fd); > + if (dup_fd == -1) { > + error_setg(errp, "Cannot dup fd %s: %s (%d)", fdname, strerror(errno), > + errno); > + return; > + } > + > + trace_migration_fd_outgoing(fd, dup_fd); > + ioc = qio_channel_new_fd(dup_fd, errp); > if (!ioc) { > - close(fd); > + close(dup_fd); > return; > } > > @@ -55,14 +68,24 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc, > void fd_start_incoming_migration(const char *infd, Error **errp) > { > QIOChannel *ioc; > - int fd; > + int fd, dup_fd; > > fd = strtol(infd, NULL, 0); > - trace_migration_fd_incoming(fd); > > - ioc = qio_channel_new_fd(fd, errp); > + /* fd is previously created by qmp command 'add-fd' or something else, > + * so client is responsible to close it. Dup it to save original value from > + * QIOChannel's destructor */ > + dup_fd = qemu_dup(fd); > + if (dup_fd == -1) { > + error_setg(errp, "Cannot dup fd %d: %s (%d)", fd, strerror(errno), > + errno); > + return; > + } > + > + trace_migration_fd_incoming(fd, dup_fd); > + ioc = qio_channel_new_fd(dup_fd, errp); > if (!ioc) { > - close(fd); > + close(dup_fd); > return; > } > > diff --git a/migration/trace-events b/migration/trace-events > index de2e136e57..d2d30a6b3c 100644 > --- a/migration/trace-events > +++ b/migration/trace-events > @@ -258,8 +258,8 @@ migration_exec_outgoing(const char *cmd) "cmd=%s" > migration_exec_incoming(const char *cmd) "cmd=%s" > > # fd.c > -migration_fd_outgoing(int fd) "fd=%d" > -migration_fd_incoming(int fd) "fd=%d" > +migration_fd_outgoing(int fd, int dup_fd) "fd=%d dup_fd=%d" > +migration_fd_incoming(int fd, int dup_fd) "fd=%d dup_fd=%d" > > # socket.c > migration_socket_incoming_accepted(void) "" > -- > 2.21.0 > > Regards, Daniel
11.04.2019, 15:04, "Daniel P. Berrangé" <berrange@redhat.com>: > On Wed, Apr 10, 2019 at 12:26:52PM +0300, Yury Kotov wrote: >> Currently such case is possible for incoming: >> QMP: add-fd (fdset = 0, fd of some file): >> adds fd to fdset 0 and returns QEMU's fd (e.g. 33) >> QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc >> ... >> Incoming migration completes and unrefs ioc -> close(33) >> QMP: remove-fd (fdset = 0, fd = 33): >> removes fd from fdset 0 and qemu_close() -> close(33) => double close > > IMHO this is user error. > > You've given the FD to QEMU and told it to use it for migration. > > Once you've done that you should not be calling remove-fd. > > remove-fd should only be used in some error code path. ie if you > have called add-fd, and then some failure means you've decided you > can't invoke 'migrate-incoming'. Now you should call "remove-fd". > > AFAIK, we have never documented that 'add-fd' must be paired > with 'remove-fd'. > > IOW, I think this "fix" is in fact not a fix - it is instead > changing the semantics of when "remove-fd" should be used. > I think it might be user's error but fd is still in cur_mon->fds (in getfd case) or in mon_fdsets (in add-fd case). So if fd is still exists in the lists why user can't close/remove it? So, there are 2 valid options: 1) fd is still valid after migration (dup) 2) fd is closed but also removed from the appropriate list Regards, Yury >> For outgoing migration the case is the same but getfd instead of add-fd. >> Fix it by duping client's fd. >> >> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> >> --- >> migration/fd.c | 39 +++++++++++++++++++++++++++++++-------- >> migration/trace-events | 4 ++-- >> 2 files changed, 33 insertions(+), 10 deletions(-) >> >> diff --git a/migration/fd.c b/migration/fd.c >> index a7c13df4ad..c9ff07ac41 100644 >> --- a/migration/fd.c >> +++ b/migration/fd.c >> @@ -15,6 +15,7 @@ >> */ >> >> #include "qemu/osdep.h" >> +#include "qapi/error.h" >> #include "channel.h" >> #include "fd.h" >> #include "migration.h" >> @@ -26,15 +27,27 @@ >> void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error **errp) >> { >> QIOChannel *ioc; >> - int fd = monitor_get_fd(cur_mon, fdname, errp); >> + int fd, dup_fd; >> + >> + fd = monitor_get_fd(cur_mon, fdname, errp); >> if (fd == -1) { >> return; >> } >> >> - trace_migration_fd_outgoing(fd); >> - ioc = qio_channel_new_fd(fd, errp); >> + /* fd is previously created by qmp command 'getfd', >> + * so client is responsible to close it. Dup it to save original value from >> + * QIOChannel's destructor */ >> + dup_fd = qemu_dup(fd); >> + if (dup_fd == -1) { >> + error_setg(errp, "Cannot dup fd %s: %s (%d)", fdname, strerror(errno), >> + errno); >> + return; >> + } >> + >> + trace_migration_fd_outgoing(fd, dup_fd); >> + ioc = qio_channel_new_fd(dup_fd, errp); >> if (!ioc) { >> - close(fd); >> + close(dup_fd); >> return; >> } >> >> @@ -55,14 +68,24 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc, >> void fd_start_incoming_migration(const char *infd, Error **errp) >> { >> QIOChannel *ioc; >> - int fd; >> + int fd, dup_fd; >> >> fd = strtol(infd, NULL, 0); >> - trace_migration_fd_incoming(fd); >> >> - ioc = qio_channel_new_fd(fd, errp); >> + /* fd is previously created by qmp command 'add-fd' or something else, >> + * so client is responsible to close it. Dup it to save original value from >> + * QIOChannel's destructor */ >> + dup_fd = qemu_dup(fd); >> + if (dup_fd == -1) { >> + error_setg(errp, "Cannot dup fd %d: %s (%d)", fd, strerror(errno), >> + errno); >> + return; >> + } >> + >> + trace_migration_fd_incoming(fd, dup_fd); >> + ioc = qio_channel_new_fd(dup_fd, errp); >> if (!ioc) { >> - close(fd); >> + close(dup_fd); >> return; >> } >> >> diff --git a/migration/trace-events b/migration/trace-events >> index de2e136e57..d2d30a6b3c 100644 >> --- a/migration/trace-events >> +++ b/migration/trace-events >> @@ -258,8 +258,8 @@ migration_exec_outgoing(const char *cmd) "cmd=%s" >> migration_exec_incoming(const char *cmd) "cmd=%s" >> >> # fd.c >> -migration_fd_outgoing(int fd) "fd=%d" >> -migration_fd_incoming(int fd) "fd=%d" >> +migration_fd_outgoing(int fd, int dup_fd) "fd=%d dup_fd=%d" >> +migration_fd_incoming(int fd, int dup_fd) "fd=%d dup_fd=%d" >> >> # socket.c >> migration_socket_incoming_accepted(void) "" >> -- >> 2.21.0 > > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Thu, Apr 11, 2019 at 03:31:43PM +0300, Yury Kotov wrote: > 11.04.2019, 15:04, "Daniel P. Berrangé" <berrange@redhat.com>: > > On Wed, Apr 10, 2019 at 12:26:52PM +0300, Yury Kotov wrote: > >> Currently such case is possible for incoming: > >> QMP: add-fd (fdset = 0, fd of some file): > >> adds fd to fdset 0 and returns QEMU's fd (e.g. 33) > >> QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc > >> ... > >> Incoming migration completes and unrefs ioc -> close(33) > >> QMP: remove-fd (fdset = 0, fd = 33): > >> removes fd from fdset 0 and qemu_close() -> close(33) => double close > > > > IMHO this is user error. > > > > You've given the FD to QEMU and told it to use it for migration. > > > > Once you've done that you should not be calling remove-fd. > > > > remove-fd should only be used in some error code path. ie if you > > have called add-fd, and then some failure means you've decided you > > can't invoke 'migrate-incoming'. Now you should call "remove-fd". > > > > AFAIK, we have never documented that 'add-fd' must be paired > > with 'remove-fd'. > > > > IOW, I think this "fix" is in fact not a fix - it is instead > > changing the semantics of when "remove-fd" should be used. > > > > I think it might be user's error but fd is still in cur_mon->fds (in getfd case) > or in mon_fdsets (in add-fd case). So if fd is still exists in the lists why > user can't close/remove it? > > So, there are 2 valid options: > 1) fd is still valid after migration (dup) If we do this then existing mgmt apps using QEMU who don't expect to need to call remove-fd are going to be leaking FDs forever. > 2) fd is closed but also removed from the appropriate list monitor_get_fd currently leaves the FD on the list. if all current users of that API are closing the FD themselves, then we could change that method to remove it from the list. Either way the requirements in this area are pooly documented both from QEMU's internal POV and from mgmt app public QMP pov. Regards, Daniel
11.04.2019, 15:39, "Daniel P. Berrangé" <berrange@redhat.com>: > On Thu, Apr 11, 2019 at 03:31:43PM +0300, Yury Kotov wrote: >> 11.04.2019, 15:04, "Daniel P. Berrangé" <berrange@redhat.com>: >> > On Wed, Apr 10, 2019 at 12:26:52PM +0300, Yury Kotov wrote: >> >> Currently such case is possible for incoming: >> >> QMP: add-fd (fdset = 0, fd of some file): >> >> adds fd to fdset 0 and returns QEMU's fd (e.g. 33) >> >> QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc >> >> ... >> >> Incoming migration completes and unrefs ioc -> close(33) >> >> QMP: remove-fd (fdset = 0, fd = 33): >> >> removes fd from fdset 0 and qemu_close() -> close(33) => double close >> > >> > IMHO this is user error. >> > >> > You've given the FD to QEMU and told it to use it for migration. >> > >> > Once you've done that you should not be calling remove-fd. >> > >> > remove-fd should only be used in some error code path. ie if you >> > have called add-fd, and then some failure means you've decided you >> > can't invoke 'migrate-incoming'. Now you should call "remove-fd". >> > >> > AFAIK, we have never documented that 'add-fd' must be paired >> > with 'remove-fd'. >> > >> > IOW, I think this "fix" is in fact not a fix - it is instead >> > changing the semantics of when "remove-fd" should be used. >> > >> >> I think it might be user's error but fd is still in cur_mon->fds (in getfd case) >> or in mon_fdsets (in add-fd case). So if fd is still exists in the lists why >> user can't close/remove it? >> >> So, there are 2 valid options: >> 1) fd is still valid after migration (dup) > > If we do this then existing mgmt apps using QEMU who don't expect to need > to call remove-fd are going to be leaking FDs forever. > Ok, so what do you think about modifying qemu_close to remove fd from these two lists? Currently it tryes to remove fd only from mon_fdsets->dup_fds. Regards, Yury >> 2) fd is closed but also removed from the appropriate list > > monitor_get_fd currently leaves the FD on the list. > > if all current users of that API are closing the FD > themselves, then we could change that method to remove > it from the list. > > Either way the requirements in this area are pooly documented > both from QEMU's internal POV and from mgmt app public QMP pov. > > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Hi, Just to clarify. I see two possible solutions: 1) Since the migration code doesn't receive fd, it isn't responsible for closing it. So, it may be better to use migrate_fd_param for both incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must close the fd themselves. But existing clients will have a leak. 2) If we don't duplicate fd, then at least we should remove fd from the corresponding list. Therefore, the solution is to fix qemu_close to find the list and remove fd from it. But qemu_close is currently consistent with qemu_open (which opens/dups fd), so adding additional logic might not be a very good idea. I don't see any other solution, but I might miss something. What do you think? Regards, Yury 11.04.2019, 15:58, "Yury Kotov" <yury-kotov@yandex-team.ru>: > 11.04.2019, 15:39, "Daniel P. Berrangé" <berrange@redhat.com>: >> On Thu, Apr 11, 2019 at 03:31:43PM +0300, Yury Kotov wrote: >>> 11.04.2019, 15:04, "Daniel P. Berrangé" <berrange@redhat.com>: >>> > On Wed, Apr 10, 2019 at 12:26:52PM +0300, Yury Kotov wrote: >>> >> Currently such case is possible for incoming: >>> >> QMP: add-fd (fdset = 0, fd of some file): >>> >> adds fd to fdset 0 and returns QEMU's fd (e.g. 33) >>> >> QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc >>> >> ... >>> >> Incoming migration completes and unrefs ioc -> close(33) >>> >> QMP: remove-fd (fdset = 0, fd = 33): >>> >> removes fd from fdset 0 and qemu_close() -> close(33) => double close >>> > >>> > IMHO this is user error. >>> > >>> > You've given the FD to QEMU and told it to use it for migration. >>> > >>> > Once you've done that you should not be calling remove-fd. >>> > >>> > remove-fd should only be used in some error code path. ie if you >>> > have called add-fd, and then some failure means you've decided you >>> > can't invoke 'migrate-incoming'. Now you should call "remove-fd". >>> > >>> > AFAIK, we have never documented that 'add-fd' must be paired >>> > with 'remove-fd'. >>> > >>> > IOW, I think this "fix" is in fact not a fix - it is instead >>> > changing the semantics of when "remove-fd" should be used. >>> > >>> >>> I think it might be user's error but fd is still in cur_mon->fds (in getfd case) >>> or in mon_fdsets (in add-fd case). So if fd is still exists in the lists why >>> user can't close/remove it? >>> >>> So, there are 2 valid options: >>> 1) fd is still valid after migration (dup) >> >> If we do this then existing mgmt apps using QEMU who don't expect to need >> to call remove-fd are going to be leaking FDs forever. > > Ok, so what do you think about modifying qemu_close to remove fd from these two > lists? Currently it tryes to remove fd only from mon_fdsets->dup_fds. > > Regards, > Yury > >>> 2) fd is closed but also removed from the appropriate list >> >> monitor_get_fd currently leaves the FD on the list. >> >> if all current users of that API are closing the FD >> themselves, then we could change that method to remove >> it from the list. >> >> Either way the requirements in this area are pooly documented >> both from QEMU's internal POV and from mgmt app public QMP pov. >> >> Regards, >> Daniel >> -- >> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| >> |: https://libvirt.org -o- https://fstop138.berrange.com :| >> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: > Hi, > > Just to clarify. I see two possible solutions: > > 1) Since the migration code doesn't receive fd, it isn't responsible for > closing it. So, it may be better to use migrate_fd_param for both > incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must > close the fd themselves. But existing clients will have a leak. We can't break existing clients in this way as they are correctly using the monitor with its current semantics. > 2) If we don't duplicate fd, then at least we should remove fd from > the corresponding list. Therefore, the solution is to fix qemu_close to find > the list and remove fd from it. But qemu_close is currently consistent with > qemu_open (which opens/dups fd), so adding additional logic might not be > a very good idea. qemu_close is not appropriate place to deal with something speciifc to the montor. > I don't see any other solution, but I might miss something. > What do you think? All callers of monitor_get_fd() will close() the FD they get back. Thus monitor_get_fd() should remove it from the list when it returns it, and we should add API docs to monitor_get_fd() to explain this. Regards, Daniel
15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >> Hi, >> >> Just to clarify. I see two possible solutions: >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >> closing it. So, it may be better to use migrate_fd_param for both >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >> close the fd themselves. But existing clients will have a leak. > > We can't break existing clients in this way as they are correctly > using the monitor with its current semantics. > >> 2) If we don't duplicate fd, then at least we should remove fd from >> the corresponding list. Therefore, the solution is to fix qemu_close to find >> the list and remove fd from it. But qemu_close is currently consistent with >> qemu_open (which opens/dups fd), so adding additional logic might not be >> a very good idea. > > qemu_close is not appropriate place to deal with something speciifc > to the montor. > >> I don't see any other solution, but I might miss something. >> What do you think? > > All callers of monitor_get_fd() will close() the FD they get back. > Thus monitor_get_fd() should remove it from the list when it returns > it, and we should add API docs to monitor_get_fd() to explain this. > Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. But what about the incoming migration? It doesn't use monitor_get_fd but just converts input string to int and use it as fd. Regards, Yury
15.04.2019, 13:17, "Yury Kotov" <yury-kotov@yandex-team.ru>: > 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >> On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >>> Hi, >>> >>> Just to clarify. I see two possible solutions: >>> >>> 1) Since the migration code doesn't receive fd, it isn't responsible for >>> closing it. So, it may be better to use migrate_fd_param for both >>> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >>> close the fd themselves. But existing clients will have a leak. >> >> We can't break existing clients in this way as they are correctly >> using the monitor with its current semantics. >> >>> 2) If we don't duplicate fd, then at least we should remove fd from >>> the corresponding list. Therefore, the solution is to fix qemu_close to find >>> the list and remove fd from it. But qemu_close is currently consistent with >>> qemu_open (which opens/dups fd), so adding additional logic might not be >>> a very good idea. >> >> qemu_close is not appropriate place to deal with something speciifc >> to the montor. >> But qemu_close is already deal with monitor: It uses monitor_fdset_dup_fd_find & monitor_fdset_dup_fd_remove to find and remove fd from monitor's dup_fds list. >>> I don't see any other solution, but I might miss something. >>> What do you think? >> >> All callers of monitor_get_fd() will close() the FD they get back. >> Thus monitor_get_fd() should remove it from the list when it returns >> it, and we should add API docs to monitor_get_fd() to explain this. > > Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. > But what about the incoming migration? It doesn't use monitor_get_fd but just > converts input string to int and use it as fd. > Regards, Yury
On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: > 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: > > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: > >> Hi, > >> > >> Just to clarify. I see two possible solutions: > >> > >> 1) Since the migration code doesn't receive fd, it isn't responsible for > >> closing it. So, it may be better to use migrate_fd_param for both > >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must > >> close the fd themselves. But existing clients will have a leak. > > > > We can't break existing clients in this way as they are correctly > > using the monitor with its current semantics. > > > >> 2) If we don't duplicate fd, then at least we should remove fd from > >> the corresponding list. Therefore, the solution is to fix qemu_close to find > >> the list and remove fd from it. But qemu_close is currently consistent with > >> qemu_open (which opens/dups fd), so adding additional logic might not be > >> a very good idea. > > > > qemu_close is not appropriate place to deal with something speciifc > > to the montor. > > > >> I don't see any other solution, but I might miss something. > >> What do you think? > > > > All callers of monitor_get_fd() will close() the FD they get back. > > Thus monitor_get_fd() should remove it from the list when it returns > > it, and we should add API docs to monitor_get_fd() to explain this. > > > Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. > But what about the incoming migration? It doesn't use monitor_get_fd but just > converts input string to int and use it as fd. The incoming migration expects the FD to be passed into QEMU by the mgmt app when it is exec'ing the QEMU binary. It doesn't interact with the monitor at all AFAIR. Regards, Daniel
15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >> >> Hi, >> >> >> >> Just to clarify. I see two possible solutions: >> >> >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >> >> closing it. So, it may be better to use migrate_fd_param for both >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >> >> close the fd themselves. But existing clients will have a leak. >> > >> > We can't break existing clients in this way as they are correctly >> > using the monitor with its current semantics. >> > >> >> 2) If we don't duplicate fd, then at least we should remove fd from >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find >> >> the list and remove fd from it. But qemu_close is currently consistent with >> >> qemu_open (which opens/dups fd), so adding additional logic might not be >> >> a very good idea. >> > >> > qemu_close is not appropriate place to deal with something speciifc >> > to the montor. >> > >> >> I don't see any other solution, but I might miss something. >> >> What do you think? >> > >> > All callers of monitor_get_fd() will close() the FD they get back. >> > Thus monitor_get_fd() should remove it from the list when it returns >> > it, and we should add API docs to monitor_get_fd() to explain this. >> > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. >> But what about the incoming migration? It doesn't use monitor_get_fd but just >> converts input string to int and use it as fd. > > The incoming migration expects the FD to be passed into QEMU by the mgmt > app when it is exec'ing the QEMU binary. It doesn't interact with the > monitor at all AFAIR. > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for migrate-incoming and such way has described problems. Regards, Yury
On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: > >> >> Hi, > >> >> > >> >> Just to clarify. I see two possible solutions: > >> >> > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for > >> >> closing it. So, it may be better to use migrate_fd_param for both > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must > >> >> close the fd themselves. But existing clients will have a leak. > >> > > >> > We can't break existing clients in this way as they are correctly > >> > using the monitor with its current semantics. > >> > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find > >> >> the list and remove fd from it. But qemu_close is currently consistent with > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be > >> >> a very good idea. > >> > > >> > qemu_close is not appropriate place to deal with something speciifc > >> > to the montor. > >> > > >> >> I don't see any other solution, but I might miss something. > >> >> What do you think? > >> > > >> > All callers of monitor_get_fd() will close() the FD they get back. > >> > Thus monitor_get_fd() should remove it from the list when it returns > >> > it, and we should add API docs to monitor_get_fd() to explain this. > >> > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. > >> But what about the incoming migration? It doesn't use monitor_get_fd but just > >> converts input string to int and use it as fd. > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt > > app when it is exec'ing the QEMU binary. It doesn't interact with the > > monitor at all AFAIR. > > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for > migrate-incoming and such way has described problems. That's a bug in your usage of QEMU IMHO, as the incoming code is not designed to use add-fd. Regards, Daniel
* Daniel P. Berrangé (berrange@redhat.com) wrote: > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: > > >> >> Hi, > > >> >> > > >> >> Just to clarify. I see two possible solutions: > > >> >> > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for > > >> >> closing it. So, it may be better to use migrate_fd_param for both > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must > > >> >> close the fd themselves. But existing clients will have a leak. > > >> > > > >> > We can't break existing clients in this way as they are correctly > > >> > using the monitor with its current semantics. > > >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find > > >> >> the list and remove fd from it. But qemu_close is currently consistent with > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be > > >> >> a very good idea. > > >> > > > >> > qemu_close is not appropriate place to deal with something speciifc > > >> > to the montor. > > >> > > > >> >> I don't see any other solution, but I might miss something. > > >> >> What do you think? > > >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. > > >> > Thus monitor_get_fd() should remove it from the list when it returns > > >> > it, and we should add API docs to monitor_get_fd() to explain this. > > >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just > > >> converts input string to int and use it as fd. > > > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt > > > app when it is exec'ing the QEMU binary. It doesn't interact with the > > > monitor at all AFAIR. > > > > > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for > > migrate-incoming and such way has described problems. > > That's a bug in your usage of QEMU IMHO, as the incoming code is not > designed to use add-fd. Hmm, that's true - although: a) It's very non-obvious b) Unfortunate, since it would go well with -incoming defer Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: > > > >> >> Hi, > > > >> >> > > > >> >> Just to clarify. I see two possible solutions: > > > >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for > > > >> >> closing it. So, it may be better to use migrate_fd_param for both > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must > > > >> >> close the fd themselves. But existing clients will have a leak. > > > >> > > > > >> > We can't break existing clients in this way as they are correctly > > > >> > using the monitor with its current semantics. > > > >> > > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be > > > >> >> a very good idea. > > > >> > > > > >> > qemu_close is not appropriate place to deal with something speciifc > > > >> > to the montor. > > > >> > > > > >> >> I don't see any other solution, but I might miss something. > > > >> >> What do you think? > > > >> > > > > >> > All callers of monitor_get_fd() will close() the FD they get back. > > > >> > Thus monitor_get_fd() should remove it from the list when it returns > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. > > > >> > > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just > > > >> converts input string to int and use it as fd. > > > > > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the > > > > monitor at all AFAIR. > > > > > > > > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for > > > migrate-incoming and such way has described problems. > > > > That's a bug in your usage of QEMU IMHO, as the incoming code is not > > designed to use add-fd. > > Hmm, that's true - although: > a) It's very non-obvious > b) Unfortunate, since it would go well with -incoming defer Yeah I think this is a screw up on QMEU's part when introducing 'defer'. We should have mandated use of 'add-fd' when using 'defer', since FD inheritance-over-execve() should only be used for command line args, not monitor commands. Not sure how to best fix this is QEMU though without breaking back compat for apps using 'defer' already. Regards, Daniel
* Daniel P. Berrangé (berrange@redhat.com) wrote: > On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: > > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: > > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: > > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: > > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: > > > > >> >> Hi, > > > > >> >> > > > > >> >> Just to clarify. I see two possible solutions: > > > > >> >> > > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for > > > > >> >> closing it. So, it may be better to use migrate_fd_param for both > > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must > > > > >> >> close the fd themselves. But existing clients will have a leak. > > > > >> > > > > > >> > We can't break existing clients in this way as they are correctly > > > > >> > using the monitor with its current semantics. > > > > >> > > > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from > > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find > > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with > > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be > > > > >> >> a very good idea. > > > > >> > > > > > >> > qemu_close is not appropriate place to deal with something speciifc > > > > >> > to the montor. > > > > >> > > > > > >> >> I don't see any other solution, but I might miss something. > > > > >> >> What do you think? > > > > >> > > > > > >> > All callers of monitor_get_fd() will close() the FD they get back. > > > > >> > Thus monitor_get_fd() should remove it from the list when it returns > > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. > > > > >> > > > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. > > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just > > > > >> converts input string to int and use it as fd. > > > > > > > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt > > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the > > > > > monitor at all AFAIR. > > > > > > > > > > > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for > > > > migrate-incoming and such way has described problems. > > > > > > That's a bug in your usage of QEMU IMHO, as the incoming code is not > > > designed to use add-fd. > > > > Hmm, that's true - although: > > a) It's very non-obvious > > b) Unfortunate, since it would go well with -incoming defer > > Yeah I think this is a screw up on QMEU's part when introducing 'defer'. > > We should have mandated use of 'add-fd' when using 'defer', since FD > inheritance-over-execve() should only be used for command line args, > not monitor commands. > > Not sure how to best fix this is QEMU though without breaking back > compat for apps using 'defer' already. We could add mon-fd: transports that has the same behaviour as now for outgoing, and for incoming uses the add-fd stash. Dave > Regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >> > > > >> >> Hi, >> > > > >> >> >> > > > >> >> Just to clarify. I see two possible solutions: >> > > > >> >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >> > > > >> >> close the fd themselves. But existing clients will have a leak. >> > > > >> > >> > > > >> > We can't break existing clients in this way as they are correctly >> > > > >> > using the monitor with its current semantics. >> > > > >> > >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be >> > > > >> >> a very good idea. >> > > > >> > >> > > > >> > qemu_close is not appropriate place to deal with something speciifc >> > > > >> > to the montor. >> > > > >> > >> > > > >> >> I don't see any other solution, but I might miss something. >> > > > >> >> What do you think? >> > > > >> > >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. >> > > > >> > >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just >> > > > >> converts input string to int and use it as fd. >> > > > > >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the >> > > > > monitor at all AFAIR. >> > > > > >> > > > >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for >> > > > migrate-incoming and such way has described problems. >> > > >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not >> > > designed to use add-fd. >> > >> > Hmm, that's true - although: >> > a) It's very non-obvious >> > b) Unfortunate, since it would go well with -incoming defer >> >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. >> >> We should have mandated use of 'add-fd' when using 'defer', since FD >> inheritance-over-execve() should only be used for command line args, >> not monitor commands. >> >> Not sure how to best fix this is QEMU though without breaking back >> compat for apps using 'defer' already. > > We could add mon-fd: transports that has the same behaviour as now for > outgoing, and for incoming uses the add-fd stash. > May be it's better to use monitor_fd_param for both incoming/outgoing? So, "migrate" will know fd:<int> semantics and "migrate-incoming" will know fd:<fd_name> semantics. And also modify monitor_get_fd to remove fd from list before return. This is a backwards compatible change. Regards, Yury
15.04.2019, 15:21, "Yury Kotov" <yury-kotov@yandex-team.ru>: > 15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >> * Daniel P. Berrangé (berrange@redhat.com) wrote: >>> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: >>> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >>> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: >>> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: >>> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: >>> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >>> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >>> > > > >> >> Hi, >>> > > > >> >> >>> > > > >> >> Just to clarify. I see two possible solutions: >>> > > > >> >> >>> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >>> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both >>> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >>> > > > >> >> close the fd themselves. But existing clients will have a leak. >>> > > > >> > >>> > > > >> > We can't break existing clients in this way as they are correctly >>> > > > >> > using the monitor with its current semantics. >>> > > > >> > >>> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from >>> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find >>> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with >>> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be >>> > > > >> >> a very good idea. >>> > > > >> > >>> > > > >> > qemu_close is not appropriate place to deal with something speciifc >>> > > > >> > to the montor. >>> > > > >> > >>> > > > >> >> I don't see any other solution, but I might miss something. >>> > > > >> >> What do you think? >>> > > > >> > >>> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. >>> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns >>> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. >>> > > > >> > >>> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. >>> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just >>> > > > >> converts input string to int and use it as fd. >>> > > > > >>> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt >>> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the >>> > > > > monitor at all AFAIR. >>> > > > > >>> > > > >>> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for >>> > > > migrate-incoming and such way has described problems. >>> > > >>> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not >>> > > designed to use add-fd. >>> > >>> > Hmm, that's true - although: >>> > a) It's very non-obvious >>> > b) Unfortunate, since it would go well with -incoming defer >>> >>> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. >>> >>> We should have mandated use of 'add-fd' when using 'defer', since FD >>> inheritance-over-execve() should only be used for command line args, >>> not monitor commands. >>> >>> Not sure how to best fix this is QEMU though without breaking back >>> compat for apps using 'defer' already. >> >> We could add mon-fd: transports that has the same behaviour as now for >> outgoing, and for incoming uses the add-fd stash. > > May be it's better to use monitor_fd_param for both incoming/outgoing? > So, "migrate" will know fd:<int> semantics and "migrate-incoming" will > know fd:<fd_name> semantics. And also modify monitor_get_fd to > remove fd from list before return. > This is a backwards compatible change. > I mean something like this: diff --git a/migration/fd.c b/migration/fd.c index a7c13df4ad..81804455bb 100644 --- a/migration/fd.c +++ b/migration/fd.c @@ -26,7 +26,7 @@ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error **errp) { QIOChannel *ioc; - int fd = monitor_get_fd(cur_mon, fdname, errp); + fd = monitor_fd_param(cur_mon, fdname, errp); if (fd == -1) { return; } @@ -57,7 +57,10 @@ void fd_start_incoming_migration(const char *infd, Error **errp) QIOChannel *ioc; int fd; - fd = strtol(infd, NULL, 0); + fd = monitor_fd_param(cur_mon, infd, errp); + if (fd == -1) { + return; + } trace_migration_fd_incoming(fd); ioc = qio_channel_new_fd(fd, errp); Regards, Yury
15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >> > > > >> >> Hi, >> > > > >> >> >> > > > >> >> Just to clarify. I see two possible solutions: >> > > > >> >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >> > > > >> >> close the fd themselves. But existing clients will have a leak. >> > > > >> > >> > > > >> > We can't break existing clients in this way as they are correctly >> > > > >> > using the monitor with its current semantics. >> > > > >> > >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be >> > > > >> >> a very good idea. >> > > > >> > >> > > > >> > qemu_close is not appropriate place to deal with something speciifc >> > > > >> > to the montor. >> > > > >> > >> > > > >> >> I don't see any other solution, but I might miss something. >> > > > >> >> What do you think? >> > > > >> > >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. >> > > > >> > >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just >> > > > >> converts input string to int and use it as fd. >> > > > > >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the >> > > > > monitor at all AFAIR. >> > > > > >> > > > >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for >> > > > migrate-incoming and such way has described problems. >> > > >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not >> > > designed to use add-fd. >> > >> > Hmm, that's true - although: >> > a) It's very non-obvious >> > b) Unfortunate, since it would go well with -incoming defer >> >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. >> >> We should have mandated use of 'add-fd' when using 'defer', since FD >> inheritance-over-execve() should only be used for command line args, >> not monitor commands. >> >> Not sure how to best fix this is QEMU though without breaking back >> compat for apps using 'defer' already. > > We could add mon-fd: transports that has the same behaviour as now for > outgoing, and for incoming uses the add-fd stash. > Oh, I'm sorry again. I think my suggestion about monitor_fd_param wasn't relevant to this issue. If migrate-incoming + "fd:" + add-fd is an invalid use case, should we disallow this? I may add a check to fd_start_incoming_migration if fd is in mon fds list. But I'm afraid there are users like me who are already using this wrong use case. Because currently nothing in QEMU's docs disallow this. So which solution is better in your opinion? 1) Disallow fd's from mon fds list in fd_start_incoming_migration 2) Allow these fds, but dup them or close them correctly And how to migrate-incoming defer through fd correctly? 1) Add "mon-fd:" protocol to work with fds passed by "add-fd/remove-fd" commands as suggested by Dave 2) My suggestion about monitor_fd_param and make "fd:" for migrate/migrate-incoming consistent. So user will be able to use getfd + migrate-incoming 3) Both of them or something else Regards, Yury Kotov
* Yury Kotov (yury-kotov@yandex-team.ru) wrote: > 15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > > * Daniel P. Berrangé (berrange@redhat.com) wrote: > >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: > >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: > >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: > >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: > >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: > >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: > >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: > >> > > > >> >> Hi, > >> > > > >> >> > >> > > > >> >> Just to clarify. I see two possible solutions: > >> > > > >> >> > >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for > >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both > >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must > >> > > > >> >> close the fd themselves. But existing clients will have a leak. > >> > > > >> > > >> > > > >> > We can't break existing clients in this way as they are correctly > >> > > > >> > using the monitor with its current semantics. > >> > > > >> > > >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from > >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find > >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with > >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be > >> > > > >> >> a very good idea. > >> > > > >> > > >> > > > >> > qemu_close is not appropriate place to deal with something speciifc > >> > > > >> > to the montor. > >> > > > >> > > >> > > > >> >> I don't see any other solution, but I might miss something. > >> > > > >> >> What do you think? > >> > > > >> > > >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. > >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns > >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. > >> > > > >> > > >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. > >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just > >> > > > >> converts input string to int and use it as fd. > >> > > > > > >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt > >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the > >> > > > > monitor at all AFAIR. > >> > > > > > >> > > > > >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for > >> > > > migrate-incoming and such way has described problems. > >> > > > >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not > >> > > designed to use add-fd. > >> > > >> > Hmm, that's true - although: > >> > a) It's very non-obvious > >> > b) Unfortunate, since it would go well with -incoming defer > >> > >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. > >> > >> We should have mandated use of 'add-fd' when using 'defer', since FD > >> inheritance-over-execve() should only be used for command line args, > >> not monitor commands. > >> > >> Not sure how to best fix this is QEMU though without breaking back > >> compat for apps using 'defer' already. > > > > We could add mon-fd: transports that has the same behaviour as now for > > outgoing, and for incoming uses the add-fd stash. > > > > Oh, I'm sorry again. I think my suggestion about monitor_fd_param wasn't > relevant to this issue. If migrate-incoming + "fd:" + add-fd is an invalid use > case, should we disallow this? > I may add a check to fd_start_incoming_migration if fd is in mon fds list. > But I'm afraid there are users like me who are already using this wrong use case. > Because currently nothing in QEMU's docs disallow this. > > So which solution is better in your opinion? > 1) Disallow fd's from mon fds list in fd_start_incoming_migration I'm surprised anything could be doing that - how would a user know what the correct fd index was? > 2) Allow these fds, but dup them or close them correctly I think I'd leave the current (confusing) fd: as it is, maybe put a note in the manual. > And how to migrate-incoming defer through fd correctly? > 1) Add "mon-fd:" protocol to work with fds passed by "add-fd/remove-fd" commands > as suggested by Dave That's my preference; it's explicitly named and consistent, and it doesn't touch the existing fd code. Dave > 2) My suggestion about monitor_fd_param and make "fd:" for > migrate/migrate-incoming consistent. So user will be able to use > getfd + migrate-incoming > 3) Both of them or something else > > Regards, > Yury Kotov -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
18.04.2019, 17:20, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> 15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: >> >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: >> >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: >> >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >> >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >> >> > > > >> >> Hi, >> >> > > > >> >> >> >> > > > >> >> Just to clarify. I see two possible solutions: >> >> > > > >> >> >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >> >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both >> >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >> >> > > > >> >> close the fd themselves. But existing clients will have a leak. >> >> > > > >> > >> >> > > > >> > We can't break existing clients in this way as they are correctly >> >> > > > >> > using the monitor with its current semantics. >> >> > > > >> > >> >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from >> >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find >> >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with >> >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be >> >> > > > >> >> a very good idea. >> >> > > > >> > >> >> > > > >> > qemu_close is not appropriate place to deal with something speciifc >> >> > > > >> > to the montor. >> >> > > > >> > >> >> > > > >> >> I don't see any other solution, but I might miss something. >> >> > > > >> >> What do you think? >> >> > > > >> > >> >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. >> >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns >> >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. >> >> > > > >> > >> >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. >> >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just >> >> > > > >> converts input string to int and use it as fd. >> >> > > > > >> >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt >> >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the >> >> > > > > monitor at all AFAIR. >> >> > > > > >> >> > > > >> >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for >> >> > > > migrate-incoming and such way has described problems. >> >> > > >> >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not >> >> > > designed to use add-fd. >> >> > >> >> > Hmm, that's true - although: >> >> > a) It's very non-obvious >> >> > b) Unfortunate, since it would go well with -incoming defer >> >> >> >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. >> >> >> >> We should have mandated use of 'add-fd' when using 'defer', since FD >> >> inheritance-over-execve() should only be used for command line args, >> >> not monitor commands. >> >> >> >> Not sure how to best fix this is QEMU though without breaking back >> >> compat for apps using 'defer' already. >> > >> > We could add mon-fd: transports that has the same behaviour as now for >> > outgoing, and for incoming uses the add-fd stash. >> > >> >> Oh, I'm sorry again. I think my suggestion about monitor_fd_param wasn't >> relevant to this issue. If migrate-incoming + "fd:" + add-fd is an invalid use >> case, should we disallow this? >> I may add a check to fd_start_incoming_migration if fd is in mon fds list. >> But I'm afraid there are users like me who are already using this wrong use case. >> Because currently nothing in QEMU's docs disallow this. >> >> So which solution is better in your opinion? >> 1) Disallow fd's from mon fds list in fd_start_incoming_migration > > I'm surprised anything could be doing that - how would a user know what > the correct fd index was? > Hmm, add-fd returns correct fd value. Maybe I din't catch you question... >> 2) Allow these fds, but dup them or close them correctly > > I think I'd leave the current (confusing) fd: as it is, maybe put a note > in the manual. > So, using fd from fdset will be an undefined behavior, right? >> And how to migrate-incoming defer through fd correctly? >> 1) Add "mon-fd:" protocol to work with fds passed by "add-fd/remove-fd" commands >> as suggested by Dave > > That's my preference; it's explicitly named and consistent, and it > doesn't touch the existing fd code. > Ok, but please tell me what you think of my suggestion (2) about using fd added by the "getfd" command for incoming migration. It doesn't requires introducing new protocol and will be consistent with outgoing migration through fd. > >> 2) My suggestion about monitor_fd_param and make "fd:" for >> migrate/migrate-incoming consistent. So user will be able to use >> getfd + migrate-incoming >> 3) Both of them or something else >> Regards, Yury
* Yury Kotov (yury-kotov@yandex-team.ru) wrote: > 18.04.2019, 17:20, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: > >> 15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: > >> >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: > >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: > >> >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: > >> >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: > >> >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: > >> >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: > >> >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: > >> >> > > > >> >> Hi, > >> >> > > > >> >> > >> >> > > > >> >> Just to clarify. I see two possible solutions: > >> >> > > > >> >> > >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for > >> >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both > >> >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must > >> >> > > > >> >> close the fd themselves. But existing clients will have a leak. > >> >> > > > >> > > >> >> > > > >> > We can't break existing clients in this way as they are correctly > >> >> > > > >> > using the monitor with its current semantics. > >> >> > > > >> > > >> >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from > >> >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find > >> >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with > >> >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be > >> >> > > > >> >> a very good idea. > >> >> > > > >> > > >> >> > > > >> > qemu_close is not appropriate place to deal with something speciifc > >> >> > > > >> > to the montor. > >> >> > > > >> > > >> >> > > > >> >> I don't see any other solution, but I might miss something. > >> >> > > > >> >> What do you think? > >> >> > > > >> > > >> >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. > >> >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns > >> >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. > >> >> > > > >> > > >> >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. > >> >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just > >> >> > > > >> converts input string to int and use it as fd. > >> >> > > > > > >> >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt > >> >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the > >> >> > > > > monitor at all AFAIR. > >> >> > > > > > >> >> > > > > >> >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for > >> >> > > > migrate-incoming and such way has described problems. > >> >> > > > >> >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not > >> >> > > designed to use add-fd. > >> >> > > >> >> > Hmm, that's true - although: > >> >> > a) It's very non-obvious > >> >> > b) Unfortunate, since it would go well with -incoming defer > >> >> > >> >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. > >> >> > >> >> We should have mandated use of 'add-fd' when using 'defer', since FD > >> >> inheritance-over-execve() should only be used for command line args, > >> >> not monitor commands. > >> >> > >> >> Not sure how to best fix this is QEMU though without breaking back > >> >> compat for apps using 'defer' already. > >> > > >> > We could add mon-fd: transports that has the same behaviour as now for > >> > outgoing, and for incoming uses the add-fd stash. > >> > > >> > >> Oh, I'm sorry again. I think my suggestion about monitor_fd_param wasn't > >> relevant to this issue. If migrate-incoming + "fd:" + add-fd is an invalid use > >> case, should we disallow this? > >> I may add a check to fd_start_incoming_migration if fd is in mon fds list. > >> But I'm afraid there are users like me who are already using this wrong use case. > >> Because currently nothing in QEMU's docs disallow this. > >> > >> So which solution is better in your opinion? > >> 1) Disallow fd's from mon fds list in fd_start_incoming_migration > > > > I'm surprised anything could be doing that - how would a user know what > > the correct fd index was? > > > > Hmm, add-fd returns correct fd value. Maybe I din't catch you question... I don't understand, where does it return it? > >> 2) Allow these fds, but dup them or close them correctly > > > > I think I'd leave the current (confusing) fd: as it is, maybe put a note > > in the manual. > > > > So, using fd from fdset will be an undefined behavior, right? For incoming, yes. > >> And how to migrate-incoming defer through fd correctly? > >> 1) Add "mon-fd:" protocol to work with fds passed by "add-fd/remove-fd" commands > >> as suggested by Dave > > > > That's my preference; it's explicitly named and consistent, and it > > doesn't touch the existing fd code. > > > > Ok, but please tell me what you think of my suggestion (2) about using fd added > by the "getfd" command for incoming migration. It doesn't requires introducing > new protocol and will be consistent with outgoing migration through fd. I worry how qemu knows whether the command means it comes from the getfd command or is actually a normal fd like now? Can you give an example. Dave > > > >> 2) My suggestion about monitor_fd_param and make "fd:" for > >> migrate/migrate-incoming consistent. So user will be able to use > >> getfd + migrate-incoming > >> 3) Both of them or something else > >> > > Regards, > Yury > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
18.04.2019, 19:03, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> 18.04.2019, 17:20, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> >> 15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> >> >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: >> >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> >> >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: >> >> >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: >> >> >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: >> >> >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >> >> >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >> >> >> > > > >> >> Hi, >> >> >> > > > >> >> >> >> >> > > > >> >> Just to clarify. I see two possible solutions: >> >> >> > > > >> >> >> >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >> >> >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both >> >> >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >> >> >> > > > >> >> close the fd themselves. But existing clients will have a leak. >> >> >> > > > >> > >> >> >> > > > >> > We can't break existing clients in this way as they are correctly >> >> >> > > > >> > using the monitor with its current semantics. >> >> >> > > > >> > >> >> >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from >> >> >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find >> >> >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with >> >> >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be >> >> >> > > > >> >> a very good idea. >> >> >> > > > >> > >> >> >> > > > >> > qemu_close is not appropriate place to deal with something speciifc >> >> >> > > > >> > to the montor. >> >> >> > > > >> > >> >> >> > > > >> >> I don't see any other solution, but I might miss something. >> >> >> > > > >> >> What do you think? >> >> >> > > > >> > >> >> >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. >> >> >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns >> >> >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. >> >> >> > > > >> > >> >> >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. >> >> >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just >> >> >> > > > >> converts input string to int and use it as fd. >> >> >> > > > > >> >> >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt >> >> >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the >> >> >> > > > > monitor at all AFAIR. >> >> >> > > > > >> >> >> > > > >> >> >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for >> >> >> > > > migrate-incoming and such way has described problems. >> >> >> > > >> >> >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not >> >> >> > > designed to use add-fd. >> >> >> > >> >> >> > Hmm, that's true - although: >> >> >> > a) It's very non-obvious >> >> >> > b) Unfortunate, since it would go well with -incoming defer >> >> >> >> >> >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. >> >> >> >> >> >> We should have mandated use of 'add-fd' when using 'defer', since FD >> >> >> inheritance-over-execve() should only be used for command line args, >> >> >> not monitor commands. >> >> >> >> >> >> Not sure how to best fix this is QEMU though without breaking back >> >> >> compat for apps using 'defer' already. >> >> > >> >> > We could add mon-fd: transports that has the same behaviour as now for >> >> > outgoing, and for incoming uses the add-fd stash. >> >> > >> >> >> >> Oh, I'm sorry again. I think my suggestion about monitor_fd_param wasn't >> >> relevant to this issue. If migrate-incoming + "fd:" + add-fd is an invalid use >> >> case, should we disallow this? >> >> I may add a check to fd_start_incoming_migration if fd is in mon fds list. >> >> But I'm afraid there are users like me who are already using this wrong use case. >> >> Because currently nothing in QEMU's docs disallow this. >> >> >> >> So which solution is better in your opinion? >> >> 1) Disallow fd's from mon fds list in fd_start_incoming_migration >> > >> > I'm surprised anything could be doing that - how would a user know what >> > the correct fd index was? >> > >> >> Hmm, add-fd returns correct fd value. Maybe I din't catch you question... > > I don't understand, where does it return it? > From misc.json: # Example: # # -> { "execute": "add-fd", "arguments": { "fdset-id": 1 } } # <- { "return": { "fdset-id": 1, "fd": 3 } } # "fd": 3 is a valid fd for migrate-incoming(uri = "fd:3") >> >> 2) Allow these fds, but dup them or close them correctly >> > >> > I think I'd leave the current (confusing) fd: as it is, maybe put a note >> > in the manual. >> > >> >> So, using fd from fdset will be an undefined behavior, right? > > For incoming, yes. > >> >> And how to migrate-incoming defer through fd correctly? >> >> 1) Add "mon-fd:" protocol to work with fds passed by "add-fd/remove-fd" commands >> >> as suggested by Dave >> > >> > That's my preference; it's explicitly named and consistent, and it >> > doesn't touch the existing fd code. >> > >> >> Ok, but please tell me what you think of my suggestion (2) about using fd added >> by the "getfd" command for incoming migration. It doesn't requires introducing >> new protocol and will be consistent with outgoing migration through fd. > > I worry how qemu knows whether the command means it comes from the getfd > command or is actually a normal fd like now? > Can you give an example. > getfd manages naming fds list. # -> { "execute": "getfd", "arguments": { "fdname": "fd1" } } So, for migrate (not incoming) is now valid migrate(uri="fd:fd1") I want the same for migrate-incoming. If fdname is parseable int, then it is an old format. Otherwise - it is a name of fd added by addfd. There is a function "monitor_fd_param" which do exactly what I mean: int monitor_fd_param(Monitor *mon, const char *fdname, Error **errp) { ... local vars ... if (!qemu_isdigit(fdname[0]) && mon) { fd = monitor_get_fd(mon, fdname, &local_err); } else { fd = qemu_parse_fd(fdname); } ... report err to errp ... } >> > >> >> 2) My suggestion about monitor_fd_param and make "fd:" for >> >> migrate/migrate-incoming consistent. So user will be able to use >> >> getfd + migrate-incoming >> >> 3) Both of them or something else >> >> >> Regards, Yury
* Yury Kotov (yury-kotov@yandex-team.ru) wrote: > 18.04.2019, 19:03, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: > >> 18.04.2019, 17:20, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > >> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: > >> >> 15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: > >> >> >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: > >> >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: > >> >> >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: > >> >> >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: > >> >> >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: > >> >> >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: > >> >> >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: > >> >> >> > > > >> >> Hi, > >> >> >> > > > >> >> > >> >> >> > > > >> >> Just to clarify. I see two possible solutions: > >> >> >> > > > >> >> > >> >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for > >> >> >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both > >> >> >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must > >> >> >> > > > >> >> close the fd themselves. But existing clients will have a leak. > >> >> >> > > > >> > > >> >> >> > > > >> > We can't break existing clients in this way as they are correctly > >> >> >> > > > >> > using the monitor with its current semantics. > >> >> >> > > > >> > > >> >> >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from > >> >> >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find > >> >> >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with > >> >> >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be > >> >> >> > > > >> >> a very good idea. > >> >> >> > > > >> > > >> >> >> > > > >> > qemu_close is not appropriate place to deal with something speciifc > >> >> >> > > > >> > to the montor. > >> >> >> > > > >> > > >> >> >> > > > >> >> I don't see any other solution, but I might miss something. > >> >> >> > > > >> >> What do you think? > >> >> >> > > > >> > > >> >> >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. > >> >> >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns > >> >> >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. > >> >> >> > > > >> > > >> >> >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. > >> >> >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just > >> >> >> > > > >> converts input string to int and use it as fd. > >> >> >> > > > > > >> >> >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt > >> >> >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the > >> >> >> > > > > monitor at all AFAIR. > >> >> >> > > > > > >> >> >> > > > > >> >> >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for > >> >> >> > > > migrate-incoming and such way has described problems. > >> >> >> > > > >> >> >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not > >> >> >> > > designed to use add-fd. > >> >> >> > > >> >> >> > Hmm, that's true - although: > >> >> >> > a) It's very non-obvious > >> >> >> > b) Unfortunate, since it would go well with -incoming defer > >> >> >> > >> >> >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. > >> >> >> > >> >> >> We should have mandated use of 'add-fd' when using 'defer', since FD > >> >> >> inheritance-over-execve() should only be used for command line args, > >> >> >> not monitor commands. > >> >> >> > >> >> >> Not sure how to best fix this is QEMU though without breaking back > >> >> >> compat for apps using 'defer' already. > >> >> > > >> >> > We could add mon-fd: transports that has the same behaviour as now for > >> >> > outgoing, and for incoming uses the add-fd stash. > >> >> > > >> >> > >> >> Oh, I'm sorry again. I think my suggestion about monitor_fd_param wasn't > >> >> relevant to this issue. If migrate-incoming + "fd:" + add-fd is an invalid use > >> >> case, should we disallow this? > >> >> I may add a check to fd_start_incoming_migration if fd is in mon fds list. > >> >> But I'm afraid there are users like me who are already using this wrong use case. > >> >> Because currently nothing in QEMU's docs disallow this. > >> >> > >> >> So which solution is better in your opinion? > >> >> 1) Disallow fd's from mon fds list in fd_start_incoming_migration > >> > > >> > I'm surprised anything could be doing that - how would a user know what > >> > the correct fd index was? > >> > > >> > >> Hmm, add-fd returns correct fd value. Maybe I din't catch you question... > > > > I don't understand, where does it return it? > > > > From misc.json: > # Example: > # > # -> { "execute": "add-fd", "arguments": { "fdset-id": 1 } } > # <- { "return": { "fdset-id": 1, "fd": 3 } } > # > > "fd": 3 is a valid fd for migrate-incoming(uri = "fd:3") Ah OK. > >> >> 2) Allow these fds, but dup them or close them correctly > >> > > >> > I think I'd leave the current (confusing) fd: as it is, maybe put a note > >> > in the manual. > >> > > >> > >> So, using fd from fdset will be an undefined behavior, right? > > > > For incoming, yes. > > > >> >> And how to migrate-incoming defer through fd correctly? > >> >> 1) Add "mon-fd:" protocol to work with fds passed by "add-fd/remove-fd" commands > >> >> as suggested by Dave > >> > > >> > That's my preference; it's explicitly named and consistent, and it > >> > doesn't touch the existing fd code. > >> > > >> > >> Ok, but please tell me what you think of my suggestion (2) about using fd added > >> by the "getfd" command for incoming migration. It doesn't requires introducing > >> new protocol and will be consistent with outgoing migration through fd. > > > > I worry how qemu knows whether the command means it comes from the getfd > > command or is actually a normal fd like now? > > Can you give an example. > > > > getfd manages naming fds list. > # -> { "execute": "getfd", "arguments": { "fdname": "fd1" } } > So, for migrate (not incoming) is now valid migrate(uri="fd:fd1") > > I want the same for migrate-incoming. If fdname is parseable int, then it is > an old format. Otherwise - it is a name of fd added by addfd. > > There is a function "monitor_fd_param" which do exactly what I mean: > int monitor_fd_param(Monitor *mon, const char *fdname, Error **errp) { > ... local vars ... > if (!qemu_isdigit(fdname[0]) && mon) { > fd = monitor_get_fd(mon, fdname, &local_err); > } else { > fd = qemu_parse_fd(fdname); > } > ... report err to errp ... > } OK, if we're already using monitor_fd_param everywhere then I think we're already down the rat-hole of guessing whether we're an add-fd or fd by whether it's an integer, and I agree with you that we should just fix incoming to use that. Now, that means I guess we need to modify monitor_fd_param to tell us which type of fd it got, so we know whether to close it later? Dave P.S. I'm out from tomorrow for a weekish. > >> > > >> >> 2) My suggestion about monitor_fd_param and make "fd:" for > >> >> migrate/migrate-incoming consistent. So user will be able to use > >> >> getfd + migrate-incoming > >> >> 3) Both of them or something else > >> >> > >> > > Regards, > Yury -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
18.04.2019, 20:01, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> 18.04.2019, 19:03, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> >> 18.04.2019, 17:20, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >> >> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> >> >> 15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >> >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> >> >> >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: >> >> >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >> >> >> >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: >> >> >> >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: >> >> >> >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: >> >> >> >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >> >> >> >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >> >> >> >> > > > >> >> Hi, >> >> >> >> > > > >> >> >> >> >> >> > > > >> >> Just to clarify. I see two possible solutions: >> >> >> >> > > > >> >> >> >> >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >> >> >> >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both >> >> >> >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >> >> >> >> > > > >> >> close the fd themselves. But existing clients will have a leak. >> >> >> >> > > > >> > >> >> >> >> > > > >> > We can't break existing clients in this way as they are correctly >> >> >> >> > > > >> > using the monitor with its current semantics. >> >> >> >> > > > >> > >> >> >> >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from >> >> >> >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find >> >> >> >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with >> >> >> >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be >> >> >> >> > > > >> >> a very good idea. >> >> >> >> > > > >> > >> >> >> >> > > > >> > qemu_close is not appropriate place to deal with something speciifc >> >> >> >> > > > >> > to the montor. >> >> >> >> > > > >> > >> >> >> >> > > > >> >> I don't see any other solution, but I might miss something. >> >> >> >> > > > >> >> What do you think? >> >> >> >> > > > >> > >> >> >> >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. >> >> >> >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns >> >> >> >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. >> >> >> >> > > > >> > >> >> >> >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. >> >> >> >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just >> >> >> >> > > > >> converts input string to int and use it as fd. >> >> >> >> > > > > >> >> >> >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt >> >> >> >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the >> >> >> >> > > > > monitor at all AFAIR. >> >> >> >> > > > > >> >> >> >> > > > >> >> >> >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for >> >> >> >> > > > migrate-incoming and such way has described problems. >> >> >> >> > > >> >> >> >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not >> >> >> >> > > designed to use add-fd. >> >> >> >> > >> >> >> >> > Hmm, that's true - although: >> >> >> >> > a) It's very non-obvious >> >> >> >> > b) Unfortunate, since it would go well with -incoming defer >> >> >> >> >> >> >> >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. >> >> >> >> >> >> >> >> We should have mandated use of 'add-fd' when using 'defer', since FD >> >> >> >> inheritance-over-execve() should only be used for command line args, >> >> >> >> not monitor commands. >> >> >> >> >> >> >> >> Not sure how to best fix this is QEMU though without breaking back >> >> >> >> compat for apps using 'defer' already. >> >> >> > >> >> >> > We could add mon-fd: transports that has the same behaviour as now for >> >> >> > outgoing, and for incoming uses the add-fd stash. >> >> >> > >> >> >> >> >> >> Oh, I'm sorry again. I think my suggestion about monitor_fd_param wasn't >> >> >> relevant to this issue. If migrate-incoming + "fd:" + add-fd is an invalid use >> >> >> case, should we disallow this? >> >> >> I may add a check to fd_start_incoming_migration if fd is in mon fds list. >> >> >> But I'm afraid there are users like me who are already using this wrong use case. >> >> >> Because currently nothing in QEMU's docs disallow this. >> >> >> >> >> >> So which solution is better in your opinion? >> >> >> 1) Disallow fd's from mon fds list in fd_start_incoming_migration >> >> > >> >> > I'm surprised anything could be doing that - how would a user know what >> >> > the correct fd index was? >> >> > >> >> >> >> Hmm, add-fd returns correct fd value. Maybe I din't catch you question... >> > >> > I don't understand, where does it return it? >> > >> >> From misc.json: >> # Example: >> # >> # -> { "execute": "add-fd", "arguments": { "fdset-id": 1 } } >> # <- { "return": { "fdset-id": 1, "fd": 3 } } >> # >> >> "fd": 3 is a valid fd for migrate-incoming(uri = "fd:3") > > Ah OK. > >> >> >> 2) Allow these fds, but dup them or close them correctly >> >> > >> >> > I think I'd leave the current (confusing) fd: as it is, maybe put a note >> >> > in the manual. >> >> > >> >> >> >> So, using fd from fdset will be an undefined behavior, right? >> > >> > For incoming, yes. >> > >> >> >> And how to migrate-incoming defer through fd correctly? >> >> >> 1) Add "mon-fd:" protocol to work with fds passed by "add-fd/remove-fd" commands >> >> >> as suggested by Dave >> >> > >> >> > That's my preference; it's explicitly named and consistent, and it >> >> > doesn't touch the existing fd code. >> >> > >> >> >> >> Ok, but please tell me what you think of my suggestion (2) about using fd added >> >> by the "getfd" command for incoming migration. It doesn't requires introducing >> >> new protocol and will be consistent with outgoing migration through fd. >> > >> > I worry how qemu knows whether the command means it comes from the getfd >> > command or is actually a normal fd like now? >> > Can you give an example. >> > >> >> getfd manages naming fds list. >> # -> { "execute": "getfd", "arguments": { "fdname": "fd1" } } >> So, for migrate (not incoming) is now valid migrate(uri="fd:fd1") >> >> I want the same for migrate-incoming. If fdname is parseable int, then it is >> an old format. Otherwise - it is a name of fd added by addfd. >> >> There is a function "monitor_fd_param" which do exactly what I mean: >> int monitor_fd_param(Monitor *mon, const char *fdname, Error **errp) { >> ... local vars ... >> if (!qemu_isdigit(fdname[0]) && mon) { >> fd = monitor_get_fd(mon, fdname, &local_err); >> } else { >> fd = qemu_parse_fd(fdname); >> } >> ... report err to errp ... >> } > > OK, if we're already using monitor_fd_param everywhere then I think > we're already down the rat-hole of guessing whether we're an add-fd or > fd by whether it's an integer, and I agree with you that we should > just fix incoming to use that. > > Now, that means I guess we need to modify monitor_fd_param to tell us > which type of fd it got, so we know whether to close it later? > > Dave > P.S. I'm out from tomorrow for a weekish. > I think the right way is to check whether fd is added by add-fd and if so then return error. Because IIUC the only valid usage of add-fd is to use qemu_open("/dev/fdset/<fdset_id>") which finds suitable fd from fdset. Such behavior is incompatible with fd:<fd_num> at all, as such syntax doesn't imply the using of particular fd. But if so, why add-fd returns the value of added fd?.. But if I'm right it's enought to: 1) Modify monitor_fd_param to check where fd comes from and disallow using fd of "add-fd", 2) As we discussed earlier, modify monitor_get_fd to remove named fd from its list before return, 3) Use monitor_fd_param in migrate_incoming for "fd:" proto. I'm not insist. May be it's ok to use fd from fdset directly and so qemu_close should be modifyed. Just to clarify what I mean: fdset is a struct: struct MonFdset { int64_t id; QLIST_HEAD(, MonFdsetFd) fds; QLIST_HEAD(, MonFdsetFd) dup_fds; QLIST_ENTRY(MonFdset) next; }; * add-fd appends new fd to "->fds" list. * qemu_open("/dev/fdset/X", int perms) is looking for suitable (by perms) fd from fdset with id X, dup it and append "->dup_fds" list. * qemu_close(int fd) tryes to find the fd in all "->dup_fds" lists of all fdsets and remove it. And closes fd anyway. If not to disallow using fds added by add-fd then there are three possible solutions: 1) dup fd in monitor_fd_param it the fd is from some fdset, 2) remove the fd from "->fds" list in qemu_close 3) don't close it in qemu_close, so client is responsible to close it by remove-fd. Regards, Yury >> >> > >> >> >> 2) My suggestion about monitor_fd_param and make "fd:" for >> >> >> migrate/migrate-incoming consistent. So user will be able to use >> >> >> getfd + migrate-incoming >> >> >> 3) Both of them or something else >> >> >> >> >> >>
Ping 18.04.2019, 20:46, "Yury Kotov" <yury-kotov@yandex-team.ru>: > 18.04.2019, 20:01, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >> * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >>> 18.04.2019, 19:03, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >>> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >>> >> 18.04.2019, 17:20, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >>> >> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >>> >> >> 15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >>> >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >>> >> >> >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: >>> >> >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >>> >> >> >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: >>> >> >> >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: >>> >> >> >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: >>> >> >> >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >>> >> >> >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >>> >> >> >> > > > >> >> Hi, >>> >> >> >> > > > >> >> >>> >> >> >> > > > >> >> Just to clarify. I see two possible solutions: >>> >> >> >> > > > >> >> >>> >> >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >>> >> >> >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both >>> >> >> >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >>> >> >> >> > > > >> >> close the fd themselves. But existing clients will have a leak. >>> >> >> >> > > > >> > >>> >> >> >> > > > >> > We can't break existing clients in this way as they are correctly >>> >> >> >> > > > >> > using the monitor with its current semantics. >>> >> >> >> > > > >> > >>> >> >> >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from >>> >> >> >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find >>> >> >> >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with >>> >> >> >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be >>> >> >> >> > > > >> >> a very good idea. >>> >> >> >> > > > >> > >>> >> >> >> > > > >> > qemu_close is not appropriate place to deal with something speciifc >>> >> >> >> > > > >> > to the montor. >>> >> >> >> > > > >> > >>> >> >> >> > > > >> >> I don't see any other solution, but I might miss something. >>> >> >> >> > > > >> >> What do you think? >>> >> >> >> > > > >> > >>> >> >> >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. >>> >> >> >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns >>> >> >> >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. >>> >> >> >> > > > >> > >>> >> >> >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. >>> >> >> >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just >>> >> >> >> > > > >> converts input string to int and use it as fd. >>> >> >> >> > > > > >>> >> >> >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt >>> >> >> >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the >>> >> >> >> > > > > monitor at all AFAIR. >>> >> >> >> > > > > >>> >> >> >> > > > >>> >> >> >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for >>> >> >> >> > > > migrate-incoming and such way has described problems. >>> >> >> >> > > >>> >> >> >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not >>> >> >> >> > > designed to use add-fd. >>> >> >> >> > >>> >> >> >> > Hmm, that's true - although: >>> >> >> >> > a) It's very non-obvious >>> >> >> >> > b) Unfortunate, since it would go well with -incoming defer >>> >> >> >> >>> >> >> >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. >>> >> >> >> >>> >> >> >> We should have mandated use of 'add-fd' when using 'defer', since FD >>> >> >> >> inheritance-over-execve() should only be used for command line args, >>> >> >> >> not monitor commands. >>> >> >> >> >>> >> >> >> Not sure how to best fix this is QEMU though without breaking back >>> >> >> >> compat for apps using 'defer' already. >>> >> >> > >>> >> >> > We could add mon-fd: transports that has the same behaviour as now for >>> >> >> > outgoing, and for incoming uses the add-fd stash. >>> >> >> > >>> >> >> >>> >> >> Oh, I'm sorry again. I think my suggestion about monitor_fd_param wasn't >>> >> >> relevant to this issue. If migrate-incoming + "fd:" + add-fd is an invalid use >>> >> >> case, should we disallow this? >>> >> >> I may add a check to fd_start_incoming_migration if fd is in mon fds list. >>> >> >> But I'm afraid there are users like me who are already using this wrong use case. >>> >> >> Because currently nothing in QEMU's docs disallow this. >>> >> >> >>> >> >> So which solution is better in your opinion? >>> >> >> 1) Disallow fd's from mon fds list in fd_start_incoming_migration >>> >> > >>> >> > I'm surprised anything could be doing that - how would a user know what >>> >> > the correct fd index was? >>> >> > >>> >> >>> >> Hmm, add-fd returns correct fd value. Maybe I din't catch you question... >>> > >>> > I don't understand, where does it return it? >>> > >>> >>> From misc.json: >>> # Example: >>> # >>> # -> { "execute": "add-fd", "arguments": { "fdset-id": 1 } } >>> # <- { "return": { "fdset-id": 1, "fd": 3 } } >>> # >>> >>> "fd": 3 is a valid fd for migrate-incoming(uri = "fd:3") >> >> Ah OK. >> >>> >> >> 2) Allow these fds, but dup them or close them correctly >>> >> > >>> >> > I think I'd leave the current (confusing) fd: as it is, maybe put a note >>> >> > in the manual. >>> >> > >>> >> >>> >> So, using fd from fdset will be an undefined behavior, right? >>> > >>> > For incoming, yes. >>> > >>> >> >> And how to migrate-incoming defer through fd correctly? >>> >> >> 1) Add "mon-fd:" protocol to work with fds passed by "add-fd/remove-fd" commands >>> >> >> as suggested by Dave >>> >> > >>> >> > That's my preference; it's explicitly named and consistent, and it >>> >> > doesn't touch the existing fd code. >>> >> > >>> >> >>> >> Ok, but please tell me what you think of my suggestion (2) about using fd added >>> >> by the "getfd" command for incoming migration. It doesn't requires introducing >>> >> new protocol and will be consistent with outgoing migration through fd. >>> > >>> > I worry how qemu knows whether the command means it comes from the getfd >>> > command or is actually a normal fd like now? >>> > Can you give an example. >>> > >>> >>> getfd manages naming fds list. >>> # -> { "execute": "getfd", "arguments": { "fdname": "fd1" } } >>> So, for migrate (not incoming) is now valid migrate(uri="fd:fd1") >>> >>> I want the same for migrate-incoming. If fdname is parseable int, then it is >>> an old format. Otherwise - it is a name of fd added by addfd. >>> >>> There is a function "monitor_fd_param" which do exactly what I mean: >>> int monitor_fd_param(Monitor *mon, const char *fdname, Error **errp) { >>> ... local vars ... >>> if (!qemu_isdigit(fdname[0]) && mon) { >>> fd = monitor_get_fd(mon, fdname, &local_err); >>> } else { >>> fd = qemu_parse_fd(fdname); >>> } >>> ... report err to errp ... >>> } >> >> OK, if we're already using monitor_fd_param everywhere then I think >> we're already down the rat-hole of guessing whether we're an add-fd or >> fd by whether it's an integer, and I agree with you that we should >> just fix incoming to use that. >> >> Now, that means I guess we need to modify monitor_fd_param to tell us >> which type of fd it got, so we know whether to close it later? >> >> Dave >> P.S. I'm out from tomorrow for a weekish. > > I think the right way is to check whether fd is added by add-fd and if so then > return error. Because IIUC the only valid usage of add-fd is to use > qemu_open("/dev/fdset/<fdset_id>") which finds suitable fd from fdset. > Such behavior is incompatible with fd:<fd_num> at all, as such syntax > doesn't imply the using of particular fd. But if so, why add-fd returns > the value of added fd?.. > > But if I'm right it's enough to: > 1) Modify monitor_fd_param to check where fd comes from and disallow using > fd of "add-fd", > 2) As we discussed earlier, modify monitor_get_fd to remove named fd from its > list before return, Omg, monitor_fd_param is already do so... Sorry, so the problem only in incoming case. > 3) Use monitor_fd_param in migrate_incoming for "fd:" proto. > > I'm not insist. May be it's ok to use fd from fdset directly and so qemu_close > should be modifyed. > > Just to clarify what I mean: > fdset is a struct: > struct MonFdset { > int64_t id; > QLIST_HEAD(, MonFdsetFd) fds; > QLIST_HEAD(, MonFdsetFd) dup_fds; > QLIST_ENTRY(MonFdset) next; > }; > > * add-fd appends new fd to "->fds" list. > * qemu_open("/dev/fdset/X", int perms) is looking for suitable (by perms) fd > from fdset with id X, dup it and append "->dup_fds" list. > * qemu_close(int fd) tryes to find the fd in all "->dup_fds" lists > of all fdsets and remove it. And closes fd anyway. > > If not to disallow using fds added by add-fd then there are three > possible solutions: > 1) dup fd in monitor_fd_param it the fd is from some fdset, > 2) remove the fd from "->fds" list in qemu_close > 3) don't close it in qemu_close, so client is responsible to close it by > remove-fd. > >>> >> > >>> >> >> 2) My suggestion about monitor_fd_param and make "fd:" for >>> >> >> migrate/migrate-incoming consistent. So user will be able to use >>> >> >> getfd + migrate-incoming >>> >> >> 3) Both of them or something else >>> >> >> >>> >> Regards, Yury
Ping 14.05.2019, 12:36, "Yury Kotov" <yury-kotov@yandex-team.ru>: > Ping > > 18.04.2019, 20:46, "Yury Kotov" <yury-kotov@yandex-team.ru>: >> 18.04.2019, 20:01, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >>> * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >>>> 18.04.2019, 19:03, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >>>> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >>>> >> 18.04.2019, 17:20, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >>>> >> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >>>> >> >> 15.04.2019, 14:30, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >>>> >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >>>> >> >> >> On Mon, Apr 15, 2019 at 12:15:12PM +0100, Dr. David Alan Gilbert wrote: >>>> >> >> >> > * Daniel P. Berrangé (berrange@redhat.com) wrote: >>>> >> >> >> > > On Mon, Apr 15, 2019 at 01:33:21PM +0300, Yury Kotov wrote: >>>> >> >> >> > > > 15.04.2019, 13:25, "Daniel P. Berrangé" <berrange@redhat.com>: >>>> >> >> >> > > > > On Mon, Apr 15, 2019 at 01:17:06PM +0300, Yury Kotov wrote: >>>> >> >> >> > > > >> 15.04.2019, 13:11, "Daniel P. Berrangé" <berrange@redhat.com>: >>>> >> >> >> > > > >> > On Mon, Apr 15, 2019 at 12:50:08PM +0300, Yury Kotov wrote: >>>> >> >> >> > > > >> >> Hi, >>>> >> >> >> > > > >> >> >>>> >> >> >> > > > >> >> Just to clarify. I see two possible solutions: >>>> >> >> >> > > > >> >> >>>> >> >> >> > > > >> >> 1) Since the migration code doesn't receive fd, it isn't responsible for >>>> >> >> >> > > > >> >> closing it. So, it may be better to use migrate_fd_param for both >>>> >> >> >> > > > >> >> incoming/outgoing and add dupping for migrate_fd_param. Thus, clients must >>>> >> >> >> > > > >> >> close the fd themselves. But existing clients will have a leak. >>>> >> >> >> > > > >> > >>>> >> >> >> > > > >> > We can't break existing clients in this way as they are correctly >>>> >> >> >> > > > >> > using the monitor with its current semantics. >>>> >> >> >> > > > >> > >>>> >> >> >> > > > >> >> 2) If we don't duplicate fd, then at least we should remove fd from >>>> >> >> >> > > > >> >> the corresponding list. Therefore, the solution is to fix qemu_close to find >>>> >> >> >> > > > >> >> the list and remove fd from it. But qemu_close is currently consistent with >>>> >> >> >> > > > >> >> qemu_open (which opens/dups fd), so adding additional logic might not be >>>> >> >> >> > > > >> >> a very good idea. >>>> >> >> >> > > > >> > >>>> >> >> >> > > > >> > qemu_close is not appropriate place to deal with something speciifc >>>> >> >> >> > > > >> > to the montor. >>>> >> >> >> > > > >> > >>>> >> >> >> > > > >> >> I don't see any other solution, but I might miss something. >>>> >> >> >> > > > >> >> What do you think? >>>> >> >> >> > > > >> > >>>> >> >> >> > > > >> > All callers of monitor_get_fd() will close() the FD they get back. >>>> >> >> >> > > > >> > Thus monitor_get_fd() should remove it from the list when it returns >>>> >> >> >> > > > >> > it, and we should add API docs to monitor_get_fd() to explain this. >>>> >> >> >> > > > >> > >>>> >> >> >> > > > >> Ok, it sounds reasonable. But monitor_get_fd is only about outgoing migration. >>>> >> >> >> > > > >> But what about the incoming migration? It doesn't use monitor_get_fd but just >>>> >> >> >> > > > >> converts input string to int and use it as fd. >>>> >> >> >> > > > > >>>> >> >> >> > > > > The incoming migration expects the FD to be passed into QEMU by the mgmt >>>> >> >> >> > > > > app when it is exec'ing the QEMU binary. It doesn't interact with the >>>> >> >> >> > > > > monitor at all AFAIR. >>>> >> >> >> > > > > >>>> >> >> >> > > > >>>> >> >> >> > > > Oh, sorry. This use case is not obvious. We used add-fd to pass fd for >>>> >> >> >> > > > migrate-incoming and such way has described problems. >>>> >> >> >> > > >>>> >> >> >> > > That's a bug in your usage of QEMU IMHO, as the incoming code is not >>>> >> >> >> > > designed to use add-fd. >>>> >> >> >> > >>>> >> >> >> > Hmm, that's true - although: >>>> >> >> >> > a) It's very non-obvious >>>> >> >> >> > b) Unfortunate, since it would go well with -incoming defer >>>> >> >> >> >>>> >> >> >> Yeah I think this is a screw up on QMEU's part when introducing 'defer'. >>>> >> >> >> >>>> >> >> >> We should have mandated use of 'add-fd' when using 'defer', since FD >>>> >> >> >> inheritance-over-execve() should only be used for command line args, >>>> >> >> >> not monitor commands. >>>> >> >> >> >>>> >> >> >> Not sure how to best fix this is QEMU though without breaking back >>>> >> >> >> compat for apps using 'defer' already. >>>> >> >> > >>>> >> >> > We could add mon-fd: transports that has the same behaviour as now for >>>> >> >> > outgoing, and for incoming uses the add-fd stash. >>>> >> >> > >>>> >> >> >>>> >> >> Oh, I'm sorry again. I think my suggestion about monitor_fd_param wasn't >>>> >> >> relevant to this issue. If migrate-incoming + "fd:" + add-fd is an invalid use >>>> >> >> case, should we disallow this? >>>> >> >> I may add a check to fd_start_incoming_migration if fd is in mon fds list. >>>> >> >> But I'm afraid there are users like me who are already using this wrong use case. >>>> >> >> Because currently nothing in QEMU's docs disallow this. >>>> >> >> >>>> >> >> So which solution is better in your opinion? >>>> >> >> 1) Disallow fd's from mon fds list in fd_start_incoming_migration >>>> >> > >>>> >> > I'm surprised anything could be doing that - how would a user know what >>>> >> > the correct fd index was? >>>> >> > >>>> >> >>>> >> Hmm, add-fd returns correct fd value. Maybe I din't catch you question... >>>> > >>>> > I don't understand, where does it return it? >>>> > >>>> >>>> From misc.json: >>>> # Example: >>>> # >>>> # -> { "execute": "add-fd", "arguments": { "fdset-id": 1 } } >>>> # <- { "return": { "fdset-id": 1, "fd": 3 } } >>>> # >>>> >>>> "fd": 3 is a valid fd for migrate-incoming(uri = "fd:3") >>> >>> Ah OK. >>> >>>> >> >> 2) Allow these fds, but dup them or close them correctly >>>> >> > >>>> >> > I think I'd leave the current (confusing) fd: as it is, maybe put a note >>>> >> > in the manual. >>>> >> > >>>> >> >>>> >> So, using fd from fdset will be an undefined behavior, right? >>>> > >>>> > For incoming, yes. >>>> > >>>> >> >> And how to migrate-incoming defer through fd correctly? >>>> >> >> 1) Add "mon-fd:" protocol to work with fds passed by "add-fd/remove-fd" commands >>>> >> >> as suggested by Dave >>>> >> > >>>> >> > That's my preference; it's explicitly named and consistent, and it >>>> >> > doesn't touch the existing fd code. >>>> >> > >>>> >> >>>> >> Ok, but please tell me what you think of my suggestion (2) about using fd added >>>> >> by the "getfd" command for incoming migration. It doesn't requires introducing >>>> >> new protocol and will be consistent with outgoing migration through fd. >>>> > >>>> > I worry how qemu knows whether the command means it comes from the getfd >>>> > command or is actually a normal fd like now? >>>> > Can you give an example. >>>> > >>>> >>>> getfd manages naming fds list. >>>> # -> { "execute": "getfd", "arguments": { "fdname": "fd1" } } >>>> So, for migrate (not incoming) is now valid migrate(uri="fd:fd1") >>>> >>>> I want the same for migrate-incoming. If fdname is parseable int, then it is >>>> an old format. Otherwise - it is a name of fd added by addfd. >>>> >>>> There is a function "monitor_fd_param" which do exactly what I mean: >>>> int monitor_fd_param(Monitor *mon, const char *fdname, Error **errp) { >>>> ... local vars ... >>>> if (!qemu_isdigit(fdname[0]) && mon) { >>>> fd = monitor_get_fd(mon, fdname, &local_err); >>>> } else { >>>> fd = qemu_parse_fd(fdname); >>>> } >>>> ... report err to errp ... >>>> } >>> >>> OK, if we're already using monitor_fd_param everywhere then I think >>> we're already down the rat-hole of guessing whether we're an add-fd or >>> fd by whether it's an integer, and I agree with you that we should >>> just fix incoming to use that. >>> >>> Now, that means I guess we need to modify monitor_fd_param to tell us >>> which type of fd it got, so we know whether to close it later? >>> >>> Dave >>> P.S. I'm out from tomorrow for a weekish. >> >> I think the right way is to check whether fd is added by add-fd and if so then >> return error. Because IIUC the only valid usage of add-fd is to use >> qemu_open("/dev/fdset/<fdset_id>") which finds suitable fd from fdset. >> Such behavior is incompatible with fd:<fd_num> at all, as such syntax >> doesn't imply the using of particular fd. But if so, why add-fd returns >> the value of added fd?.. >> >> But if I'm right it's enough to: >> 1) Modify monitor_fd_param to check where fd comes from and disallow using >> fd of "add-fd", >> 2) As we discussed earlier, modify monitor_get_fd to remove named fd from its >> list before return, > > Omg, monitor_fd_param is already do so... Sorry, so the problem only in > incoming case. > >> 3) Use monitor_fd_param in migrate_incoming for "fd:" proto. >> >> I'm not insist. May be it's ok to use fd from fdset directly and so qemu_close >> should be modifyed. >> >> Just to clarify what I mean: >> fdset is a struct: >> struct MonFdset { >> int64_t id; >> QLIST_HEAD(, MonFdsetFd) fds; >> QLIST_HEAD(, MonFdsetFd) dup_fds; >> QLIST_ENTRY(MonFdset) next; >> }; >> >> * add-fd appends new fd to "->fds" list. >> * qemu_open("/dev/fdset/X", int perms) is looking for suitable (by perms) fd >> from fdset with id X, dup it and append "->dup_fds" list. >> * qemu_close(int fd) tryes to find the fd in all "->dup_fds" lists >> of all fdsets and remove it. And closes fd anyway. >> >> If not to disallow using fds added by add-fd then there are three >> possible solutions: >> 1) dup fd in monitor_fd_param it the fd is from some fdset, >> 2) remove the fd from "->fds" list in qemu_close >> 3) don't close it in qemu_close, so client is responsible to close it by >> remove-fd. >> >>>> >> > >>>> >> >> 2) My suggestion about monitor_fd_param and make "fd:" for >>>> >> >> migrate/migrate-incoming consistent. So user will be able to use >>>> >> >> getfd + migrate-incoming >>>> >> >> 3) Both of them or something else >>>> >> >> >>>> >> > > Regards, > Yury
diff --git a/migration/fd.c b/migration/fd.c index a7c13df4ad..c9ff07ac41 100644 --- a/migration/fd.c +++ b/migration/fd.c @@ -15,6 +15,7 @@ */ #include "qemu/osdep.h" +#include "qapi/error.h" #include "channel.h" #include "fd.h" #include "migration.h" @@ -26,15 +27,27 @@ void fd_start_outgoing_migration(MigrationState *s, const char *fdname, Error **errp) { QIOChannel *ioc; - int fd = monitor_get_fd(cur_mon, fdname, errp); + int fd, dup_fd; + + fd = monitor_get_fd(cur_mon, fdname, errp); if (fd == -1) { return; } - trace_migration_fd_outgoing(fd); - ioc = qio_channel_new_fd(fd, errp); + /* fd is previously created by qmp command 'getfd', + * so client is responsible to close it. Dup it to save original value from + * QIOChannel's destructor */ + dup_fd = qemu_dup(fd); + if (dup_fd == -1) { + error_setg(errp, "Cannot dup fd %s: %s (%d)", fdname, strerror(errno), + errno); + return; + } + + trace_migration_fd_outgoing(fd, dup_fd); + ioc = qio_channel_new_fd(dup_fd, errp); if (!ioc) { - close(fd); + close(dup_fd); return; } @@ -55,14 +68,24 @@ static gboolean fd_accept_incoming_migration(QIOChannel *ioc, void fd_start_incoming_migration(const char *infd, Error **errp) { QIOChannel *ioc; - int fd; + int fd, dup_fd; fd = strtol(infd, NULL, 0); - trace_migration_fd_incoming(fd); - ioc = qio_channel_new_fd(fd, errp); + /* fd is previously created by qmp command 'add-fd' or something else, + * so client is responsible to close it. Dup it to save original value from + * QIOChannel's destructor */ + dup_fd = qemu_dup(fd); + if (dup_fd == -1) { + error_setg(errp, "Cannot dup fd %d: %s (%d)", fd, strerror(errno), + errno); + return; + } + + trace_migration_fd_incoming(fd, dup_fd); + ioc = qio_channel_new_fd(dup_fd, errp); if (!ioc) { - close(fd); + close(dup_fd); return; } diff --git a/migration/trace-events b/migration/trace-events index de2e136e57..d2d30a6b3c 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -258,8 +258,8 @@ migration_exec_outgoing(const char *cmd) "cmd=%s" migration_exec_incoming(const char *cmd) "cmd=%s" # fd.c -migration_fd_outgoing(int fd) "fd=%d" -migration_fd_incoming(int fd) "fd=%d" +migration_fd_outgoing(int fd, int dup_fd) "fd=%d dup_fd=%d" +migration_fd_incoming(int fd, int dup_fd) "fd=%d dup_fd=%d" # socket.c migration_socket_incoming_accepted(void) ""
Currently such case is possible for incoming: QMP: add-fd (fdset = 0, fd of some file): adds fd to fdset 0 and returns QEMU's fd (e.g. 33) QMP: migrate-incoming (uri = "fd:33"): fd is stored in QIOChannel *ioc ... Incoming migration completes and unrefs ioc -> close(33) QMP: remove-fd (fdset = 0, fd = 33): removes fd from fdset 0 and qemu_close() -> close(33) => double close For outgoing migration the case is the same but getfd instead of add-fd. Fix it by duping client's fd. Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> --- migration/fd.c | 39 +++++++++++++++++++++++++++++++-------- migration/trace-events | 4 ++-- 2 files changed, 33 insertions(+), 10 deletions(-)