mbox series

[v2,0/2] migration mapped-ram fixes

Message ID 20240313212824.16974-1-farosas@suse.de (mailing list archive)
Headers show
Series migration mapped-ram fixes | expand

Message

Fabiano Rosas March 13, 2024, 9:28 p.m. UTC
Hi,

In this v2:

patch 1 - The fix for the ioc leaks, now including the main channel

patch 2 - A fix for an fd: migration case I thought I had written code
          for, but obviously didn't.

Thank you for your patience.

based-on: https://gitlab.com/peterx/qemu/-/commits/migration-stable
CI run: https://gitlab.com/farosas/qemu/-/pipelines/1212483701

Fabiano Rosas (2):
  migration: Fix iocs leaks during file and fd migration
  migration/multifd: Ensure we're not given a socket for file migration

 migration/fd.c   | 35 +++++++++++---------------
 migration/file.c | 65 ++++++++++++++++++++++++++++++++----------------
 migration/file.h |  1 +
 3 files changed, 60 insertions(+), 41 deletions(-)

Comments

Peter Xu March 14, 2024, 3:22 p.m. UTC | #1
On Wed, Mar 13, 2024 at 06:28:22PM -0300, Fabiano Rosas wrote:
> Hi,
> 
> In this v2:
> 
> patch 1 - The fix for the ioc leaks, now including the main channel
> 
> patch 2 - A fix for an fd: migration case I thought I had written code
>           for, but obviously didn't.

Maybe I found one more issue.. I'm looking at fd_start_outgoing_migration().

    ioc = qio_channel_new_fd(fd, errp);  <----- here the fd is consumed and
                                                then owned by the IOC
    if (!ioc) {
        close(fd);
        return;
    }

    outgoing_args.fd = fd;               <----- here we use the fd again,
                                                and "owned" by outgoing_args
                                                even if it shouldn't?

The problem is outgoing_args.fd will be cleaned up with a close().  I had a
feeling that it's possible it will close() something else if the fd reused
before that close() but after the IOC's.  We may want yet another dup() for
outgoing_args.fd?

If you agree, we may also want to avoid doing:

    outgoing_args.fd = -1;

We could assert it instead making sure no fd leak.

> 
> Thank you for your patience.
> 
> based-on: https://gitlab.com/peterx/qemu/-/commits/migration-stable
> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1212483701
> 
> Fabiano Rosas (2):
>   migration: Fix iocs leaks during file and fd migration
>   migration/multifd: Ensure we're not given a socket for file migration
> 
>  migration/fd.c   | 35 +++++++++++---------------
>  migration/file.c | 65 ++++++++++++++++++++++++++++++++----------------
>  migration/file.h |  1 +
>  3 files changed, 60 insertions(+), 41 deletions(-)
> 
> -- 
> 2.35.3
>
Fabiano Rosas March 14, 2024, 4:55 p.m. UTC | #2
Peter Xu <peterx@redhat.com> writes:

> On Wed, Mar 13, 2024 at 06:28:22PM -0300, Fabiano Rosas wrote:
>> Hi,
>> 
>> In this v2:
>> 
>> patch 1 - The fix for the ioc leaks, now including the main channel
>> 
>> patch 2 - A fix for an fd: migration case I thought I had written code
>>           for, but obviously didn't.
>
> Maybe I found one more issue.. I'm looking at fd_start_outgoing_migration().
>
>     ioc = qio_channel_new_fd(fd, errp);  <----- here the fd is consumed and
>                                                 then owned by the IOC
>     if (!ioc) {
>         close(fd);
>         return;
>     }
>
>     outgoing_args.fd = fd;               <----- here we use the fd again,
>                                                 and "owned" by outgoing_args
>                                                 even if it shouldn't?
>
> The problem is outgoing_args.fd will be cleaned up with a close().  I had a
> feeling that it's possible it will close() something else if the fd reused
> before that close() but after the IOC's.  We may want yet another dup() for
> outgoing_args.fd?

I think the right fix is to not close() it at
fd_cleanup_outgoing_migration(). That fd is already owned by the ioc.

>
> If you agree, we may also want to avoid doing:
>
>     outgoing_args.fd = -1;

We will always need this. This is just initialization of the field
because 0 is a valid fd value. Otherwise the file.c code can't know if
we're actually using an fd at all.

@file_send_channel_create:

int fd = fd_args_get_fd();

if (fd && fd != -1) {
    <new IOC from fd>
} else {
    <new IOC from file name>
}

>
> We could assert it instead making sure no fd leak.
>
>> 
>> Thank you for your patience.
>> 
>> based-on: https://gitlab.com/peterx/qemu/-/commits/migration-stable
>> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1212483701
>> 
>> Fabiano Rosas (2):
>>   migration: Fix iocs leaks during file and fd migration
>>   migration/multifd: Ensure we're not given a socket for file migration
>> 
>>  migration/fd.c   | 35 +++++++++++---------------
>>  migration/file.c | 65 ++++++++++++++++++++++++++++++++----------------
>>  migration/file.h |  1 +
>>  3 files changed, 60 insertions(+), 41 deletions(-)
>> 
>> -- 
>> 2.35.3
>>
Peter Xu March 14, 2024, 5:35 p.m. UTC | #3
On Thu, Mar 14, 2024 at 01:55:31PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Wed, Mar 13, 2024 at 06:28:22PM -0300, Fabiano Rosas wrote:
> >> Hi,
> >> 
> >> In this v2:
> >> 
> >> patch 1 - The fix for the ioc leaks, now including the main channel
> >> 
> >> patch 2 - A fix for an fd: migration case I thought I had written code
> >>           for, but obviously didn't.
> >
> > Maybe I found one more issue.. I'm looking at fd_start_outgoing_migration().
> >
> >     ioc = qio_channel_new_fd(fd, errp);  <----- here the fd is consumed and
> >                                                 then owned by the IOC
> >     if (!ioc) {
> >         close(fd);
> >         return;
> >     }
> >
> >     outgoing_args.fd = fd;               <----- here we use the fd again,
> >                                                 and "owned" by outgoing_args
> >                                                 even if it shouldn't?
> >
> > The problem is outgoing_args.fd will be cleaned up with a close().  I had a
> > feeling that it's possible it will close() something else if the fd reused
> > before that close() but after the IOC's.  We may want yet another dup() for
> > outgoing_args.fd?
> 
> I think the right fix is to not close() it at
> fd_cleanup_outgoing_migration(). That fd is already owned by the ioc.

But outgoing_args.fd can point to other things if the IOC (along with the
ioc->fd) is released.  Keeping outgoing_args.fd pointing to that fd index
should be dangerous because the integer can be reused.

> 
> >
> > If you agree, we may also want to avoid doing:
> >
> >     outgoing_args.fd = -1;
> 
> We will always need this. This is just initialization of the field
> because 0 is a valid fd value. Otherwise the file.c code can't know if
> we're actually using an fd at all.

I meant avoid setting it to -1 only in fd_start_outgoing_migration().
Using -1 to represent "no fd" is fine.

> 
> @file_send_channel_create:
> 
> int fd = fd_args_get_fd();
> 
> if (fd && fd != -1) {
>     <new IOC from fd>
> } else {
>     <new IOC from file name>
> }
> 
> >
> > We could assert it instead making sure no fd leak.
> >
> >> 
> >> Thank you for your patience.
> >> 
> >> based-on: https://gitlab.com/peterx/qemu/-/commits/migration-stable
> >> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1212483701
> >> 
> >> Fabiano Rosas (2):
> >>   migration: Fix iocs leaks during file and fd migration
> >>   migration/multifd: Ensure we're not given a socket for file migration
> >> 
> >>  migration/fd.c   | 35 +++++++++++---------------
> >>  migration/file.c | 65 ++++++++++++++++++++++++++++++++----------------
> >>  migration/file.h |  1 +
> >>  3 files changed, 60 insertions(+), 41 deletions(-)
> >> 
> >> -- 
> >> 2.35.3
> >> 
>
Peter Xu March 14, 2024, 6 p.m. UTC | #4
On Wed, Mar 13, 2024 at 06:28:22PM -0300, Fabiano Rosas wrote:
> Hi,
> 
> In this v2:
> 
> patch 1 - The fix for the ioc leaks, now including the main channel
> 
> patch 2 - A fix for an fd: migration case I thought I had written code
>           for, but obviously didn't.

The two issues are separate.  I assume patch 2 will need a rework, but I
queued patch 1 first.  Thanks.