mbox series

[0/9] migration/mapped-ram: Add direct-io support

Message ID 20240426142042.14573-1-farosas@suse.de (mailing list archive)
Headers show
Series migration/mapped-ram: Add direct-io support | expand

Message

Fabiano Rosas April 26, 2024, 2:20 p.m. UTC
Hi everyone, here's the rest of the migration "mapped-ram" feature
that didn't get merged for 9.0. This series adds support for direct
I/O, the missing piece to get the desired performance improvements.

There's 3 parts to this:

1- The plumbing for the new "direct-io" migration parameter. With this
   we can already use direct-io with the file transport + multifd +
   mapped-ram. Patches 1-3.

Due to the alignment requirements of O_DIRECT and the fact that
multifd runs the channels in parallel with the migration thread, we
must open the migration file two times, one with O_DIRECT set and
another with it clear.

If the user is not passing in a file name which QEMU can open at will,
we must then require that the user pass the two file descriptors with
the flags already properly set. We'll use the already existing fdset +
QMP add-fd infrastructure for this.

2- Changes to the fdset infrastructure to support O_DIRECT. We need
   those to be able to select from the user-provided fdset the file
   descriptor that contains the O_DIRECT flag. Patches 4-5.

3- Some fdset validation to make sure the two-fds requirement is being
   met. Patches 6-7.

CI run: https://gitlab.com/farosas/qemu/-/pipelines/1269352083

Fabiano Rosas (9):
  monitor: Honor QMP request for fd removal immediately
  migration: Fix file migration with fdset
  tests/qtest/migration: Fix file migration offset check
  migration: Add direct-io parameter
  migration/multifd: Add direct-io support
  tests/qtest/migration: Add tests for file migration with direct-io
  monitor: fdset: Match against O_DIRECT
  migration: Add support for fdset with multifd + file
  tests/qtest/migration: Add a test for mapped-ram with passing of fds

 docs/devel/migration/main.rst       |  18 +++
 docs/devel/migration/mapped-ram.rst |   6 +-
 include/qemu/osdep.h                |   2 +
 migration/file.c                    | 108 ++++++++++++++-
 migration/migration-hmp-cmds.c      |  11 ++
 migration/migration.c               |  23 ++++
 migration/options.c                 |  30 +++++
 migration/options.h                 |   1 +
 monitor/fds.c                       |  13 +-
 qapi/migration.json                 |  18 ++-
 tests/qtest/migration-helpers.c     |  42 ++++++
 tests/qtest/migration-helpers.h     |   1 +
 tests/qtest/migration-test.c        | 202 +++++++++++++++++++++++++++-
 util/osdep.c                        |   9 ++
 14 files changed, 465 insertions(+), 19 deletions(-)


base-commit: a118c4aff4087eafb68f7132b233ad548cf16376

Comments

Peter Xu May 2, 2024, 8:01 p.m. UTC | #1
On Fri, Apr 26, 2024 at 11:20:33AM -0300, Fabiano Rosas wrote:
> If the user is not passing in a file name which QEMU can open at will,
> we must then require that the user pass the two file descriptors with
> the flags already properly set. We'll use the already existing fdset +
> QMP add-fd infrastructure for this.

Yes I remember such requirement that one extra fd is needed for direct-io,
however today when I looked closer at the man page it looks like F_SETFL
works with O_DIRECT too?

       F_SETFL (int)
              Set the file status flags to the value specified by arg.
              File access mode (O_RDONLY, O_WRONLY, O_RDWR) and file
              creation flags (i.e., O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC) in
              arg are ignored.  On Linux, this command can change only the
              O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, and O_NONBLOCK flags.
              It is not possible to change the O_DSYNC and O_SYNC flags;
              see BUGS, below.

====8<====
$ cat fcntl.c
#define _GNU_SOURCE
#include <stdio.h>
#include <fcntl.h>
#include <assert.h>
#include <unistd.h>

int main(void)
{
    int fd, newfd, ret, flags;

    fd = open("test.txt", O_RDWR | O_CREAT, 0660);
    assert(fd != -1);

    flags = fcntl(fd, F_GETFL);
    printf("old fd flags: 0x%x\n", flags);

    newfd = dup(fd);
    assert(newfd != -1);

    flags = fcntl(newfd, F_GETFL);
    printf("new fd flags: 0x%x\n", flags);

    flags |= O_DIRECT;
    ret = fcntl(newfd, F_SETFL, flags);

    flags = fcntl(fd, F_GETFL);
    printf("updated new flags: 0x%x\n", flags);
    
    return 0;
}
$ make fcntl
cc     fcntl.c   -o fcntl
$ ./fcntl 
old fd flags: 0x8002
new fd flags: 0x8002
updated new flags: 0xc002
====8<====

Perhaps I missed something important?
Fabiano Rosas May 2, 2024, 8:34 p.m. UTC | #2
Peter Xu <peterx@redhat.com> writes:

> On Fri, Apr 26, 2024 at 11:20:33AM -0300, Fabiano Rosas wrote:
>> If the user is not passing in a file name which QEMU can open at will,
>> we must then require that the user pass the two file descriptors with
>> the flags already properly set. We'll use the already existing fdset +
>> QMP add-fd infrastructure for this.
>
> Yes I remember such requirement that one extra fd is needed for direct-io,
> however today when I looked closer at the man page it looks like F_SETFL
> works with O_DIRECT too?
>
>        F_SETFL (int)
>               Set the file status flags to the value specified by arg.
>               File access mode (O_RDONLY, O_WRONLY, O_RDWR) and file
>               creation flags (i.e., O_CREAT, O_EXCL, O_NOCTTY, O_TRUNC) in
>               arg are ignored.  On Linux, this command can change only the
>               O_APPEND, O_ASYNC, O_DIRECT, O_NOATIME, and O_NONBLOCK flags.
>               It is not possible to change the O_DSYNC and O_SYNC flags;
>               see BUGS, below.
>
> ====8<====
> $ cat fcntl.c
> #define _GNU_SOURCE
> #include <stdio.h>
> #include <fcntl.h>
> #include <assert.h>
> #include <unistd.h>
>
> int main(void)
> {
>     int fd, newfd, ret, flags;
>
>     fd = open("test.txt", O_RDWR | O_CREAT, 0660);
>     assert(fd != -1);
>
>     flags = fcntl(fd, F_GETFL);
>     printf("old fd flags: 0x%x\n", flags);
>
>     newfd = dup(fd);
>     assert(newfd != -1);
>
>     flags = fcntl(newfd, F_GETFL);
>     printf("new fd flags: 0x%x\n", flags);
>
>     flags |= O_DIRECT;
>     ret = fcntl(newfd, F_SETFL, flags);
>
>     flags = fcntl(fd, F_GETFL);
>     printf("updated new flags: 0x%x\n", flags);
>     
>     return 0;
> }
> $ make fcntl
> cc     fcntl.c   -o fcntl
> $ ./fcntl 
> old fd flags: 0x8002
> new fd flags: 0x8002
> updated new flags: 0xc002
> ====8<====
>
> Perhaps I missed something important?

The dup()'ed file descriptor shares file status flags with the original
fd. Your code example proves just that. In the last two blocks you're
doing F_SETFL on the 'newfd' and then seeing the change take effect on
'fd'. That's what we don't want to happen.