mbox series

[0/3] fs/9p: d_revalidate changes and using it for uncached mode

Message ID cover.1743956147.git.m@maowtm.org (mailing list archive)
Headers show
Series fs/9p: d_revalidate changes and using it for uncached mode | expand

Message

Tingmao Wang April 6, 2025, 4:18 p.m. UTC
Hi 9p-fs maintainers,

This patchset fixes a bug I've noticed while testing 9pfs with Landlock,
although this is not otherwise related to Landlock.  The main gist of the
fix is in the first patch, quoting the message here:

> Currently if another process keeps a file open, due to existing dentry in
> the dcache, other processes will not see updated metadata of that file if
> it is changed on the server, even in uncached mode.
>
> This can also manifest as -ENODATA when reading a file that has shrunk on
> the server (even if it's re-opened in another process), or -ENOTSUPP if
> the file has changed type (e.g. regular file to directory) on the server.
> We can end up in a situation where both `readdir` or `read` fails until
> the file is closed by all processes using it.

Here is a sample program that can be used to reproduce this issue.  It
takes a filename as argument and keeps that file open.

    #include <unistd.h>
    #include <stdio.h>
    #include <fcntl.h>
    #include <stdbool.h>

    int main(int argc, char const *argv[])
    {
        if (argc < 2) {
            printf("Usage: %s <filename>\n", argv[0]);
            return 1;
        }
        const char *filename = argv[1];
        int fd = open(filename, O_RDONLY);
        if (fd < 0) {
            perror("open");
            return 1;
        }
        while (true) {
            char buf[4096];
            ssize_t bytes_read = read(fd, buf, sizeof(buf)-1);
            if (bytes_read < 0) {
                perror("read");
                return 1;
            }
            buf[bytes_read] = '\0';
            printf("Read %zd bytes: %s\n", bytes_read, buf);
            printf("Press Enter to read again or Ctrl+C to exit...\n");
            while (getchar() != '\n') {}
            lseek(fd, 0, SEEK_SET);
        }
        return 0;
    }

There are two kinds of error. For -ENODATA:

On the server, prepare a folder with a file that initially has a longer content:

    # mkdir -p /tmp/linux-test
    # echo aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa > /tmp/linux-test/readme

Export the folder with 9pfs, and mount it in a QEMU guest with 9p share:

    guest # mount -t 9p -o trans=virtio,version=9p2000.L,debug=9,cache=none test /mnt
    [   16.254728][  T163] 9pnet: -- v9fs_mount (163):  simple set mount, return 0

Run openread in the background:

    guest # ./openread /mnt/readme &
    [1] 164
    guest # [   27.159295][  T164] 9pnet: -- v9fs_vfs_lookup (164): dir: ffff888105590000 dentry: (readme) ffff888102f0e000 flags: 0
    [   27.159827][  T164] 9pnet: -- v9fs_fid_find (164):  dentry: / (ffff888102f0d780) uid 0 any 0
    [   27.160161][  T164] 9pnet: -- v9fs_fid_find_inode (164):  inode: ffff888105590000
    [   27.160464][  T164] 9pnet: -- v9fs_fid_find (164):  dentry: / (ffff888102f0d780) uid 0 any 0
    [   27.160720][  T164] 9pnet: -- v9fs_fid_find_inode (164):  inode: ffff888105590000
    [   27.160913][  T164] 9pnet: -- v9fs_fid_find (164):  dentry: / (ffff888102f0d780) uid 0 any 0
    [   27.161130][  T164] 9pnet: -- v9fs_fid_find_inode (164):  inode: ffff888105590000
    [   27.162124][  T164] 9pnet: -- v9fs_vfs_lookup (164): readme -> qid->path 51094
    [   27.163190][  T164] 9pnet: -- v9fs_file_open (164): inode: ffff888105590648 file: ffff88810a779c00
    [   27.163679][  T164] 9pnet: -- v9fs_fid_find (164):  dentry: readme (ffff888102f0e000) uid 0 any 0
    [   27.165294][  T164] 9pnet: -- v9fs_file_read_iter (164): fid 3 count 4095 offset 0
    Read 68 bytes: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

    Press Enter to read again or Ctrl+C to exit...

Shrink the file on the server:

    # echo aa > /tmp/linux-test/readme

Try to read it in the client.  Note that this is from a new process -- the
original openread process would also fail with the same error if we `fg`
then press enter on it.

    guest # head /mnt/readme
    [   75.992862][  T165] 9pnet: -- v9fs_file_open (165): inode: ffff888105590648 file: ffff888107aeaf40
    [   75.997456][  T165] 9pnet: -- v9fs_fid_find (165):  dentry: readme (ffff888102f0e000) uid 0 any 0
    [   75.999052][  T165] 9pnet: -- v9fs_file_read_iter (165): fid 4 count 8192 offset 0
    head: error reading '/mnt/readme': No data available
    [   76.001290][  T165] 9pnet: -- v9fs_dir_release (165): inode: ffff888105590648 filp: ffff888107aeaf40 fid: 4

Note that a stat will resolve the situation in this case, since that
updates the metadata.  It also happens that `cat` does a fstat, so using
`cat` to read the file would not reproduce this issue:

    guest # stat /mnt/readme
    [  371.948720][  T172] 9pnet: -- v9fs_vfs_getattr_dotl (172): dentry: ffff888102f0e000
    [  371.952462][  T172] 9pnet: -- v9fs_fid_find (172):  dentry: readme (ffff888102f0e000) uid 0 any 0
    File: /mnt/readme
    Size: 3         	Blocks: 8          IO Block: 131072 regular file
    guest # head /mnt/readme
    [  373.141565][  T173] 9pnet: -- v9fs_file_open (173): inode: ffff888105590648 file: ffff8881071f32c0
    [  373.141969][  T173] 9pnet: -- v9fs_fid_find (173):  dentry: readme (ffff888102f0e000) uid 0 any 0
    [  373.143636][  T173] 9pnet: -- v9fs_file_read_iter (173): fid 4 count 8192 offset 0
    aa
    [  373.145526][  T173] 9pnet: -- v9fs_file_read_iter (173): fid 4 count 8192 offset 3
    [  373.146358][  T173] 9pnet: -- v9fs_dir_release (173): inode: ffff888105590648 filp: ffff8881071f32c0 fid: 4

The other failure mode, which is perhaps less common but more problematic,
is when the file changes type on the host.  Continuing the above, on the
server:

    # rm /tmp/linux-test/readme
    # mkdir /tmp/linux-test/readme

Now the file/dir is inaccessible in the client:

    guest # cat /mnt/readme
    [  509.682250][  T177] 9pnet: -- v9fs_file_open (177): inode: ffff888105590648 file: ffff88810b2756c0
    [  509.682836][  T177] 9pnet: -- v9fs_fid_find (177):  dentry: readme (ffff888102f0e000) uid 0 any 0
    [  509.688049][  T177] 9pnet: -- v9fs_vfs_getattr_dotl (177): dentry: ffff888102f0e000
    [  509.691997][  T177] 9pnet: -- v9fs_fid_find (177):  dentry: readme (ffff888102f0e000) uid 0 any 0
    [  509.696235][  T177] 9pnet: -- v9fs_file_read_iter (177): fid 4 count 131072 offset 0
    qemu-system-x86_64: warning: 9p: bad client: T_read request on directory only expected with 9P2000.u protocol version
    cat: /mnt/readme: Operation not supported
    [  509.700970][  T177] 9pnet: -- v9fs_dir_release (177): inode: ffff888105590648 filp: ffff88810b2756c0 fid: 4
    guest # ls /mnt/readme/
    ls: cannot access '/mnt/readme/': Not a directory

Both issues are fixed by this patchset, with the debug prints added in
commit #3 shown below:

    ... Same setup as before, readme is a long file, we "openread" it,
    shortens it on the host, then read it with head ...

    guest # head /mnt/readme
    [  114.024938][  T196] 9pnet: -- v9fs_fid_find (196):  dentry: readme (ffff8881075d0758) uid 0 any 0
    [  114.026228][  T196] 9pnet: -- __v9fs_lookup_revalidate (196): dentry: readme (ffff8881075d0758) is valid
    [  114.026596][  T196] 9pnet: -- v9fs_file_open (196): inode: ffff8881074b8c90 file: ffff88810bee7480
    [  114.026971][  T196] 9pnet: -- v9fs_fid_find (196):  dentry: readme (ffff8881075d0758) uid 0 any 0
    [  114.027970][  T196] 9pnet: -- v9fs_file_read_iter (196): fid 4 count 8192 offset 0
    aa
    [  114.029430][  T196] 9pnet: -- v9fs_file_read_iter (196): fid 4 count 8192 offset 3
    [  114.030204][  T196] 9pnet: -- v9fs_dir_release (196): inode: ffff8881074b8c90 filp: ffff88810bee7480 fid: 4

    ... Now change the file to a dir on the host ...

    root@d8c28a676d72:/# ls -lah /mnt/readme
    [  177.047326][  T197] 9pnet: -- v9fs_fid_find (197):  dentry: readme (ffff8881075d0758) uid 0 any 0
    [  177.048901][  T197] 9pnet: -- __v9fs_lookup_revalidate (197): dentry: readme (ffff8881075d0758) invalidated due to type change
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    [  177.049400][  T197] 9pnet: -- v9fs_vfs_lookup (197): dir: ffff8881074b8000 dentry: (readme) ffff888102ecc000 flags: 0
    [  177.049914][  T197] 9pnet: -- v9fs_fid_find (197):  dentry: / (ffff8881075ce5e0) uid 0 any 0
    ...
    [  177.095001][  T197] 9pnet: -- v9fs_fid_find (197):  dentry: readme (ffff888102ecf8f8) uid 0 any 0
    [  177.096224][  T197] 9pnet: -- v9fs_dir_readdir_dotl (197): name readme
    [  177.097073][  T197] 9pnet: -- v9fs_dir_release (197): inode: ffff888102f192d8 filp: ffff88810aebf640 fid: 5
    [  177.097616][  T197] 9pnet: -- v9fs_dentry_release (197):  dentry: readme (ffff888102ecf8f8)
    total 8.0K
    drwxr-x--- 2 1000 1000 4.0K Apr  6 15:20 .
    drwxr-x--- 7 1000 1000 4.0K Apr  6 15:20 ..

While I haven't measured the impact, I expect there to be performance hit
only when opening a file that's already open by another process in
uncached mode, due to the extra getattr request.  When a file is not
opened by anyone, it won't have a dentry due to the use of
always_delete_dentry, and thus this d_revalidate change isn't applicable.

Also, I decided to reuse the dentry and call v9fs_refresh_inode_dotl if
the type stayed the same, even in uncached mode.  From looking at the
usages of, and the comment in v9fs_refresh_inode_dotl, I believe this is
safe to do.

Let me know if you wants more testing -- I'm happy to do so.

Tingmao

Tingmao Wang (3):
  fs/9p: Refresh metadata in d_revalidate for uncached mode too
  fs/9p: Invalidate dentry if inode type change detected in cached mode
  fs/9p: Add p9_debug(VFS) in d_revalidate

 fs/9p/vfs_dentry.c     | 33 +++++++++++++++++++++++++++++----
 fs/9p/vfs_inode.c      |  8 +++++++-
 fs/9p/vfs_inode_dotl.c |  8 +++++++-
 3 files changed, 43 insertions(+), 6 deletions(-)


base-commit: 38fec10eb60d687e30c8c6b5420d86e8149f7557
--
2.39.5

Comments

Dominique Martinet April 14, 2025, 9:01 p.m. UTC | #1
Hi,

Thanks for the patches; I think it makes sense overall. We've had
problems with files modifiede on host while guest was looking at it
semi-recently and I thought it was fixed but it either came back or I
missed something...

Just a note so you don't wonder what's happening that I won't be able to
give them a proper look for ~a month with real life getting in the way.

If someone else gives it a fair review I can probably pick up this
thread up for 6.16 but otherwise it probably won't make it, the landlock
part will probably come even later...

Sorry for the delay,