mbox series

[v4,0/5] io_uring getdents

Message ID 20230718132112.461218-1-hao.xu@linux.dev (mailing list archive)
Headers show
Series io_uring getdents | expand

Message

Hao Xu July 18, 2023, 1:21 p.m. UTC
From: Hao Xu <howeyxu@tencent.com>

This series introduce getdents64 to io_uring, the code logic is similar
with the snychronized version's. It first try nowait issue, and offload
it to io-wq threads if the first try fails.

Tested it with a liburing case:
https://github.com/HowHsu/liburing/blob/getdents/test/getdents2.c

The test is controlled by the below script[2] which runs getdents2.t 100
times and calulate the avg.
The result show that io_uring version is about 3% faster:

python3 run_getdents.py
    Average of sync: 0.1036849
    Average of iouring: 0.1005568

(0.1036849-0.1005568)/0.1036849 = 3.017%

note:
[1] the number of getdents call/request in io_uring and normal sync version
are made sure to be same beforehand.

[2] run_getdents.py

```python3

import subprocess

N = 100
sum = 0.0
args = ["/data/home/howeyxu/tmpdir", "sync"]

for i in range(N):
    output = subprocess.check_output(["./liburing/test/getdents2.t"] + args)
    sum += float(output)

average = sum / N
print("Average of sync:", average)

sum = 0.0
args = ["/data/home/howeyxu/tmpdir", "iouring"]

for i in range(N):
    output = subprocess.check_output(["./liburing/test/getdents2.t"] + args)
    sum += float(output)

average = sum / N
print("Average of iouring:", average)

```

v3->v4:
 - add Dave's xfs nowait code and fix a deadlock problem, with some code
   style tweak.
 - disable fixed file to avoid a race problem for now
 - add a test program.

v2->v3:
 - removed the kernfs patches
 - add f_pos_lock logic
 - remove the "reduce last EOF getdents try" optimization since
   Dominique reports that doesn't make difference
 - remove the rewind logic, I think the right way is to introduce lseek
   to io_uring not to patch this logic to getdents.
 - add Singed-off-by of Stefan Roesch for patch 1 since checkpatch
   complained that Co-developed-by someone should be accompanied with
   Signed-off-by same person, I can remove them if Stefan thinks that's
   not proper.


Dominique Martinet (1):
  fs: split off vfs_getdents function of getdents64 syscall

Hao Xu (4):
  vfs_getdents/struct dir_context: add flags field
  io_uring: add support for getdents
  xfs: add NOWAIT semantics for readdir
  disable fixed file for io_uring getdents for now

 fs/internal.h                  |  8 +++++
 fs/readdir.c                   | 36 ++++++++++++++++-----
 fs/xfs/libxfs/xfs_da_btree.c   | 16 ++++++++++
 fs/xfs/libxfs/xfs_da_btree.h   |  1 +
 fs/xfs/libxfs/xfs_dir2_block.c |  7 ++--
 fs/xfs/libxfs/xfs_dir2_priv.h  |  2 +-
 fs/xfs/scrub/dir.c             |  2 +-
 fs/xfs/scrub/readdir.c         |  2 +-
 fs/xfs/xfs_dir2_readdir.c      | 58 +++++++++++++++++++++++++++-------
 fs/xfs/xfs_inode.c             | 17 ++++++++++
 fs/xfs/xfs_inode.h             | 15 +++++----
 include/linux/fs.h             |  8 +++++
 include/uapi/linux/io_uring.h  |  7 ++++
 io_uring/fs.c                  | 57 +++++++++++++++++++++++++++++++++
 io_uring/fs.h                  |  3 ++
 io_uring/opdef.c               |  8 +++++
 16 files changed, 215 insertions(+), 32 deletions(-)

Comments

Christian Brauner July 19, 2023, 6:04 a.m. UTC | #1
On Tue, Jul 18, 2023 at 09:21:07PM +0800, Hao Xu wrote:
> From: Hao Xu <howeyxu@tencent.com>
> 
> This series introduce getdents64 to io_uring, the code logic is similar
> with the snychronized version's. It first try nowait issue, and offload
> it to io-wq threads if the first try fails.
> 
> Tested it with a liburing case:
> https://github.com/HowHsu/liburing/blob/getdents/test/getdents2.c
> 
> The test is controlled by the below script[2] which runs getdents2.t 100
> times and calulate the avg.
> The result show that io_uring version is about 3% faster:
> 
> python3 run_getdents.py
>     Average of sync: 0.1036849
>     Average of iouring: 0.1005568
> 
> (0.1036849-0.1005568)/0.1036849 = 3.017%
> 
> note:
> [1] the number of getdents call/request in io_uring and normal sync version
> are made sure to be same beforehand.
> 
> [2] run_getdents.py
> 
> ```python3
> 
> import subprocess
> 
> N = 100
> sum = 0.0
> args = ["/data/home/howeyxu/tmpdir", "sync"]
> 
> for i in range(N):
>     output = subprocess.check_output(["./liburing/test/getdents2.t"] + args)
>     sum += float(output)
> 
> average = sum / N
> print("Average of sync:", average)
> 
> sum = 0.0
> args = ["/data/home/howeyxu/tmpdir", "iouring"]
> 
> for i in range(N):
>     output = subprocess.check_output(["./liburing/test/getdents2.t"] + args)
>     sum += float(output)
> 
> average = sum / N
> print("Average of iouring:", average)
> 
> ```
> 
> v3->v4:

I'm out this week so will review next week.