diff mbox series

[3/3] io_uring: add io_uring_event cache hit information

Message ID 20190315145938.21516-4-axboe@kernel.dk (mailing list archive)
State New, archived
Headers show
Series io_uring improvements | expand

Commit Message

Jens Axboe March 15, 2019, 2:59 p.m. UTC
Add hint on whether a read was served out of the page cache, or if it
hit media. This is useful for buffered async IO, O_DIRECT reads would
never have this set (for obvious reasons).

If the read hit page cache, cqe->flags will have IOCQE_FLAG_CACHEHIT
set.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 6 +++++-
 include/uapi/linux/io_uring.h | 5 +++++
 2 files changed, 10 insertions(+), 1 deletion(-)

Comments

Jens Axboe March 15, 2019, 4:27 p.m. UTC | #1
On 3/15/19 8:59 AM, Jens Axboe wrote:
> Add hint on whether a read was served out of the page cache, or if it
> hit media. This is useful for buffered async IO, O_DIRECT reads would
> never have this set (for obvious reasons).
> 
> If the read hit page cache, cqe->flags will have IOCQE_FLAG_CACHEHIT
> set.

Linus, curious on your opinion on this one. I had this as part of the
original series, but removed it from the pull request due to the
mincore etc discussions.

Leaving the patch intact for you. This is part of a small series of
improvements, which sit on top of a series of fixes for this release.

> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index f66a4a5daf35..513df722702e 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -635,10 +635,14 @@ static void kiocb_end_write(struct kiocb *kiocb)
>  static void io_complete_rw(struct kiocb *kiocb, long res, long res2)
>  {
>  	struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw);
> +	unsigned ev_flags = 0;
>  
>  	kiocb_end_write(kiocb);
>  
> -	io_cqring_add_event(req->ctx, req->user_data, res, 0);
> +	if (res > 0 && (req->flags & REQ_F_FORCE_NONBLOCK))
> +		ev_flags = IOCQE_FLAG_CACHEHIT;
> +
> +	io_cqring_add_event(req->ctx, req->user_data, res, ev_flags);
>  	io_put_req(req);
>  }
>  
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index e23408692118..24906e99fdc7 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -69,6 +69,11 @@ struct io_uring_cqe {
>  	__u32	flags;
>  };
>  
> +/*
> + * io_uring_event->flags
> + */
> +#define IOCQE_FLAG_CACHEHIT	(1U << 0)	/* IO did not hit media */
> +
>  /*
>   * Magic offsets for the application to mmap the data it needs
>   */
>
Linus Torvalds March 16, 2019, 1:34 a.m. UTC | #2
On Fri, Mar 15, 2019 at 9:27 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> Linus, curious on your opinion on this one. I had this as part of the
> original series, but removed it from the pull request due to the
> mincore etc discussions.

I'd rather not have new ways to leak cache information, and in fact
already the IOCB_NOWAIT is a bit questionable for that. But afaik, the
non-O_DIRECT paths don't even support it to begin with for some
filesystems.

Wasn't the whole point of this io_ring that we'd move *away* from
direct block access and O_DIRECT?

I'm seeing a lot of stuff that looks like just "more of the same old
O_DIRECT garbage" that seems to be all about thinking that IO doesn't
cache.

Haven't we gotten over that already? It was one of the arguments you
used for io_ring to begin with.

                      Linus
Jens Axboe March 16, 2019, 4:27 p.m. UTC | #3
On 3/15/19 7:34 PM, Linus Torvalds wrote:
> On Fri, Mar 15, 2019 at 9:27 AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Linus, curious on your opinion on this one. I had this as part of the
>> original series, but removed it from the pull request due to the
>> mincore etc discussions.
> 
> I'd rather not have new ways to leak cache information, and in fact
> already the IOCB_NOWAIT is a bit questionable for that. But afaik, the
> non-O_DIRECT paths don't even support it to begin with for some
> filesystems.

For the most popular it works fine, though. For buffered async IO it'd
be nice to have it be fully reliable, so we can ensure we punt when we
have to. Currently for writes, we just ignore it and punt to async,
since it doesn't work at all there. The latter should be fixable.

On the hint, I hear ya, that's what I figured. I'll drop the patch for
now.

> Wasn't the whole point of this io_ring that we'd move *away* from
> direct block access and O_DIRECT?
> 
> I'm seeing a lot of stuff that looks like just "more of the same old
> O_DIRECT garbage" that seems to be all about thinking that IO doesn't
> cache.
> 
> Haven't we gotten over that already? It was one of the arguments you
> used for io_ring to begin with.

There are still folks that use it, even if O_DIRECT is nasty. If you
want to get peak performance, there's no way around it. That doesn't
mean that io_uring is in any way centered around O_DIRECT at all, it
isn't. At its core, it's "just" an efficient delivery mechanism for
submissions and completions.

Not sure what "stuff" you are referring to, the patches in this short
series? That's not centered around block devices at all, works just fine
for files. The bvec iterator is what io_uring uses for fixed buffers,
it's not exclusive to block devices at all.
diff mbox series

Patch

diff --git a/fs/io_uring.c b/fs/io_uring.c
index f66a4a5daf35..513df722702e 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -635,10 +635,14 @@  static void kiocb_end_write(struct kiocb *kiocb)
 static void io_complete_rw(struct kiocb *kiocb, long res, long res2)
 {
 	struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw);
+	unsigned ev_flags = 0;
 
 	kiocb_end_write(kiocb);
 
-	io_cqring_add_event(req->ctx, req->user_data, res, 0);
+	if (res > 0 && (req->flags & REQ_F_FORCE_NONBLOCK))
+		ev_flags = IOCQE_FLAG_CACHEHIT;
+
+	io_cqring_add_event(req->ctx, req->user_data, res, ev_flags);
 	io_put_req(req);
 }
 
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index e23408692118..24906e99fdc7 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -69,6 +69,11 @@  struct io_uring_cqe {
 	__u32	flags;
 };
 
+/*
+ * io_uring_event->flags
+ */
+#define IOCQE_FLAG_CACHEHIT	(1U << 0)	/* IO did not hit media */
+
 /*
  * Magic offsets for the application to mmap the data it needs
  */