diff mbox series

[1/2] io_uring: fix issue when IOSQE_IO_DRAIN pass with IOSQE_IO_LINK

Message ID 1565775322-10296-1-git-send-email-liuyun01@kylinos.cn (mailing list archive)
State New, archived
Headers show
Series [1/2] io_uring: fix issue when IOSQE_IO_DRAIN pass with IOSQE_IO_LINK | expand

Commit Message

Jackie Liu Aug. 14, 2019, 9:35 a.m. UTC
Suppose there are three IOs here, and their order is as follows:

Submit:
	[1] IO_LINK
	    |
	    |---  [2] IO_LINK | IO_DRAIN
		      |
		      |- [3] NORMAL_IO

In theory, they all need to be inserted into the Link-list, but flag
IO_DRAIN we have, io[2] and io[3] will be inserted into the defer_list,
and finally, io[3] and io[2] will be processed at the same time.

Now, it is directly forbidden to pass these two flags at the same time.

Fixes: 9e645e1105c ("io_uring: add support for sqe links")
Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
---
 fs/io_uring.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Jens Axboe Aug. 15, 2019, 5:07 p.m. UTC | #1
On 8/14/19 3:35 AM, Jackie Liu wrote:
> Suppose there are three IOs here, and their order is as follows:
> 
> Submit:
> 	[1] IO_LINK
> 	    |
> 	    |---  [2] IO_LINK | IO_DRAIN
> 		      |
> 		      |- [3] NORMAL_IO
> 
> In theory, they all need to be inserted into the Link-list, but flag
> IO_DRAIN we have, io[2] and io[3] will be inserted into the defer_list,
> and finally, io[3] and io[2] will be processed at the same time.
> 
> Now, it is directly forbidden to pass these two flags at the same time.
> 
> Fixes: 9e645e1105c ("io_uring: add support for sqe links")
> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
> ---
>   fs/io_uring.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index d542f1c..05ee628 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -2074,10 +2074,13 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
>   {
>   	struct io_uring_sqe *sqe_copy;
>   	struct io_kiocb *req;
> +	unsigned int flags;
>   	int ret;
>   
> +	flags = READ_ONCE(s->sqe->flags);
>   	/* enforce forwards compatibility on users */
> -	if (unlikely(s->sqe->flags & ~SQE_VALID_FLAGS)) {
> +	if (unlikely((flags & ~SQE_VALID_FLAGS) ||
> +		     (flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)))) {

This doesn't look right, as any setting of either DRAIN or LINK would now
fail?

Did you mean something ala:

	if ((flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) ==
	    (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) {
		... fail ...
	}

which makes me worried that you didn't test this at all...
Jackie Liu Aug. 16, 2019, 12:48 a.m. UTC | #2
在 2019年8月16日,01:07,Jens Axboe <axboe@kernel.dk> 写道:

> 
> On 8/14/19 3:35 AM, Jackie Liu wrote:
>> Suppose there are three IOs here, and their order is as follows:
>> 
>> Submit:
>> 	[1] IO_LINK
>> 	    |
>> 	    |---  [2] IO_LINK | IO_DRAIN
>> 		      |
>> 		      |- [3] NORMAL_IO
>> 
>> In theory, they all need to be inserted into the Link-list, but flag
>> IO_DRAIN we have, io[2] and io[3] will be inserted into the defer_list,
>> and finally, io[3] and io[2] will be processed at the same time.
>> 
>> Now, it is directly forbidden to pass these two flags at the same time.
>> 
>> Fixes: 9e645e1105c ("io_uring: add support for sqe links")
>> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
>> ---
>>  fs/io_uring.c | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index d542f1c..05ee628 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -2074,10 +2074,13 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
>>  {
>>  	struct io_uring_sqe *sqe_copy;
>>  	struct io_kiocb *req;
>> +	unsigned int flags;
>>  	int ret;
>> 
>> +	flags = READ_ONCE(s->sqe->flags);
>>  	/* enforce forwards compatibility on users */
>> -	if (unlikely(s->sqe->flags & ~SQE_VALID_FLAGS)) {
>> +	if (unlikely((flags & ~SQE_VALID_FLAGS) ||
>> +		     (flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)))) {
> 
> This doesn't look right, as any setting of either DRAIN or LINK would now
> fail?
> 
> Did you mean something ala:
> 
> 	if ((flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) ==
> 	    (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) {
> 		... fail ...
> 	}
> 
> which makes me worried that you didn't test this at all...
> 
> -- 
> Jens Axboe

Oh, yes, it's my fault, I just simulated it in my head, thank you for pointing out.
I think I'd add an [RFC PATCH] next time. 

For this issue, I have two solutions, first is this, just avoid passing DRAIN and LINK at
the same time; second is allow, let the SQE following LINK inherit the DRAIN flag, but
It's more complicated, I prefer the first one.

I will rewrite this patch later, with a real test. Thanks again.

--
Jackie Liu
Jens Axboe Aug. 16, 2019, 1:21 a.m. UTC | #3
On 8/15/19 6:48 PM, Jackie Liu wrote:
> 
> 在 2019年8月16日,01:07,Jens Axboe <axboe@kernel.dk> 写道:
> 
>>
>> On 8/14/19 3:35 AM, Jackie Liu wrote:
>>> Suppose there are three IOs here, and their order is as follows:
>>>
>>> Submit:
>>> 	[1] IO_LINK
>>> 	    |
>>> 	    |---  [2] IO_LINK | IO_DRAIN
>>> 		      |
>>> 		      |- [3] NORMAL_IO
>>>
>>> In theory, they all need to be inserted into the Link-list, but flag
>>> IO_DRAIN we have, io[2] and io[3] will be inserted into the defer_list,
>>> and finally, io[3] and io[2] will be processed at the same time.
>>>
>>> Now, it is directly forbidden to pass these two flags at the same time.
>>>
>>> Fixes: 9e645e1105c ("io_uring: add support for sqe links")
>>> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
>>> ---
>>>   fs/io_uring.c | 7 ++++++-
>>>   1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>> index d542f1c..05ee628 100644
>>> --- a/fs/io_uring.c
>>> +++ b/fs/io_uring.c
>>> @@ -2074,10 +2074,13 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
>>>   {
>>>   	struct io_uring_sqe *sqe_copy;
>>>   	struct io_kiocb *req;
>>> +	unsigned int flags;
>>>   	int ret;
>>>
>>> +	flags = READ_ONCE(s->sqe->flags);
>>>   	/* enforce forwards compatibility on users */
>>> -	if (unlikely(s->sqe->flags & ~SQE_VALID_FLAGS)) {
>>> +	if (unlikely((flags & ~SQE_VALID_FLAGS) ||
>>> +		     (flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)))) {
>>
>> This doesn't look right, as any setting of either DRAIN or LINK would now
>> fail?
>>
>> Did you mean something ala:
>>
>> 	if ((flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) ==
>> 	    (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) {
>> 		... fail ...
>> 	}
>>
>> which makes me worried that you didn't test this at all...
>>
>> -- 
>> Jens Axboe
> 
> Oh, yes, it's my fault, I just simulated it in my head, thank you for
> pointing out.  I think I'd add an [RFC PATCH] next time.

Even for an RFC, it better be more tested than just being thought
about... If something hasn't been run at all, it should always include
wording to that effect ("Totally untested, but something like this
perhaps"). I have higher expectations for even an RFC patch, I do expect
that to be both thought about AND tested.

> For this issue, I have two solutions, first is this, just avoid
> passing DRAIN and LINK at the same time; second is allow, let the SQE
> following LINK inherit the DRAIN flag, but It's more complicated, I
> prefer the first one.
> 
> I will rewrite this patch later, with a real test. Thanks again.

If an SQE has both set, it should first wait for any inflight sqe to
complete, then execute the chain. Once things have drained, it should
behave like an SQE that just had LINK set. I'd be interested in seeing a
patch that fixes this instead of just making it illegal, it seems to be
a valid use case.
Jackie Liu Aug. 16, 2019, 6:47 a.m. UTC | #4
> 在 2019年8月16日,09:21,Jens Axboe <axboe@kernel.dk> 写道:
> 
> On 8/15/19 6:48 PM, Jackie Liu wrote:
>> 
>> 在 2019年8月16日,01:07,Jens Axboe <axboe@kernel.dk> 写道:
>> 
>>> 
>>> On 8/14/19 3:35 AM, Jackie Liu wrote:
>>>> Suppose there are three IOs here, and their order is as follows:
>>>> 
>>>> Submit:
>>>> 	[1] IO_LINK
>>>> 	    |
>>>> 	    |---  [2] IO_LINK | IO_DRAIN
>>>> 		      |
>>>> 		      |- [3] NORMAL_IO
>>>> 
>>>> In theory, they all need to be inserted into the Link-list, but flag
>>>> IO_DRAIN we have, io[2] and io[3] will be inserted into the defer_list,
>>>> and finally, io[3] and io[2] will be processed at the same time.
>>>> 
>>>> Now, it is directly forbidden to pass these two flags at the same time.
>>>> 
>>>> Fixes: 9e645e1105c ("io_uring: add support for sqe links")
>>>> Signed-off-by: Jackie Liu <liuyun01@kylinos.cn>
>>>> ---
>>>>  fs/io_uring.c | 7 ++++++-
>>>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>>> index d542f1c..05ee628 100644
>>>> --- a/fs/io_uring.c
>>>> +++ b/fs/io_uring.c
>>>> @@ -2074,10 +2074,13 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
>>>>  {
>>>>  	struct io_uring_sqe *sqe_copy;
>>>>  	struct io_kiocb *req;
>>>> +	unsigned int flags;
>>>>  	int ret;
>>>> 
>>>> +	flags = READ_ONCE(s->sqe->flags);
>>>>  	/* enforce forwards compatibility on users */
>>>> -	if (unlikely(s->sqe->flags & ~SQE_VALID_FLAGS)) {
>>>> +	if (unlikely((flags & ~SQE_VALID_FLAGS) ||
>>>> +		     (flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)))) {
>>> 
>>> This doesn't look right, as any setting of either DRAIN or LINK would now
>>> fail?
>>> 
>>> Did you mean something ala:
>>> 
>>> 	if ((flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) ==
>>> 	    (IOSQE_IO_DRAIN | IOSQE_IO_LINK)) {
>>> 		... fail ...
>>> 	}
>>> 
>>> which makes me worried that you didn't test this at all...
>>> 
>>> -- 
>>> Jens Axboe
>> 
>> Oh, yes, it's my fault, I just simulated it in my head, thank you for
>> pointing out.  I think I'd add an [RFC PATCH] next time.
> 
> Even for an RFC, it better be more tested than just being thought
> about... If something hasn't been run at all, it should always include
> wording to that effect ("Totally untested, but something like this
> perhaps"). I have higher expectations for even an RFC patch, I do expect
> that to be both thought about AND tested.
> 
>> For this issue, I have two solutions, first is this, just avoid
>> passing DRAIN and LINK at the same time; second is allow, let the SQE
>> following LINK inherit the DRAIN flag, but It's more complicated, I
>> prefer the first one.
>> 
>> I will rewrite this patch later, with a real test. Thanks again.
> 
> If an SQE has both set, it should first wait for any inflight sqe to
> complete, then execute the chain. Once things have drained, it should
> behave like an SQE that just had LINK set. I'd be interested in seeing a
> patch that fixes this instead of just making it illegal, it seems to be
> a valid use case.
> 

How about this, We consider link list as a whole, and any IO among them
that has drain will mark the first IO as drain, which is easy to implement.
Of course, there is no clear order between IO in link list and IO in defer_list,
so there maybe have a problems. 

If we want to keep a clear order between link list and defer_list, maybe need to
add more flags and variables. my initial implementation is very complicated. 
do you have any good idea?

First idea patch like follow. untested, just a idea.

---
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 6b572c4..bc0b535 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1995,10 +1995,15 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s,
        flags = READ_ONCE(s->sqe->flags);
        fd = READ_ONCE(s->sqe->fd);

-       if (flags & IOSQE_IO_DRAIN) {
+       if (flags & IOSQE_IO_DRAIN)
                req->flags |= REQ_F_IO_DRAIN;
-               req->sequence = ctx->cached_sq_head - 1;
-       }
+
+       /*
+        * All io need record the previous position, if LINK vs DARIN,
+        * it can be used to mark the position of the first IO in the
+        * link list.
+        */
+       req->sequence = ctx->cached_sq_head - 1;

        if (!io_op_needs_file(s->sqe))
                return 0;
@@ -2123,6 +2128,12 @@ static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
                }

                s->sqe = sqe_copy;
+               /*
+                * Mark the first IO in link list as DRAIN, let all the following
+                * IOs enter the defer list.
+                */
+               if (s->sqe->flags & IOSQE_IO_DRAIN)
+                       prev->flags |= REQ_F_IO_DRAIN;
                memcpy(&req->submit, s, sizeof(*s));
                list_add_tail(&req->list, &prev->link_list);
        } else if (s->sqe->flags & IOSQE_IO_LINK) {


--
Jackie Liu
diff mbox series

Patch

diff --git a/fs/io_uring.c b/fs/io_uring.c
index d542f1c..05ee628 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2074,10 +2074,13 @@  static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
 {
 	struct io_uring_sqe *sqe_copy;
 	struct io_kiocb *req;
+	unsigned int flags;
 	int ret;
 
+	flags = READ_ONCE(s->sqe->flags);
 	/* enforce forwards compatibility on users */
-	if (unlikely(s->sqe->flags & ~SQE_VALID_FLAGS)) {
+	if (unlikely((flags & ~SQE_VALID_FLAGS) ||
+		     (flags & (IOSQE_IO_DRAIN | IOSQE_IO_LINK)))) {
 		ret = -EINVAL;
 		goto err;
 	}
@@ -2093,6 +2096,8 @@  static void io_submit_sqe(struct io_ring_ctx *ctx, struct sqe_submit *s,
 err_req:
 		io_free_req(req);
 err:
+		if (*link)
+			io_fail_links(*link);
 		io_cqring_add_event(ctx, s->sqe->user_data, ret);
 		return;
 	}