diff mbox series

io_uring: use __kernel_timespec in timeout ABI

Message ID 20190930202055.1748710-1-arnd@arndb.de (mailing list archive)
State New, archived
Headers show
Series io_uring: use __kernel_timespec in timeout ABI | expand

Commit Message

Arnd Bergmann Sept. 30, 2019, 8:20 p.m. UTC
All system calls use struct __kernel_timespec instead of the old struct
timespec, but this one was just added with the old-style ABI. Change it
now to enforce the use of __kernel_timespec, avoiding ABI confusion and
the need for compat handlers on 32-bit architectures.

Any user space caller will have to use __kernel_timespec now, but this
is unambiguous and works for any C library regardless of the time_t
definition. A nicer way to specify the timeout would have been a less
ambiguous 64-bit nanosecond value, but I suppose it's too late now to
change that as this would impact both 32-bit and 64-bit users.

Fixes: 5262f567987d ("io_uring: IORING_OP_TIMEOUT support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 fs/io_uring.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Jens Axboe Oct. 1, 2019, 2:09 p.m. UTC | #1
On 9/30/19 2:20 PM, Arnd Bergmann wrote:
> All system calls use struct __kernel_timespec instead of the old struct
> timespec, but this one was just added with the old-style ABI. Change it
> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
> the need for compat handlers on 32-bit architectures.
> 
> Any user space caller will have to use __kernel_timespec now, but this
> is unambiguous and works for any C library regardless of the time_t
> definition. A nicer way to specify the timeout would have been a less
> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
> change that as this would impact both 32-bit and 64-bit users.

Thanks for catching that, Arnd. Applied.
Jens Axboe Oct. 1, 2019, 3:38 p.m. UTC | #2
On 10/1/19 8:09 AM, Jens Axboe wrote:
> On 9/30/19 2:20 PM, Arnd Bergmann wrote:
>> All system calls use struct __kernel_timespec instead of the old struct
>> timespec, but this one was just added with the old-style ABI. Change it
>> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
>> the need for compat handlers on 32-bit architectures.
>>
>> Any user space caller will have to use __kernel_timespec now, but this
>> is unambiguous and works for any C library regardless of the time_t
>> definition. A nicer way to specify the timeout would have been a less
>> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
>> change that as this would impact both 32-bit and 64-bit users.
> 
> Thanks for catching that, Arnd. Applied.

On second thought - since there appears to be no good 64-bit timespec
available to userspace, the alternative here is including on in liburing.
That seems kinda crappy in terms of API, so why not just use a 64-bit nsec
value as you suggest? There's on released kernel with this feature yet, so
there's nothing stopping us from just changing the API to be based on
a single 64-bit nanosecond timeout.

diff --git a/fs/io_uring.c b/fs/io_uring.c
index dd094b387cab..de3d14fe3025 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1892,16 +1892,13 @@ static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	unsigned count, req_dist, tail_index;
 	struct io_ring_ctx *ctx = req->ctx;
 	struct list_head *entry;
-	struct timespec ts;
+	u64 timeout;
 
 	if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
 		return -EINVAL;
 	if (sqe->flags || sqe->ioprio || sqe->buf_index || sqe->timeout_flags ||
 	    sqe->len != 1)
 		return -EINVAL;
-	if (copy_from_user(&ts, (void __user *) (unsigned long) sqe->addr,
-	    sizeof(ts)))
-		return -EFAULT;
 
 	/*
 	 * sqe->off holds how many events that need to occur for this
@@ -1932,9 +1929,10 @@ static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	list_add(&req->list, entry);
 	spin_unlock_irq(&ctx->completion_lock);
 
+	timeout = READ_ONCE(sqe->addr);
 	hrtimer_init(&req->timeout.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	req->timeout.timer.function = io_timeout_fn;
-	hrtimer_start(&req->timeout.timer, timespec_to_ktime(ts),
+	hrtimer_start(&req->timeout.timer, ns_to_ktime(timeout),
 			HRTIMER_MODE_REL);
 	return 0;
 }
Arnd Bergmann Oct. 1, 2019, 3:49 p.m. UTC | #3
On Tue, Oct 1, 2019 at 5:38 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 10/1/19 8:09 AM, Jens Axboe wrote:
> > On 9/30/19 2:20 PM, Arnd Bergmann wrote:
> >> All system calls use struct __kernel_timespec instead of the old struct
> >> timespec, but this one was just added with the old-style ABI. Change it
> >> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
> >> the need for compat handlers on 32-bit architectures.
> >>
> >> Any user space caller will have to use __kernel_timespec now, but this
> >> is unambiguous and works for any C library regardless of the time_t
> >> definition. A nicer way to specify the timeout would have been a less
> >> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
> >> change that as this would impact both 32-bit and 64-bit users.
> >
> > Thanks for catching that, Arnd. Applied.
>
> On second thought - since there appears to be no good 64-bit timespec
> available to userspace, the alternative here is including on in liburing.

What's wrong with using __kernel_timespec? Just the name?
I suppose liburing could add a macro to give it a different name
for its users.

> That seems kinda crappy in terms of API, so why not just use a 64-bit nsec
> value as you suggest? There's on released kernel with this feature yet, so
> there's nothing stopping us from just changing the API to be based on
> a single 64-bit nanosecond timeout.

Certainly fine with me.

> +       timeout = READ_ONCE(sqe->addr);
>         hrtimer_init(&req->timeout.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>         req->timeout.timer.function = io_timeout_fn;
> -       hrtimer_start(&req->timeout.timer, timespec_to_ktime(ts),
> +       hrtimer_start(&req->timeout.timer, ns_to_ktime(timeout),

It seems a little odd to use the 'addr' field as something that's not
an address,
and I'm not sure I understand the logic behind when you use a READ_ONCE()
as opposed to simply accessing the sqe the way it is done a few lines
earlier.

The time handling definitely looks good to me.

       Arnd
Jens Axboe Oct. 1, 2019, 3:52 p.m. UTC | #4
On 10/1/19 9:49 AM, Arnd Bergmann wrote:
> On Tue, Oct 1, 2019 at 5:38 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 10/1/19 8:09 AM, Jens Axboe wrote:
>>> On 9/30/19 2:20 PM, Arnd Bergmann wrote:
>>>> All system calls use struct __kernel_timespec instead of the old struct
>>>> timespec, but this one was just added with the old-style ABI. Change it
>>>> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
>>>> the need for compat handlers on 32-bit architectures.
>>>>
>>>> Any user space caller will have to use __kernel_timespec now, but this
>>>> is unambiguous and works for any C library regardless of the time_t
>>>> definition. A nicer way to specify the timeout would have been a less
>>>> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
>>>> change that as this would impact both 32-bit and 64-bit users.
>>>
>>> Thanks for catching that, Arnd. Applied.
>>
>> On second thought - since there appears to be no good 64-bit timespec
>> available to userspace, the alternative here is including on in liburing.
> 
> What's wrong with using __kernel_timespec? Just the name?
> I suppose liburing could add a macro to give it a different name
> for its users.

Just that it seems I need to make it available through liburing on
systems that don't have it yet. Not a big deal, though.

>> That seems kinda crappy in terms of API, so why not just use a 64-bit nsec
>> value as you suggest? There's on released kernel with this feature yet, so
>> there's nothing stopping us from just changing the API to be based on
>> a single 64-bit nanosecond timeout.
> 
> Certainly fine with me.
> 
>> +       timeout = READ_ONCE(sqe->addr);
>>          hrtimer_init(&req->timeout.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
>>          req->timeout.timer.function = io_timeout_fn;
>> -       hrtimer_start(&req->timeout.timer, timespec_to_ktime(ts),
>> +       hrtimer_start(&req->timeout.timer, ns_to_ktime(timeout),
> 
> It seems a little odd to use the 'addr' field as something that's not
> an address,
> and I'm not sure I understand the logic behind when you use a READ_ONCE()
> as opposed to simply accessing the sqe the way it is done a few lines
> earlier.
> 
> The time handling definitely looks good to me.

One thing that struck me about this approach - we then lose the ability to
differentiate between "don't want a timed timeout" with ts == NULL, vs
tv_sec and tv_nsec both being 0.

I think I'll stuck with that you had and just use __kernel_timespec in
liburing.
Arnd Bergmann Oct. 1, 2019, 3:57 p.m. UTC | #5
On Tue, Oct 1, 2019 at 5:52 PM Jens Axboe <axboe@kernel.dk> wrote:
> On 10/1/19 9:49 AM, Arnd Bergmann wrote:
> > On Tue, Oct 1, 2019 at 5:38 PM Jens Axboe <axboe@kernel.dk> wrote:

> > What's wrong with using __kernel_timespec? Just the name?
> > I suppose liburing could add a macro to give it a different name
> > for its users.
>
> Just that it seems I need to make it available through liburing on
> systems that don't have it yet. Not a big deal, though.

Ah, right. I t would not cover the case of building against kernel
headers earlier than linux-5.1 but running on a 5.4+ kernel.

I assumed that that you would require new kernel headers anyway,
but if you have a copy of the io_uring header, that is not necessary.

> One thing that struck me about this approach - we then lose the ability to
> differentiate between "don't want a timed timeout" with ts == NULL, vs
> tv_sec and tv_nsec both being 0.

You could always define a special constant such as
'#define IO_URING_TIMEOUT_NEVER -1ull' if you want to
support for 'never wait if it's not already done' and 'wait indefinitely'.

> I think I'll stuck with that you had and just use __kernel_timespec in
> liburing.

Ok.

       Arnd
Jens Axboe Oct. 1, 2019, 4:02 p.m. UTC | #6
On 10/1/19 9:57 AM, Arnd Bergmann wrote:
> On Tue, Oct 1, 2019 at 5:52 PM Jens Axboe <axboe@kernel.dk> wrote:
>> On 10/1/19 9:49 AM, Arnd Bergmann wrote:
>>> On Tue, Oct 1, 2019 at 5:38 PM Jens Axboe <axboe@kernel.dk> wrote:
> 
>>> What's wrong with using __kernel_timespec? Just the name?
>>> I suppose liburing could add a macro to give it a different name
>>> for its users.
>>
>> Just that it seems I need to make it available through liburing on
>> systems that don't have it yet. Not a big deal, though.
> 
> Ah, right. I t would not cover the case of building against kernel
> headers earlier than linux-5.1 but running on a 5.4+ kernel.
> 
> I assumed that that you would require new kernel headers anyway,
> but if you have a copy of the io_uring header, that is not necessary.

Since I rely mostly on folks using liburing, we include the header as
well. So I'm just going to use __kernel_timespec in liburing, and have
a check to define it if we don't have it.

>> One thing that struck me about this approach - we then lose the ability to
>> differentiate between "don't want a timed timeout" with ts == NULL, vs
>> tv_sec and tv_nsec both being 0.
> 
> You could always define a special constant such as
> '#define IO_URING_TIMEOUT_NEVER -1ull' if you want to
> support for 'never wait if it's not already done' and 'wait indefinitely'.

That thought did occur to me, but that seems pretty ugly... The ts == NULL
vs ts != NULL and timeout set is a more well understood pattern.
Florian Weimer Oct. 1, 2019, 4:07 p.m. UTC | #7
* Arnd Bergmann:

> On Tue, Oct 1, 2019 at 5:38 PM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> On 10/1/19 8:09 AM, Jens Axboe wrote:
>> > On 9/30/19 2:20 PM, Arnd Bergmann wrote:
>> >> All system calls use struct __kernel_timespec instead of the old struct
>> >> timespec, but this one was just added with the old-style ABI. Change it
>> >> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
>> >> the need for compat handlers on 32-bit architectures.
>> >>
>> >> Any user space caller will have to use __kernel_timespec now, but this
>> >> is unambiguous and works for any C library regardless of the time_t
>> >> definition. A nicer way to specify the timeout would have been a less
>> >> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
>> >> change that as this would impact both 32-bit and 64-bit users.
>> >
>> > Thanks for catching that, Arnd. Applied.
>>
>> On second thought - since there appears to be no good 64-bit timespec
>> available to userspace, the alternative here is including on in liburing.
>
> What's wrong with using __kernel_timespec? Just the name?
> I suppose liburing could add a macro to give it a different name
> for its users.

Yes, mostly the name.

__ names are reserved for the C/C++ implementation (which does not
include the kernel).  __kernel_timespec looks like an internal kernel
type to the uninitiated, not a UAPI type.

Once we have struct timespec64 in userspace, you also end up with
copying stuff around or introducing aliasing violations.

I'm not saying those concerns are valid, but you asked what's wrong with
it. 8-)

Thanks,
Florian
Jens Axboe Oct. 1, 2019, 6:08 p.m. UTC | #8
On 10/1/19 10:07 AM, Florian Weimer wrote:
> * Arnd Bergmann:
> 
>> On Tue, Oct 1, 2019 at 5:38 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>
>>> On 10/1/19 8:09 AM, Jens Axboe wrote:
>>>> On 9/30/19 2:20 PM, Arnd Bergmann wrote:
>>>>> All system calls use struct __kernel_timespec instead of the old struct
>>>>> timespec, but this one was just added with the old-style ABI. Change it
>>>>> now to enforce the use of __kernel_timespec, avoiding ABI confusion and
>>>>> the need for compat handlers on 32-bit architectures.
>>>>>
>>>>> Any user space caller will have to use __kernel_timespec now, but this
>>>>> is unambiguous and works for any C library regardless of the time_t
>>>>> definition. A nicer way to specify the timeout would have been a less
>>>>> ambiguous 64-bit nanosecond value, but I suppose it's too late now to
>>>>> change that as this would impact both 32-bit and 64-bit users.
>>>>
>>>> Thanks for catching that, Arnd. Applied.
>>>
>>> On second thought - since there appears to be no good 64-bit timespec
>>> available to userspace, the alternative here is including on in liburing.
>>
>> What's wrong with using __kernel_timespec? Just the name?
>> I suppose liburing could add a macro to give it a different name
>> for its users.
> 
> Yes, mostly the name.
> 
> __ names are reserved for the C/C++ implementation (which does not
> include the kernel).  __kernel_timespec looks like an internal kernel
> type to the uninitiated, not a UAPI type.
> 
> Once we have struct timespec64 in userspace, you also end up with
> copying stuff around or introducing aliasing violations.
> 
> I'm not saying those concerns are valid, but you asked what's wrong with
> it. 8-)

FWIW, I do agree, __kernel_timespec sounds like an internal type, not
something apps should be using. timespec64 works a lot better for that.
Oh well.
diff mbox series

Patch

diff --git a/fs/io_uring.c b/fs/io_uring.c
index aa8ac557493c..8a0381f1a43b 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -1892,15 +1892,15 @@  static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	unsigned count, req_dist, tail_index;
 	struct io_ring_ctx *ctx = req->ctx;
 	struct list_head *entry;
-	struct timespec ts;
+	struct timespec64 ts;
 
 	if (unlikely(ctx->flags & IORING_SETUP_IOPOLL))
 		return -EINVAL;
 	if (sqe->flags || sqe->ioprio || sqe->buf_index || sqe->timeout_flags ||
 	    sqe->len != 1)
 		return -EINVAL;
-	if (copy_from_user(&ts, (void __user *) (unsigned long) sqe->addr,
-	    sizeof(ts)))
+
+	if (get_timespec64(&ts, u64_to_user_ptr(sqe->addr)))
 		return -EFAULT;
 
 	/*
@@ -1934,7 +1934,7 @@  static int io_timeout(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 
 	hrtimer_init(&req->timeout.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 	req->timeout.timer.function = io_timeout_fn;
-	hrtimer_start(&req->timeout.timer, timespec_to_ktime(ts),
+	hrtimer_start(&req->timeout.timer, timespec64_to_ktime(ts),
 			HRTIMER_MODE_REL);
 	return 0;
 }