diff mbox series

[v4,18/26] arm64: vdso32: Replace TASK_SIZE_32 check in vgettimeofday

Message ID 20200317122220.30393-19-vincenzo.frascino@arm.com (mailing list archive)
State New, archived
Headers show
Series Introduce common headers for vDSO | expand

Commit Message

Vincenzo Frascino March 17, 2020, 12:22 p.m. UTC
For ABI compatibility with arm32, the compat vDSO layer on arm64 needs
to return -EINVAL when UINTPTR_MAX is passed as argument to the
clock_get* functions.

Replace TASK_SIZE_32 with a more semantically correct formula that checks
for wrapping around 0.

Note: This will allow to not define TASK_SIZE_32 for the vdso headers in a
future patch that will introduce asm/vdso/processor.h on arm64.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
---
 arch/arm64/kernel/vdso32/vgettimeofday.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

Comments

Catalin Marinas March 17, 2020, 2:38 p.m. UTC | #1
On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
> diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
> index 54fc1c2ce93f..91138077b073 100644
> --- a/arch/arm64/kernel/vdso32/vgettimeofday.c
> +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
> @@ -8,11 +8,14 @@
>  #include <linux/time.h>
>  #include <linux/types.h>
>  
> +#define VALID_CLOCK_ID(x) \
> +	((x >= 0) && (x < VDSO_BASES))
> +
>  int __vdso_clock_gettime(clockid_t clock,
>  			 struct old_timespec32 *ts)
>  {
>  	/* The checks below are required for ABI consistency with arm */
> -	if ((u32)ts >= TASK_SIZE_32)
> +	if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
>  		return -EFAULT;
>  
>  	return __cvdso_clock_gettime32(clock, ts);

I probably miss something but I can't find the TASK_SIZE check in the
arch/arm/vdso/vgettimeofday.c code. Is this done elsewhere?

> @@ -22,7 +25,7 @@ int __vdso_clock_gettime64(clockid_t clock,
>  			   struct __kernel_timespec *ts)
>  {
>  	/* The checks below are required for ABI consistency with arm */
> -	if ((u32)ts >= TASK_SIZE_32)
> +	if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
>  		return -EFAULT;
>  
>  	return __cvdso_clock_gettime(clock, ts);
> @@ -38,9 +41,12 @@ int __vdso_clock_getres(clockid_t clock_id,
>  			struct old_timespec32 *res)
>  {
>  	/* The checks below are required for ABI consistency with arm */
> -	if ((u32)res >= TASK_SIZE_32)
> +	if ((u32)res > UINTPTR_MAX - sizeof(res) + 1)
>  		return -EFAULT;
>  
> +	if (!VALID_CLOCK_ID(clock_id) && res == NULL)
> +		return -EINVAL;

This last check needs an explanation. If the clock_id is invalid but res
is not NULL, we allow it. I don't see where the compatibility issue is,
arm32 doesn't have such check.
Vincenzo Frascino March 17, 2020, 3:04 p.m. UTC | #2
Hi Catalin,

On 3/17/20 2:38 PM, Catalin Marinas wrote:
> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
>> diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
>> index 54fc1c2ce93f..91138077b073 100644
>> --- a/arch/arm64/kernel/vdso32/vgettimeofday.c
>> +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
>> @@ -8,11 +8,14 @@
>>  #include <linux/time.h>
>>  #include <linux/types.h>
>>  
>> +#define VALID_CLOCK_ID(x) \
>> +	((x >= 0) && (x < VDSO_BASES))
>> +
>>  int __vdso_clock_gettime(clockid_t clock,
>>  			 struct old_timespec32 *ts)
>>  {
>>  	/* The checks below are required for ABI consistency with arm */
>> -	if ((u32)ts >= TASK_SIZE_32)
>> +	if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
>>  		return -EFAULT;
>>  
>>  	return __cvdso_clock_gettime32(clock, ts);
> 
> I probably miss something but I can't find the TASK_SIZE check in the
> arch/arm/vdso/vgettimeofday.c code. Is this done elsewhere?

Can TASK_SIZE > UINTPTR_MAX on an arm64 system?

> 
>> @@ -22,7 +25,7 @@ int __vdso_clock_gettime64(clockid_t clock,
>>  			   struct __kernel_timespec *ts)
>>  {
>>  	/* The checks below are required for ABI consistency with arm */
>> -	if ((u32)ts >= TASK_SIZE_32)
>> +	if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
>>  		return -EFAULT;
>>  
>>  	return __cvdso_clock_gettime(clock, ts);
>> @@ -38,9 +41,12 @@ int __vdso_clock_getres(clockid_t clock_id,
>>  			struct old_timespec32 *res)
>>  {
>>  	/* The checks below are required for ABI consistency with arm */
>> -	if ((u32)res >= TASK_SIZE_32)
>> +	if ((u32)res > UINTPTR_MAX - sizeof(res) + 1)
>>  		return -EFAULT;
>>  
>> +	if (!VALID_CLOCK_ID(clock_id) && res == NULL)
>> +		return -EINVAL;
> 
> This last check needs an explanation. If the clock_id is invalid but res
> is not NULL, we allow it. I don't see where the compatibility issue is,
> arm32 doesn't have such check.
> 

The case that you are describing has to return -EPERM per ABI spec. This case
has to return -EINVAL.

The first case is taken care from the generic code. But if we don't do this
check before on arm64 compat we end up returning the wrong error code.

For the non compat case the same is taken care from the syscall fallback [1].

[1] lib/vdso/gettimeofday.c
Catalin Marinas March 17, 2020, 3:50 p.m. UTC | #3
On Tue, Mar 17, 2020 at 03:04:01PM +0000, Vincenzo Frascino wrote:
> On 3/17/20 2:38 PM, Catalin Marinas wrote:
> > On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
> >> diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
> >> index 54fc1c2ce93f..91138077b073 100644
> >> --- a/arch/arm64/kernel/vdso32/vgettimeofday.c
> >> +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
> >> @@ -8,11 +8,14 @@
> >>  #include <linux/time.h>
> >>  #include <linux/types.h>
> >>  
> >> +#define VALID_CLOCK_ID(x) \
> >> +	((x >= 0) && (x < VDSO_BASES))
> >> +
> >>  int __vdso_clock_gettime(clockid_t clock,
> >>  			 struct old_timespec32 *ts)
> >>  {
> >>  	/* The checks below are required for ABI consistency with arm */
> >> -	if ((u32)ts >= TASK_SIZE_32)
> >> +	if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
> >>  		return -EFAULT;
> >>  
> >>  	return __cvdso_clock_gettime32(clock, ts);
> > 
> > I probably miss something but I can't find the TASK_SIZE check in the
> > arch/arm/vdso/vgettimeofday.c code. Is this done elsewhere?
> 
> Can TASK_SIZE > UINTPTR_MAX on an arm64 system?

TASK_SIZE yes on arm64 but not TASK_SIZE_32. I was asking about the
arm32 check where TASK_SIZE < UINTPTR_MAX. How does the vdsotest return
-EFAULT on arm32? Which code path causes this in the user vdso code?

My guess is that on arm32 it only fails with -EFAULT in the syscall
fallback path since a copy_to_user() would fail the access_ok() check.
Does it always take the fallback path if ts > TASK_SIZE?

On arm64, while we have a similar access_ok() check, USER_DS is (1 <<
VA_BITS) even for compat tasks (52-bit maximum), so it doesn't detect
the end of the user address space for 32-bit tasks.

Is this an issue for other syscalls expecting EFAULT at UINTPTR_MAX and
instead getting a signal? The vdsotest seems to be the only one assuming
this. I don't have a simple solution here since USER_DS currently needs
to be a constant (used in entry.S).

I could as well argue that this is not a valid ABI test, no real-world
program relying on this behaviour ;).

> >> @@ -22,7 +25,7 @@ int __vdso_clock_gettime64(clockid_t clock,
> >>  			   struct __kernel_timespec *ts)
> >>  {
> >>  	/* The checks below are required for ABI consistency with arm */
> >> -	if ((u32)ts >= TASK_SIZE_32)
> >> +	if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
> >>  		return -EFAULT;
> >>  
> >>  	return __cvdso_clock_gettime(clock, ts);
> >> @@ -38,9 +41,12 @@ int __vdso_clock_getres(clockid_t clock_id,
> >>  			struct old_timespec32 *res)
> >>  {
> >>  	/* The checks below are required for ABI consistency with arm */
> >> -	if ((u32)res >= TASK_SIZE_32)
> >> +	if ((u32)res > UINTPTR_MAX - sizeof(res) + 1)
> >>  		return -EFAULT;
> >>  
> >> +	if (!VALID_CLOCK_ID(clock_id) && res == NULL)
> >> +		return -EINVAL;
> > 
> > This last check needs an explanation. If the clock_id is invalid but res
> > is not NULL, we allow it. I don't see where the compatibility issue is,
> > arm32 doesn't have such check.
> 
> The case that you are describing has to return -EPERM per ABI spec. This case
> has to return -EINVAL.
> 
> The first case is taken care from the generic code. But if we don't do this
> check before on arm64 compat we end up returning the wrong error code.

I guess I have the same question as above. Where does the arm32 code
return -EINVAL for that case? Did it work correctly before you removed
the TASK_SIZE_32 check?

Sorry, just trying to figure out where the compatibility aspect is and
that we don't add some artificial checks only to satisfy a vdsotest case
that may or may not have relevance to any other user program.
Vincenzo Frascino March 17, 2020, 4:40 p.m. UTC | #4
Hi Catalin,

On 3/17/20 3:50 PM, Catalin Marinas wrote:
> On Tue, Mar 17, 2020 at 03:04:01PM +0000, Vincenzo Frascino wrote:
>> On 3/17/20 2:38 PM, Catalin Marinas wrote:
>>> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:

[...]

>>
>> Can TASK_SIZE > UINTPTR_MAX on an arm64 system?
> 
> TASK_SIZE yes on arm64 but not TASK_SIZE_32. I was asking about the
> arm32 check where TASK_SIZE < UINTPTR_MAX. How does the vdsotest return
> -EFAULT on arm32? Which code path causes this in the user vdso code?
>

Sorry I got confused because you referred to arch/arm/vdso/vgettimeofday.c which
is the arm64 implementation, not the compat one :)

In the case of arm32 everything is handled via syscall fallback.

> My guess is that on arm32 it only fails with -EFAULT in the syscall
> fallback path since a copy_to_user() would fail the access_ok() check.
> Does it always take the fallback path if ts > TASK_SIZE?
> 

Correct, it goes via fallback. The return codes for these syscalls are specified
by the ABI [1]. Then I agree with you the way on which arm32 achieves it should
be via access_ok() check.

> On arm64, while we have a similar access_ok() check, USER_DS is (1 <<
> VA_BITS) even for compat tasks (52-bit maximum), so it doesn't detect
> the end of the user address space for 32-bit tasks.
> 

I agree on this as well, if you remember we discussed it in past.

> Is this an issue for other syscalls expecting EFAULT at UINTPTR_MAX and
> instead getting a signal? The vdsotest seems to be the only one assuming
> this. I don't have a simple solution here since USER_DS currently needs
> to be a constant (used in entry.S).
> 
> I could as well argue that this is not a valid ABI test, no real-world
> program relying on this behaviour ;).
> 

Ok, but I could argue that unless you manage to prove to me that there is no
software out there relying on this behavior, I guess that the safest way to go
is to have a check here ;)

More than that, being a simple check, it has no performance impact.

[...]

>>>
>>> This last check needs an explanation. If the clock_id is invalid but res
>>> is not NULL, we allow it. I don't see where the compatibility issue is,
>>> arm32 doesn't have such check.
>>
>> The case that you are describing has to return -EPERM per ABI spec. This case
>> has to return -EINVAL.
>>
>> The first case is taken care from the generic code. But if we don't do this
>> check before on arm64 compat we end up returning the wrong error code.
> 
> I guess I have the same question as above. Where does the arm32 code
> return -EINVAL for that case? Did it work correctly before you removed
> the TASK_SIZE_32 check?
>

I repeated the test and seems that it was failing even before I removed
TASK_SIZE_32. For reasons I can't explain I did not catch it before.

The getres syscall should return -EINVAL in the cases specified in [1].


> Sorry, just trying to figure out where the compatibility aspect is and
> that we don't add some artificial checks only to satisfy a vdsotest case
> that may or may not have relevance to any other user program.
> 

No issue Catalin. I understand the implications of the change that I am
proposing with this series and I am the first one who wants to get it right.

Said that vdsotest follows "pedantically" the ABI spec and I chose it at the
beginning of this journey to have as less surprises as I could in the long run.

[1] http://man7.org/linux/man-pages/man2/clock_getres.2.html
Vincenzo Frascino March 17, 2020, 4:43 p.m. UTC | #5
On 3/17/20 4:40 PM, Vincenzo Frascino wrote:
> Hi Catalin,
> 
> On 3/17/20 3:50 PM, Catalin Marinas wrote:
>> On Tue, Mar 17, 2020 at 03:04:01PM +0000, Vincenzo Frascino wrote:
>>> On 3/17/20 2:38 PM, Catalin Marinas wrote:
>>>> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
> 
> [...]
> 
>>>
>>> Can TASK_SIZE > UINTPTR_MAX on an arm64 system?
>>
>> TASK_SIZE yes on arm64 but not TASK_SIZE_32. I was asking about the
>> arm32 check where TASK_SIZE < UINTPTR_MAX. How does the vdsotest return
>> -EFAULT on arm32? Which code path causes this in the user vdso code?
>>
> 
> Sorry I got confused because you referred to arch/arm/vdso/vgettimeofday.c which
> is the arm64 implementation, not the compat one :)
> 

I stand corrected arch/*arm*/vdso/vgettimeofday.c is definitely the arm32
implemetation... I got completely confused here :)
Catalin Marinas March 17, 2020, 5:48 p.m. UTC | #6
On Tue, Mar 17, 2020 at 04:40:48PM +0000, Vincenzo Frascino wrote:
> On 3/17/20 3:50 PM, Catalin Marinas wrote:
> > On Tue, Mar 17, 2020 at 03:04:01PM +0000, Vincenzo Frascino wrote:
> >> On 3/17/20 2:38 PM, Catalin Marinas wrote:
> >>> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
> >>
> >> Can TASK_SIZE > UINTPTR_MAX on an arm64 system?
> > 
> > TASK_SIZE yes on arm64 but not TASK_SIZE_32. I was asking about the
> > arm32 check where TASK_SIZE < UINTPTR_MAX. How does the vdsotest return
> > -EFAULT on arm32? Which code path causes this in the user vdso code?
> 
> Sorry I got confused because you referred to arch/arm/vdso/vgettimeofday.c which
> is the arm64 implementation, not the compat one :)

You figured out (in your subsequent reply) that I was indeed talking
about arm32 ;).

> In the case of arm32 everything is handled via syscall fallback.

So clock_gettime() on arm32 always falls back to the syscall?

> > My guess is that on arm32 it only fails with -EFAULT in the syscall
> > fallback path since a copy_to_user() would fail the access_ok() check.
> > Does it always take the fallback path if ts > TASK_SIZE?
> 
> Correct, it goes via fallback. The return codes for these syscalls are specified
> by the ABI [1]. Then I agree with you the way on which arm32 achieves it should
> be via access_ok() check.

"it should be" or "it is" on arm32?

If, on arm32, clock_gettime() is (would be?) handled in the vdso
entirely, who checks for the pointer outside the accessible address
space (as per the clock_gettime man page)?

I'm fine with such check as long as it is consistent across arm32 and
arm64 compat. Or even on arm64 native between syscall fallback and vdso
execution. I haven't figured out yet whether this is the case.

> >>> This last check needs an explanation. If the clock_id is invalid but res
> >>> is not NULL, we allow it. I don't see where the compatibility issue is,
> >>> arm32 doesn't have such check.
> >>
> >> The case that you are describing has to return -EPERM per ABI spec. This case
> >> has to return -EINVAL.
> >>
> >> The first case is taken care from the generic code. But if we don't do this
> >> check before on arm64 compat we end up returning the wrong error code.
> > 
> > I guess I have the same question as above. Where does the arm32 code
> > return -EINVAL for that case? Did it work correctly before you removed
> > the TASK_SIZE_32 check?
> 
> I repeated the test and seems that it was failing even before I removed
> TASK_SIZE_32. For reasons I can't explain I did not catch it before.
> 
> The getres syscall should return -EINVAL in the cases specified in [1].

It states 'clk_id specified is not supported on this system'. Fair
enough but it doesn't say that it returns -EINVAL only if res == NULL.
You also don't explain why __cvdso_clock_getres_time32() doesn't already
detect an invalid clk_id on arm64 compat (but does it on arm32).
Vincenzo Frascino March 18, 2020, 4:14 p.m. UTC | #7
Hi Catalin,

On 3/17/20 5:48 PM, Catalin Marinas wrote:
> On Tue, Mar 17, 2020 at 04:40:48PM +0000, Vincenzo Frascino wrote:
>> On 3/17/20 3:50 PM, Catalin Marinas wrote:
>>> On Tue, Mar 17, 2020 at 03:04:01PM +0000, Vincenzo Frascino wrote:
>>>> On 3/17/20 2:38 PM, Catalin Marinas wrote:
>>>>> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
>>>>
>>>> Can TASK_SIZE > UINTPTR_MAX on an arm64 system?
>>>
>>> TASK_SIZE yes on arm64 but not TASK_SIZE_32. I was asking about the
>>> arm32 check where TASK_SIZE < UINTPTR_MAX. How does the vdsotest return
>>> -EFAULT on arm32? Which code path causes this in the user vdso code?
>>
>> Sorry I got confused because you referred to arch/arm/vdso/vgettimeofday.c which
>> is the arm64 implementation, not the compat one :)
> 
> You figured out (in your subsequent reply) that I was indeed talking
> about arm32 ;).
> 

From when I do not drink coffee, afternoon gets more difficult ;)

>> In the case of arm32 everything is handled via syscall fallback.
> 
> So clock_gettime() on arm32 always falls back to the syscall?
> 

This seems not what you asked, and I think I answered accordingly. Anyway, in
the case of arm32 the error code path is handled via syscall fallback.

Look at the code below as an example (I am using getres because I know this
email will be already too long, and I do not want to add pointless code, but the
concept is the same for gettime and the others):

static __maybe_unused
int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
{
	int ret = __cvdso_clock_getres_common(clock, res);

	if (unlikely(ret))
		return clock_getres_fallback(clock, res);
	return 0;
}

When the return code of the "vdso" internal function returns an error the system
call is triggered.

In general arm32 has been ported to the unified vDSO library hence it has a
proper implementation on par with all the other architectures supported by the
unified library.

>>> My guess is that on arm32 it only fails with -EFAULT in the syscall
>>> fallback path since a copy_to_user() would fail the access_ok() check.
>>> Does it always take the fallback path if ts > TASK_SIZE?
>>
>> Correct, it goes via fallback. The return codes for these syscalls are specified
>> by the ABI [1]. Then I agree with you the way on which arm32 achieves it should
>> be via access_ok() check.
> 
> "it should be" or "it is" on arm32?
> 

What I meant is that I did not check how copy_from_user() implementation on
arm32 before answering but I did imagine at that point that it would use
access_ok(), as it does.

For better clarification look at the code below (kernel/time/posix-timers.c if
you want to have a look at the rest of the code):

SYSCALL_DEFINE2(clock_gettime, const clockid_t, which_clock,
		struct __kernel_timespec __user *, tp)
{
	const struct k_clock *kc = clockid_to_kclock(which_clock);
	struct timespec64 kernel_tp;
	int error;

	if (!kc)
		return -EINVAL;

	error = kc->clock_get_timespec(which_clock, &kernel_tp);

	if (!error && put_timespec64(&kernel_tp, tp))
		error = -EFAULT;

	return error;
}

This is the syscall on which we fallback when the "vdso" internal function
returns an error. The behavior of the vdso has to be exactly the same of the
syscall otherwise we end up in an ABI breakage.

The path followed by put_timespec64() is:

put_timespec64() -> copy_to_user() -> _copy_to_user() ->  access_ok()

and this path is true for every architecture being this common code.

Hope this provides better insight on my previous answer.

> If, on arm32, clock_gettime() is (would be?) handled in the vdso
> entirely, who checks for the pointer outside the accessible address
> space (as per the clock_gettime man page)?
> 
> I'm fine with such check as long as it is consistent across arm32 and
> arm64 compat. Or even on arm64 native between syscall fallback and vdso
> execution. I haven't figured out yet whether this is the case.
> 

Just to contextualize again we are discussing here the check:

	if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
		return -EFAULT;

On all the architectures we return -EFAULT if copy_to_user() fails due to
access_ok() failing (kernel/time/time.c):

int put_timespec64(const struct timespec64 *ts,
		   struct __kernel_timespec __user *uts)
{
	[...]

	return copy_to_user(uts, &kts, sizeof(kts)) ? -EFAULT : 0;
}

On arm64 compat it gets tricky, because arm64 uses USER_DS (addr_limit set
happens in arch/arm64/kernel/entry.S), which is defined as (1 << VA_BITS), as
access_ok() validation even on compat tasks and since arm64 supports up to 52bit
VA, this does not detect the end of the user address space for a 32 bit task.

So to be logically consistent with the ABI on arm32 and arm64 (and all the other
architectures) we need to make an explicit check in the case of arm64 compat.

>>>>> This last check needs an explanation. If the clock_id is invalid but res
>>>>> is not NULL, we allow it. I don't see where the compatibility issue is,
>>>>> arm32 doesn't have such check.
>>>>
>>>> The case that you are describing has to return -EPERM per ABI spec. This case
>>>> has to return -EINVAL.
>>>>
>>>> The first case is taken care from the generic code. But if we don't do this
>>>> check before on arm64 compat we end up returning the wrong error code.
>>>
>>> I guess I have the same question as above. Where does the arm32 code
>>> return -EINVAL for that case? Did it work correctly before you removed
>>> the TASK_SIZE_32 check?
>>
>> I repeated the test and seems that it was failing even before I removed
>> TASK_SIZE_32. For reasons I can't explain I did not catch it before.
>>
>> The getres syscall should return -EINVAL in the cases specified in [1].
> 
> It states 'clk_id specified is not supported on this system'. Fair
> enough but it doesn't say that it returns -EINVAL only if res == NULL.

Actually it does, the description of getres() starts with:

'The function clock_getres() finds the resolution (precision) of the
specified clock clk_id, and, if res is *non-NULL*, stores it in the
struct timespec pointed to by res.'

Please refer to the system call below of which we mimic the behavior in the vdso
(kernel/time/posix-timers.c):

SYSCALL_DEFINE2(clock_getres_time32, clockid_t, which_clock,
		struct old_timespec32 __user *, tp)
{
	const struct k_clock *kc = clockid_to_kclock(which_clock);
	struct timespec64 ts;
	int err;

	if (!kc)
		return -EINVAL;

	err = kc->clock_getres(which_clock, &ts);
	if (!err && tp && put_old_timespec32(&ts, tp))
		return -EFAULT;

	return err;
}

If the clock is bogus and res == NULL we are supposed to return -EINVAL and not
-EFAULT or something else. This is what the test is trying to verify. If the
check below is not in place on arm64 compat, I get error report from the test suite.
	if (!VALID_CLOCK_ID(clock_id) && res == NULL)
		return -EINVAL;

error message from vdsotest:

passing bogus clock id and NULL to clock_getres (VDSO): unexpected errno 14 (Bad
address), expected 22 (Invalid argument)
passing bogus clock id and NULL to clock_getres (VDSO): exited with status 1,
expected 0
clock-getres-monotonic-coarse/abi: 1 failures/inconsistencies encountered

> You also don't explain why __cvdso_clock_getres_time32() doesn't already
> detect an invalid clk_id on arm64 compat (but does it on arm32).
> 

Thanks for asking to me this question, if I would not have spent the day trying
to explain it, I would not have found a bug in the getres() fallback:

diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 1dd22da1c3a9..803039d504de 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -25,8 +25,8 @@
 #define __NR_compat_gettimeofday       78
 #define __NR_compat_sigreturn          119
 #define __NR_compat_rt_sigreturn       173
-#define __NR_compat_clock_getres       247
 #define __NR_compat_clock_gettime      263
+#define __NR_compat_clock_getres       264
 #define __NR_compat_clock_gettime64    403
 #define __NR_compat_clock_getres_time64        406

In particular compat getres is mis-numbered and that is what causes the issue.

I am going to add a patch to my v5 that addresses the issue (or probably a
separate one and cc stable since it fixes a bug) and in this patch I will remove
the check on VALID_CLOCK_ID.

I hope that this long email helps you to have a clearer picture of what is going
on. Please let me know if there is still something missing.
Catalin Marinas March 18, 2020, 6:36 p.m. UTC | #8
On Wed, Mar 18, 2020 at 04:14:26PM +0000, Vincenzo Frascino wrote:
> On 3/17/20 5:48 PM, Catalin Marinas wrote:
> > On Tue, Mar 17, 2020 at 04:40:48PM +0000, Vincenzo Frascino wrote:
> >> On 3/17/20 3:50 PM, Catalin Marinas wrote:
> >>> On Tue, Mar 17, 2020 at 03:04:01PM +0000, Vincenzo Frascino wrote:
> >>>> On 3/17/20 2:38 PM, Catalin Marinas wrote:
> >>>>> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
> >>>>
> >>>> Can TASK_SIZE > UINTPTR_MAX on an arm64 system?
> >>>
> >>> TASK_SIZE yes on arm64 but not TASK_SIZE_32. I was asking about the
> >>> arm32 check where TASK_SIZE < UINTPTR_MAX. How does the vdsotest return
> >>> -EFAULT on arm32? Which code path causes this in the user vdso code?
[...]
> > So clock_gettime() on arm32 always falls back to the syscall?
> 
> This seems not what you asked, and I think I answered accordingly. Anyway, in
> the case of arm32 the error code path is handled via syscall fallback.
> 
> Look at the code below as an example (I am using getres because I know this
> email will be already too long, and I do not want to add pointless code, but the
> concept is the same for gettime and the others):
> 
> static __maybe_unused
> int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
> {
> 	int ret = __cvdso_clock_getres_common(clock, res);
> 
> 	if (unlikely(ret))
> 		return clock_getres_fallback(clock, res);
> 	return 0;
> }
> 
> When the return code of the "vdso" internal function returns an error the system
> call is triggered.

But when __cvdso_clock_getres_common() does *not* return an error, it
means that it handled the clock_getres() call without a fallback to the
syscall. I assume this is possible on arm32. When the clock_getres() is
handled directly (not as a syscall), why doesn't arm32 need the same
(res >= TASK_SIZE) check?

> In general arm32 has been ported to the unified vDSO library hence it has a
> proper implementation on par with all the other architectures supported by the
> unified library.

And that's what I do not fully understand. When the call doesn't fall
back to a syscall, why does arm32 and arm64 compat need to differ in the
implementation? I may be missing something here.

> >>> My guess is that on arm32 it only fails with -EFAULT in the syscall
> >>> fallback path since a copy_to_user() would fail the access_ok() check.
> >>> Does it always take the fallback path if ts > TASK_SIZE?
> >>
> >> Correct, it goes via fallback. The return codes for these syscalls are specified
> >> by the ABI [1]. Then I agree with you the way on which arm32 achieves it should
> >> be via access_ok() check.
> > 
> > "it should be" or "it is" on arm32?
[...]
> SYSCALL_DEFINE2(clock_gettime, const clockid_t, which_clock,
> 		struct __kernel_timespec __user *, tp)
[...]
> This is the syscall on which we fallback when the "vdso" internal function
> returns an error. The behavior of the vdso has to be exactly the same of the
> syscall otherwise we end up in an ABI breakage.

I agree. I just don't understand why on arm32 the vdso code doesn't need
to check for tp >= TASK_SIZE while the arm64 compat one does when it
does *not* fall back to a syscall. I understand the syscall fallback
case, that's caused by how we handle access_ok(), but this doesn't
explain the vdso-only case.

> >>>>> This last check needs an explanation. If the clock_id is invalid but res
> >>>>> is not NULL, we allow it. I don't see where the compatibility issue is,
> >>>>> arm32 doesn't have such check.
> >>>>
> >>>> The case that you are describing has to return -EPERM per ABI spec. This case
> >>>> has to return -EINVAL.
> >>>>
> >>>> The first case is taken care from the generic code. But if we don't do this
> >>>> check before on arm64 compat we end up returning the wrong error code.
> >>>
> >>> I guess I have the same question as above. Where does the arm32 code
> >>> return -EINVAL for that case? Did it work correctly before you removed
> >>> the TASK_SIZE_32 check?
> >>
> >> I repeated the test and seems that it was failing even before I removed
> >> TASK_SIZE_32. For reasons I can't explain I did not catch it before.
> >>
> >> The getres syscall should return -EINVAL in the cases specified in [1].
> > 
> > It states 'clk_id specified is not supported on this system'. Fair
> > enough but it doesn't say that it returns -EINVAL only if res == NULL.
> 
> Actually it does, the description of getres() starts with:
> 
> 'The function clock_getres() finds the resolution (precision) of the
> specified clock clk_id, and, if res is *non-NULL*, stores it in the
> struct timespec pointed to by res.'
> 
> Please refer to the system call below of which we mimic the behavior in the vdso
> (kernel/time/posix-timers.c):
> 
> SYSCALL_DEFINE2(clock_getres_time32, clockid_t, which_clock,
> 		struct old_timespec32 __user *, tp)
> {
> 	const struct k_clock *kc = clockid_to_kclock(which_clock);
> 	struct timespec64 ts;
> 	int err;
> 
> 	if (!kc)
> 		return -EINVAL;
> 
> 	err = kc->clock_getres(which_clock, &ts);
> 	if (!err && tp && put_old_timespec32(&ts, tp))
> 		return -EFAULT;
> 
> 	return err;
> }
> 
> If the clock is bogus and res == NULL we are supposed to return -EINVAL and not
> -EFAULT or something else.

If the clock is bogus, the above returns 'err' irrespective of the value
of 'tp'. I presume 'err' is -EINVAL in this case. But there is no
condition that tp == NULL above.

What the above tries to achieve is that if there is no error (err == 0)
and tp != NULL, try to write the timespec to the user tp pointer. If
put_old_timespec32() fails, that's when we return -EFAULT.

> This is what the test is trying to verify. If the check below is not
> in place on arm64 compat, I get error report from the test suite.
> 	if (!VALID_CLOCK_ID(clock_id) && res == NULL)
> 		return -EINVAL;

I really don't get where you deduced that you need to check for res ==
NULL. The above should be:

	if (!VALID_CLOCK_ID(clock_id))
		return -EINVAL;

Furthermore, my assumption is that __cvdso_clock_getres_common() should
handle this case already and we don't need it in the arch vdso code.

> > You also don't explain why __cvdso_clock_getres_time32() doesn't already
> > detect an invalid clk_id on arm64 compat (but does it on arm32).
> > 
> 
> Thanks for asking to me this question, if I would not have spent the day trying
> to explain it, I would not have found a bug in the getres() fallback:
> 
> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
> index 1dd22da1c3a9..803039d504de 100644
> --- a/arch/arm64/include/asm/unistd.h
> +++ b/arch/arm64/include/asm/unistd.h
> @@ -25,8 +25,8 @@
>  #define __NR_compat_gettimeofday       78
>  #define __NR_compat_sigreturn          119
>  #define __NR_compat_rt_sigreturn       173
> -#define __NR_compat_clock_getres       247
>  #define __NR_compat_clock_gettime      263
> +#define __NR_compat_clock_getres       264
>  #define __NR_compat_clock_gettime64    403
>  #define __NR_compat_clock_getres_time64        406
> 
> In particular compat getres is mis-numbered and that is what causes the issue.
> 
> I am going to add a patch to my v5 that addresses the issue (or probably a
> separate one and cc stable since it fixes a bug) and in this patch I will remove
> the check on VALID_CLOCK_ID.

Please send this as a separate patch that should be merged as a fix, cc
stable.

> I hope that this long email helps you to have a clearer picture of what is going
> on. Please let me know if there is still something missing.

Not entirely. Sorry.
Vincenzo Frascino March 19, 2020, 12:38 p.m. UTC | #9
Hi Catalin,

On 3/18/20 6:36 PM, Catalin Marinas wrote:
> On Wed, Mar 18, 2020 at 04:14:26PM +0000, Vincenzo Frascino wrote:
>> On 3/17/20 5:48 PM, Catalin Marinas wrote:
>>> On Tue, Mar 17, 2020 at 04:40:48PM +0000, Vincenzo Frascino wrote:
>>>> On 3/17/20 3:50 PM, Catalin Marinas wrote:
>>>>> On Tue, Mar 17, 2020 at 03:04:01PM +0000, Vincenzo Frascino wrote:
>>>>>> On 3/17/20 2:38 PM, Catalin Marinas wrote:
>>>>>>> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
>>>>>>
>>>>>> Can TASK_SIZE > UINTPTR_MAX on an arm64 system?
>>>>>
>>>>> TASK_SIZE yes on arm64 but not TASK_SIZE_32. I was asking about the
>>>>> arm32 check where TASK_SIZE < UINTPTR_MAX. How does the vdsotest return
>>>>> -EFAULT on arm32? Which code path causes this in the user vdso code?
> [...]
>>> So clock_gettime() on arm32 always falls back to the syscall?
>>
>> This seems not what you asked, and I think I answered accordingly. Anyway, in
>> the case of arm32 the error code path is handled via syscall fallback.
>>
>> Look at the code below as an example (I am using getres because I know this
>> email will be already too long, and I do not want to add pointless code, but the
>> concept is the same for gettime and the others):
>>
>> static __maybe_unused
>> int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
>> {
>> 	int ret = __cvdso_clock_getres_common(clock, res);
>>
>> 	if (unlikely(ret))
>> 		return clock_getres_fallback(clock, res);
>> 	return 0;
>> }
>>
>> When the return code of the "vdso" internal function returns an error the system
>> call is triggered.
> 
> But when __cvdso_clock_getres_common() does *not* return an error, it
> means that it handled the clock_getres() call without a fallback to the
> syscall. I assume this is possible on arm32. When the clock_getres() is
> handled directly (not as a syscall), why doesn't arm32 need the same
> (res >= TASK_SIZE) check?
> 

Ok, I see what you mean.

It does not need to differ when __cvdso_clock_getres_common() does *not* return
an error, we can move the checks in the fallback and leave the vdso code the
same. The reason why I put the checks at the beginning of vdso code is because
since I know such a condition it is going to fail I prefer to bailout
immediately when it is detected instead of going through a bus error and a
syscall before I can bailout.

>> In general arm32 has been ported to the unified vDSO library hence it has a
>> proper implementation on par with all the other architectures supported by the
>> unified library.
> 
> And that's what I do not fully understand. When the call doesn't fall
> back to a syscall, why does arm32 and arm64 compat need to differ in the
> implementation? I may be missing something here.
> 

I think I replied above, please let me know if this is not the case.

>>>>> My guess is that on arm32 it only fails with -EFAULT in the syscall
>>>>> fallback path since a copy_to_user() would fail the access_ok() check.
>>>>> Does it always take the fallback path if ts > TASK_SIZE?
>>>>
>>>> Correct, it goes via fallback. The return codes for these syscalls are specified
>>>> by the ABI [1]. Then I agree with you the way on which arm32 achieves it should
>>>> be via access_ok() check.
>>>
>>> "it should be" or "it is" on arm32?
> [...]
>> SYSCALL_DEFINE2(clock_gettime, const clockid_t, which_clock,
>> 		struct __kernel_timespec __user *, tp)
> [...]
>> This is the syscall on which we fallback when the "vdso" internal function
>> returns an error. The behavior of the vdso has to be exactly the same of the
>> syscall otherwise we end up in an ABI breakage.
> 
> I agree. I just don't understand why on arm32 the vdso code doesn't need
> to check for tp >= TASK_SIZE while the arm64 compat one does when it
> does *not* fall back to a syscall. I understand the syscall fallback
> case, that's caused by how we handle access_ok(), but this doesn't
> explain the vdso-only case.
> 

It is mainly a design choice based on what I explained above but I am open to
suggestions if you have a better way to proceed.

[...]

> 
> Furthermore, my assumption is that __cvdso_clock_getres_common() should
> handle this case already and we don't need it in the arch vdso code.
> 

This is not the point I was trying to make, what I was trying to analyze here
was the check compared to why the test verifies it, not the correctness of the
check itself. Anyway, according to me, it is not worthed continuing to discuss
it further since as of my previous email I already said that I am going to
remove the check entirely because of the fix below.

>>> You also don't explain why __cvdso_clock_getres_time32() doesn't already
>>> detect an invalid clk_id on arm64 compat (but does it on arm32).
>>>
>>
>> Thanks for asking to me this question, if I would not have spent the day trying
>> to explain it, I would not have found a bug in the getres() fallback:
>>
>> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
>> index 1dd22da1c3a9..803039d504de 100644
>> --- a/arch/arm64/include/asm/unistd.h
>> +++ b/arch/arm64/include/asm/unistd.h
>> @@ -25,8 +25,8 @@
>>  #define __NR_compat_gettimeofday       78
>>  #define __NR_compat_sigreturn          119
>>  #define __NR_compat_rt_sigreturn       173
>> -#define __NR_compat_clock_getres       247
>>  #define __NR_compat_clock_gettime      263
>> +#define __NR_compat_clock_getres       264
>>  #define __NR_compat_clock_gettime64    403
>>  #define __NR_compat_clock_getres_time64        406
>>
>> In particular compat getres is mis-numbered and that is what causes the issue.
>>
>> I am going to add a patch to my v5 that addresses the issue (or probably a
>> separate one and cc stable since it fixes a bug) and in this patch I will remove
>> the check on VALID_CLOCK_ID.
> 
> Please send this as a separate patch that should be merged as a fix, cc
> stable.
> 

Ok, I will prepare a fix today.

>> I hope that this long email helps you to have a clearer picture of what is going
>> on. Please let me know if there is still something missing.
> 
> Not entirely. Sorry.
> 

Let's try again :)
Andy Lutomirski March 19, 2020, 3:49 p.m. UTC | #10
On Tue, Mar 17, 2020 at 7:38 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
> > diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
> > index 54fc1c2ce93f..91138077b073 100644
> > --- a/arch/arm64/kernel/vdso32/vgettimeofday.c
> > +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
> > @@ -8,11 +8,14 @@
> >  #include <linux/time.h>
> >  #include <linux/types.h>
> >
> > +#define VALID_CLOCK_ID(x) \
> > +     ((x >= 0) && (x < VDSO_BASES))
> > +
> >  int __vdso_clock_gettime(clockid_t clock,
> >                        struct old_timespec32 *ts)
> >  {
> >       /* The checks below are required for ABI consistency with arm */
> > -     if ((u32)ts >= TASK_SIZE_32)
> > +     if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
> >               return -EFAULT;
> >
> >       return __cvdso_clock_gettime32(clock, ts);
>
> I probably miss something but I can't find the TASK_SIZE check in the
> arch/arm/vdso/vgettimeofday.c code. Is this done elsewhere?
>

Can you not just remove the TASK_SIZE_32 check entirely?  If you pass
a garbage address to the vDSO, you are quite likely to get SIGSEGV.
Why does this particular type of error need special handling?
Vincenzo Frascino March 19, 2020, 4:58 p.m. UTC | #11
Hi Andy,

On 3/19/20 3:49 PM, Andy Lutomirski wrote:
> On Tue, Mar 17, 2020 at 7:38 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
>>
>> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
>>> diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
>>> index 54fc1c2ce93f..91138077b073 100644
>>> --- a/arch/arm64/kernel/vdso32/vgettimeofday.c
>>> +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
>>> @@ -8,11 +8,14 @@
>>>  #include <linux/time.h>
>>>  #include <linux/types.h>
>>>
>>> +#define VALID_CLOCK_ID(x) \
>>> +     ((x >= 0) && (x < VDSO_BASES))
>>> +
>>>  int __vdso_clock_gettime(clockid_t clock,
>>>                        struct old_timespec32 *ts)
>>>  {
>>>       /* The checks below are required for ABI consistency with arm */
>>> -     if ((u32)ts >= TASK_SIZE_32)
>>> +     if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
>>>               return -EFAULT;
>>>
>>>       return __cvdso_clock_gettime32(clock, ts);
>>
>> I probably miss something but I can't find the TASK_SIZE check in the
>> arch/arm/vdso/vgettimeofday.c code. Is this done elsewhere?
>>
> 
> Can you not just remove the TASK_SIZE_32 check entirely?  If you pass
> a garbage address to the vDSO, you are quite likely to get SIGSEGV.
> Why does this particular type of error need special handling?
> 

In this particular case the system call and the vDSO library as it stands
returns -EFAULT on all the architectures that support the vdso library except on
arm64 compat. The reason why it does not return the correct error code, as I was
discussing with Catalin, it is because arm64 uses USER_DS (addr_limit set
happens in arch/arm64/kernel/entry.S), which is defined as (1 << VA_BITS), as
access_ok() validation even on compat tasks and since arm64 supports up to 52bit
VA, this does not detect the end of the user address space for a 32 bit task.
Hence when we fall back on the system call we get the wrong error code out of it.

According to me to have ABI consistency we need this check, but if we say that
we can make an ABI exception in this case, I am fine with that either if it has
enough consensus.

Please let me know your thoughts.
Catalin Marinas March 19, 2020, 6:10 p.m. UTC | #12
Hi Vincenzo,

On Thu, Mar 19, 2020 at 12:38:42PM +0000, Vincenzo Frascino wrote:
> On 3/18/20 6:36 PM, Catalin Marinas wrote:
> > On Wed, Mar 18, 2020 at 04:14:26PM +0000, Vincenzo Frascino wrote:
> >> On 3/17/20 5:48 PM, Catalin Marinas wrote:
> >>> So clock_gettime() on arm32 always falls back to the syscall?
> >>
> >> This seems not what you asked, and I think I answered accordingly. Anyway, in
> >> the case of arm32 the error code path is handled via syscall fallback.
> >>
> >> Look at the code below as an example (I am using getres because I know this
> >> email will be already too long, and I do not want to add pointless code, but the
> >> concept is the same for gettime and the others):
> >>
> >> static __maybe_unused
> >> int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
> >> {
> >> 	int ret = __cvdso_clock_getres_common(clock, res);
> >>
> >> 	if (unlikely(ret))
> >> 		return clock_getres_fallback(clock, res);
> >> 	return 0;
> >> }
> >>
> >> When the return code of the "vdso" internal function returns an error the system
> >> call is triggered.
> > 
> > But when __cvdso_clock_getres_common() does *not* return an error, it
> > means that it handled the clock_getres() call without a fallback to the
> > syscall. I assume this is possible on arm32. When the clock_getres() is
> > handled directly (not as a syscall), why doesn't arm32 need the same
> > (res >= TASK_SIZE) check?
> 
> Ok, I see what you mean.

I'm not sure.

> It does not need to differ when __cvdso_clock_getres_common() does *not* return
> an error, we can move the checks in the fallback and leave the vdso code the
> same. The reason why I put the checks at the beginning of vdso code is because
> since I know such a condition it is going to fail I prefer to bailout
> immediately when it is detected instead of going through a bus error and a
> syscall before I can bailout.

I don't dispute your choice of choosing to bail out early, that's fine
by me. What I'm asking above, and you haven't answered, is why we don't
need exactly the same check on arm32. I.e.:

diff --git a/arch/arm/vdso/vgettimeofday.c b/arch/arm/vdso/vgettimeofday.c
index 1976c6f325a4..17ee5d211228 100644
--- a/arch/arm/vdso/vgettimeofday.c
+++ b/arch/arm/vdso/vgettimeofday.c
@@ -28,6 +28,9 @@ int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 int __vdso_clock_getres(clockid_t clock_id,
 			struct old_timespec32 *res)
 {
+	if ((u32)res >= TASK_SIZE)
+		return -EFAULT;
+
 	return __cvdso_clock_getres_time32(clock_id, res);
 }
 

(where arch/arm means arm32 ;)).

If the arm32 vdsotest passes, I'd like to know why.

> It is mainly a design choice based on what I explained above but I am open to
> suggestions if you have a better way to proceed.

I suggest just drop the TASK_SIZE_32 test altogether in this series to
get it merged for 5.7-rc1. We'll fix the ABI issues in -rc2/-rc3 once
you confirm that the test fully passes on arm32 when it doesn't fall
back to the syscall handling and we understood why.

> > Furthermore, my assumption is that __cvdso_clock_getres_common() should
> > handle this case already and we don't need it in the arch vdso code.
> > 
> 
> This is not the point I was trying to make, what I was trying to analyze here
> was the check compared to why the test verifies it, not the correctness of the
> check itself.

You should implement it based on what the man page defines, not some
specific test. Tests are rarely exhaustive (unless you do formal
modelling).
Will Deacon March 19, 2020, 6:32 p.m. UTC | #13
On Thu, Mar 19, 2020 at 04:58:00PM +0000, Vincenzo Frascino wrote:
> On 3/19/20 3:49 PM, Andy Lutomirski wrote:
> > On Tue, Mar 17, 2020 at 7:38 AM Catalin Marinas <catalin.marinas@arm.com> wrote:
> >>
> >> On Tue, Mar 17, 2020 at 12:22:12PM +0000, Vincenzo Frascino wrote:
> >>> diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
> >>> index 54fc1c2ce93f..91138077b073 100644
> >>> --- a/arch/arm64/kernel/vdso32/vgettimeofday.c
> >>> +++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
> >>> @@ -8,11 +8,14 @@
> >>>  #include <linux/time.h>
> >>>  #include <linux/types.h>
> >>>
> >>> +#define VALID_CLOCK_ID(x) \
> >>> +     ((x >= 0) && (x < VDSO_BASES))
> >>> +
> >>>  int __vdso_clock_gettime(clockid_t clock,
> >>>                        struct old_timespec32 *ts)
> >>>  {
> >>>       /* The checks below are required for ABI consistency with arm */
> >>> -     if ((u32)ts >= TASK_SIZE_32)
> >>> +     if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
> >>>               return -EFAULT;
> >>>
> >>>       return __cvdso_clock_gettime32(clock, ts);
> >>
> >> I probably miss something but I can't find the TASK_SIZE check in the
> >> arch/arm/vdso/vgettimeofday.c code. Is this done elsewhere?
> >>
> > 
> > Can you not just remove the TASK_SIZE_32 check entirely?  If you pass
> > a garbage address to the vDSO, you are quite likely to get SIGSEGV.
> > Why does this particular type of error need special handling?
> > 
> 
> In this particular case the system call and the vDSO library as it stands
> returns -EFAULT on all the architectures that support the vdso library except on
> arm64 compat. The reason why it does not return the correct error code, as I was
> discussing with Catalin, it is because arm64 uses USER_DS (addr_limit set
> happens in arch/arm64/kernel/entry.S), which is defined as (1 << VA_BITS), as
> access_ok() validation even on compat tasks and since arm64 supports up to 52bit
> VA, this does not detect the end of the user address space for a 32 bit task.
> Hence when we fall back on the system call we get the wrong error code out of it.
> 
> According to me to have ABI consistency we need this check, but if we say that
> we can make an ABI exception in this case, I am fine with that either if it has
> enough consensus.
> 
> Please let me know your thoughts.

I don't agree with your reasoning -- letting the thing SEGV is perfectly
fine and we don't need to perform additional checking in userspace here.
If you treat the vDSO more as being part of libc then part of the kernel
then I think it makes perfect sense.

There are other system calls that will SEGV in libc if they are passed dodgy
pointers before the kernel has a chance to return -EFAULT.

Will
Vincenzo Frascino March 20, 2020, 1:05 p.m. UTC | #14
Hi Catalin,

On 3/19/20 6:10 PM, Catalin Marinas wrote:
> Hi Vincenzo,
> 
> On Thu, Mar 19, 2020 at 12:38:42PM +0000, Vincenzo Frascino wrote:
>> On 3/18/20 6:36 PM, Catalin Marinas wrote:
>>> On Wed, Mar 18, 2020 at 04:14:26PM +0000, Vincenzo Frascino wrote:
>>>> On 3/17/20 5:48 PM, Catalin Marinas wrote:
>>>>> So clock_gettime() on arm32 always falls back to the syscall?
>>>>
>>>> This seems not what you asked, and I think I answered accordingly. Anyway, in
>>>> the case of arm32 the error code path is handled via syscall fallback.
>>>>
>>>> Look at the code below as an example (I am using getres because I know this
>>>> email will be already too long, and I do not want to add pointless code, but the
>>>> concept is the same for gettime and the others):
>>>>
>>>> static __maybe_unused
>>>> int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
>>>> {
>>>> 	int ret = __cvdso_clock_getres_common(clock, res);
>>>>
>>>> 	if (unlikely(ret))
>>>> 		return clock_getres_fallback(clock, res);
>>>> 	return 0;
>>>> }
>>>>
>>>> When the return code of the "vdso" internal function returns an error the system
>>>> call is triggered.
>>>
>>> But when __cvdso_clock_getres_common() does *not* return an error, it
>>> means that it handled the clock_getres() call without a fallback to the
>>> syscall. I assume this is possible on arm32. When the clock_getres() is
>>> handled directly (not as a syscall), why doesn't arm32 need the same
>>> (res >= TASK_SIZE) check?
>>
>> Ok, I see what you mean.
> 
> I'm not sure.
> 
Thank you for the long chat this morning. As we agreed I am going to repost the
patches removing the checks discussed in this thread and we will address the
syscall ABI difference subsequently with a different series.
Catalin Marinas March 20, 2020, 2:22 p.m. UTC | #15
On Fri, Mar 20, 2020 at 01:05:14PM +0000, Vincenzo Frascino wrote:
> On 3/19/20 6:10 PM, Catalin Marinas wrote:
> > On Thu, Mar 19, 2020 at 12:38:42PM +0000, Vincenzo Frascino wrote:
> >> On 3/18/20 6:36 PM, Catalin Marinas wrote:
> >>> On Wed, Mar 18, 2020 at 04:14:26PM +0000, Vincenzo Frascino wrote:
> >>>> On 3/17/20 5:48 PM, Catalin Marinas wrote:
> >>>>> So clock_gettime() on arm32 always falls back to the syscall?
> >>>>
> >>>> This seems not what you asked, and I think I answered accordingly. Anyway, in
> >>>> the case of arm32 the error code path is handled via syscall fallback.
> >>>>
> >>>> Look at the code below as an example (I am using getres because I know this
> >>>> email will be already too long, and I do not want to add pointless code, but the
> >>>> concept is the same for gettime and the others):
> >>>>
> >>>> static __maybe_unused
> >>>> int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
> >>>> {
> >>>> 	int ret = __cvdso_clock_getres_common(clock, res);
> >>>>
> >>>> 	if (unlikely(ret))
> >>>> 		return clock_getres_fallback(clock, res);
> >>>> 	return 0;
> >>>> }
> >>>>
> >>>> When the return code of the "vdso" internal function returns an error the system
> >>>> call is triggered.
> >>>
> >>> But when __cvdso_clock_getres_common() does *not* return an error, it
> >>> means that it handled the clock_getres() call without a fallback to the
> >>> syscall. I assume this is possible on arm32. When the clock_getres() is
> >>> handled directly (not as a syscall), why doesn't arm32 need the same
> >>> (res >= TASK_SIZE) check?
> >>
> >> Ok, I see what you mean.
> > 
> > I'm not sure.
> 
> Thank you for the long chat this morning. As we agreed I am going to repost the
> patches removing the checks discussed in this thread

Great, thanks.

> and we will address the syscall ABI difference subsequently with a
> different series.

Now I'm even less convinced we need any additional patches. The arm64
compat syscall would still return -EFAULT for res >= TASK_SIZE_32
because copy_to_user() will fail. So it would be entirely consistent
with the arm32 syscall. In the vdso-only case, both arm32 and arm64
compat would generate a signal.

As Will said, arguably, the syscall semantics may not be applicable to
the vdso implementation. But if you do want to get down this route (tp =
UINTPTR_MAX - sizeof(*tp) returning -EFAULT), please do it for all
architectures, not just arm64 compat. However, I'm not sure anyone
relies on this functionality, other than the vdsotest, so no real
application broken.
Vincenzo Frascino March 20, 2020, 2:41 p.m. UTC | #16
Hi Catalin,

On 3/20/20 2:22 PM, Catalin Marinas wrote:
> On Fri, Mar 20, 2020 at 01:05:14PM +0000, Vincenzo Frascino wrote:
>> On 3/19/20 6:10 PM, Catalin Marinas wrote:
>>> On Thu, Mar 19, 2020 at 12:38:42PM +0000, Vincenzo Frascino wrote:
>>>> On 3/18/20 6:36 PM, Catalin Marinas wrote:
>>>>> On Wed, Mar 18, 2020 at 04:14:26PM +0000, Vincenzo Frascino wrote:
>>>>>> On 3/17/20 5:48 PM, Catalin Marinas wrote:
[...]

>>
>> Thank you for the long chat this morning. As we agreed I am going to repost the
>> patches removing the checks discussed in this thread
> 
> Great, thanks.
> 
>> and we will address the syscall ABI difference subsequently with a
>> different series.
> 
> Now I'm even less convinced we need any additional patches. The arm64
> compat syscall would still return -EFAULT for res >= TASK_SIZE_32
> because copy_to_user() will fail. So it would be entirely consistent
> with the arm32 syscall. In the vdso-only case, both arm32 and arm64
> compat would generate a signal.
> 
> As Will said, arguably, the syscall semantics may not be applicable to
> the vdso implementation. But if you do want to get down this route (tp =
> UINTPTR_MAX - sizeof(*tp) returning -EFAULT), please do it for all
> architectures, not just arm64 compat. However, I'm not sure anyone
> relies on this functionality, other than the vdsotest, so no real
> application broken.
> 

It is ok, we will discuss the topic once we cross that bridge. I am already
happy that I managed to explain finally my reasons ;)

Anyway, I think that if there is an application that relies on this behavior (or
similar) and uses compat we will discover it as soon as these patches will be
out in the wild. For this reason I am putting a link to this discussion in the
commit message of the relevant patch so that we can take it from there.
diff mbox series

Patch

diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
index 54fc1c2ce93f..91138077b073 100644
--- a/arch/arm64/kernel/vdso32/vgettimeofday.c
+++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
@@ -8,11 +8,14 @@ 
 #include <linux/time.h>
 #include <linux/types.h>
 
+#define VALID_CLOCK_ID(x) \
+	((x >= 0) && (x < VDSO_BASES))
+
 int __vdso_clock_gettime(clockid_t clock,
 			 struct old_timespec32 *ts)
 {
 	/* The checks below are required for ABI consistency with arm */
-	if ((u32)ts >= TASK_SIZE_32)
+	if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
 		return -EFAULT;
 
 	return __cvdso_clock_gettime32(clock, ts);
@@ -22,7 +25,7 @@  int __vdso_clock_gettime64(clockid_t clock,
 			   struct __kernel_timespec *ts)
 {
 	/* The checks below are required for ABI consistency with arm */
-	if ((u32)ts >= TASK_SIZE_32)
+	if ((u32)ts > UINTPTR_MAX - sizeof(*ts) + 1)
 		return -EFAULT;
 
 	return __cvdso_clock_gettime(clock, ts);
@@ -38,9 +41,12 @@  int __vdso_clock_getres(clockid_t clock_id,
 			struct old_timespec32 *res)
 {
 	/* The checks below are required for ABI consistency with arm */
-	if ((u32)res >= TASK_SIZE_32)
+	if ((u32)res > UINTPTR_MAX - sizeof(res) + 1)
 		return -EFAULT;
 
+	if (!VALID_CLOCK_ID(clock_id) && res == NULL)
+		return -EINVAL;
+
 	return __cvdso_clock_getres_time32(clock_id, res);
 }