diff mbox series

[1/3] arm64: ptrace: Add is_syscall_success to handle compat

Message ID 20210416075533.7720-1-zhe.he@windriver.com (mailing list archive)
State New
Headers show
Series [1/3] arm64: ptrace: Add is_syscall_success to handle compat | expand

Commit Message

He Zhe April 16, 2021, 7:55 a.m. UTC
The general version of is_syscall_success does not handle 32-bit
compatible case, which would cause 32-bit negative return code to be
recoganized as a positive number later and seen as a "success".

Since is_compat_thread is defined in compat.h, implementing
is_syscall_success in ptrace.h would introduce build failure due to
recursive inclusion of some basic headers like mutex.h. We put the
implementation to ptrace.c

Signed-off-by: He Zhe <zhe.he@windriver.com>
---
 arch/arm64/include/asm/ptrace.h |  3 +++
 arch/arm64/kernel/ptrace.c      | 10 ++++++++++
 2 files changed, 13 insertions(+)

Comments

Catalin Marinas April 16, 2021, 12:33 p.m. UTC | #1
On Fri, Apr 16, 2021 at 03:55:31PM +0800, He Zhe wrote:
> The general version of is_syscall_success does not handle 32-bit
> compatible case, which would cause 32-bit negative return code to be
> recoganized as a positive number later and seen as a "success".
> 
> Since is_compat_thread is defined in compat.h, implementing
> is_syscall_success in ptrace.h would introduce build failure due to
> recursive inclusion of some basic headers like mutex.h. We put the
> implementation to ptrace.c
> 
> Signed-off-by: He Zhe <zhe.he@windriver.com>
> ---
>  arch/arm64/include/asm/ptrace.h |  3 +++
>  arch/arm64/kernel/ptrace.c      | 10 ++++++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> index e58bca832dff..3c415e9e5d85 100644
> --- a/arch/arm64/include/asm/ptrace.h
> +++ b/arch/arm64/include/asm/ptrace.h
> @@ -328,6 +328,9 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
>  	regs->regs[0] = rc;
>  }
>  
> +extern inline int is_syscall_success(struct pt_regs *regs);
> +#define is_syscall_success(regs) is_syscall_success(regs)
> +
>  /**
>   * regs_get_kernel_argument() - get Nth function argument in kernel
>   * @regs:	pt_regs of that context
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 170f42fd6101..3266201f8c60 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -1909,3 +1909,13 @@ int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task)
>  	else
>  		return valid_native_regs(regs);
>  }
> +
> +inline int is_syscall_success(struct pt_regs *regs)
> +{
> +	unsigned long val = regs->regs[0];
> +
> +	if (is_compat_thread(task_thread_info(current)))
> +		val = sign_extend64(val, 31);
> +
> +	return !IS_ERR_VALUE(val);
> +}

It's better to use compat_user_mode(regs) here instead of
is_compat_thread(). It saves us from worrying whether regs are for the
current context.

I think we should change regs_return_value() instead. This function
seems to be called from several other places and it has the same
potential problems if called on compat pt_regs.
Mark Rutland April 16, 2021, 1:34 p.m. UTC | #2
On Fri, Apr 16, 2021 at 01:33:22PM +0100, Catalin Marinas wrote:
> On Fri, Apr 16, 2021 at 03:55:31PM +0800, He Zhe wrote:
> > The general version of is_syscall_success does not handle 32-bit
> > compatible case, which would cause 32-bit negative return code to be
> > recoganized as a positive number later and seen as a "success".
> > 
> > Since is_compat_thread is defined in compat.h, implementing
> > is_syscall_success in ptrace.h would introduce build failure due to
> > recursive inclusion of some basic headers like mutex.h. We put the
> > implementation to ptrace.c
> > 
> > Signed-off-by: He Zhe <zhe.he@windriver.com>
> > ---
> >  arch/arm64/include/asm/ptrace.h |  3 +++
> >  arch/arm64/kernel/ptrace.c      | 10 ++++++++++
> >  2 files changed, 13 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> > index e58bca832dff..3c415e9e5d85 100644
> > --- a/arch/arm64/include/asm/ptrace.h
> > +++ b/arch/arm64/include/asm/ptrace.h
> > @@ -328,6 +328,9 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
> >  	regs->regs[0] = rc;
> >  }
> >  
> > +extern inline int is_syscall_success(struct pt_regs *regs);
> > +#define is_syscall_success(regs) is_syscall_success(regs)
> > +
> >  /**
> >   * regs_get_kernel_argument() - get Nth function argument in kernel
> >   * @regs:	pt_regs of that context
> > diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> > index 170f42fd6101..3266201f8c60 100644
> > --- a/arch/arm64/kernel/ptrace.c
> > +++ b/arch/arm64/kernel/ptrace.c
> > @@ -1909,3 +1909,13 @@ int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task)
> >  	else
> >  		return valid_native_regs(regs);
> >  }
> > +
> > +inline int is_syscall_success(struct pt_regs *regs)
> > +{
> > +	unsigned long val = regs->regs[0];
> > +
> > +	if (is_compat_thread(task_thread_info(current)))
> > +		val = sign_extend64(val, 31);
> > +
> > +	return !IS_ERR_VALUE(val);
> > +}
> 
> It's better to use compat_user_mode(regs) here instead of
> is_compat_thread(). It saves us from worrying whether regs are for the
> current context.
> 
> I think we should change regs_return_value() instead. This function
> seems to be called from several other places and it has the same
> potential problems if called on compat pt_regs.

I think this is a problem we created for ourselves back in commit:

  15956689a0e60aa0 ("arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return)

AFAICT, the perf regs samples are the only place this matters, since for
ptrace the compat regs are implicitly truncated to compat_ulong_t, and
audit expects the non-truncated return value. Other architectures don't
truncate here, so I think we're setting ourselves up for a game of
whack-a-mole to truncate and extend wherever we need to.

Given that, I suspect it'd be better to do something like the below.

Will, thoughts?

Mark.

---->8----
From df0f7c160240d9ee6f20f87a180326d3253e80fb Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland@arm.com>
Date: Fri, 16 Apr 2021 13:58:54 +0100
Subject: [PATCH] arm64: perf: truncate compat regs

For compat userspace, it doesn't generally make sense for the upper 32
bits of GPRs to be set, as these bits don't really exist in AArch32.
However, for structural reasons the kernel may transiently set the upper
32 bits of registers in pt_regs at points where a perf sample can be
taken.

We tried to avoid this happening in commit:

  15956689a0e60aa0 ("arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return")

... by having invoke_syscall() truncate the return value for compat
tasks, with helpers in <asm/syscall.h> extending the return value when
required.

Unfortunately this is not complete, as there are other places where we
assign the return value, such as when el0_svc_common() sets up a return
of -ENOSYS.

Further, this approach breaks the audit code, which relies on the upper
32 bits of the return value.

Instead, let's have the perf code explicitly truncate the user regs to
32 bits, and otherwise preserve those within the kernel.

Fixes: 15956689a0e60aa0 ("arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/syscall.h | 11 +----------
 arch/arm64/kernel/perf_regs.c    | 26 ++++++++++++++++----------
 arch/arm64/kernel/syscall.c      |  3 ---
 3 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index cfc0672013f6..0ebeaf6dbd45 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -35,9 +35,6 @@ static inline long syscall_get_error(struct task_struct *task,
 {
 	unsigned long error = regs->regs[0];
 
-	if (is_compat_thread(task_thread_info(task)))
-		error = sign_extend64(error, 31);
-
 	return IS_ERR_VALUE(error) ? error : 0;
 }
 
@@ -51,13 +48,7 @@ static inline void syscall_set_return_value(struct task_struct *task,
 					    struct pt_regs *regs,
 					    int error, long val)
 {
-	if (error)
-		val = error;
-
-	if (is_compat_thread(task_thread_info(task)))
-		val = lower_32_bits(val);
-
-	regs->regs[0] = val;
+	regs->regs[0] = (long) error ? error : val;
 }
 
 #define SYSCALL_MAX_ARGS 6
diff --git a/arch/arm64/kernel/perf_regs.c b/arch/arm64/kernel/perf_regs.c
index f6f58e6265df..296f0c55b4e2 100644
--- a/arch/arm64/kernel/perf_regs.c
+++ b/arch/arm64/kernel/perf_regs.c
@@ -9,6 +9,17 @@
 #include <asm/perf_regs.h>
 #include <asm/ptrace.h>
 
+static u64 __perf_reg_value(struct pt_regs *regs, int idx)
+{
+	if ((u32)idx == PERF_REG_ARM64_SP)
+		return regs->sp;
+
+	if ((u32)idx == PERF_REG_ARM64_PC)
+		return regs->pc;
+
+	return regs->regs[idx];
+}
+
 u64 perf_reg_value(struct pt_regs *regs, int idx)
 {
 	if (WARN_ON_ONCE((u32)idx >= PERF_REG_ARM64_MAX))
@@ -38,20 +49,15 @@ u64 perf_reg_value(struct pt_regs *regs, int idx)
 	 */
 	if (compat_user_mode(regs)) {
 		if ((u32)idx == PERF_REG_ARM64_SP)
-			return regs->compat_sp;
+			return lower_32_bits(regs->compat_sp);
 		if ((u32)idx == PERF_REG_ARM64_LR)
-			return regs->compat_lr;
+			return lower_32_bits(regs->compat_lr);
 		if (idx == 15)
-			return regs->pc;
+			return lower_32_bits(regs->pc);
+		return lower_32_bits(__perf_reg_value(regs, idx));
 	}
 
-	if ((u32)idx == PERF_REG_ARM64_SP)
-		return regs->sp;
-
-	if ((u32)idx == PERF_REG_ARM64_PC)
-		return regs->pc;
-
-	return regs->regs[idx];
+	return __perf_reg_value(regs, idx);
 }
 
 #define REG_RESERVED (~((1ULL << PERF_REG_ARM64_MAX) - 1))
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index b9cf12b271d7..84314fae4b5c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -51,9 +51,6 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
 		ret = do_ni_syscall(regs, scno);
 	}
 
-	if (is_compat_task())
-		ret = lower_32_bits(ret);
-
 	regs->regs[0] = ret;
 }
David Laight April 17, 2021, 1:19 p.m. UTC | #3
From: Mark Rutland
> Sent: 16 April 2021 14:35
..
> @@ -51,13 +48,7 @@ static inline void syscall_set_return_value(struct task_struct *task,
>  					    struct pt_regs *regs,
>  					    int error, long val)
>  {
> -	if (error)
> -		val = error;
> -
> -	if (is_compat_thread(task_thread_info(task)))
> -		val = lower_32_bits(val);
> -
> -	regs->regs[0] = val;
> +	regs->regs[0] = (long) error ? error : val;

	= error ? (long)error : rval;

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Will Deacon April 19, 2021, 12:19 p.m. UTC | #4
On Fri, Apr 16, 2021 at 02:34:41PM +0100, Mark Rutland wrote:
> On Fri, Apr 16, 2021 at 01:33:22PM +0100, Catalin Marinas wrote:
> > On Fri, Apr 16, 2021 at 03:55:31PM +0800, He Zhe wrote:
> > > The general version of is_syscall_success does not handle 32-bit
> > > compatible case, which would cause 32-bit negative return code to be
> > > recoganized as a positive number later and seen as a "success".
> > > 
> > > Since is_compat_thread is defined in compat.h, implementing
> > > is_syscall_success in ptrace.h would introduce build failure due to
> > > recursive inclusion of some basic headers like mutex.h. We put the
> > > implementation to ptrace.c
> > > 
> > > Signed-off-by: He Zhe <zhe.he@windriver.com>
> > > ---
> > >  arch/arm64/include/asm/ptrace.h |  3 +++
> > >  arch/arm64/kernel/ptrace.c      | 10 ++++++++++
> > >  2 files changed, 13 insertions(+)
> > > 
> > > diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> > > index e58bca832dff..3c415e9e5d85 100644
> > > --- a/arch/arm64/include/asm/ptrace.h
> > > +++ b/arch/arm64/include/asm/ptrace.h
> > > @@ -328,6 +328,9 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
> > >  	regs->regs[0] = rc;
> > >  }
> > >  
> > > +extern inline int is_syscall_success(struct pt_regs *regs);
> > > +#define is_syscall_success(regs) is_syscall_success(regs)
> > > +
> > >  /**
> > >   * regs_get_kernel_argument() - get Nth function argument in kernel
> > >   * @regs:	pt_regs of that context
> > > diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> > > index 170f42fd6101..3266201f8c60 100644
> > > --- a/arch/arm64/kernel/ptrace.c
> > > +++ b/arch/arm64/kernel/ptrace.c
> > > @@ -1909,3 +1909,13 @@ int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task)
> > >  	else
> > >  		return valid_native_regs(regs);
> > >  }
> > > +
> > > +inline int is_syscall_success(struct pt_regs *regs)
> > > +{
> > > +	unsigned long val = regs->regs[0];
> > > +
> > > +	if (is_compat_thread(task_thread_info(current)))
> > > +		val = sign_extend64(val, 31);
> > > +
> > > +	return !IS_ERR_VALUE(val);
> > > +}
> > 
> > It's better to use compat_user_mode(regs) here instead of
> > is_compat_thread(). It saves us from worrying whether regs are for the
> > current context.
> > 
> > I think we should change regs_return_value() instead. This function
> > seems to be called from several other places and it has the same
> > potential problems if called on compat pt_regs.
> 
> I think this is a problem we created for ourselves back in commit:
> 
>   15956689a0e60aa0 ("arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return)
> 
> AFAICT, the perf regs samples are the only place this matters, since for
> ptrace the compat regs are implicitly truncated to compat_ulong_t, and
> audit expects the non-truncated return value. Other architectures don't
> truncate here, so I think we're setting ourselves up for a game of
> whack-a-mole to truncate and extend wherever we need to.
> 
> Given that, I suspect it'd be better to do something like the below.
> 
> Will, thoughts?

I think perf is one example, but this is also visible to userspace via the
native ptrace interface and I distinctly remember needing this for some
versions of arm64 strace to work correctly when tracing compat tasks.

So I do think that clearing the upper bits on the return path is the right
approach, but it sounds like we need some more work to handle syscall(-1)
and audit (what exactly is the problem here after these patches have been
applied?)

Will
He Zhe April 20, 2021, 8:42 a.m. UTC | #5
On 4/16/21 8:33 PM, Catalin Marinas wrote:
> On Fri, Apr 16, 2021 at 03:55:31PM +0800, He Zhe wrote:
>> The general version of is_syscall_success does not handle 32-bit
>> compatible case, which would cause 32-bit negative return code to be
>> recoganized as a positive number later and seen as a "success".
>>
>> Since is_compat_thread is defined in compat.h, implementing
>> is_syscall_success in ptrace.h would introduce build failure due to
>> recursive inclusion of some basic headers like mutex.h. We put the
>> implementation to ptrace.c
>>
>> Signed-off-by: He Zhe <zhe.he@windriver.com>
>> ---
>>  arch/arm64/include/asm/ptrace.h |  3 +++
>>  arch/arm64/kernel/ptrace.c      | 10 ++++++++++
>>  2 files changed, 13 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
>> index e58bca832dff..3c415e9e5d85 100644
>> --- a/arch/arm64/include/asm/ptrace.h
>> +++ b/arch/arm64/include/asm/ptrace.h
>> @@ -328,6 +328,9 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
>>  	regs->regs[0] = rc;
>>  }
>>  
>> +extern inline int is_syscall_success(struct pt_regs *regs);
>> +#define is_syscall_success(regs) is_syscall_success(regs)
>> +
>>  /**
>>   * regs_get_kernel_argument() - get Nth function argument in kernel
>>   * @regs:	pt_regs of that context
>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>> index 170f42fd6101..3266201f8c60 100644
>> --- a/arch/arm64/kernel/ptrace.c
>> +++ b/arch/arm64/kernel/ptrace.c
>> @@ -1909,3 +1909,13 @@ int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task)
>>  	else
>>  		return valid_native_regs(regs);
>>  }
>> +
>> +inline int is_syscall_success(struct pt_regs *regs)
>> +{
>> +	unsigned long val = regs->regs[0];
>> +
>> +	if (is_compat_thread(task_thread_info(current)))
>> +		val = sign_extend64(val, 31);
>> +
>> +	return !IS_ERR_VALUE(val);
>> +}
> It's better to use compat_user_mode(regs) here instead of
> is_compat_thread(). It saves us from worrying whether regs are for the
> current context.

Thanks. I'll use this for v2.

>
> I think we should change regs_return_value() instead. This function
> seems to be called from several other places and it has the same
> potential problems if called on compat pt_regs.

IMHO, now that we have had specific function, syscall_get_return_value, to get
syscall return code, we might as well use it. regs_return_value may be left for
where we want internal return code. I found such places below and haven't found
other places that syscall sign extension is concerned about.

kernel/test_kprobes.c
kernel/trace/trace_kprobe.c
samples/kprobes/kretprobe_example.c


Regards,
Zhe

>
He Zhe April 20, 2021, 8:54 a.m. UTC | #6
On 4/19/21 8:19 PM, Will Deacon wrote:
> On Fri, Apr 16, 2021 at 02:34:41PM +0100, Mark Rutland wrote:
>> On Fri, Apr 16, 2021 at 01:33:22PM +0100, Catalin Marinas wrote:
>>> On Fri, Apr 16, 2021 at 03:55:31PM +0800, He Zhe wrote:
>>>> The general version of is_syscall_success does not handle 32-bit
>>>> compatible case, which would cause 32-bit negative return code to be
>>>> recoganized as a positive number later and seen as a "success".
>>>>
>>>> Since is_compat_thread is defined in compat.h, implementing
>>>> is_syscall_success in ptrace.h would introduce build failure due to
>>>> recursive inclusion of some basic headers like mutex.h. We put the
>>>> implementation to ptrace.c
>>>>
>>>> Signed-off-by: He Zhe <zhe.he@windriver.com>
>>>> ---
>>>>  arch/arm64/include/asm/ptrace.h |  3 +++
>>>>  arch/arm64/kernel/ptrace.c      | 10 ++++++++++
>>>>  2 files changed, 13 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
>>>> index e58bca832dff..3c415e9e5d85 100644
>>>> --- a/arch/arm64/include/asm/ptrace.h
>>>> +++ b/arch/arm64/include/asm/ptrace.h
>>>> @@ -328,6 +328,9 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
>>>>  	regs->regs[0] = rc;
>>>>  }
>>>>  
>>>> +extern inline int is_syscall_success(struct pt_regs *regs);
>>>> +#define is_syscall_success(regs) is_syscall_success(regs)
>>>> +
>>>>  /**
>>>>   * regs_get_kernel_argument() - get Nth function argument in kernel
>>>>   * @regs:	pt_regs of that context
>>>> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
>>>> index 170f42fd6101..3266201f8c60 100644
>>>> --- a/arch/arm64/kernel/ptrace.c
>>>> +++ b/arch/arm64/kernel/ptrace.c
>>>> @@ -1909,3 +1909,13 @@ int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task)
>>>>  	else
>>>>  		return valid_native_regs(regs);
>>>>  }
>>>> +
>>>> +inline int is_syscall_success(struct pt_regs *regs)
>>>> +{
>>>> +	unsigned long val = regs->regs[0];
>>>> +
>>>> +	if (is_compat_thread(task_thread_info(current)))
>>>> +		val = sign_extend64(val, 31);
>>>> +
>>>> +	return !IS_ERR_VALUE(val);
>>>> +}
>>> It's better to use compat_user_mode(regs) here instead of
>>> is_compat_thread(). It saves us from worrying whether regs are for the
>>> current context.
>>>
>>> I think we should change regs_return_value() instead. This function
>>> seems to be called from several other places and it has the same
>>> potential problems if called on compat pt_regs.
>> I think this is a problem we created for ourselves back in commit:
>>
>>   15956689a0e60aa0 ("arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return)
>>
>> AFAICT, the perf regs samples are the only place this matters, since for
>> ptrace the compat regs are implicitly truncated to compat_ulong_t, and
>> audit expects the non-truncated return value. Other architectures don't
>> truncate here, so I think we're setting ourselves up for a game of
>> whack-a-mole to truncate and extend wherever we need to.
>>
>> Given that, I suspect it'd be better to do something like the below.
>>
>> Will, thoughts?
> I think perf is one example, but this is also visible to userspace via the
> native ptrace interface and I distinctly remember needing this for some
> versions of arm64 strace to work correctly when tracing compat tasks.
>
> So I do think that clearing the upper bits on the return path is the right
> approach, but it sounds like we need some more work to handle syscall(-1)
> and audit (what exactly is the problem here after these patches have been
> applied?)

IIUC, IS_ERR_VALUE could handle -1, did I miss something? Thanks.

Regards,
Zhe

>
> Will
Mark Rutland April 21, 2021, 5:10 p.m. UTC | #7
On Mon, Apr 19, 2021 at 01:19:33PM +0100, Will Deacon wrote:
> On Fri, Apr 16, 2021 at 02:34:41PM +0100, Mark Rutland wrote:
> > I think this is a problem we created for ourselves back in commit:
> > 
> >   15956689a0e60aa0 ("arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return)
> > 
> > AFAICT, the perf regs samples are the only place this matters, since for
> > ptrace the compat regs are implicitly truncated to compat_ulong_t, and
> > audit expects the non-truncated return value. Other architectures don't
> > truncate here, so I think we're setting ourselves up for a game of
> > whack-a-mole to truncate and extend wherever we need to.
> > 
> > Given that, I suspect it'd be better to do something like the below.
> > 
> > Will, thoughts?
> 
> I think perf is one example, but this is also visible to userspace via the
> native ptrace interface and I distinctly remember needing this for some
> versions of arm64 strace to work correctly when tracing compat tasks.

FWIW, you've convinced me on your approach (more on that below), but
when I went digging here this didn't seem to be exposed via ptrace --
for any task tracing a compat task, the GPRs are exposed via
compat_gpr_{get,set}(), which always truncate to compat_ulong_t, giving
the lower 32 bits. See task_user_regset_view() for where we get the
regset.

Am I missing something, or are you thinking of another issue you fixed
at the same time?

> So I do think that clearing the upper bits on the return path is the right
> approach, but it sounds like we need some more work to handle syscall(-1)
> and audit (what exactly is the problem here after these patches have been
> applied?)

From digging a bit more, I think I agree, and I think these patches are
sufficient for audit. I have some comments I'll leave separately.

The remaining issues are wherever we assign a signed value to a compat
GPR without explicit truncation. That'll leak via perf sampling the user
regs, but I haven't managed to convince myself whether that causes any
functional change in behaviour for audit, seccomp, or syscall tracing.

Since we mostly use compat_ulong_t for intermediate values in compat
code, it does look like this is only an issue for x0 where we assign an
error value, e.g. the -ENOSYS case in el0_svc_common. I'll go see if I
can find any more.

With those fixed up we can remove the x0 truncation from entry.S,
which'd be nice too.

Thanks,
Mark.
Mark Rutland April 21, 2021, 5:17 p.m. UTC | #8
On Tue, Apr 20, 2021 at 04:42:53PM +0800, He Zhe wrote:
> On 4/16/21 8:33 PM, Catalin Marinas wrote:
> > On Fri, Apr 16, 2021 at 03:55:31PM +0800, He Zhe wrote:
> >> The general version of is_syscall_success does not handle 32-bit
> >> compatible case, which would cause 32-bit negative return code to be
> >> recoganized as a positive number later and seen as a "success".
> >>
> >> Since is_compat_thread is defined in compat.h, implementing
> >> is_syscall_success in ptrace.h would introduce build failure due to
> >> recursive inclusion of some basic headers like mutex.h. We put the
> >> implementation to ptrace.c
> >>
> >> Signed-off-by: He Zhe <zhe.he@windriver.com>
> >> ---
> >>  arch/arm64/include/asm/ptrace.h |  3 +++
> >>  arch/arm64/kernel/ptrace.c      | 10 ++++++++++
> >>  2 files changed, 13 insertions(+)
> >>
> >> diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
> >> index e58bca832dff..3c415e9e5d85 100644
> >> --- a/arch/arm64/include/asm/ptrace.h
> >> +++ b/arch/arm64/include/asm/ptrace.h
> >> @@ -328,6 +328,9 @@ static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
> >>  	regs->regs[0] = rc;
> >>  }
> >>  
> >> +extern inline int is_syscall_success(struct pt_regs *regs);
> >> +#define is_syscall_success(regs) is_syscall_success(regs)
> >> +
> >>  /**
> >>   * regs_get_kernel_argument() - get Nth function argument in kernel
> >>   * @regs:	pt_regs of that context
> >> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> >> index 170f42fd6101..3266201f8c60 100644
> >> --- a/arch/arm64/kernel/ptrace.c
> >> +++ b/arch/arm64/kernel/ptrace.c
> >> @@ -1909,3 +1909,13 @@ int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task)
> >>  	else
> >>  		return valid_native_regs(regs);
> >>  }
> >> +
> >> +inline int is_syscall_success(struct pt_regs *regs)
> >> +{
> >> +	unsigned long val = regs->regs[0];
> >> +
> >> +	if (is_compat_thread(task_thread_info(current)))
> >> +		val = sign_extend64(val, 31);
> >> +
> >> +	return !IS_ERR_VALUE(val);
> >> +}
> > It's better to use compat_user_mode(regs) here instead of
> > is_compat_thread(). It saves us from worrying whether regs are for the
> > current context.
> 
> Thanks. I'll use this for v2.
> 
> >
> > I think we should change regs_return_value() instead. This function
> > seems to be called from several other places and it has the same
> > potential problems if called on compat pt_regs.
> 
> IMHO, now that we have had specific function, syscall_get_return_value, to get
> syscall return code, we might as well use it. regs_return_value may be left for
> where we want internal return code. I found such places below and haven't found
> other places that syscall sign extension is concerned about.
> 
> kernel/test_kprobes.c
> kernel/trace/trace_kprobe.c
> samples/kprobes/kretprobe_example.c

FWIW, I agree that we should use syscall_get_return_value(). If we make
the common implementation of is_syscall_success() use
syscall_get_return_value(), we shouldn't need to write our own
implementation, so I'd prefer if we could do that if possible.

IIUC regs_get_return_value() was originally meant to be used for kernel
regs, and is trying to do something quite different, so having the core
code use syscall_get_return_value() makes sense to allow architectures
to handle those cases differently.

Thanks,
Mark.
Will Deacon April 22, 2021, 4:07 p.m. UTC | #9
Hi Mark,

On Wed, Apr 21, 2021 at 06:10:05PM +0100, Mark Rutland wrote:
> On Mon, Apr 19, 2021 at 01:19:33PM +0100, Will Deacon wrote:
> > On Fri, Apr 16, 2021 at 02:34:41PM +0100, Mark Rutland wrote:
> > > I think this is a problem we created for ourselves back in commit:
> > > 
> > >   15956689a0e60aa0 ("arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return)
> > > 
> > > AFAICT, the perf regs samples are the only place this matters, since for
> > > ptrace the compat regs are implicitly truncated to compat_ulong_t, and
> > > audit expects the non-truncated return value. Other architectures don't
> > > truncate here, so I think we're setting ourselves up for a game of
> > > whack-a-mole to truncate and extend wherever we need to.
> > > 
> > > Given that, I suspect it'd be better to do something like the below.
> > > 
> > > Will, thoughts?
> > 
> > I think perf is one example, but this is also visible to userspace via the
> > native ptrace interface and I distinctly remember needing this for some
> > versions of arm64 strace to work correctly when tracing compat tasks.
> 
> FWIW, you've convinced me on your approach (more on that below), but
> when I went digging here this didn't seem to be exposed via ptrace --
> for any task tracing a compat task, the GPRs are exposed via
> compat_gpr_{get,set}(), which always truncate to compat_ulong_t, giving
> the lower 32 bits. See task_user_regset_view() for where we get the
> regset.
> 
> Am I missing something, or are you thinking of another issue you fixed
> at the same time?

I think it may depend on whether strace pokes at the GPRs or instead issues
a PTRACE_GET_SYSCALL_INFO request but I've forgotten the details,
unfortunately. I do remember seeing an issue though, and it was only last
year.

> > So I do think that clearing the upper bits on the return path is the right
> > approach, but it sounds like we need some more work to handle syscall(-1)
> > and audit (what exactly is the problem here after these patches have been
> > applied?)
> 
> From digging a bit more, I think I agree, and I think these patches are
> sufficient for audit. I have some comments I'll leave separately.
> 
> The remaining issues are wherever we assign a signed value to a compat
> GPR without explicit truncation. That'll leak via perf sampling the user
> regs, but I haven't managed to convince myself whether that causes any
> functional change in behaviour for audit, seccomp, or syscall tracing.
> 
> Since we mostly use compat_ulong_t for intermediate values in compat
> code, it does look like this is only an issue for x0 where we assign an
> error value, e.g. the -ENOSYS case in el0_svc_common. I'll go see if I
> can find any more.
> 
> With those fixed up we can remove the x0 truncation from entry.S,
> which'd be nice too.

If we remove that then we should probably have a (debug?) check on the
return-to-user path just to make sure.

Will
Mark Rutland April 22, 2021, 4:42 p.m. UTC | #10
On Thu, Apr 22, 2021 at 05:07:53PM +0100, Will Deacon wrote:
> On Wed, Apr 21, 2021 at 06:10:05PM +0100, Mark Rutland wrote:
> > On Mon, Apr 19, 2021 at 01:19:33PM +0100, Will Deacon wrote:
> > > On Fri, Apr 16, 2021 at 02:34:41PM +0100, Mark Rutland wrote:
> > > > I think this is a problem we created for ourselves back in commit:
> > > > 
> > > >   15956689a0e60aa0 ("arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return)
> > > > 
> > > > AFAICT, the perf regs samples are the only place this matters, since for
> > > > ptrace the compat regs are implicitly truncated to compat_ulong_t, and
> > > > audit expects the non-truncated return value. Other architectures don't
> > > > truncate here, so I think we're setting ourselves up for a game of
> > > > whack-a-mole to truncate and extend wherever we need to.
> > > > 
> > > > Given that, I suspect it'd be better to do something like the below.
> > > > 
> > > > Will, thoughts?
> > > 
> > > I think perf is one example, but this is also visible to userspace via the
> > > native ptrace interface and I distinctly remember needing this for some
> > > versions of arm64 strace to work correctly when tracing compat tasks.
> > 
> > FWIW, you've convinced me on your approach (more on that below), but
> > when I went digging here this didn't seem to be exposed via ptrace --
> > for any task tracing a compat task, the GPRs are exposed via
> > compat_gpr_{get,set}(), which always truncate to compat_ulong_t, giving
> > the lower 32 bits. See task_user_regset_view() for where we get the
> > regset.
> > 
> > Am I missing something, or are you thinking of another issue you fixed
> > at the same time?
> 
> I think it may depend on whether strace pokes at the GPRs or instead issues
> a PTRACE_GET_SYSCALL_INFO request but I've forgotten the details,
> unfortunately. I do remember seeing an issue though, and it was only last
> year.

Ah; I hadn't spotted PTRACE_GET_SYSCALL_INFO, thanks for the pointer. I
see that gets at the regs via syscall_get_arguments(), which doesn't
truncate them.

That makes me wonder whether x86 and others do the right thing here...

> > > So I do think that clearing the upper bits on the return path is the right
> > > approach, but it sounds like we need some more work to handle syscall(-1)
> > > and audit (what exactly is the problem here after these patches have been
> > > applied?)
> > 
> > From digging a bit more, I think I agree, and I think these patches are
> > sufficient for audit. I have some comments I'll leave separately.
> > 
> > The remaining issues are wherever we assign a signed value to a compat
> > GPR without explicit truncation. That'll leak via perf sampling the user
> > regs, but I haven't managed to convince myself whether that causes any
> > functional change in behaviour for audit, seccomp, or syscall tracing.
> > 
> > Since we mostly use compat_ulong_t for intermediate values in compat
> > code, it does look like this is only an issue for x0 where we assign an
> > error value, e.g. the -ENOSYS case in el0_svc_common. I'll go see if I
> > can find any more.
> > 
> > With those fixed up we can remove the x0 truncation from entry.S,
> > which'd be nice too.
> 
> If we remove that then we should probably have a (debug?) check on the
> return-to-user path just to make sure.

I've hacked that up locally; I'd certainly like to run that along with
some other sanity checks (e.g. PSTATE) while fuzzing, but agree that's
probably not something to enable in any defconfig build.

I'm happy to add that later as and when I remove the bit from entry.S.

Thanks,
Mark.
Dmitry V. Levin April 22, 2021, 6:57 p.m. UTC | #11
On Thu, Apr 22, 2021 at 05:42:28PM +0100, Mark Rutland wrote:
> On Thu, Apr 22, 2021 at 05:07:53PM +0100, Will Deacon wrote:
> > On Wed, Apr 21, 2021 at 06:10:05PM +0100, Mark Rutland wrote:
> > > On Mon, Apr 19, 2021 at 01:19:33PM +0100, Will Deacon wrote:
> > > > On Fri, Apr 16, 2021 at 02:34:41PM +0100, Mark Rutland wrote:
> > > > > I think this is a problem we created for ourselves back in commit:
> > > > > 
> > > > >   15956689a0e60aa0 ("arm64: compat: Ensure upper 32 bits of x0 are zero on syscall return)
> > > > > 
> > > > > AFAICT, the perf regs samples are the only place this matters, since for
> > > > > ptrace the compat regs are implicitly truncated to compat_ulong_t, and
> > > > > audit expects the non-truncated return value. Other architectures don't
> > > > > truncate here, so I think we're setting ourselves up for a game of
> > > > > whack-a-mole to truncate and extend wherever we need to.
> > > > > 
> > > > > Given that, I suspect it'd be better to do something like the below.
> > > > > 
> > > > > Will, thoughts?
> > > > 
> > > > I think perf is one example, but this is also visible to userspace via the
> > > > native ptrace interface and I distinctly remember needing this for some
> > > > versions of arm64 strace to work correctly when tracing compat tasks.
> > > 
> > > FWIW, you've convinced me on your approach (more on that below), but
> > > when I went digging here this didn't seem to be exposed via ptrace --
> > > for any task tracing a compat task, the GPRs are exposed via
> > > compat_gpr_{get,set}(), which always truncate to compat_ulong_t, giving
> > > the lower 32 bits. See task_user_regset_view() for where we get the
> > > regset.
> > > 
> > > Am I missing something, or are you thinking of another issue you fixed
> > > at the same time?
> > 
> > I think it may depend on whether strace pokes at the GPRs or instead issues
> > a PTRACE_GET_SYSCALL_INFO request but I've forgotten the details,
> > unfortunately. I do remember seeing an issue though, and it was only last
> > year.
> 
> Ah; I hadn't spotted PTRACE_GET_SYSCALL_INFO, thanks for the pointer. I
> see that gets at the regs via syscall_get_arguments(), which doesn't
> truncate them.
> 
> That makes me wonder whether x86 and others do the right thing here...

Yes, some architectures had to be fixed, but they mostly do the right
thing nowadays.

Feel free to use tools/testing/selftests/ptrace/get_syscall_info.c for
testing, or indeed use strace test suite.
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
index e58bca832dff..3c415e9e5d85 100644
--- a/arch/arm64/include/asm/ptrace.h
+++ b/arch/arm64/include/asm/ptrace.h
@@ -328,6 +328,9 @@  static inline void regs_set_return_value(struct pt_regs *regs, unsigned long rc)
 	regs->regs[0] = rc;
 }
 
+extern inline int is_syscall_success(struct pt_regs *regs);
+#define is_syscall_success(regs) is_syscall_success(regs)
+
 /**
  * regs_get_kernel_argument() - get Nth function argument in kernel
  * @regs:	pt_regs of that context
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 170f42fd6101..3266201f8c60 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1909,3 +1909,13 @@  int valid_user_regs(struct user_pt_regs *regs, struct task_struct *task)
 	else
 		return valid_native_regs(regs);
 }
+
+inline int is_syscall_success(struct pt_regs *regs)
+{
+	unsigned long val = regs->regs[0];
+
+	if (is_compat_thread(task_thread_info(current)))
+		val = sign_extend64(val, 31);
+
+	return !IS_ERR_VALUE(val);
+}