diff mbox series

riscv: entry: set a0 prior to syscall_handler

Message ID 20230711062202.3542367-1-CoelacanthusHex@gmail.com (mailing list archive)
State Superseded, archived
Headers show
Series riscv: entry: set a0 prior to syscall_handler | expand

Checks

Context Check Description
conchuod/cover_letter success Single patches do not need cover letters
conchuod/tree_selection success Guessed tree name to be fixes at HEAD 06c2afb862f9
conchuod/fixes_present success Fixes tag present in non-next series
conchuod/maintainers_pattern success MAINTAINERS pattern errors before the patch: 4 and now 4
conchuod/verify_signedoff success Signed-off-by tag matches author and committer
conchuod/kdoc success Errors and warnings before: 0 this patch: 0
conchuod/build_rv64_clang_allmodconfig success Errors and warnings before: 11 this patch: 11
conchuod/module_param success Was 0 now: 0
conchuod/build_rv64_gcc_allmodconfig success Errors and warnings before: 23 this patch: 23
conchuod/build_rv32_defconfig success Build OK
conchuod/dtb_warn_rv64 success Errors and warnings before: 3 this patch: 3
conchuod/header_inline success No static functions without inline keyword in header files
conchuod/checkpatch warning CHECK: From:/Signed-off-by: email comments mismatch: 'From: Celeste Liu <coelacanthushex@gmail.com>' != 'Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>' WARNING: ENOSYS means 'invalid syscall nr' and nothing else WARNING: Reported-by: should be immediately followed by Closes: with a URL to the report
conchuod/build_rv64_nommu_k210_defconfig success Build OK
conchuod/verify_fixes success Fixes tag looks correct
conchuod/build_rv64_nommu_virt_defconfig success Build OK

Commit Message

Celeste Liu July 11, 2023, 6:21 a.m. UTC
When we test seccomp with 6.4 kernel, we found errno has wrong value.
If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
get ENOSYS. We got same result with 9c2598d43510 ("riscv: entry: Save a0
prior syscall_enter_from_user_mode()").

Compared with x86 and loongarch's implementation of this part of the
function, we think that regs->a0 = -ENOSYS should be advanced before
syscall_handler to fix this problem. We have written the following patch,
which can fix this problem after testing. But we don't know enough about
this part of the code to explain the root cause. Hope someone can find
a reasonable explanation. And we'd like to reword this commit message
according to the explanation in v2

Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
Reported-by: Felix Yan <felixonmars@archlinux.org>
Co-developed-by: Ruizhe Pan <c141028@gmail.com>
Signed-off-by: Ruizhe Pan <c141028@gmail.com>
Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Tested-by: Felix Yan <felixonmars@archlinux.org>
---
 arch/riscv/kernel/traps.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

Comments

Guo Ren July 13, 2023, midnight UTC | #1
On Tue, Jul 11, 2023 at 2:22 AM Celeste Liu <coelacanthushex@gmail.com> wrote:
>
> When we test seccomp with 6.4 kernel, we found errno has wrong value.
> If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
> get ENOSYS. We got same result with 9c2598d43510 ("riscv: entry: Save a0
> prior syscall_enter_from_user_mode()").
>
> Compared with x86 and loongarch's implementation of this part of the
> function, we think that regs->a0 = -ENOSYS should be advanced before
> syscall_handler to fix this problem. We have written the following patch,
> which can fix this problem after testing. But we don't know enough about
> this part of the code to explain the root cause. Hope someone can find
> a reasonable explanation. And we'd like to reword this commit message
> according to the explanation in v2
>
> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> Reported-by: Felix Yan <felixonmars@archlinux.org>
> Co-developed-by: Ruizhe Pan <c141028@gmail.com>
> Signed-off-by: Ruizhe Pan <c141028@gmail.com>
> Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
> Tested-by: Felix Yan <felixonmars@archlinux.org>
> ---
>  arch/riscv/kernel/traps.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> index f910dfccbf5d2..ccadb5ffd063c 100644
> --- a/arch/riscv/kernel/traps.c
> +++ b/arch/riscv/kernel/traps.c
> @@ -301,6 +301,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>
>                 regs->epc += 4;
>                 regs->orig_a0 = regs->a0;
> +               regs->a0 = -ENOSYS;
Oh, no. You destroyed the a0 for syscall_handler, right? It's not
reasonable. Let's see which syscall_handler needs a0=-ENOSYS.

Could you give out more detail on your test case?

>
>                 riscv_v_vstate_discard(regs);
>
> @@ -308,8 +309,6 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>
>                 if (syscall < NR_syscalls)
>                         syscall_handler(regs, syscall);
> -               else
> -                       regs->a0 = -ENOSYS;
>
>                 syscall_exit_to_user_mode(regs);
>         } else {
> --
> 2.41.0
>
Celeste Liu July 13, 2023, 5:26 a.m. UTC | #2
On 2023/7/13 08:00, Guo Ren wrote:
> On Tue, Jul 11, 2023 at 2:22 AM Celeste Liu <coelacanthushex@gmail.com> wrote:
>>
>> When we test seccomp with 6.4 kernel, we found errno has wrong value.
>> If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
>> get ENOSYS. We got same result with 9c2598d43510 ("riscv: entry: Save a0
>> prior syscall_enter_from_user_mode()").
>>
>> Compared with x86 and loongarch's implementation of this part of the
>> function, we think that regs->a0 = -ENOSYS should be advanced before
>> syscall_handler to fix this problem. We have written the following patch,
>> which can fix this problem after testing. But we don't know enough about
>> this part of the code to explain the root cause. Hope someone can find
>> a reasonable explanation. And we'd like to reword this commit message
>> according to the explanation in v2
>>
>> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
>> Reported-by: Felix Yan <felixonmars@archlinux.org>
>> Co-developed-by: Ruizhe Pan <c141028@gmail.com>
>> Signed-off-by: Ruizhe Pan <c141028@gmail.com>
>> Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
>> Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
>> Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
>> Tested-by: Felix Yan <felixonmars@archlinux.org>
>> ---
>>  arch/riscv/kernel/traps.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>> index f910dfccbf5d2..ccadb5ffd063c 100644
>> --- a/arch/riscv/kernel/traps.c
>> +++ b/arch/riscv/kernel/traps.c
>> @@ -301,6 +301,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>>
>>                 regs->epc += 4;
>>                 regs->orig_a0 = regs->a0;
>> +               regs->a0 = -ENOSYS;
> Oh, no. You destroyed the a0 for syscall_handler, right? It's not
> reasonable. Let's see which syscall_handler needs a0=-ENOSYS.
> 
> Could you give out more detail on your test case?
> 

Our test case is here:

int main() {
  scmp_filter_ctx ctx;
  ctx = seccomp_init(SCMP_ACT_ALLOW);

  seccomp_rule_add_exact(ctx, SCMP_ACT_ERRNO(EAFNOSUPPORT), SCMP_SYS(socket), 2,
                         SCMP_CMP(0, SCMP_CMP_EQ, AF_NETLINK),
                         SCMP_CMP(2, SCMP_CMP_EQ, NETLINK_AUDIT));

  if (seccomp_load(ctx) < 0) {
    perror("seccomp_load");
    exit(EXIT_FAILURE);
  }

  int sock_fd1 = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
  if (sock_fd1 < 0) {
    printf("1st socket syscall failed with return value %d and errno %d (%s), which is unexpected\n",
           sock_fd1, errno, strerror(errno));
    seccomp_release(ctx);
    return 1;
  }
  printf("1st socket created successfully, as expected.\n");

  int sock_fd2 = socket(AF_NETLINK, SOCK_RAW, NETLINK_AUDIT);
  if (sock_fd2 < 0) {
    printf("2nd socket syscall failed with return value %d and errno %d (%s).\n", sock_fd2, errno,
           strerror(errno));

    if (errno == EAFNOSUPPORT) {
      printf("2nd socket syscall failed with EAFNOSUPPORT, as expected.\n");
      seccomp_release(ctx);
      return 0;
    } else {
      printf("2nd socket syscall failed with unexpected errno, which is unexpected.\n");
      seccomp_release(ctx);
      return 1;
    }
  }
  printf("2nd socket created successfully, which is unexpected.\n");
  seccomp_release(ctx);

  return 2;
}

>>
>>                 riscv_v_vstate_discard(regs);
>>
>> @@ -308,8 +309,6 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>>
>>                 if (syscall < NR_syscalls)
>>                         syscall_handler(regs, syscall);
>> -               else
>> -                       regs->a0 = -ENOSYS;
>>
>>                 syscall_exit_to_user_mode(regs);
>>         } else {
>> --
>> 2.41.0
>>
> 
>
Celeste Liu July 13, 2023, 3:28 p.m. UTC | #3
On 2023/7/13 08:00, Guo Ren wrote:
> On Tue, Jul 11, 2023 at 2:22 AM Celeste Liu <coelacanthushex@gmail.com> wrote:
>>
>> When we test seccomp with 6.4 kernel, we found errno has wrong value.
>> If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
>> get ENOSYS. We got same result with 9c2598d43510 ("riscv: entry: Save a0
>> prior syscall_enter_from_user_mode()").
>>
>> Compared with x86 and loongarch's implementation of this part of the
>> function, we think that regs->a0 = -ENOSYS should be advanced before
>> syscall_handler to fix this problem. We have written the following patch,
>> which can fix this problem after testing. But we don't know enough about
>> this part of the code to explain the root cause. Hope someone can find
>> a reasonable explanation. And we'd like to reword this commit message
>> according to the explanation in v2
>>
>> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
>> Reported-by: Felix Yan <felixonmars@archlinux.org>
>> Co-developed-by: Ruizhe Pan <c141028@gmail.com>
>> Signed-off-by: Ruizhe Pan <c141028@gmail.com>
>> Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
>> Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
>> Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
>> Tested-by: Felix Yan <felixonmars@archlinux.org>
>> ---
>>  arch/riscv/kernel/traps.c | 3 +--
>>  1 file changed, 1 insertion(+), 2 deletions(-)
>>
>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>> index f910dfccbf5d2..ccadb5ffd063c 100644
>> --- a/arch/riscv/kernel/traps.c
>> +++ b/arch/riscv/kernel/traps.c
>> @@ -301,6 +301,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>>
>>                 regs->epc += 4;
>>                 regs->orig_a0 = regs->a0;
>> +               regs->a0 = -ENOSYS;
> Oh, no. You destroyed the a0 for syscall_handler, right? It's not
> reasonable. Let's see which syscall_handler needs a0=-ENOSYS.

syscall_handler always use orig_a0, not a0.
And I have a mistake in original email, corret one is
syscall_enter_from_user_mode not syscall_handler.

> Could you give out more detail on your test case?
> 
>>
>>                 riscv_v_vstate_discard(regs);
>>
>> @@ -308,8 +309,6 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
>>
>>                 if (syscall < NR_syscalls)
>>                         syscall_handler(regs, syscall);
>> -               else
>> -                       regs->a0 = -ENOSYS;
>>
>>                 syscall_exit_to_user_mode(regs);
>>         } else {
>> --
>> 2.41.0
>>
> 
>
Guo Ren July 13, 2023, 4:59 p.m. UTC | #4
On Thu, Jul 13, 2023 at 11:28 AM Celeste Liu <coelacanthushex@gmail.com> wrote:
>
>
> On 2023/7/13 08:00, Guo Ren wrote:
> > On Tue, Jul 11, 2023 at 2:22 AM Celeste Liu <coelacanthushex@gmail.com> wrote:
> >>
> >> When we test seccomp with 6.4 kernel, we found errno has wrong value.
> >> If we deny NETLINK_AUDIT with EAFNOSUPPORT, after f0bddf50586d, we will
> >> get ENOSYS. We got same result with 9c2598d43510 ("riscv: entry: Save a0
> >> prior syscall_enter_from_user_mode()").
> >>
> >> Compared with x86 and loongarch's implementation of this part of the
> >> function, we think that regs->a0 = -ENOSYS should be advanced before
> >> syscall_handler to fix this problem. We have written the following patch,
> >> which can fix this problem after testing. But we don't know enough about
> >> this part of the code to explain the root cause. Hope someone can find
> >> a reasonable explanation. And we'd like to reword this commit message
> >> according to the explanation in v2
> >>
> >> Fixes: f0bddf50586d ("riscv: entry: Convert to generic entry")
> >> Reported-by: Felix Yan <felixonmars@archlinux.org>
> >> Co-developed-by: Ruizhe Pan <c141028@gmail.com>
> >> Signed-off-by: Ruizhe Pan <c141028@gmail.com>
> >> Co-developed-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> >> Signed-off-by: Shiqi Zhang <shiqi@isrc.iscas.ac.cn>
> >> Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
> >> Tested-by: Felix Yan <felixonmars@archlinux.org>
> >> ---
> >>  arch/riscv/kernel/traps.c | 3 +--
> >>  1 file changed, 1 insertion(+), 2 deletions(-)
> >>
> >> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
> >> index f910dfccbf5d2..ccadb5ffd063c 100644
> >> --- a/arch/riscv/kernel/traps.c
> >> +++ b/arch/riscv/kernel/traps.c
> >> @@ -301,6 +301,7 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
> >>
> >>                 regs->epc += 4;
> >>                 regs->orig_a0 = regs->a0;
> >> +               regs->a0 = -ENOSYS;
> > Oh, no. You destroyed the a0 for syscall_handler, right? It's not
> > reasonable. Let's see which syscall_handler needs a0=-ENOSYS.
>
> syscall_handler always use orig_a0, not a0.
> And I have a mistake in original email, corret one is
> syscall_enter_from_user_mode not syscall_handler.
I misunderstood. Yes, a0 would be replaced by orig_a0:
syscall_enter_from_user_mode_work ->  syscall_rollback

If the syscall was denied by syscall_enter_from_user_mode(), the
return number is forced to be -ENOSYS. Maybe regs->a0 has already been
updated by SYSCALL_WORK_SECCOMP. eg:

__seccomp_filter() {
...
        case SECCOMP_RET_TRAP:
                /* Show the handler the original registers. */
                syscall_rollback(current, current_pt_regs());
                /* Let the filter pass back 16 bits of data. */
                force_sig_seccomp(this_syscall, data, false);
                goto skip;

>
> > Could you give out more detail on your test case?
> >
> >>
> >>                 riscv_v_vstate_discard(regs);
> >>
> >> @@ -308,8 +309,6 @@ asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
> >>
> >>                 if (syscall < NR_syscalls)
> >>                         syscall_handler(regs, syscall);
> >> -               else
> >> -                       regs->a0 = -ENOSYS;
> >>
> >>                 syscall_exit_to_user_mode(regs);
> >>         } else {
> >> --
> >> 2.41.0
> >>
> >
> >
>
diff mbox series

Patch

diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
index f910dfccbf5d2..ccadb5ffd063c 100644
--- a/arch/riscv/kernel/traps.c
+++ b/arch/riscv/kernel/traps.c
@@ -301,6 +301,7 @@  asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
 
 		regs->epc += 4;
 		regs->orig_a0 = regs->a0;
+		regs->a0 = -ENOSYS;
 
 		riscv_v_vstate_discard(regs);
 
@@ -308,8 +309,6 @@  asmlinkage __visible __trap_section void do_trap_ecall_u(struct pt_regs *regs)
 
 		if (syscall < NR_syscalls)
 			syscall_handler(regs, syscall);
-		else
-			regs->a0 = -ENOSYS;
 
 		syscall_exit_to_user_mode(regs);
 	} else {