Message ID | 1415346443-28915-1-git-send-email-takahiro.akashi@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote: > This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL. > It can be used to change a system call number as follows: > ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no); > 'new_syscall_no' can be -1 to skip this system call, you need to modify > a register's value, in arch-specific way, as return value though. > > Please note that we can't define PTRACE_SET_SYSCALL macro in > uapi/linux/ptrace.h partly because its value on arm, 23, is used as another > request on sparc. > > This patch also contains an example of change on arch side, arm. > Only syscall_set_nr() is required to be defined in asm/syscall.h. > > Currently only arm has this request, while arm64 would also have it > once my patch series of seccomp for arm64 is merged. It will also be > usable for most of other arches. > See the discussions in lak-ml: > http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> > Can you describe why you are moving the implementation? Is this a feature that we want to have on all architectures in the future? As you say, only arm32 implements is at the moment. Arnd
On Fri, Nov 07, 2014 at 09:30:53AM +0000, Arnd Bergmann wrote: > On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote: > > This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL. > > It can be used to change a system call number as follows: > > ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no); > > 'new_syscall_no' can be -1 to skip this system call, you need to modify > > a register's value, in arch-specific way, as return value though. > > > > Please note that we can't define PTRACE_SET_SYSCALL macro in > > uapi/linux/ptrace.h partly because its value on arm, 23, is used as another > > request on sparc. > > > > This patch also contains an example of change on arch side, arm. > > Only syscall_set_nr() is required to be defined in asm/syscall.h. > > > > Currently only arm has this request, while arm64 would also have it > > once my patch series of seccomp for arm64 is merged. It will also be > > usable for most of other arches. > > See the discussions in lak-ml: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html > > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> > > > > Can you describe why you are moving the implementation? Is this a feature > that we want to have on all architectures in the future? As you say, > only arm32 implements is at the moment. We need this for arm64 and, since all architectures seem to have a mechanism for setting a system call via ptrace, moving it to generic code should make sense for new architectures too, no? We don't have any arch-specific ptrace requests on arm64, so it would be a shame if we had to add one now, especially since there's nothing conceptually arch-specific about setting a syscall number. Will
On Friday 07 November 2014 11:55:51 Will Deacon wrote: > On Fri, Nov 07, 2014 at 09:30:53AM +0000, Arnd Bergmann wrote: > > On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote: > > > This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL. > > > It can be used to change a system call number as follows: > > > ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no); > > > 'new_syscall_no' can be -1 to skip this system call, you need to modify > > > a register's value, in arch-specific way, as return value though. > > > > > > Please note that we can't define PTRACE_SET_SYSCALL macro in > > > uapi/linux/ptrace.h partly because its value on arm, 23, is used as another > > > request on sparc. > > > > > > This patch also contains an example of change on arch side, arm. > > > Only syscall_set_nr() is required to be defined in asm/syscall.h. > > > > > > Currently only arm has this request, while arm64 would also have it > > > once my patch series of seccomp for arm64 is merged. It will also be > > > usable for most of other arches. > > > See the discussions in lak-ml: > > > http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html > > > > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> > > > > > > > Can you describe why you are moving the implementation? Is this a feature > > that we want to have on all architectures in the future? As you say, > > only arm32 implements is at the moment. > > We need this for arm64 and, since all architectures seem to have a mechanism > for setting a system call via ptrace, moving it to generic code should make > sense for new architectures too, no? It makes a little more sense now, but I still don't understand why you need to set the system call number via ptrace. What is this used for, and why doesn't any other architecture have this? Arnd
On Fri, Nov 07, 2014 at 01:03:00PM +0100, Arnd Bergmann wrote: > On Friday 07 November 2014 11:55:51 Will Deacon wrote: > > We need this for arm64 and, since all architectures seem to have a mechanism > > for setting a system call via ptrace, moving it to generic code should make > > sense for new architectures too, no? > > It makes a little more sense now, but I still don't understand why you > need to set the system call number via ptrace. What is this used for, > and why doesn't any other architecture have this? All other architectures have a way. x86, for example, you set orig_eax (or orig_rax) to change the syscall number. On ARM, that doesn't work because we don't always pass the syscall number in a register.
On Fri, Nov 07, 2014 at 12:03:00PM +0000, Arnd Bergmann wrote: > On Friday 07 November 2014 11:55:51 Will Deacon wrote: > > On Fri, Nov 07, 2014 at 09:30:53AM +0000, Arnd Bergmann wrote: > > > On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote: > > > > This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL. > > > > It can be used to change a system call number as follows: > > > > ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no); > > > > 'new_syscall_no' can be -1 to skip this system call, you need to modify > > > > a register's value, in arch-specific way, as return value though. > > > > > > > > Please note that we can't define PTRACE_SET_SYSCALL macro in > > > > uapi/linux/ptrace.h partly because its value on arm, 23, is used as another > > > > request on sparc. > > > > > > > > This patch also contains an example of change on arch side, arm. > > > > Only syscall_set_nr() is required to be defined in asm/syscall.h. > > > > > > > > Currently only arm has this request, while arm64 would also have it > > > > once my patch series of seccomp for arm64 is merged. It will also be > > > > usable for most of other arches. > > > > See the discussions in lak-ml: > > > > http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html > > > > > > > > Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> > > > > > > > > > > Can you describe why you are moving the implementation? Is this a feature > > > that we want to have on all architectures in the future? As you say, > > > only arm32 implements is at the moment. > > > > We need this for arm64 and, since all architectures seem to have a mechanism > > for setting a system call via ptrace, moving it to generic code should make > > sense for new architectures too, no? > > It makes a little more sense now, but I still don't understand why you > need to set the system call number via ptrace. What is this used for, > and why doesn't any other architecture have this? I went through the same thought process back in August, and Akashi eventually convinced me that this was the best thing to do: http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/278692.html It comes down to a debugger (which could be GDB, seccomp, tracer ...) wanting to change the system call number. This is also used as a mechanism to skip a system call by setting it to '-1' (yeah, it's gross, and the interaction between all of these syscall hooks is horrible too). If we update w8 directly instead, we run into a couple of issues: - Needing to restore the original w8 if the value is set to '-1' for skip, but continuing to return -ENOSYS for syscall(-1) when not on a tracer path - seccomp assumes that syscall_get_nr will return the version set by the most recent tracer, so then we need hacks in ptrace to route register writes to w8 to syscallno in pt_regs, but again, only in the case that we're tracing. Akashi might be able to elaborate on other problems, since this was a couple of months ago and I take every opportunity I can to avoid looking at this part of the kernel. Will
On Friday 07 November 2014 12:11:19 Russell King - ARM Linux wrote: > On Fri, Nov 07, 2014 at 01:03:00PM +0100, Arnd Bergmann wrote: > > On Friday 07 November 2014 11:55:51 Will Deacon wrote: > > > We need this for arm64 and, since all architectures seem to have a mechanism > > > for setting a system call via ptrace, moving it to generic code should make > > > sense for new architectures too, no? > > > > It makes a little more sense now, but I still don't understand why you > > need to set the system call number via ptrace. What is this used for, > > and why doesn't any other architecture have this? > > All other architectures have a way. x86, for example, you set orig_eax > (or orig_rax) to change the syscall number. On ARM, that doesn't work > because we don't always pass the syscall number in a register. > Sorry for being slow today, but why can't we use the same interface that s390 has on arm64: static int s390_system_call_get(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, void *kbuf, void __user *ubuf) { unsigned int *data = &task_thread_info(target)->system_call; return user_regset_copyout(&pos, &count, &kbuf, &ubuf, data, 0, sizeof(unsigned int)); } static int s390_system_call_set(struct task_struct *target, const struct user_regset *regset, unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { unsigned int *data = &task_thread_info(target)->system_call; return user_regset_copyin(&pos, &count, &kbuf, &ubuf, data, 0, sizeof(unsigned int)); } static const struct user_regset s390_regsets[] = { ... { .core_note_type = NT_S390_SYSTEM_CALL, .n = 1, .size = sizeof(unsigned int), .align = sizeof(unsigned int), .get = s390_system_call_get, .set = s390_system_call_set, }, ... }; Is it just preference for being consistent with ARM32, or is there a reason this won't work? It's not that I care strongly about the interface, my main point is that the changelog doesn't describe why one interface was used instead the other. Arnd
On Fri, Nov 07, 2014 at 12:44:07PM +0000, Arnd Bergmann wrote: > On Friday 07 November 2014 12:11:19 Russell King - ARM Linux wrote: > > On Fri, Nov 07, 2014 at 01:03:00PM +0100, Arnd Bergmann wrote: > > > On Friday 07 November 2014 11:55:51 Will Deacon wrote: > > > > We need this for arm64 and, since all architectures seem to have a mechanism > > > > for setting a system call via ptrace, moving it to generic code should make > > > > sense for new architectures too, no? > > > > > > It makes a little more sense now, but I still don't understand why you > > > need to set the system call number via ptrace. What is this used for, > > > and why doesn't any other architecture have this? > > > > All other architectures have a way. x86, for example, you set orig_eax > > (or orig_rax) to change the syscall number. On ARM, that doesn't work > > because we don't always pass the syscall number in a register. > > > > Sorry for being slow today, but why can't we use the same interface that > s390 has on arm64: > > static int s390_system_call_get(struct task_struct *target, > const struct user_regset *regset, > unsigned int pos, unsigned int count, > void *kbuf, void __user *ubuf) > { > unsigned int *data = &task_thread_info(target)->system_call; > return user_regset_copyout(&pos, &count, &kbuf, &ubuf, > data, 0, sizeof(unsigned int)); > } > > static int s390_system_call_set(struct task_struct *target, > const struct user_regset *regset, > unsigned int pos, unsigned int count, > const void *kbuf, const void __user *ubuf) > { > unsigned int *data = &task_thread_info(target)->system_call; > return user_regset_copyin(&pos, &count, &kbuf, &ubuf, > data, 0, sizeof(unsigned int)); > } > > static const struct user_regset s390_regsets[] = { > ... > { > .core_note_type = NT_S390_SYSTEM_CALL, > .n = 1, > .size = sizeof(unsigned int), > .align = sizeof(unsigned int), > .get = s390_system_call_get, > .set = s390_system_call_set, > }, > ... > }; > > Is it just preference for being consistent with ARM32, or is there a > reason this won't work? Interesting, I hadn't considered a unit-length regset. > It's not that I care strongly about the interface, my main point is > that the changelog doesn't describe why one interface was used instead > the other. I suspect the current approach was taken because it follows the same scheme as 32-bit ARM. If both methods are sufficient (Kees would have a better idea than me on that), then I don't have a strong preference. Will
On 11/07, AKASHI Takahiro wrote: > > --- a/arch/arm/kernel/ptrace.c > +++ b/arch/arm/kernel/ptrace.c > @@ -853,11 +853,6 @@ long arch_ptrace(struct task_struct *child, long request, > datap); > break; > > - case PTRACE_SET_SYSCALL: > - task_thread_info(child)->syscall = data; > - ret = 0; > - break; > - > #ifdef CONFIG_CRUNCH > case PTRACE_GETCRUNCHREGS: > ret = ptrace_getcrunchregs(child, datap); > diff --git a/kernel/ptrace.c b/kernel/ptrace.c > index 54e7522..d7048fa 100644 > --- a/kernel/ptrace.c > +++ b/kernel/ptrace.c > @@ -1001,6 +1001,12 @@ int ptrace_request(struct task_struct *child, long request, > break; > } > #endif > + > +#ifdef PTRACE_SET_SYSCALL > + case PTRACE_SET_SYSCALL: > + ret = syscall_set_nr(child, task_pt_regs(child), data); > + break; > +#endif I too do not understand why it makes sense to move PTRACE_SET_SYSCALL into the common kernel/ptrace.c. To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr() is very much arch-dependant (but most probably trivial) means that this code should live in arch_ptrace(). In any case, I think it doesn't make sense to pass task_pt_regs(child), this helper can do this itself if it needs struct pt_regs. Oleg.
On Friday 07 November 2014 13:11:30 Will Deacon wrote: > > > It's not that I care strongly about the interface, my main point is > > that the changelog doesn't describe why one interface was used instead > > the other. > > I suspect the current approach was taken because it follows the same scheme > as 32-bit ARM. If both methods are sufficient (Kees would have a better idea > than me on that), then I don't have a strong preference. Using the regset would probably address Oleg's comment, and would keep the implementation architecture specific. You could even share the NT_S390_SYSTEM_CALL number, but I don't know if there any downsides to doing that. Arnd
On Fri, Nov 7, 2014 at 6:30 AM, Arnd Bergmann <arnd@arndb.de> wrote: > On Friday 07 November 2014 13:11:30 Will Deacon wrote: >> >> > It's not that I care strongly about the interface, my main point is >> > that the changelog doesn't describe why one interface was used instead >> > the other. >> >> I suspect the current approach was taken because it follows the same scheme >> as 32-bit ARM. If both methods are sufficient (Kees would have a better idea >> than me on that), then I don't have a strong preference. > > Using the regset would probably address Oleg's comment, and would keep the > implementation architecture specific. You could even share the NT_S390_SYSTEM_CALL > number, but I don't know if there any downsides to doing that. That's fine by me -- I only want an interface. :) I think it'd be nice to keep it the same between arm32 and arm64, but using a specific regset does seem to be the better approach. -Kees
Not that I'm actually involved any more, but I'd endorse the user_regset approach and not the new request. On many (most?) machines, it's already part of the main integer regset (orig_rax et al) and adding another mechanism is redundant. Using user_regset also means there won't be a word of hidden state that is not visible in core dumps.
On 11/07/2014 09:27 PM, Will Deacon wrote: > On Fri, Nov 07, 2014 at 12:03:00PM +0000, Arnd Bergmann wrote: >> On Friday 07 November 2014 11:55:51 Will Deacon wrote: >>> On Fri, Nov 07, 2014 at 09:30:53AM +0000, Arnd Bergmann wrote: >>>> On Friday 07 November 2014 16:47:23 AKASHI Takahiro wrote: >>>>> This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL. >>>>> It can be used to change a system call number as follows: >>>>> ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no); >>>>> 'new_syscall_no' can be -1 to skip this system call, you need to modify >>>>> a register's value, in arch-specific way, as return value though. >>>>> >>>>> Please note that we can't define PTRACE_SET_SYSCALL macro in >>>>> uapi/linux/ptrace.h partly because its value on arm, 23, is used as another >>>>> request on sparc. >>>>> >>>>> This patch also contains an example of change on arch side, arm. >>>>> Only syscall_set_nr() is required to be defined in asm/syscall.h. >>>>> >>>>> Currently only arm has this request, while arm64 would also have it >>>>> once my patch series of seccomp for arm64 is merged. It will also be >>>>> usable for most of other arches. >>>>> See the discussions in lak-ml: >>>>> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html >>>>> >>>>> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> >>>>> >>>> >>>> Can you describe why you are moving the implementation? Is this a feature >>>> that we want to have on all architectures in the future? As you say, >>>> only arm32 implements is at the moment. >>> >>> We need this for arm64 and, since all architectures seem to have a mechanism >>> for setting a system call via ptrace, moving it to generic code should make >>> sense for new architectures too, no? >> >> It makes a little more sense now, but I still don't understand why you >> need to set the system call number via ptrace. What is this used for, >> and why doesn't any other architecture have this? > > I went through the same thought process back in August, and Akashi > eventually convinced me that this was the best thing to do: > > http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/278692.html > > It comes down to a debugger (which could be GDB, seccomp, tracer ...) > wanting to change the system call number. This is also used as a mechanism > to skip a system call by setting it to '-1' (yeah, it's gross, and the > interaction between all of these syscall hooks is horrible too). > > If we update w8 directly instead, we run into a couple of issues: > > - Needing to restore the original w8 if the value is set to '-1' for > skip, but continuing to return -ENOSYS for syscall(-1) when not on a > tracer path Yeah, this restriction still exists on my recent patch, v7. (this is because arm64 uses the same register, x0, as the first argument and a return value.) > - seccomp assumes that syscall_get_nr will return the version set by > the most recent tracer, so then we need hacks in ptrace to route > register writes to w8 to syscallno in pt_regs, but again, only in the > case that we're tracing. The problem here is that, if we had a hack of replacinging syscallno with w8 in ptrace (ptrace_syscall_enter()), secure_computing() (actually, seccomp_phase2() on v3.18-rc) would have no chance of seeing a modified syscall number because the hack would be executed after secure_computing(). (Please note that a tracer simply modifies w8, not syscallno directly). This eventually results in missing a special case of -1 (skipping this system call). http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/280846.html That is why we needed to have a dedicated new interface. -Takahiro AKASHI > Akashi might be able to elaborate on other problems, since this was a > couple of months ago and I take every opportunity I can to avoid looking > at this part of the kernel. > > Will >
Will, On 11/07/2014 11:04 PM, Oleg Nesterov wrote: > On 11/07, AKASHI Takahiro wrote: >> >> --- a/arch/arm/kernel/ptrace.c >> +++ b/arch/arm/kernel/ptrace.c >> @@ -853,11 +853,6 @@ long arch_ptrace(struct task_struct *child, long request, >> datap); >> break; >> >> - case PTRACE_SET_SYSCALL: >> - task_thread_info(child)->syscall = data; >> - ret = 0; >> - break; >> - >> #ifdef CONFIG_CRUNCH >> case PTRACE_GETCRUNCHREGS: >> ret = ptrace_getcrunchregs(child, datap); >> diff --git a/kernel/ptrace.c b/kernel/ptrace.c >> index 54e7522..d7048fa 100644 >> --- a/kernel/ptrace.c >> +++ b/kernel/ptrace.c >> @@ -1001,6 +1001,12 @@ int ptrace_request(struct task_struct *child, long request, >> break; >> } >> #endif >> + >> +#ifdef PTRACE_SET_SYSCALL >> + case PTRACE_SET_SYSCALL: >> + ret = syscall_set_nr(child, task_pt_regs(child), data); >> + break; >> +#endif > > I too do not understand why it makes sense to move PTRACE_SET_SYSCALL into > the common kernel/ptrace.c. I think I explained why we need a new (atomic) interface of changing a system call number while tracing with ptrace. But I don't have a strong preference, either ptrace(SET_SYSCALL) or ptrace(SETREGSET, NT_SYSTEM_CALL). > To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr() > is very much arch-dependant (but most probably trivial) means that this code > should live in arch_ptrace(). Thinking of Oleg's comment above, it doesn't make sense neither to define generic NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset() in kernel/ptrace.c with arch-defined syscall_(g)set_nr(). Since we should have the same interface on arm and arm64, we'd better implement ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted). -Takahiro AKASHI > In any case, I think it doesn't make sense to pass task_pt_regs(child), this > helper can do this itself if it needs struct pt_regs. > > Oleg. >
On Wed, Nov 12, 2014 at 10:46:01AM +0000, AKASHI Takahiro wrote: > On 11/07/2014 11:04 PM, Oleg Nesterov wrote: > > To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr() > > is very much arch-dependant (but most probably trivial) means that this code > > should live in arch_ptrace(). > > Thinking of Oleg's comment above, it doesn't make sense neither to define generic > NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset() > in kernel/ptrace.c with arch-defined syscall_(g)set_nr(). > > Since we should have the same interface on arm and arm64, we'd better implement > ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted). I think the regset approach is cleaner. We already do something similar for TLS. That would be implemented under arch/arm64/ with it's own NT type. Will
On 11/12/2014 08:00 PM, Will Deacon wrote: > On Wed, Nov 12, 2014 at 10:46:01AM +0000, AKASHI Takahiro wrote: >> On 11/07/2014 11:04 PM, Oleg Nesterov wrote: >>> To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr() >>> is very much arch-dependant (but most probably trivial) means that this code >>> should live in arch_ptrace(). >> >> Thinking of Oleg's comment above, it doesn't make sense neither to define generic >> NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset() >> in kernel/ptrace.c with arch-defined syscall_(g)set_nr(). >> >> Since we should have the same interface on arm and arm64, we'd better implement >> ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted). > > I think the regset approach is cleaner. We already do something similar for > TLS. That would be implemented under arch/arm64/ with it's own NT type. Okey, so arm64 goes its own way :) Or do you want to have a similar regset on arm, too? (In this case, NT_ARM_SYSTEM_CALL can be shared in uapi/linux/elf.h) -Takahiro AKASHI > Will >
On Wed, Nov 12, 2014 at 11:06:59AM +0000, AKASHI Takahiro wrote: > On 11/12/2014 08:00 PM, Will Deacon wrote: > > On Wed, Nov 12, 2014 at 10:46:01AM +0000, AKASHI Takahiro wrote: > >> On 11/07/2014 11:04 PM, Oleg Nesterov wrote: > >>> To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr() > >>> is very much arch-dependant (but most probably trivial) means that this code > >>> should live in arch_ptrace(). > >> > >> Thinking of Oleg's comment above, it doesn't make sense neither to define generic > >> NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset() > >> in kernel/ptrace.c with arch-defined syscall_(g)set_nr(). > >> > >> Since we should have the same interface on arm and arm64, we'd better implement > >> ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted). > > > > I think the regset approach is cleaner. We already do something similar for > > TLS. That would be implemented under arch/arm64/ with it's own NT type. > > Okey, so arm64 goes its own way :) > Or do you want to have a similar regset on arm, too? > (In this case, NT_ARM_SYSTEM_CALL can be shared in uapi/linux/elf.h) Just do arm64. We already have the dedicated request for arch/arm/. Will
On Wednesday 12 November 2014 11:13:52 Will Deacon wrote: > On Wed, Nov 12, 2014 at 11:06:59AM +0000, AKASHI Takahiro wrote: > > On 11/12/2014 08:00 PM, Will Deacon wrote: > > > On Wed, Nov 12, 2014 at 10:46:01AM +0000, AKASHI Takahiro wrote: > > >> On 11/07/2014 11:04 PM, Oleg Nesterov wrote: > > >>> To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr() > > >>> is very much arch-dependant (but most probably trivial) means that this code > > >>> should live in arch_ptrace(). > > >> > > >> Thinking of Oleg's comment above, it doesn't make sense neither to define generic > > >> NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset() > > >> in kernel/ptrace.c with arch-defined syscall_(g)set_nr(). > > >> > > >> Since we should have the same interface on arm and arm64, we'd better implement > > >> ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted). > > > > > > I think the regset approach is cleaner. We already do something similar for > > > TLS. That would be implemented under arch/arm64/ with it's own NT type. > > > > Okey, so arm64 goes its own way > > Or do you want to have a similar regset on arm, too? > > (In this case, NT_ARM_SYSTEM_CALL can be shared in uapi/linux/elf.h) > > Just do arm64. We already have the dedicated request for arch/arm/. I wonder if we should define NT_ARM64_SYSTEM_CALL to the same value as NT_S390_SYSTEM_CALL (0x307), or even define it as an architecture- independent NT_SYSTEM_CALL number with that value, so other architectures don't have to introduce new types when they also want to implement it. Arnd
On Wed, Nov 12, 2014 at 12:19:26PM +0100, Arnd Bergmann wrote: > On Wednesday 12 November 2014 11:13:52 Will Deacon wrote: > > Just do arm64. We already have the dedicated request for arch/arm/. > > I wonder if we should define NT_ARM64_SYSTEM_CALL to the same value > as NT_S390_SYSTEM_CALL (0x307), or even define it as an architecture- > independent NT_SYSTEM_CALL number with that value, so other architectures > don't have to introduce new types when they also want to implement it. That would be a sane thing to do, so that tools which want to get at this information can do in an almost standardised way for architectures implementing this method.
On 11/12/2014 08:19 PM, Arnd Bergmann wrote: > On Wednesday 12 November 2014 11:13:52 Will Deacon wrote: >> On Wed, Nov 12, 2014 at 11:06:59AM +0000, AKASHI Takahiro wrote: >>> On 11/12/2014 08:00 PM, Will Deacon wrote: >>>> On Wed, Nov 12, 2014 at 10:46:01AM +0000, AKASHI Takahiro wrote: >>>>> On 11/07/2014 11:04 PM, Oleg Nesterov wrote: >>>>>> To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr() >>>>>> is very much arch-dependant (but most probably trivial) means that this code >>>>>> should live in arch_ptrace(). >>>>> >>>>> Thinking of Oleg's comment above, it doesn't make sense neither to define generic >>>>> NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset() >>>>> in kernel/ptrace.c with arch-defined syscall_(g)set_nr(). >>>>> >>>>> Since we should have the same interface on arm and arm64, we'd better implement >>>>> ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted). >>>> >>>> I think the regset approach is cleaner. We already do something similar for >>>> TLS. That would be implemented under arch/arm64/ with it's own NT type. >>> >>> Okey, so arm64 goes its own way >>> Or do you want to have a similar regset on arm, too? >>> (In this case, NT_ARM_SYSTEM_CALL can be shared in uapi/linux/elf.h) >> >> Just do arm64. We already have the dedicated request for arch/arm/. > > I wonder if we should define NT_ARM64_SYSTEM_CALL to the same value > as NT_S390_SYSTEM_CALL (0x307), or even define it as an architecture- > independent NT_SYSTEM_CALL number with that value, so other architectures > don't have to introduce new types when they also want to implement it. I digged into gdb code (gdb/bfd/elf.c): https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/elf.c;h=8b207ad872a3992381e93bdfa0a75ef444651613;hb=HEAD elf_parse_notes()->elfcore_grok_note()->elfcore_grok_s390_system_call() It seems to me that NT_S390_SYSTEM_CALL(=0x307) is recognized as a s390 specific value (without checking for machine type). So thinking of potential conflict, it might not be a good idea to use this value as a common number (of NT_SYSTEM_CALL). It's very unlikely that a "note" section for NT_(S390_)SYSTEM_CALL appears in a coredump file, though. What do you think? -Takahiro AKASHI > > Arnd >
On Thursday 13 November 2014 16:02:49 AKASHI Takahiro wrote: > On 11/12/2014 08:19 PM, Arnd Bergmann wrote: > > On Wednesday 12 November 2014 11:13:52 Will Deacon wrote: > >> On Wed, Nov 12, 2014 at 11:06:59AM +0000, AKASHI Takahiro wrote: > >>> On 11/12/2014 08:00 PM, Will Deacon wrote: > >>>> On Wed, Nov 12, 2014 at 10:46:01AM +0000, AKASHI Takahiro wrote: > >>>>> On 11/07/2014 11:04 PM, Oleg Nesterov wrote: > >>>>>> To me the fact that PTRACE_SET_SYSCALL can be undefined and syscall_set_nr() > >>>>>> is very much arch-dependant (but most probably trivial) means that this code > >>>>>> should live in arch_ptrace(). > >>>>> > >>>>> Thinking of Oleg's comment above, it doesn't make sense neither to define generic > >>>>> NT_SYSTEM_CALL (user_regset) in uapi/linux/elf.h and implement it in ptrace_regset() > >>>>> in kernel/ptrace.c with arch-defined syscall_(g)set_nr(). > >>>>> > >>>>> Since we should have the same interface on arm and arm64, we'd better implement > >>>>> ptrace(PTRACE_SET_SYSCALL) locally on arm64 for now (as I originally submitted). > >>>> > >>>> I think the regset approach is cleaner. We already do something similar for > >>>> TLS. That would be implemented under arch/arm64/ with it's own NT type. > >>> > >>> Okey, so arm64 goes its own way > >>> Or do you want to have a similar regset on arm, too? > >>> (In this case, NT_ARM_SYSTEM_CALL can be shared in uapi/linux/elf.h) > >> > >> Just do arm64. We already have the dedicated request for arch/arm/. > > > > I wonder if we should define NT_ARM64_SYSTEM_CALL to the same value > > as NT_S390_SYSTEM_CALL (0x307), or even define it as an architecture- > > independent NT_SYSTEM_CALL number with that value, so other architectures > > don't have to introduce new types when they also want to implement it. > > I digged into gdb code (gdb/bfd/elf.c): > https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/elf.c;h=8b207ad872a3992381e93bdfa0a75ef444651613;hb=HEAD > elf_parse_notes()->elfcore_grok_note()->elfcore_grok_s390_system_call() > > It seems to me that NT_S390_SYSTEM_CALL(=0x307) is recognized as a s390 specific > value (without checking for machine type). So thinking of potential conflict, it might not be > a good idea to use this value as a common number (of NT_SYSTEM_CALL). > It's very unlikely that a "note" section for NT_(S390_)SYSTEM_CALL appears in a coredump file, though. > > What do you think? (adding Ulrich and Andreas) This code was introduced by http://sourceware-org.1504.n7.nabble.com/rfa-s390-bfd-part-Support-extended-register-sets-td50072.html I have to admit that I don't really understand gdb internals, but from a first look I get the impression that it will just do the right thing if you reuse NT_S390_SYSTEM_CALL on ARM64 with the same semantics. If not, we should indeed have a different number for it and duplicate that code. Arnd
Arnd Bergmann <arnd@arndb.de> wrote on 13.11.2014 11:21:28: > I have to admit that I don't really understand gdb internals, but from > a first look I get the impression that it will just do the right thing > if you reuse NT_S390_SYSTEM_CALL on ARM64 with the same semantics. There's an interface between BFD and GDB proper involved here. BFD will detect the presence of register set notes in the core dump, and will translate them into virtual sections; GDB will then simply look up such sections under well-known names. In particular, the NT_S390_SYSTEM_CALL note will be translated by BFD into a virtual section named ".reg-s390-system-call"; GDB platform- specific code will look for sections of this particular name. So if you were to create notes using the same note type, by default it would do nothing on ARM64. You might add code to the ARM64 back-end to also look for a section ".reg-s390-system-call", but that would be somewhat confusing. Using a new, platform-specific note type for ARM64 would appear to fit better with existing precedent. Bye, Ulrich
On Thursday 13 November 2014 15:49:20 Ulrich Weigand wrote: > Arnd Bergmann <arnd@arndb.de> wrote on 13.11.2014 11:21:28: > > > I have to admit that I don't really understand gdb internals, but from > > a first look I get the impression that it will just do the right thing > > if you reuse NT_S390_SYSTEM_CALL on ARM64 with the same semantics. > > There's an interface between BFD and GDB proper involved here. BFD will > detect the presence of register set notes in the core dump, and will > translate them into virtual sections; GDB will then simply look up such > sections under well-known names. > > In particular, the NT_S390_SYSTEM_CALL note will be translated by BFD > into a virtual section named ".reg-s390-system-call"; GDB platform- > specific code will look for sections of this particular name. > > So if you were to create notes using the same note type, by default it > would do nothing on ARM64. You might add code to the ARM64 back-end > to also look for a section ".reg-s390-system-call", but that would be > somewhat confusing. Using a new, platform-specific note type for ARM64 > would appear to fit better with existing precedent. Ok, thanks a lot for your insight and for confirming what Takahiro AKASHI said. Let's use a new NT_ARM64_SYSTEM_CALL type with a different number then. Arnd
Ulrich, Arnd, thank you for your discussions: On 11/14/2014 07:25 AM, Arnd Bergmann wrote: > On Thursday 13 November 2014 15:49:20 Ulrich Weigand wrote: >> Arnd Bergmann <arnd@arndb.de> wrote on 13.11.2014 11:21:28: >> >>> I have to admit that I don't really understand gdb internals, but from >>> a first look I get the impression that it will just do the right thing >>> if you reuse NT_S390_SYSTEM_CALL on ARM64 with the same semantics. >> >> There's an interface between BFD and GDB proper involved here. BFD will >> detect the presence of register set notes in the core dump, and will >> translate them into virtual sections; GDB will then simply look up such >> sections under well-known names. >> >> In particular, the NT_S390_SYSTEM_CALL note will be translated by BFD >> into a virtual section named ".reg-s390-system-call"; GDB platform- >> specific code will look for sections of this particular name. >> >> So if you were to create notes using the same note type, by default it >> would do nothing on ARM64. You might add code to the ARM64 back-end >> to also look for a section ".reg-s390-system-call", but that would be >> somewhat confusing. Using a new, platform-specific note type for ARM64 >> would appear to fit better with existing precedent. I implemented a regset of NT_SYSTEM_CALL(=NT_S390_SYSTEM_CALL) experimentally, and checked a generated core file: >$ aarch64-linux-gnu-readelf -Wn <...>/tmp/nulltest/core > >Displaying notes found at file offset 0x000003c0 with length 0x00000a68: > Owner Data size Description > CORE 0x00000188 NT_PRSTATUS (prstatus structure) > CORE 0x00000088 NT_PRPSINFO (prpsinfo structure) > CORE 0x00000080 NT_SIGINFO (siginfo_t data) > CORE 0x00000130 NT_AUXV (auxiliary vector) > CORE 0x000001b4 NT_FILE (mapped files) > Page size: 4096 > Start End Page Offset >[snip]... > CORE 0x00000210 NT_FPREGSET (floating point registers) > LINUX 0x00000008 NT_ARM_TLS (AArch TLS registers) > LINUX 0x00000108 NT_ARM_HW_BREAK (AArch hardware breakpoint registers) > LINUX 0x00000108 NT_ARM_HW_WATCH (AArch hardware watchpoint registers) > LINUX 0x00000004 NT_S390_SYSTEM_CALL (s390 system call restart data) Looks funny:) > Ok, thanks a lot for your insight and for confirming what Takahiro AKASHI > said. Let's use a new NT_ARM64_SYSTEM_CALL type with a different > number then. We will use NT_ARM_SYSTEM_CALL(=0x404) as other NT_ARM_*, except NT_ARM_VFP, are also shared by arch/arm and arch/arm64. Anyhow, gdb (and/or binutils?) should be updated as well once my coming patch is merged. My next question is who should know this? Thanks, -Takahiro AKASHI > Arnd >
diff --git a/arch/arm/include/asm/syscall.h b/arch/arm/include/asm/syscall.h index e86c985..3e1d9c0 100644 --- a/arch/arm/include/asm/syscall.h +++ b/arch/arm/include/asm/syscall.h @@ -24,6 +24,13 @@ static inline int syscall_get_nr(struct task_struct *task, return task_thread_info(task)->syscall; } +static inline int syscall_set_nr(struct task_struct *task, + struct pt_regs *regs, int syscall) +{ + task_thread_info(task)->syscall = syscall; + return 0; +} + static inline void syscall_rollback(struct task_struct *task, struct pt_regs *regs) { diff --git a/arch/arm/kernel/ptrace.c b/arch/arm/kernel/ptrace.c index ef9119f..908bae8 100644 --- a/arch/arm/kernel/ptrace.c +++ b/arch/arm/kernel/ptrace.c @@ -853,11 +853,6 @@ long arch_ptrace(struct task_struct *child, long request, datap); break; - case PTRACE_SET_SYSCALL: - task_thread_info(child)->syscall = data; - ret = 0; - break; - #ifdef CONFIG_CRUNCH case PTRACE_GETCRUNCHREGS: ret = ptrace_getcrunchregs(child, datap); diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 54e7522..d7048fa 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -1001,6 +1001,12 @@ int ptrace_request(struct task_struct *child, long request, break; } #endif + +#ifdef PTRACE_SET_SYSCALL + case PTRACE_SET_SYSCALL: + ret = syscall_set_nr(child, task_pt_regs(child), data); + break; +#endif default: break; }
This patch adds a new generic ptrace request, PTRACE_SET_SYSCALL. It can be used to change a system call number as follows: ret = ptrace(pid, PTRACE_SET_SYSCALL, null, new_syscall_no); 'new_syscall_no' can be -1 to skip this system call, you need to modify a register's value, in arch-specific way, as return value though. Please note that we can't define PTRACE_SET_SYSCALL macro in uapi/linux/ptrace.h partly because its value on arm, 23, is used as another request on sparc. This patch also contains an example of change on arch side, arm. Only syscall_set_nr() is required to be defined in asm/syscall.h. Currently only arm has this request, while arm64 would also have it once my patch series of seccomp for arm64 is merged. It will also be usable for most of other arches. See the discussions in lak-ml: http://lists.infradead.org/pipermail/linux-arm-kernel/2014-November/300167.html Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> --- arch/arm/include/asm/syscall.h | 7 +++++++ arch/arm/kernel/ptrace.c | 5 ----- kernel/ptrace.c | 6 ++++++ 3 files changed, 13 insertions(+), 5 deletions(-)