diff mbox series

[RFC,dwarves] syscall functions in BTF

Message ID ZAsBYpsBV0wvkhh0@krava (mailing list archive)
State RFC
Headers show
Series [RFC,dwarves] syscall functions in BTF | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Jiri Olsa March 10, 2023, 10:07 a.m. UTC
hi,
with latest pahole fixes we get rid of some syscall functions (with
__x64_sys_ prefix) and it seems to fall down to 2 cases:

- weak syscall functions generated in kernel/sys_ni.c prevent these syscalls
  to be generated in BTF. The reason is the __COND_SYSCALL macro uses
  '__unused' for regs argument:

        #define __COND_SYSCALL(abi, name)                                      \
               __weak long __##abi##_##name(const struct pt_regs *__unused);   \
               __weak long __##abi##_##name(const struct pt_regs *__unused)    \
               {                                                               \
                       return sys_ni_syscall();                                \
               }

  and having weak function with different argument name will rule out the
  syscall from BTF functions

  the patch below workarounds this by using the same argument name,
  but I guess the real fix would be to check the whole type not just
  the argument name.. or ignore weak function if there's non weak one

  I guess there will be more cases like this in kernel


- we also do not get any syscall with no arguments, because they are
  generated as aliases to __do_<syscall> function:

        $ nm ./vmlinux | grep _sys_fork
        ffffffff81174890 t __do_sys_fork
        ffffffff81174890 T __ia32_sys_fork
        ffffffff81174880 T __pfx___x64_sys_fork
        ffffffff81174890 T __x64_sys_fork

  with:
        #define __SYS_STUB0(abi, name)                                          \
                long __##abi##_##name(const struct pt_regs *regs);              \
                ALLOW_ERROR_INJECTION(__##abi##_##name, ERRNO);                 \
                long __##abi##_##name(const struct pt_regs *regs)               \
                        __alias(__do_##name);

  the problem seems to be that there's no DWARF data for aliased symbol,
  so pahole won't see any __x64_sys_fork record
  I'm not sure how to fix this one

  technically we can always connect to __do_sys_fork, but we'd need to
  have special cases for such syscalls.. would be great to have all with
  '__x64_sys_' prefix


thoughts?

thanks,
jirka


---

Comments

Alan Maguire March 10, 2023, 12:43 p.m. UTC | #1
On 10/03/2023 10:07, Jiri Olsa wrote:
> hi,
> with latest pahole fixes we get rid of some syscall functions (with
> __x64_sys_ prefix) and it seems to fall down to 2 cases:
> 
> - weak syscall functions generated in kernel/sys_ni.c prevent these syscalls
>   to be generated in BTF. The reason is the __COND_SYSCALL macro uses
>   '__unused' for regs argument:
> 
>         #define __COND_SYSCALL(abi, name)                                      \
>                __weak long __##abi##_##name(const struct pt_regs *__unused);   \
>                __weak long __##abi##_##name(const struct pt_regs *__unused)    \
>                {                                                               \
>                        return sys_ni_syscall();                                \
>                }
> 
>   and having weak function with different argument name will rule out the
>   syscall from BTF functions
> 
>   the patch below workarounds this by using the same argument name,
>   but I guess the real fix would be to check the whole type not just
>   the argument name.. or ignore weak function if there's non weak one
> 
>   I guess there will be more cases like this in kernel
> 
>

Thanks for the report Jiri! I'm working on reusing the dwarves_fprintf.c
code to use string comparisons of function prototypes (minus parameter names!)
instead as a more robust comparison.  Hope to have something working soon..
 
> - we also do not get any syscall with no arguments, because they are
>   generated as aliases to __do_<syscall> function:
> 
>         $ nm ./vmlinux | grep _sys_fork
>         ffffffff81174890 t __do_sys_fork
>         ffffffff81174890 T __ia32_sys_fork
>         ffffffff81174880 T __pfx___x64_sys_fork
>         ffffffff81174890 T __x64_sys_fork
> 
>   with:
>         #define __SYS_STUB0(abi, name)                                          \
>                 long __##abi##_##name(const struct pt_regs *regs);              \
>                 ALLOW_ERROR_INJECTION(__##abi##_##name, ERRNO);                 \
>                 long __##abi##_##name(const struct pt_regs *regs)               \
>                         __alias(__do_##name);
> 
>   the problem seems to be that there's no DWARF data for aliased symbol,
>   so pahole won't see any __x64_sys_fork record
>   I'm not sure how to fix this one
> 

Is this one a new issue, or did you just spot it when looking at the other case?

Thanks!

Alan

>   technically we can always connect to __do_sys_fork, but we'd need to
>   have special cases for such syscalls.. would be great to have all with
>   '__x64_sys_' prefix
> 
> 
> thoughts?
> 
> thanks,
> jirka
> 
> 
> ---
> diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
> index fd2669b1cb2d..e02dab630577 100644
> --- a/arch/x86/include/asm/syscall_wrapper.h
> +++ b/arch/x86/include/asm/syscall_wrapper.h
> @@ -80,8 +80,8 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);
>  	}
>  
>  #define __COND_SYSCALL(abi, name)					\
> -	__weak long __##abi##_##name(const struct pt_regs *__unused);	\
> -	__weak long __##abi##_##name(const struct pt_regs *__unused)	\
> +	__weak long __##abi##_##name(const struct pt_regs *regs);	\
> +	__weak long __##abi##_##name(const struct pt_regs *regs)	\
>  	{								\
>  		return sys_ni_syscall();				\
>  	}
>
Arnaldo Carvalho de Melo March 10, 2023, 3:03 p.m. UTC | #2
Em Fri, Mar 10, 2023 at 12:43:31PM +0000, Alan Maguire escreveu:
> On 10/03/2023 10:07, Jiri Olsa wrote:
> > hi,
> > with latest pahole fixes we get rid of some syscall functions (with
> > __x64_sys_ prefix) and it seems to fall down to 2 cases:
> > 
> > - weak syscall functions generated in kernel/sys_ni.c prevent these syscalls
> >   to be generated in BTF. The reason is the __COND_SYSCALL macro uses
> >   '__unused' for regs argument:
> > 
> >         #define __COND_SYSCALL(abi, name)                                      \
> >                __weak long __##abi##_##name(const struct pt_regs *__unused);   \
> >                __weak long __##abi##_##name(const struct pt_regs *__unused)    \
> >                {                                                               \
> >                        return sys_ni_syscall();                                \
> >                }
> > 
> >   and having weak function with different argument name will rule out the
> >   syscall from BTF functions
> > 
> >   the patch below workarounds this by using the same argument name,
> >   but I guess the real fix would be to check the whole type not just
> >   the argument name.. or ignore weak function if there's non weak one
> > 
> >   I guess there will be more cases like this in kernel
> > 
> >
 
> Thanks for the report Jiri! I'm working on reusing the dwarves_fprintf.c
> code to use string comparisons of function prototypes (minus parameter names!)
> instead as a more robust comparison.  Hope to have something working soon..

Humm, that could be an option, a simple strcmp after snprintf'ing the
function prototype, but there is also the type__compare_members_types()
approach, used to order types in pahole, the same could be done for
function prototypes?

I.e. to compare a function prototype for functions with the same name we
would check its return value type, the number of arguments and then each
of the arguments, continuing to consider the names as an heuristic that
functions with all being so far equal having different argument names
may indicate different functions, but if there is no name in both
functions, look at its type instead, where we then would use
type__compare_members_types() for structs/unions?

- Arnaldo
  
> > - we also do not get any syscall with no arguments, because they are
> >   generated as aliases to __do_<syscall> function:
> > 
> >         $ nm ./vmlinux | grep _sys_fork
> >         ffffffff81174890 t __do_sys_fork
> >         ffffffff81174890 T __ia32_sys_fork
> >         ffffffff81174880 T __pfx___x64_sys_fork
> >         ffffffff81174890 T __x64_sys_fork
> > 
> >   with:
> >         #define __SYS_STUB0(abi, name)                                          \
> >                 long __##abi##_##name(const struct pt_regs *regs);              \
> >                 ALLOW_ERROR_INJECTION(__##abi##_##name, ERRNO);                 \
> >                 long __##abi##_##name(const struct pt_regs *regs)               \
> >                         __alias(__do_##name);
> > 
> >   the problem seems to be that there's no DWARF data for aliased symbol,
> >   so pahole won't see any __x64_sys_fork record
> >   I'm not sure how to fix this one
> > 
> 
> Is this one a new issue, or did you just spot it when looking at the other case?
> 
> Thanks!
> 
> Alan
> 
> >   technically we can always connect to __do_sys_fork, but we'd need to
> >   have special cases for such syscalls.. would be great to have all with
> >   '__x64_sys_' prefix
> > 
> > 
> > thoughts?
> > 
> > thanks,
> > jirka
> > 
> > 
> > ---
> > diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
> > index fd2669b1cb2d..e02dab630577 100644
> > --- a/arch/x86/include/asm/syscall_wrapper.h
> > +++ b/arch/x86/include/asm/syscall_wrapper.h
> > @@ -80,8 +80,8 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);
> >  	}
> >  
> >  #define __COND_SYSCALL(abi, name)					\
> > -	__weak long __##abi##_##name(const struct pt_regs *__unused);	\
> > -	__weak long __##abi##_##name(const struct pt_regs *__unused)	\
> > +	__weak long __##abi##_##name(const struct pt_regs *regs);	\
> > +	__weak long __##abi##_##name(const struct pt_regs *regs)	\
> >  	{								\
> >  		return sys_ni_syscall();				\
> >  	}
> >
Jiri Olsa March 10, 2023, 3:09 p.m. UTC | #3
On Fri, Mar 10, 2023 at 12:43:31PM +0000, Alan Maguire wrote:
> On 10/03/2023 10:07, Jiri Olsa wrote:
> > hi,
> > with latest pahole fixes we get rid of some syscall functions (with
> > __x64_sys_ prefix) and it seems to fall down to 2 cases:
> > 
> > - weak syscall functions generated in kernel/sys_ni.c prevent these syscalls
> >   to be generated in BTF. The reason is the __COND_SYSCALL macro uses
> >   '__unused' for regs argument:
> > 
> >         #define __COND_SYSCALL(abi, name)                                      \
> >                __weak long __##abi##_##name(const struct pt_regs *__unused);   \
> >                __weak long __##abi##_##name(const struct pt_regs *__unused)    \
> >                {                                                               \
> >                        return sys_ni_syscall();                                \
> >                }
> > 
> >   and having weak function with different argument name will rule out the
> >   syscall from BTF functions
> > 
> >   the patch below workarounds this by using the same argument name,
> >   but I guess the real fix would be to check the whole type not just
> >   the argument name.. or ignore weak function if there's non weak one
> > 
> >   I guess there will be more cases like this in kernel
> > 
> >
> 
> Thanks for the report Jiri! I'm working on reusing the dwarves_fprintf.c
> code to use string comparisons of function prototypes (minus parameter names!)
> instead as a more robust comparison.  Hope to have something working soon..

great, I saw the patchset, will check

>  
> > - we also do not get any syscall with no arguments, because they are
> >   generated as aliases to __do_<syscall> function:
> > 
> >         $ nm ./vmlinux | grep _sys_fork
> >         ffffffff81174890 t __do_sys_fork
> >         ffffffff81174890 T __ia32_sys_fork
> >         ffffffff81174880 T __pfx___x64_sys_fork
> >         ffffffff81174890 T __x64_sys_fork
> > 
> >   with:
> >         #define __SYS_STUB0(abi, name)                                          \
> >                 long __##abi##_##name(const struct pt_regs *regs);              \
> >                 ALLOW_ERROR_INJECTION(__##abi##_##name, ERRNO);                 \
> >                 long __##abi##_##name(const struct pt_regs *regs)               \
> >                         __alias(__do_##name);
> > 
> >   the problem seems to be that there's no DWARF data for aliased symbol,
> >   so pahole won't see any __x64_sys_fork record
> >   I'm not sure how to fix this one
> > 
> 
> Is this one a new issue, or did you just spot it when looking at the other case?

I was trying to attach to all syscalls and noticed some where missing,
it looks like the alias was used in this place for few years

thanks,
jirka
Alan Maguire March 10, 2023, 6:11 p.m. UTC | #4
On 10/03/2023 15:09, Jiri Olsa wrote:
> On Fri, Mar 10, 2023 at 12:43:31PM +0000, Alan Maguire wrote:
>> On 10/03/2023 10:07, Jiri Olsa wrote:
>>> hi,
>>> with latest pahole fixes we get rid of some syscall functions (with
>>> __x64_sys_ prefix) and it seems to fall down to 2 cases:
>>>
>>> - weak syscall functions generated in kernel/sys_ni.c prevent these syscalls
>>>   to be generated in BTF. The reason is the __COND_SYSCALL macro uses
>>>   '__unused' for regs argument:
>>>
>>>         #define __COND_SYSCALL(abi, name)                                      \
>>>                __weak long __##abi##_##name(const struct pt_regs *__unused);   \
>>>                __weak long __##abi##_##name(const struct pt_regs *__unused)    \
>>>                {                                                               \
>>>                        return sys_ni_syscall();                                \
>>>                }
>>>
>>>   and having weak function with different argument name will rule out the
>>>   syscall from BTF functions
>>>
>>>   the patch below workarounds this by using the same argument name,
>>>   but I guess the real fix would be to check the whole type not just
>>>   the argument name.. or ignore weak function if there's non weak one
>>>
>>>   I guess there will be more cases like this in kernel
>>>
>>>
>>
>> Thanks for the report Jiri! I'm working on reusing the dwarves_fprintf.c
>> code to use string comparisons of function prototypes (minus parameter names!)
>> instead as a more robust comparison.  Hope to have something working soon..
> 
> great, I saw the patchset, will check
> 
>>  
>>> - we also do not get any syscall with no arguments, because they are
>>>   generated as aliases to __do_<syscall> function:
>>>
>>>         $ nm ./vmlinux | grep _sys_fork
>>>         ffffffff81174890 t __do_sys_fork
>>>         ffffffff81174890 T __ia32_sys_fork
>>>         ffffffff81174880 T __pfx___x64_sys_fork
>>>         ffffffff81174890 T __x64_sys_fork
>>>
>>>   with:
>>>         #define __SYS_STUB0(abi, name)                                          \
>>>                 long __##abi##_##name(const struct pt_regs *regs);              \
>>>                 ALLOW_ERROR_INJECTION(__##abi##_##name, ERRNO);                 \
>>>                 long __##abi##_##name(const struct pt_regs *regs)               \
>>>                         __alias(__do_##name);
>>>
>>>   the problem seems to be that there's no DWARF data for aliased symbol,
>>>   so pahole won't see any __x64_sys_fork record
>>>   I'm not sure how to fix this one
>>>
>>
>> Is this one a new issue, or did you just spot it when looking at the other case?
> 
> I was trying to attach to all syscalls and noticed some where missing,
> it looks like the alias was used in this place for few years
> 

I looked into this one; in the DWARF associated with these functions I see:

<1><756d2>: Abbrev Number: 24 (DW_TAG_subprogram)
    <756d3>   DW_AT_external    : 1
    <756d3>   DW_AT_name        : (indirect string, offset: 0x8552d): __x64_sys_fork
    <756d7>   DW_AT_decl_file   : 7
    <756d8>   DW_AT_decl_line   : 58
    <756d9>   DW_AT_decl_column : 1
    <756da>   DW_AT_prototyped  : 1
    <756da>   DW_AT_type        : <0x73abd>
    <756de>   DW_AT_declaration : 1
    <756de>   DW_AT_sibling     : <0x756e8>
 <2><756e2>: Abbrev Number: 18 (DW_TAG_formal_parameter)
    <756e3>   DW_AT_type        : <0x73d42>

So it's marked as a declaration, and BTF encoding has skipped declarations
for a while now as they don't have parameter names; I checked out v1.24 of 
dwarves and only __do_sys_fork() is encoded in BTF there too.  Looks like 
the last relevant change around the syscall stubs was in 2020:

d2b5de495ee9 ("x86/entry: Refactor SYSCALL_DEFINE0 macros")

Alan
Alan Maguire March 10, 2023, 6:34 p.m. UTC | #5
On 10/03/2023 15:03, Arnaldo Carvalho de Melo wrote:
> Em Fri, Mar 10, 2023 at 12:43:31PM +0000, Alan Maguire escreveu:
>> On 10/03/2023 10:07, Jiri Olsa wrote:
>>> hi,
>>> with latest pahole fixes we get rid of some syscall functions (with
>>> __x64_sys_ prefix) and it seems to fall down to 2 cases:
>>>
>>> - weak syscall functions generated in kernel/sys_ni.c prevent these syscalls
>>>   to be generated in BTF. The reason is the __COND_SYSCALL macro uses
>>>   '__unused' for regs argument:
>>>
>>>         #define __COND_SYSCALL(abi, name)                                      \
>>>                __weak long __##abi##_##name(const struct pt_regs *__unused);   \
>>>                __weak long __##abi##_##name(const struct pt_regs *__unused)    \
>>>                {                                                               \
>>>                        return sys_ni_syscall();                                \
>>>                }
>>>
>>>   and having weak function with different argument name will rule out the
>>>   syscall from BTF functions
>>>
>>>   the patch below workarounds this by using the same argument name,
>>>   but I guess the real fix would be to check the whole type not just
>>>   the argument name.. or ignore weak function if there's non weak one
>>>
>>>   I guess there will be more cases like this in kernel
>>>
>>>
>  
>> Thanks for the report Jiri! I'm working on reusing the dwarves_fprintf.c
>> code to use string comparisons of function prototypes (minus parameter names!)
>> instead as a more robust comparison.  Hope to have something working soon..
> 
> Humm, that could be an option, a simple strcmp after snprintf'ing the
> function prototype, but there is also the type__compare_members_types()
> approach, used to order types in pahole, the same could be done for
> function prototypes?
> 
> I.e. to compare a function prototype for functions with the same name we
> would check its return value type, the number of arguments and then each
> of the arguments, continuing to consider the names as an heuristic that
> functions with all being so far equal having different argument names
> may indicate different functions, but if there is no name in both
> functions, look at its type instead, where we then would use
> type__compare_members_types() for structs/unions?
> 

We could do that too alright; the thing I like about stringifying the
prototype is we get a clear description when we hit a mismatch
(in verbose mode):

function mismatch for 'alloc_buffer'('alloc_buffer'): 'struct debug_buffer * ()(struct usb_bus *, ssize_t (*)(struct debug_buffer *))' != 'struct debug_buffer * ()(struct ohci_hcd *, ssize_t (*)(struct debug_buffer *))'

...and we can re-use the existing support for function display, which works great!
Luckily we don't have to do this very often; I counted 2113 prototype comparisons,
of which only 49 mismatches were detected. Thanks!

Alan

> - Arnaldo
>   
>>> - we also do not get any syscall with no arguments, because they are
>>>   generated as aliases to __do_<syscall> function:
>>>
>>>         $ nm ./vmlinux | grep _sys_fork
>>>         ffffffff81174890 t __do_sys_fork
>>>         ffffffff81174890 T __ia32_sys_fork
>>>         ffffffff81174880 T __pfx___x64_sys_fork
>>>         ffffffff81174890 T __x64_sys_fork
>>>
>>>   with:
>>>         #define __SYS_STUB0(abi, name)                                          \
>>>                 long __##abi##_##name(const struct pt_regs *regs);              \
>>>                 ALLOW_ERROR_INJECTION(__##abi##_##name, ERRNO);                 \
>>>                 long __##abi##_##name(const struct pt_regs *regs)               \
>>>                         __alias(__do_##name);
>>>
>>>   the problem seems to be that there's no DWARF data for aliased symbol,
>>>   so pahole won't see any __x64_sys_fork record
>>>   I'm not sure how to fix this one
>>>
>>
>> Is this one a new issue, or did you just spot it when looking at the other case?
>>
>> Thanks!
>>
>> Alan
>>
>>>   technically we can always connect to __do_sys_fork, but we'd need to
>>>   have special cases for such syscalls.. would be great to have all with
>>>   '__x64_sys_' prefix
>>>
>>>
>>> thoughts?
>>>
>>> thanks,
>>> jirka
>>>
>>>
>>> ---
>>> diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
>>> index fd2669b1cb2d..e02dab630577 100644
>>> --- a/arch/x86/include/asm/syscall_wrapper.h
>>> +++ b/arch/x86/include/asm/syscall_wrapper.h
>>> @@ -80,8 +80,8 @@ extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);
>>>  	}
>>>  
>>>  #define __COND_SYSCALL(abi, name)					\
>>> -	__weak long __##abi##_##name(const struct pt_regs *__unused);	\
>>> -	__weak long __##abi##_##name(const struct pt_regs *__unused)	\
>>> +	__weak long __##abi##_##name(const struct pt_regs *regs);	\
>>> +	__weak long __##abi##_##name(const struct pt_regs *regs)	\
>>>  	{								\
>>>  		return sys_ni_syscall();				\
>>>  	}
>>>
>
diff mbox series

Patch

diff --git a/arch/x86/include/asm/syscall_wrapper.h b/arch/x86/include/asm/syscall_wrapper.h
index fd2669b1cb2d..e02dab630577 100644
--- a/arch/x86/include/asm/syscall_wrapper.h
+++ b/arch/x86/include/asm/syscall_wrapper.h
@@ -80,8 +80,8 @@  extern long __ia32_sys_ni_syscall(const struct pt_regs *regs);
 	}
 
 #define __COND_SYSCALL(abi, name)					\
-	__weak long __##abi##_##name(const struct pt_regs *__unused);	\
-	__weak long __##abi##_##name(const struct pt_regs *__unused)	\
+	__weak long __##abi##_##name(const struct pt_regs *regs);	\
+	__weak long __##abi##_##name(const struct pt_regs *regs)	\
 	{								\
 		return sys_ni_syscall();				\
 	}