modpost: remove use of non-standard strsep() in HOSTCC code
diff mbox series

Message ID 11c1e65b393b4c3ca6118515c77bbf19524dab11.1593074472.git.hns@goldelico.com
State New
Headers show
Series
  • modpost: remove use of non-standard strsep() in HOSTCC code
Related show

Commit Message

H. Nikolaus Schaller June 25, 2020, 8:41 a.m. UTC
strsep() is neither standard C nor POSIX and used outside
the kernel code here. Using it here requires that the
build host supports it out of the box which is e.g.
not true for a Darwin build host and using a cross-compiler.
This leads to:

scripts/mod/modpost.c:145:2: warning: implicit declaration of function 'strsep' [-Wimplicit-function-declaration]
  return strsep(stringp, "\n");
  ^

and a segfault when running MODPOST.

See also: https://stackoverflow.com/a/7219504

So let's add some lines of code separating the string at the
next newline character instead of using strsep(). It does not
hurt kernel size or speed since this code is run on the build host.

Fixes: ac5100f5432967 ("modpost: add read_text_file() and get_line() helpers")
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
---
 scripts/mod/modpost.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Masahiro Yamada June 28, 2020, 5:51 a.m. UTC | #1
On Thu, Jun 25, 2020 at 5:47 PM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>
> strsep() is neither standard C nor POSIX and used outside
> the kernel code here. Using it here requires that the
> build host supports it out of the box which is e.g.
> not true for a Darwin build host and using a cross-compiler.
> This leads to:
>
> scripts/mod/modpost.c:145:2: warning: implicit declaration of function 'strsep' [-Wimplicit-function-declaration]
>   return strsep(stringp, "\n");
>   ^
>
> and a segfault when running MODPOST.
>
> See also: https://stackoverflow.com/a/7219504
>
> So let's add some lines of code separating the string at the
> next newline character instead of using strsep(). It does not
> hurt kernel size or speed since this code is run on the build host.
>
> Fixes: ac5100f5432967 ("modpost: add read_text_file() and get_line() helpers")
> Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
> ---
>  scripts/mod/modpost.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> index 6aea65c65745..8fe63989c6e1 100644
> --- a/scripts/mod/modpost.c
> +++ b/scripts/mod/modpost.c
> @@ -138,11 +138,16 @@ char *read_text_file(const char *filename)
>
>  char *get_line(char **stringp)
>  {
> +       char *p;
>         /* do not return the unwanted extra line at EOF */
>         if (*stringp && **stringp == '\0')

This check does not make sense anymore.

Previously, get_line(NULL) returns NULL.

With your patch, get_line(NULL) crashes
due to NULL-pointer dereference.



>                 return NULL;
>
> -       return strsep(stringp, "\n");
> +       p = *stringp;
> +       while (**stringp != '\n')
> +               (*stringp)++;


Is this a safe conversion?

If the input file does not contain '\n' at all,
this while-loop continues running,
and results in the segmentation fault
due to buffer over-run.



> +       *(*stringp)++ = '\0';
> +       return p;
>  }



How about this?

char *get_line(char **stringp)
{
        char *orig = *stringp;
        char *next;

        /* do not return the unwanted extra line at EOF */
        if (!orig || *orig == '\0')
                return NULL;

        next = strchr(orig, '\n');
        if (next)
                *next++ = '\0';

        *stringp = next;

        return orig;
}




>  /* A list of all modules we processed */
> --
> 2.26.2
>


--
Best Regards
Masahiro Yamada
H. Nikolaus Schaller June 28, 2020, 6:17 a.m. UTC | #2
Hi,

> Am 28.06.2020 um 07:51 schrieb Masahiro Yamada <masahiroy@kernel.org>:
> 
> On Thu, Jun 25, 2020 at 5:47 PM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>> 
>> strsep() is neither standard C nor POSIX and used outside
>> the kernel code here. Using it here requires that the
>> build host supports it out of the box which is e.g.
>> not true for a Darwin build host and using a cross-compiler.
>> This leads to:
>> 
>> scripts/mod/modpost.c:145:2: warning: implicit declaration of function 'strsep' [-Wimplicit-function-declaration]
>>  return strsep(stringp, "\n");
>>  ^
>> 
>> and a segfault when running MODPOST.
>> 
>> See also: https://stackoverflow.com/a/7219504
>> 
>> So let's add some lines of code separating the string at the
>> next newline character instead of using strsep(). It does not
>> hurt kernel size or speed since this code is run on the build host.
>> 
>> Fixes: ac5100f5432967 ("modpost: add read_text_file() and get_line() helpers")
>> Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
>> ---
>> scripts/mod/modpost.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>> 
>> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
>> index 6aea65c65745..8fe63989c6e1 100644
>> --- a/scripts/mod/modpost.c
>> +++ b/scripts/mod/modpost.c
>> @@ -138,11 +138,16 @@ char *read_text_file(const char *filename)
>> 
>> char *get_line(char **stringp)
>> {
>> +       char *p;
>>        /* do not return the unwanted extra line at EOF */
>>        if (*stringp && **stringp == '\0')
> 
> This check does not make sense anymore.
> 
> Previously, get_line(NULL) returns NULL.
> 
> With your patch, get_line(NULL) crashes
> due to NULL-pointer dereference.

Well, that is original code.

I have only replaced the strsep() function.
But yes, it looks to be better in addition to
my patch.

> 
> 
> 
>>                return NULL;
>> 
>> -       return strsep(stringp, "\n");
>> +       p = *stringp;
>> +       while (**stringp != '\n')
>> +               (*stringp)++;
> 
> 
> Is this a safe conversion?
> 
> If the input file does not contain '\n' at all,
> this while-loop continues running,
> and results in the segmentation fault
> due to buffer over-run.

Ah, yes, you are right.

We should use

+       while (**stringp && **stringp != '\n')

> 
> 
> 
>> +       *(*stringp)++ = '\0';
>> +       return p;
>> }
> 
> 
> 
> How about this?
> 
> char *get_line(char **stringp)
> {
>        char *orig = *stringp;

^^^ this still segfaults with get_line(NULL)

>        char *next;
> 
>        /* do not return the unwanted extra line at EOF */
>        if (!orig || *orig == '\0')
>                return NULL;
> 
>        next = strchr(orig, '\n');
>        if (next)
>                *next++ = '\0';
> 
>        *stringp = next;

Yes, this code is easier to understand than my while loop.
And strchr() is POSIX.

So should I submit an updated patch or do you want to submit
it (with a suggested-by: H. Nikolaus Schaller <hns@goldelico.com>)

BR and thanks,
Nikolaus Schaller
Masahiro Yamada June 28, 2020, 7:52 a.m. UTC | #3
On Sun, Jun 28, 2020 at 3:17 PM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>
> Hi,
>
> > Am 28.06.2020 um 07:51 schrieb Masahiro Yamada <masahiroy@kernel.org>:
> >
> > On Thu, Jun 25, 2020 at 5:47 PM H. Nikolaus Schaller <hns@goldelico.com> wrote:
> >>
> >> strsep() is neither standard C nor POSIX and used outside
> >> the kernel code here. Using it here requires that the
> >> build host supports it out of the box which is e.g.
> >> not true for a Darwin build host and using a cross-compiler.
> >> This leads to:
> >>
> >> scripts/mod/modpost.c:145:2: warning: implicit declaration of function 'strsep' [-Wimplicit-function-declaration]
> >>  return strsep(stringp, "\n");
> >>  ^
> >>
> >> and a segfault when running MODPOST.
> >>
> >> See also: https://stackoverflow.com/a/7219504
> >>
> >> So let's add some lines of code separating the string at the
> >> next newline character instead of using strsep(). It does not
> >> hurt kernel size or speed since this code is run on the build host.
> >>
> >> Fixes: ac5100f5432967 ("modpost: add read_text_file() and get_line() helpers")
> >> Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
> >> ---
> >> scripts/mod/modpost.c | 7 ++++++-
> >> 1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> >> index 6aea65c65745..8fe63989c6e1 100644
> >> --- a/scripts/mod/modpost.c
> >> +++ b/scripts/mod/modpost.c
> >> @@ -138,11 +138,16 @@ char *read_text_file(const char *filename)
> >>
> >> char *get_line(char **stringp)
> >> {
> >> +       char *p;
> >>        /* do not return the unwanted extra line at EOF */
> >>        if (*stringp && **stringp == '\0')
> >
> > This check does not make sense anymore.
> >
> > Previously, get_line(NULL) returns NULL.
> >
> > With your patch, get_line(NULL) crashes
> > due to NULL-pointer dereference.
>
> Well, that is original code.


Sorry for confusion.

I meant this:

  char *s = NULL;
  get_line(&s);


In the current code, get_line(&s) returns NULL.
As 'man strsep' says this:
  "If *stringp is NULL, the strsep() function returns NULL
   and does nothing else."

With your patch, **stringp will cause
NULL-pointer dereference.






>
> I have only replaced the strsep() function.
> But yes, it looks to be better in addition to
> my patch.
>
> >
> >
> >
> >>                return NULL;
> >>
> >> -       return strsep(stringp, "\n");
> >> +       p = *stringp;
> >> +       while (**stringp != '\n')
> >> +               (*stringp)++;
> >
> >
> > Is this a safe conversion?
> >
> > If the input file does not contain '\n' at all,
> > this while-loop continues running,
> > and results in the segmentation fault
> > due to buffer over-run.
>
> Ah, yes, you are right.
>
> We should use
>
> +       while (**stringp && **stringp != '\n')
>
> >
> >
> >
> >> +       *(*stringp)++ = '\0';
> >> +       return p;
> >> }
> >
> >
> >
> > How about this?
> >
> > char *get_line(char **stringp)
> > {
> >        char *orig = *stringp;
>
> ^^^ this still segfaults with get_line(NULL)


This is OK.

get_line(NULL) should crash because we never expect
the case  ' stringp == NULL'.

We need to care about the case ' *stringp == NULL'.
In this case, get_line() should return NULL.




> >        char *next;
> >
> >        /* do not return the unwanted extra line at EOF */
> >        if (!orig || *orig == '\0')
> >                return NULL;
> >
> >        next = strchr(orig, '\n');
> >        if (next)
> >                *next++ = '\0';
> >
> >        *stringp = next;
>
> Yes, this code is easier to understand than my while loop.
> And strchr() is POSIX.
>
> So should I submit an updated patch or do you want to submit
> it (with a suggested-by: H. Nikolaus Schaller <hns@goldelico.com>)

Please send a patch.
(Co-developed-by if you want to give some credit to me)

> BR and thanks,
> Nikolaus Schaller
>
>


--
Best Regards
Masahiro Yamada
H. Nikolaus Schaller June 28, 2020, 8:20 a.m. UTC | #4
> Am 28.06.2020 um 09:52 schrieb Masahiro Yamada <masahiroy@kernel.org>:
> 
> On Sun, Jun 28, 2020 at 3:17 PM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>> 
>> Hi,
>> 
>>> Am 28.06.2020 um 07:51 schrieb Masahiro Yamada <masahiroy@kernel.org>:
>>> 
>>> On Thu, Jun 25, 2020 at 5:47 PM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>>>> 
>>>> strsep() is neither standard C nor POSIX and used outside
>>>> the kernel code here. Using it here requires that the
>>>> build host supports it out of the box which is e.g.
>>>> not true for a Darwin build host and using a cross-compiler.
>>>> This leads to:
>>>> 
>>>> scripts/mod/modpost.c:145:2: warning: implicit declaration of function 'strsep' [-Wimplicit-function-declaration]
>>>> return strsep(stringp, "\n");
>>>> ^
>>>> 
>>>> and a segfault when running MODPOST.
>>>> 
>>>> See also: https://stackoverflow.com/a/7219504
>>>> 
>>>> So let's add some lines of code separating the string at the
>>>> next newline character instead of using strsep(). It does not
>>>> hurt kernel size or speed since this code is run on the build host.
>>>> 
>>>> Fixes: ac5100f5432967 ("modpost: add read_text_file() and get_line() helpers")
>>>> Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
>>>> ---
>>>> scripts/mod/modpost.c | 7 ++++++-
>>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>> 
>>>> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
>>>> index 6aea65c65745..8fe63989c6e1 100644
>>>> --- a/scripts/mod/modpost.c
>>>> +++ b/scripts/mod/modpost.c
>>>> @@ -138,11 +138,16 @@ char *read_text_file(const char *filename)
>>>> 
>>>> char *get_line(char **stringp)
>>>> {
>>>> +       char *p;
>>>>       /* do not return the unwanted extra line at EOF */
>>>>       if (*stringp && **stringp == '\0')
>>> 
>>> This check does not make sense anymore.
>>> 
>>> Previously, get_line(NULL) returns NULL.
>>> 
>>> With your patch, get_line(NULL) crashes
>>> due to NULL-pointer dereference.
>> 
>> Well, that is original code.
> 
> 
> Sorry for confusion.
> 
> I meant this:
> 
>  char *s = NULL;
>  get_line(&s);
> 
> 
> In the current code, get_line(&s) returns NULL.
> As 'man strsep' says this:
>  "If *stringp is NULL, the strsep() function returns NULL
>   and does nothing else."
> 
> With your patch, **stringp will cause
> NULL-pointer dereference.

Ah, now I see. strsep() has a special case that is not covered
by my patch.

On the other hand, get_line() is only called as get_line(&pos) and
pos = buf can not be NULL because that is checked before in read_dump().
This is why I did not observe a segfault.

But it is wise to make get_line() it more robust than needed. We do
never know who will copy this code fragment... And I am tempted to
handle the get_line(NULL) case as well.

>> I have only replaced the strsep() function.
>> But yes, it looks to be better in addition to
>> my patch.
>> 
>>> 
>>> 
>>> 
>>>>               return NULL;
>>>> 
>>>> -       return strsep(stringp, "\n");
>>>> +       p = *stringp;
>>>> +       while (**stringp != '\n')
>>>> +               (*stringp)++;
>>> 
>>> 
>>> Is this a safe conversion?
>>> 
>>> If the input file does not contain '\n' at all,
>>> this while-loop continues running,
>>> and results in the segmentation fault
>>> due to buffer over-run.
>> 
>> Ah, yes, you are right.
>> 
>> We should use
>> 
>> +       while (**stringp && **stringp != '\n')
>> 
>>> 
>>> 
>>> 
>>>> +       *(*stringp)++ = '\0';
>>>> +       return p;
>>>> }
>>> 
>>> 
>>> 
>>> How about this?
>>> 
>>> char *get_line(char **stringp)
>>> {
>>>       char *orig = *stringp;
>> 
>> ^^^ this still segfaults with get_line(NULL)
> 
> 
> This is OK.
> 
> get_line(NULL) should crash because we never expect
> the case  ' stringp == NULL'.
> 
> We need to care about the case ' *stringp == NULL'.
> In this case, get_line() should return NULL.
> 
> 
> 
> 
>>>       char *next;
>>> 
>>>       /* do not return the unwanted extra line at EOF */
>>>       if (!orig || *orig == '\0')
>>>               return NULL;
>>> 
>>>       next = strchr(orig, '\n');
>>>       if (next)
>>>               *next++ = '\0';
>>> 
>>>       *stringp = next;
>> 
>> Yes, this code is easier to understand than my while loop.
>> And strchr() is POSIX.
>> 
>> So should I submit an updated patch or do you want to submit
>> it (with a suggested-by: H. Nikolaus Schaller <hns@goldelico.com>)
> 
> Please send a patch.
> (Co-developed-by if you want to give some credit to me)

Yes, I will do in the next days.

BR and thanks,
Nikolaus Schaller
Masahiro Yamada June 28, 2020, 2:20 p.m. UTC | #5
On Sun, Jun 28, 2020 at 5:20 PM H. Nikolaus Schaller <hns@goldelico.com> wrote:
>
>
> > Am 28.06.2020 um 09:52 schrieb Masahiro Yamada <masahiroy@kernel.org>:
> >
> > On Sun, Jun 28, 2020 at 3:17 PM H. Nikolaus Schaller <hns@goldelico.com> wrote:
> >>
> >> Hi,
> >>
> >>> Am 28.06.2020 um 07:51 schrieb Masahiro Yamada <masahiroy@kernel.org>:
> >>>
> >>> On Thu, Jun 25, 2020 at 5:47 PM H. Nikolaus Schaller <hns@goldelico.com> wrote:
> >>>>
> >>>> strsep() is neither standard C nor POSIX and used outside
> >>>> the kernel code here. Using it here requires that the
> >>>> build host supports it out of the box which is e.g.
> >>>> not true for a Darwin build host and using a cross-compiler.
> >>>> This leads to:
> >>>>
> >>>> scripts/mod/modpost.c:145:2: warning: implicit declaration of function 'strsep' [-Wimplicit-function-declaration]
> >>>> return strsep(stringp, "\n");
> >>>> ^
> >>>>
> >>>> and a segfault when running MODPOST.
> >>>>
> >>>> See also: https://stackoverflow.com/a/7219504
> >>>>
> >>>> So let's add some lines of code separating the string at the
> >>>> next newline character instead of using strsep(). It does not
> >>>> hurt kernel size or speed since this code is run on the build host.
> >>>>
> >>>> Fixes: ac5100f5432967 ("modpost: add read_text_file() and get_line() helpers")
> >>>> Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
> >>>> ---
> >>>> scripts/mod/modpost.c | 7 ++++++-
> >>>> 1 file changed, 6 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> >>>> index 6aea65c65745..8fe63989c6e1 100644
> >>>> --- a/scripts/mod/modpost.c
> >>>> +++ b/scripts/mod/modpost.c
> >>>> @@ -138,11 +138,16 @@ char *read_text_file(const char *filename)
> >>>>
> >>>> char *get_line(char **stringp)
> >>>> {
> >>>> +       char *p;
> >>>>       /* do not return the unwanted extra line at EOF */
> >>>>       if (*stringp && **stringp == '\0')
> >>>
> >>> This check does not make sense anymore.
> >>>
> >>> Previously, get_line(NULL) returns NULL.
> >>>
> >>> With your patch, get_line(NULL) crashes
> >>> due to NULL-pointer dereference.
> >>
> >> Well, that is original code.
> >
> >
> > Sorry for confusion.
> >
> > I meant this:
> >
> >  char *s = NULL;
> >  get_line(&s);
> >
> >
> > In the current code, get_line(&s) returns NULL.
> > As 'man strsep' says this:
> >  "If *stringp is NULL, the strsep() function returns NULL
> >   and does nothing else."
> >
> > With your patch, **stringp will cause
> > NULL-pointer dereference.
>
> Ah, now I see. strsep() has a special case that is not covered
> by my patch.
>
> On the other hand, get_line() is only called as get_line(&pos) and
> pos = buf can not be NULL because that is checked before in read_dump().
> This is why I did not observe a segfault.
>
> But it is wise to make get_line() it more robust than needed. We do
> never know who will copy this code fragment... And I am tempted to
> handle the get_line(NULL) case as well.

No.
(stringp == NULL) is a bug.
Better to not handle.


(*stringp == NULL) has a good reason to be handled.


*stringp is updated to point to the next.

But, there is no more delimiter, *stringp reaches
the end of a string (**stringp == '\0').
In this case, we cannot advance *stringp any more.
(We cannot point the next character of '\0';
it would cause buffer overrun).

So,*stringp is set to NULL to represent no more string.
In the next iteration, (*stringp == NULL) is input,
then NULL is returned.




>
> >> I have only replaced the strsep() function.
> >> But yes, it looks to be better in addition to
> >> my patch.
> >>
> >>>
> >>>
> >>>
> >>>>               return NULL;
> >>>>
> >>>> -       return strsep(stringp, "\n");
> >>>> +       p = *stringp;
> >>>> +       while (**stringp != '\n')
> >>>> +               (*stringp)++;
> >>>
> >>>
> >>> Is this a safe conversion?
> >>>
> >>> If the input file does not contain '\n' at all,
> >>> this while-loop continues running,
> >>> and results in the segmentation fault
> >>> due to buffer over-run.
> >>
> >> Ah, yes, you are right.
> >>
> >> We should use
> >>
> >> +       while (**stringp && **stringp != '\n')
> >>
> >>>
> >>>
> >>>
> >>>> +       *(*stringp)++ = '\0';
> >>>> +       return p;
> >>>> }
> >>>
> >>>
> >>>
> >>> How about this?
> >>>
> >>> char *get_line(char **stringp)
> >>> {
> >>>       char *orig = *stringp;
> >>
> >> ^^^ this still segfaults with get_line(NULL)
> >
> >
> > This is OK.
> >
> > get_line(NULL) should crash because we never expect
> > the case  ' stringp == NULL'.
> >
> > We need to care about the case ' *stringp == NULL'.
> > In this case, get_line() should return NULL.
> >
> >
> >
> >
> >>>       char *next;
> >>>
> >>>       /* do not return the unwanted extra line at EOF */
> >>>       if (!orig || *orig == '\0')
> >>>               return NULL;
> >>>
> >>>       next = strchr(orig, '\n');
> >>>       if (next)
> >>>               *next++ = '\0';
> >>>
> >>>       *stringp = next;
> >>
> >> Yes, this code is easier to understand than my while loop.
> >> And strchr() is POSIX.
> >>
> >> So should I submit an updated patch or do you want to submit
> >> it (with a suggested-by: H. Nikolaus Schaller <hns@goldelico.com>)
> >
> > Please send a patch.
> > (Co-developed-by if you want to give some credit to me)
>
> Yes, I will do in the next days.
>
> BR and thanks,
> Nikolaus Schaller
>

Patch
diff mbox series

diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index 6aea65c65745..8fe63989c6e1 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -138,11 +138,16 @@  char *read_text_file(const char *filename)
 
 char *get_line(char **stringp)
 {
+	char *p;
 	/* do not return the unwanted extra line at EOF */
 	if (*stringp && **stringp == '\0')
 		return NULL;
 
-	return strsep(stringp, "\n");
+	p = *stringp;
+	while (**stringp != '\n')
+		(*stringp)++;
+	*(*stringp)++ = '\0';
+	return p;
 }
 
 /* A list of all modules we processed */