diff mbox series

[v4] checkpatch: add check for snprintf to scnprintf

Message ID 20240408-snprintf-checkpatch-v4-1-8697c96ac94b@google.com (mailing list archive)
State Superseded
Headers show
Series [v4] checkpatch: add check for snprintf to scnprintf | expand

Commit Message

Justin Stitt April 8, 2024, 8:53 p.m. UTC
I am going to quote Lee Jones who has been doing some snprintf ->
scnprintf refactorings:

"There is a general misunderstanding amongst engineers that
{v}snprintf() returns the length of the data *actually* encoded into the
destination array.  However, as per the C99 standard {v}snprintf()
really returns the length of the data that *would have been* written if
there were enough space for it.  This misunderstanding has led to
buffer-overruns in the past.  It's generally considered safer to use the
{v}scnprintf() variants in their place (or even sprintf() in simple
cases).  So let's do that."

To help prevent new instances of snprintf() from popping up, let's add a
check to checkpatch.pl.

Suggested-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
---
Changes in v4:
- also check for vsnprintf variant (thanks Bill)
- Link to v3: https://lore.kernel.org/r/20240315-snprintf-checkpatch-v3-1-a451e7664306@google.com

Changes in v3:
- fix indentation
- add reference link (https://github.com/KSPP/linux/issues/105) (thanks Joe)
- Link to v2: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v2-1-9baeb59dae30@google.com

Changes in v2:
- Had a vim moment and deleted a character before sending the patch.
- Replaced the character :)
- Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com
---
From a discussion here [1].

[1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/
---
 scripts/checkpatch.pl | 6 ++++++
 1 file changed, 6 insertions(+)


---
base-commit: b401b621758e46812da61fa58a67c3fd8d91de0d
change-id: 20240221-snprintf-checkpatch-a864ed67ebd0

Best regards,
--
Justin Stitt <justinstitt@google.com>

Comments

Kees Cook April 8, 2024, 11:20 p.m. UTC | #1
On Mon, Apr 08, 2024 at 08:53:33PM +0000, Justin Stitt wrote:
> I am going to quote Lee Jones who has been doing some snprintf ->
> scnprintf refactorings:
> 
> "There is a general misunderstanding amongst engineers that
> {v}snprintf() returns the length of the data *actually* encoded into the
> destination array.  However, as per the C99 standard {v}snprintf()
> really returns the length of the data that *would have been* written if
> there were enough space for it.  This misunderstanding has led to
> buffer-overruns in the past.  It's generally considered safer to use the
> {v}scnprintf() variants in their place (or even sprintf() in simple
> cases).  So let's do that."
> 
> To help prevent new instances of snprintf() from popping up, let's add a
> check to checkpatch.pl.
> 
> Suggested-by: Finn Thain <fthain@linux-m68k.org>
> Signed-off-by: Justin Stitt <justinstitt@google.com>

Reviewed-by: Kees Cook <keescook@chromium.org>
Lee Jones April 11, 2024, 6:57 a.m. UTC | #2
On Mon, 08 Apr 2024, Justin Stitt wrote:

> I am going to quote Lee Jones who has been doing some snprintf ->
> scnprintf refactorings:
> 
> "There is a general misunderstanding amongst engineers that
> {v}snprintf() returns the length of the data *actually* encoded into the
> destination array.  However, as per the C99 standard {v}snprintf()
> really returns the length of the data that *would have been* written if
> there were enough space for it.  This misunderstanding has led to
> buffer-overruns in the past.  It's generally considered safer to use the
> {v}scnprintf() variants in their place (or even sprintf() in simple
> cases).  So let's do that."
> 
> To help prevent new instances of snprintf() from popping up, let's add a
> check to checkpatch.pl.
> 
> Suggested-by: Finn Thain <fthain@linux-m68k.org>
> Signed-off-by: Justin Stitt <justinstitt@google.com>
> ---
> Changes in v4:
> - also check for vsnprintf variant (thanks Bill)
> - Link to v3: https://lore.kernel.org/r/20240315-snprintf-checkpatch-v3-1-a451e7664306@google.com
> 
> Changes in v3:
> - fix indentation
> - add reference link (https://github.com/KSPP/linux/issues/105) (thanks Joe)
> - Link to v2: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v2-1-9baeb59dae30@google.com
> 
> Changes in v2:
> - Had a vim moment and deleted a character before sending the patch.
> - Replaced the character :)
> - Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com
> ---
> From a discussion here [1].
> 
> [1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/
> ---
>  scripts/checkpatch.pl | 6 ++++++
>  1 file changed, 6 insertions(+)

Reviewed-by: Lee Jones <lee@kernel.org>
Christophe JAILLET April 11, 2024, 8:01 p.m. UTC | #3
Le 08/04/2024 à 22:53, Justin Stitt a écrit :
> I am going to quote Lee Jones who has been doing some snprintf ->
> scnprintf refactorings:
> 
> "There is a general misunderstanding amongst engineers that
> {v}snprintf() returns the length of the data *actually* encoded into the
> destination array.  However, as per the C99 standard {v}snprintf()
> really returns the length of the data that *would have been* written if
> there were enough space for it.  This misunderstanding has led to
> buffer-overruns in the past.  It's generally considered safer to use the
> {v}scnprintf() variants in their place (or even sprintf() in simple
> cases).  So let's do that."
> 
> To help prevent new instances of snprintf() from popping up, let's add a
> check to checkpatch.pl.
> 
> Suggested-by: Finn Thain <fthain@linux-m68k.org>
> Signed-off-by: Justin Stitt <justinstitt@google.com>
> ---
> Changes in v4:
> - also check for vsnprintf variant (thanks Bill)
> - Link to v3: https://lore.kernel.org/r/20240315-snprintf-checkpatch-v3-1-a451e7664306@google.com
> 
> Changes in v3:
> - fix indentation
> - add reference link (https://github.com/KSPP/linux/issues/105) (thanks Joe)
> - Link to v2: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v2-1-9baeb59dae30@google.com
> 
> Changes in v2:
> - Had a vim moment and deleted a character before sending the patch.
> - Replaced the character :)
> - Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com
> ---
>  From a discussion here [1].
> 
> [1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/
> ---
>   scripts/checkpatch.pl | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 9c4c4a61bc83..a0fd681ea837 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -7012,6 +7012,12 @@ sub process {
>   			     "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr);
>   		}
>   
> +# {v}snprintf uses that should likely be {v}scnprintf
> +		if ($line =~ /\b(v|)snprintf\s*\(\s*/) {

Hi,

for my understanding, what is the purpose of the 2nd "\s*"?
IMHO, it could be just removed.

> +			WARN("SNPRINTF",
> +			     "Prefer {v}scnprintf over {v}snprintf - see: https://github.com/KSPP/linux/issues/105\n" . $herecurr);

Maybe $1 instead of {v} in both places, so that is displays the real 
function name that is and should be used?

CJ

> +		}
> +
>   # ethtool_sprintf uses that should likely be ethtool_puts
>   		if ($line =~ /\bethtool_sprintf\s*\(\s*$FuncArg\s*,\s*$FuncArg\s*\)/) {
>   			if (WARN("PREFER_ETHTOOL_PUTS",
> 
> ---
> base-commit: b401b621758e46812da61fa58a67c3fd8d91de0d
> change-id: 20240221-snprintf-checkpatch-a864ed67ebd0
> 
> Best regards,
> --
> Justin Stitt <justinstitt@google.com>
> 
> 
>
Joe Perches April 11, 2024, 8:56 p.m. UTC | #4
On Thu, 2024-04-11 at 22:01 +0200, Christophe JAILLET wrote:
> Le 08/04/2024 à 22:53, Justin Stitt a écrit :
> > I am going to quote Lee Jones who has been doing some snprintf ->
> > scnprintf refactorings:
> > 
> > "There is a general misunderstanding amongst engineers that
> > {v}snprintf() returns the length of the data *actually* encoded into the
> > destination array.  However, as per the C99 standard {v}snprintf()
> > really returns the length of the data that *would have been* written if
> > there were enough space for it.  This misunderstanding has led to
> > buffer-overruns in the past.  It's generally considered safer to use the
> > {v}scnprintf() variants in their place (or even sprintf() in simple
> > cases).  So let's do that."
> > 
> > To help prevent new instances of snprintf() from popping up, let's add a
> > check to checkpatch.pl.
> > 
> > Suggested-by: Finn Thain <fthain@linux-m68k.org>
> > Signed-off-by: Justin Stitt <justinstitt@google.com>
> > ---
> > Changes in v4:
> > - also check for vsnprintf variant (thanks Bill)
> > - Link to v3: https://lore.kernel.org/r/20240315-snprintf-checkpatch-v3-1-a451e7664306@google.com
> > 
> > Changes in v3:
> > - fix indentation
> > - add reference link (https://github.com/KSPP/linux/issues/105) (thanks Joe)
> > - Link to v2: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v2-1-9baeb59dae30@google.com
> > 
> > Changes in v2:
> > - Had a vim moment and deleted a character before sending the patch.
> > - Replaced the character :)
> > - Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com
> > ---
> >  From a discussion here [1].
> > 
> > [1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/
> > ---
> >   scripts/checkpatch.pl | 6 ++++++
> >   1 file changed, 6 insertions(+)
> > 
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > index 9c4c4a61bc83..a0fd681ea837 100755
> > --- a/scripts/checkpatch.pl
> > +++ b/scripts/checkpatch.pl
> > @@ -7012,6 +7012,12 @@ sub process {
> >   			     "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr);
> >   		}
> >   
> > +# {v}snprintf uses that should likely be {v}scnprintf
> > +		if ($line =~ /\b(v|)snprintf\s*\(\s*/) {
> 
> Hi,
> 
> for my understanding, what is the purpose of the 2nd "\s*"?
> IMHO, it could be just removed.

It could.

# {v}snprintf uses that should likely be {v}scnprintf
		if ($line =~ /\b((v?)snprintf)\s*\(/) {
			WARN("SNPRINTF",
			     "Prefer ${2}scnprintf over $1 - see: https://github.com/KSPP/linux/issues/105\n" . $herecurr);
		}



Though I also think it's better to use lore rather than github
Justin Stitt April 11, 2024, 10:10 p.m. UTC | #5
On Thu, Apr 11, 2024 at 1:56 PM Joe Perches <joe@perches.com> wrote:
> It could.
>
> # {v}snprintf uses that should likely be {v}scnprintf
>                 if ($line =~ /\b((v?)snprintf)\s*\(/) {
>                         WARN("SNPRINTF",
>                              "Prefer ${2}scnprintf over $1 - see: https://github.com/KSPP/linux/issues/105\n" . $herecurr);
>                 }
>
>
>
> Though I also think it's better to use lore rather than github

I am fine with making the UX change in v5 regarding using ${2} and $1
but I wish someone could have said something about the Github links
earlier, we already have a pattern going with these string api
changes:

     "Prefer strscpy over strcpy - see:
https://github.com/KSPP/linux/issues/88\n" . $herecurr);
}

...
# strlcpy uses that should likely be strscpy
if ($line =~ /\bstrlcpy\s*\(/) {
WARN("STRLCPY",
     "Prefer strscpy over strlcpy - see:
https://github.com/KSPP/linux/issues/89\n" . $herecurr);
}

# strncpy uses that should likely be strscpy or strscpy_pad
if ($line =~ /\bstrncpy\s*\(/) {
WARN("STRNCPY",
     "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see:
https://github.com/KSPP/linux/issues/90\n" . $herecurr);
}

# {v}snprintf uses that should likely be {v}scnprintf
if ($line =~ /\b(v|)snprintf\s*\(/) {
WARN("SNPRINTF",
     "Prefer ${2}scnprintf over $1 - see:
https://github.com/KSPP/linux/issues/105\n" . $herecurr);
...


It should be noted that nowhere else is a lore link or github link
provided during a warning, so there really is no precedence. Joe what
should we do?

>
Kees Cook April 15, 2024, 6:06 p.m. UTC | #6
On Thu, Apr 11, 2024 at 03:10:57PM -0700, Justin Stitt wrote:
> On Thu, Apr 11, 2024 at 1:56 PM Joe Perches <joe@perches.com> wrote:
> > It could.
> >
> > # {v}snprintf uses that should likely be {v}scnprintf
> >                 if ($line =~ /\b((v?)snprintf)\s*\(/) {
> >                         WARN("SNPRINTF",
> >                              "Prefer ${2}scnprintf over $1 - see: https://github.com/KSPP/linux/issues/105\n" . $herecurr);
> >                 }
> >
> >
> >
> > Though I also think it's better to use lore rather than github
> 
> I am fine with making the UX change in v5 regarding using ${2} and $1
> but I wish someone could have said something about the Github links
> earlier, we already have a pattern going with these string api
> changes:
> 
>      "Prefer strscpy over strcpy - see:
> https://github.com/KSPP/linux/issues/88\n" . $herecurr);
> }

KSPP isn't going anywhere -- we've used these links before and we can
use them here too. I don't see any good reason to duplicate stuff into
lore, etc.
diff mbox series

Patch

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9c4c4a61bc83..a0fd681ea837 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -7012,6 +7012,12 @@  sub process {
 			     "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr);
 		}
 
+# {v}snprintf uses that should likely be {v}scnprintf
+		if ($line =~ /\b(v|)snprintf\s*\(\s*/) {
+			WARN("SNPRINTF",
+			     "Prefer {v}scnprintf over {v}snprintf - see: https://github.com/KSPP/linux/issues/105\n" . $herecurr);
+		}
+
 # ethtool_sprintf uses that should likely be ethtool_puts
 		if ($line =~ /\bethtool_sprintf\s*\(\s*$FuncArg\s*,\s*$FuncArg\s*\)/) {
 			if (WARN("PREFER_ETHTOOL_PUTS",