diff mbox series

[v2] checkpatch: add check for snprintf to scnprintf

Message ID 20240221-snprintf-checkpatch-v2-1-9baeb59dae30@google.com (mailing list archive)
State Superseded
Headers show
Series [v2] checkpatch: add check for snprintf to scnprintf | expand

Commit Message

Justin Stitt Feb. 21, 2024, 10:11 p.m. UTC
I am going to quote Lee Jones who has been doing some snprintf ->
scnprintf refactorings:

"There is a general misunderstanding amongst engineers that
{v}snprintf() returns the length of the data *actually* encoded into the
destination array.  However, as per the C99 standard {v}snprintf()
really returns the length of the data that *would have been* written if
there were enough space for it.  This misunderstanding has led to
buffer-overruns in the past.  It's generally considered safer to use the
{v}scnprintf() variants in their place (or even sprintf() in simple
cases).  So let's do that."

To help prevent new instances of snprintf() from popping up, let's add a
check to checkpatch.pl.

Suggested-by: Finn Thain <fthain@linux-m68k.org>
Signed-off-by: Justin Stitt <justinstitt@google.com>
---
Changes in v2:
- Had a vim moment and deleted a character before sending the patch.
- Replaced the character :)
- Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com
---
From a discussion here [1].

[1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/
---
 scripts/checkpatch.pl | 6 ++++++
 1 file changed, 6 insertions(+)


---
base-commit: b401b621758e46812da61fa58a67c3fd8d91de0d
change-id: 20240221-snprintf-checkpatch-a864ed67ebd0

Best regards,
--
Justin Stitt <justinstitt@google.com>

Comments

Kees Cook Feb. 21, 2024, 10:33 p.m. UTC | #1
On Wed, Feb 21, 2024 at 10:11:59PM +0000, Justin Stitt wrote:
> I am going to quote Lee Jones who has been doing some snprintf ->
> scnprintf refactorings:
> 
> "There is a general misunderstanding amongst engineers that
> {v}snprintf() returns the length of the data *actually* encoded into the
> destination array.  However, as per the C99 standard {v}snprintf()
> really returns the length of the data that *would have been* written if
> there were enough space for it.  This misunderstanding has led to
> buffer-overruns in the past.  It's generally considered safer to use the
> {v}scnprintf() variants in their place (or even sprintf() in simple
> cases).  So let's do that."
> 
> To help prevent new instances of snprintf() from popping up, let's add a
> check to checkpatch.pl.
> 
> Suggested-by: Finn Thain <fthain@linux-m68k.org>
> Signed-off-by: Justin Stitt <justinstitt@google.com>

Yes please! :)

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
> Changes in v2:
> - Had a vim moment and deleted a character before sending the patch.
> - Replaced the character :)
> - Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com
> ---
> From a discussion here [1].
> 
> [1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/
> ---
>  scripts/checkpatch.pl | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 9c4c4a61bc83..64025a6e6155 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -7012,6 +7012,12 @@ sub process {
>  			     "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr);
>  		}
>  
> +# snprintf uses that should likely be {v}scnprintf
> +		if ($line =~ /\bsnprintf\s*\(\s*/) {
> +				WARN("SNPRINTF",
> +				     "Prefer scnprintf over snprintf\n" . $herecurr);
> +		}
> +
>  # ethtool_sprintf uses that should likely be ethtool_puts
>  		if ($line =~ /\bethtool_sprintf\s*\(\s*$FuncArg\s*,\s*$FuncArg\s*\)/) {
>  			if (WARN("PREFER_ETHTOOL_PUTS",
> 
> ---
> base-commit: b401b621758e46812da61fa58a67c3fd8d91de0d
> change-id: 20240221-snprintf-checkpatch-a864ed67ebd0
> 
> Best regards,
> --
> Justin Stitt <justinstitt@google.com>
>
Joe Perches Feb. 22, 2024, 1:05 a.m. UTC | #2
On Wed, 2024-02-21 at 22:11 +0000, Justin Stitt wrote:
> I am going to quote Lee Jones who has been doing some snprintf ->
> scnprintf refactorings:
> 
> "There is a general misunderstanding amongst engineers that
> {v}snprintf() returns the length of the data *actually* encoded into the
> destination array.  However, as per the C99 standard {v}snprintf()
> really returns the length of the data that *would have been* written if
> there were enough space for it.  This misunderstanding has led to
> buffer-overruns in the past.  It's generally considered safer to use the
> {v}scnprintf() variants in their place (or even sprintf() in simple
> cases).  So let's do that."
> 
> To help prevent new instances of snprintf() from popping up, let's add a
> check to checkpatch.pl.
> 
> Suggested-by: Finn Thain <fthain@linux-m68k.org>
> Signed-off-by: Justin Stitt <justinstitt@google.com>
> ---
> Changes in v2:
> - Had a vim moment and deleted a character before sending the patch.
> - Replaced the character :)
> - Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com
> ---
> From a discussion here [1].
> 
> [1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/

> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -7012,6 +7012,12 @@ sub process {
>  			     "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr);
>  		}
>  
> +# snprintf uses that should likely be {v}scnprintf
> +		if ($line =~ /\bsnprintf\s*\(\s*/) {
> +				WARN("SNPRINTF",
> +				     "Prefer scnprintf over snprintf\n" . $herecurr);

There really should be some sort of reference link here
similar to the one above this.

Also, I rather doubt _all_ of these should be changed just
for churn's sake.

Maybe add a test for some return value use like

		if (defined($stat) &&
		    $stat =~ /$Lval\s*=\s*snprintf\s*\(/) {
			etc...

Maybe offer to --fix it too.
David Laight Feb. 22, 2024, 10:30 p.m. UTC | #3
From: Justin Stitt
> Sent: 21 February 2024 22:12
> 
> I am going to quote Lee Jones who has been doing some snprintf ->
> scnprintf refactorings:
> 
> "There is a general misunderstanding amongst engineers that
> {v}snprintf() returns the length of the data *actually* encoded into the
> destination array.  However, as per the C99 standard {v}snprintf()
> really returns the length of the data that *would have been* written if
> there were enough space for it.  This misunderstanding has led to
> buffer-overruns in the past.  It's generally considered safer to use the
> {v}scnprintf() variants in their place (or even sprintf() in simple
> cases).  So let's do that."

While generally true, there are places that really do want to
detect (and error) overflow.
That isn't possible with scnprintf().

I'm not sure what the solution is though.
Having a function that returns a negative value on overflow is also
likely to get misused.
seq_printf() (or whatever it is called) may let you check,
but it is hardly a cheap wrapper and a bit of a PITA to use.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Lee Jones Feb. 23, 2024, 10:38 a.m. UTC | #4
On Wed, 21 Feb 2024, Joe Perches wrote:

> On Wed, 2024-02-21 at 22:11 +0000, Justin Stitt wrote:
> > I am going to quote Lee Jones who has been doing some snprintf ->
> > scnprintf refactorings:
> > 
> > "There is a general misunderstanding amongst engineers that
> > {v}snprintf() returns the length of the data *actually* encoded into the
> > destination array.  However, as per the C99 standard {v}snprintf()
> > really returns the length of the data that *would have been* written if
> > there were enough space for it.  This misunderstanding has led to
> > buffer-overruns in the past.  It's generally considered safer to use the
> > {v}scnprintf() variants in their place (or even sprintf() in simple
> > cases).  So let's do that."
> > 
> > To help prevent new instances of snprintf() from popping up, let's add a
> > check to checkpatch.pl.
> > 
> > Suggested-by: Finn Thain <fthain@linux-m68k.org>
> > Signed-off-by: Justin Stitt <justinstitt@google.com>
> > ---
> > Changes in v2:
> > - Had a vim moment and deleted a character before sending the patch.
> > - Replaced the character :)
> > - Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com
> > ---
> > From a discussion here [1].
> > 
> > [1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/
> 
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> []
> > @@ -7012,6 +7012,12 @@ sub process {
> >  			     "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr);
> >  		}
> >  
> > +# snprintf uses that should likely be {v}scnprintf
> > +		if ($line =~ /\bsnprintf\s*\(\s*/) {
> > +				WARN("SNPRINTF",
> > +				     "Prefer scnprintf over snprintf\n" . $herecurr);
> 
> There really should be some sort of reference link here
> similar to the one above this.
> 
> Also, I rather doubt _all_ of these should be changed just
> for churn's sake.

This is for new implementations only.

Kees is planning on changing all of the current instances kernel-wide.

> Maybe add a test for some return value use like
> 
> 		if (defined($stat) &&
> 		    $stat =~ /$Lval\s*=\s*snprintf\s*\(/) {
> 			etc...
> 
> Maybe offer to --fix it too.
>
Lee Jones Feb. 23, 2024, 10:41 a.m. UTC | #5
On Thu, 22 Feb 2024, David Laight wrote:

> From: Justin Stitt
> > Sent: 21 February 2024 22:12
> > 
> > I am going to quote Lee Jones who has been doing some snprintf ->
> > scnprintf refactorings:
> > 
> > "There is a general misunderstanding amongst engineers that
> > {v}snprintf() returns the length of the data *actually* encoded into the
> > destination array.  However, as per the C99 standard {v}snprintf()
> > really returns the length of the data that *would have been* written if
> > there were enough space for it.  This misunderstanding has led to
> > buffer-overruns in the past.  It's generally considered safer to use the
> > {v}scnprintf() variants in their place (or even sprintf() in simple
> > cases).  So let's do that."
> 
> While generally true, there are places that really do want to
> detect (and error) overflow.
> That isn't possible with scnprintf().
> 
> I'm not sure what the solution is though.
> Having a function that returns a negative value on overflow is also
> likely to get misused.
> seq_printf() (or whatever it is called) may let you check,
> but it is hardly a cheap wrapper and a bit of a PITA to use.

I agree.

spprinf() was my favorite solution, but it seems that the lib string
people don't like to accept new functionality, even if it's a clear
improvement over the currently available solutions.

[0] https://lore.kernel.org/all/20240130160953.766676-1-lee@kernel.org/
Joe Perches Feb. 23, 2024, 12:47 p.m. UTC | #6
On Fri, 2024-02-23 at 10:38 +0000, Lee Jones wrote:
> On Wed, 21 Feb 2024, Joe Perches wrote:
> 
> > On Wed, 2024-02-21 at 22:11 +0000, Justin Stitt wrote:
> > > I am going to quote Lee Jones who has been doing some snprintf ->
> > > scnprintf refactorings:
> > > 
> > > "There is a general misunderstanding amongst engineers that
> > > {v}snprintf() returns the length of the data *actually* encoded into the
> > > destination array.  However, as per the C99 standard {v}snprintf()
> > > really returns the length of the data that *would have been* written if
> > > there were enough space for it.  This misunderstanding has led to
> > > buffer-overruns in the past.  It's generally considered safer to use the
> > > {v}scnprintf() variants in their place (or even sprintf() in simple
> > > cases).  So let's do that."
> > > 
> > > To help prevent new instances of snprintf() from popping up, let's add a
> > > check to checkpatch.pl.
> > > 
> > > Suggested-by: Finn Thain <fthain@linux-m68k.org>
> > > Signed-off-by: Justin Stitt <justinstitt@google.com>
> > > ---
> > > Changes in v2:
> > > - Had a vim moment and deleted a character before sending the patch.
> > > - Replaced the character :)
> > > - Link to v1: https://lore.kernel.org/r/20240221-snprintf-checkpatch-v1-1-3ac5025b5961@google.com
> > > ---
> > > From a discussion here [1].
> > > 
> > > [1]: https://lore.kernel.org/all/0f9c95f9-2c14-eee6-7faf-635880edcea4@linux-m68k.org/
> > 
> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> > []
> > > @@ -7012,6 +7012,12 @@ sub process {
> > >  			     "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr);
> > >  		}
> > >  
> > > +# snprintf uses that should likely be {v}scnprintf
> > > +		if ($line =~ /\bsnprintf\s*\(\s*/) {
> > > +				WARN("SNPRINTF",
> > > +				     "Prefer scnprintf over snprintf\n" . $herecurr);
> > 
> > There really should be some sort of reference link here
> > similar to the one above this.
> > 
> > Also, I rather doubt _all_ of these should be changed just
> > for churn's sake.
> 
> This is for new implementations only.
> 
> Kees is planning on changing all of the current instances kernel-wide.

I saw that.  I also saw pushback.
Not just my own.

Creating a cocci script is easy.
Getting Linus and others to run it isn't.
diff mbox series

Patch

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 9c4c4a61bc83..64025a6e6155 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -7012,6 +7012,12 @@  sub process {
 			     "Prefer strscpy, strscpy_pad, or __nonstring over strncpy - see: https://github.com/KSPP/linux/issues/90\n" . $herecurr);
 		}
 
+# snprintf uses that should likely be {v}scnprintf
+		if ($line =~ /\bsnprintf\s*\(\s*/) {
+				WARN("SNPRINTF",
+				     "Prefer scnprintf over snprintf\n" . $herecurr);
+		}
+
 # ethtool_sprintf uses that should likely be ethtool_puts
 		if ($line =~ /\bethtool_sprintf\s*\(\s*$FuncArg\s*,\s*$FuncArg\s*\)/) {
 			if (WARN("PREFER_ETHTOOL_PUTS",