diff mbox

Re: [PATCH] checkpatch: add warning on %pk instead of %pK usage

Message ID 1486757549.2192.20.camel@perches.com (mailing list archive)
State New, archived
Headers show

Commit Message

Joe Perches Feb. 10, 2017, 8:12 p.m. UTC
On Fri, 2017-02-10 at 11:37 -0800, william.c.roberts@intel.com wrote:
> From: William Roberts <william.c.roberts@intel.com>
> 
> Sample output:
> WARNING: %pk is close to %pK, did you mean %pK?.
> \#20: FILE: drivers/char/applicom.c:230:
> +			printk(KERN_INFO "Could not allocate IRQ %d for PCI Applicom device. %pk\n", dev->irq, pci_get_class);

There isn't a single instance of this in the kernel tree.

Maybe if this is really useful, then all the %p<foo> extensions
should be enumerated and all unknown uses should have warnings.

Something like:

---
 scripts/checkpatch.pl | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Roberts, William C Feb. 10, 2017, 10:14 p.m. UTC | #1
> -----Original Message-----
> From: Joe Perches [mailto:joe@perches.com]
> Sent: Friday, February 10, 2017 12:12 PM
> To: Roberts, William C <william.c.roberts@intel.com>; linux-
> kernel@vger.kernel.org; apw@canonical.com; Andew Morton <akpm@linux-
> foundation.org>
> Cc: keescook@chromium.org; kernel-hardening@lists.openwall.com
> Subject: Re: [PATCH] checkpatch: add warning on %pk instead of %pK usage
> 
> On Fri, 2017-02-10 at 11:37 -0800, william.c.roberts@intel.com wrote:
> > From: William Roberts <william.c.roberts@intel.com>
> >
> > Sample output:
> > WARNING: %pk is close to %pK, did you mean %pK?.
> > \#20: FILE: drivers/char/applicom.c:230:
> > +			printk(KERN_INFO "Could not allocate IRQ %d for PCI
> Applicom
> > +device. %pk\n", dev->irq, pci_get_class);
> 
> There isn't a single instance of this in the kernel tree.
> 
> Maybe if this is really useful, then all the %p<foo> extensions should be
> enumerated and all unknown uses should have warnings.

I was thinking of doing that, but I figured I would start with the bare minimum patch.

> 
> Something like:
> 
> ---
>  scripts/checkpatch.pl | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index
> ad5ea5c545b2..8a90b457e8b5 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -5305,6 +5305,15 @@ sub process {
>  			}
>  		}
> 
> +# check for vsprintf extension %p<foo> misuses
> +		if ($line =~ /\b$logFunctions\s*\(.*$String/) {
> +			my $format = get_quoted_string($line, $rawline);
> +			if ($format =~
> /(\%[\*\d\.]*p(?![\WFfSsBKRraEhMmIiUDdgVCbGN]).)/) {
> +				WARN("VSPRINTF_POINTER_EXTENSION",
> +				     "Invalid vsprintf pointer extension '$1'\n" .
> $herecurr);
> +			}
> +		}
> +
>  # check for logging continuations
>  		if ($line =~ /\bprintk\s*\(\s*KERN_CONT\b|\bpr_cont\s*\(/) {
>  			WARN("LOGGING_CONTINUATION",
Roberts, William C Feb. 10, 2017, 10:26 p.m. UTC | #2
<snip>

> >
> > On Fri, 2017-02-10 at 11:37 -0800, william.c.roberts@intel.com wrote:
> > > From: William Roberts <william.c.roberts@intel.com>
> > >
> > > Sample output:
> > > WARNING: %pk is close to %pK, did you mean %pK?.
> > > \#20: FILE: drivers/char/applicom.c:230:
> > > +			printk(KERN_INFO "Could not allocate IRQ %d for PCI
> > Applicom
> > > +device. %pk\n", dev->irq, pci_get_class);
> >
> > There isn't a single instance of this in the kernel tree.
> >
> > Maybe if this is really useful, then all the %p<foo> extensions should
> > be enumerated and all unknown uses should have warnings.
> 
> I was thinking of doing that, but I figured I would start with the bare minimum
> patch.
> 
> >
> > Something like:
> >
> > ---
> >  scripts/checkpatch.pl | 9 +++++++++
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index
> > ad5ea5c545b2..8a90b457e8b5 100755
> > --- a/scripts/checkpatch.pl
> > +++ b/scripts/checkpatch.pl
> > @@ -5305,6 +5305,15 @@ sub process {
> >  			}
> >  		}
> >
> > +# check for vsprintf extension %p<foo> misuses
> > +		if ($line =~ /\b$logFunctions\s*\(.*$String/) {

I don't see the normal string formatting routines in that list... I think this is too restrictive.

> > +			my $format = get_quoted_string($line, $rawline);

Ahh thanks for that get_quoted_string().

> > +			if ($format =~
> > /(\%[\*\d\.]*p(?![\WFfSsBKRraEhMmIiUDdgVCbGN]).)/) {
> > +				WARN("VSPRINTF_POINTER_EXTENSION",
> > +				     "Invalid vsprintf pointer extension '$1'\n" .
> > $herecurr);

I think I'll send out a V2 with this part of the addition. I like that, and your wording.

> > +			}
> > +		}
> > +
> >  # check for logging continuations
> >  		if ($line =~ /\bprintk\s*\(\s*KERN_CONT\b|\bpr_cont\s*\(/) {
> >  			WARN("LOGGING_CONTINUATION",

I did a grep on some of the patters to see what it would match against
Joe Perches Feb. 10, 2017, 10:49 p.m. UTC | #3
On Fri, 2017-02-10 at 22:26 +0000, Roberts, William C wrote:
> <snip>
> 
> > > 
> > > On Fri, 2017-02-10 at 11:37 -0800, william.c.roberts@intel.com wrote:
> > > > From: William Roberts <william.c.roberts@intel.com>
> > > > 
> > > > Sample output:
> > > > WARNING: %pk is close to %pK, did you mean %pK?.
> > > > \#20: FILE: drivers/char/applicom.c:230:
> > > > +			printk(KERN_INFO "Could not allocate IRQ %d for PCI
> > > 
> > > Applicom
> > > > +device. %pk\n", dev->irq, pci_get_class);
> > > 
> > > There isn't a single instance of this in the kernel tree.
> > > 
> > > Maybe if this is really useful, then all the %p<foo> extensions should
> > > be enumerated and all unknown uses should have warnings.
> > 
> > I was thinking of doing that, but I figured I would start with the bare minimum
> > patch.
> > 
> > > 
> > > Something like:
> > > 
> > > ---
> > >  scripts/checkpatch.pl | 9 +++++++++
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index
> > > ad5ea5c545b2..8a90b457e8b5 100755
> > > --- a/scripts/checkpatch.pl
> > > +++ b/scripts/checkpatch.pl
> > > @@ -5305,6 +5305,15 @@ sub process {
> > >  			}
> > >  		}
> > > 
> > > +# check for vsprintf extension %p<foo> misuses
> > > +		if ($line =~ /\b$logFunctions\s*\(.*$String/) {
> 
> I don't see the normal string formatting routines in that list... I think this is too restrictive.

I don't.  There are no "normal" string formatting routines.
What do you think is missing?  sn?printf ? That's easy to add.
Joe Perches Feb. 10, 2017, 10:59 p.m. UTC | #4
On Fri, 2017-02-10 at 14:49 -0800, Joe Perches wrote:
> On Fri, 2017-02-10 at 22:26 +0000, Roberts, William C wrote:
> > <snip>
> > 
> > > > 
> > > > On Fri, 2017-02-10 at 11:37 -0800, william.c.roberts@intel.com wrote:
> > > > > From: William Roberts <william.c.roberts@intel.com>
> > > > > 
> > > > > Sample output:
> > > > > WARNING: %pk is close to %pK, did you mean %pK?.
> > > > > \#20: FILE: drivers/char/applicom.c:230:
> > > > > +			printk(KERN_INFO "Could not allocate IRQ %d for PCI
> > > > 
> > > > Applicom
> > > > > +device. %pk\n", dev->irq, pci_get_class);
> > > > 
> > > > There isn't a single instance of this in the kernel tree.

Just in case anyone else wondered why this came up.

https://googleprojectzero.blogspot.com/2017/02/lifting-hyper-visor-bypassing-samsungs.html
Roberts, William C Feb. 10, 2017, 11:31 p.m. UTC | #5
> -----Original Message-----
> From: Joe Perches [mailto:joe@perches.com]
> Sent: Friday, February 10, 2017 2:50 PM
> To: Roberts, William C <william.c.roberts@intel.com>; linux-
> kernel@vger.kernel.org; apw@canonical.com; Andew Morton <akpm@linux-
> foundation.org>
> Cc: keescook@chromium.org; kernel-hardening@lists.openwall.com
> Subject: Re: [PATCH] checkpatch: add warning on %pk instead of %pK usage
> 
> On Fri, 2017-02-10 at 22:26 +0000, Roberts, William C wrote:
> > <snip>
> >
> > > >
> > > > On Fri, 2017-02-10 at 11:37 -0800, william.c.roberts@intel.com wrote:
> > > > > From: William Roberts <william.c.roberts@intel.com>
> > > > >
> > > > > Sample output:
> > > > > WARNING: %pk is close to %pK, did you mean %pK?.
> > > > > \#20: FILE: drivers/char/applicom.c:230:
> > > > > +			printk(KERN_INFO "Could not allocate IRQ %d for
> PCI
> > > >
> > > > Applicom
> > > > > +device. %pk\n", dev->irq, pci_get_class);
> > > >
> > > > There isn't a single instance of this in the kernel tree.
> > > >
> > > > Maybe if this is really useful, then all the %p<foo> extensions
> > > > should be enumerated and all unknown uses should have warnings.
> > >
> > > I was thinking of doing that, but I figured I would start with the
> > > bare minimum patch.
> > >
> > > >
> > > > Something like:
> > > >
> > > > ---
> > > >  scripts/checkpatch.pl | 9 +++++++++
> > > >  1 file changed, 9 insertions(+)
> > > >
> > > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index
> > > > ad5ea5c545b2..8a90b457e8b5 100755
> > > > --- a/scripts/checkpatch.pl
> > > > +++ b/scripts/checkpatch.pl
> > > > @@ -5305,6 +5305,15 @@ sub process {
> > > >  			}
> > > >  		}
> > > >
> > > > +# check for vsprintf extension %p<foo> misuses
> > > > +		if ($line =~ /\b$logFunctions\s*\(.*$String/) {
> >
> > I don't see the normal string formatting routines in that list... I think this is too
> restrictive.
> 
> I don't.  There are no "normal" string formatting routines.

By "normal" I'm referring to things that call into pointer(), just casually looking
I see bstr_printf vsnprintf kvasprintf, which would be easy enough to add

> What do you think is missing?  sn?printf ? That's easy to add.

The problem starts to get hairy when we think of how often folks roll their own logging macros (see some small sampling at the end).

I think we would want to add DEBUG DBG and sn?printf and maybe consider dropping the \b on the regex so it's a bit more matchy but still shouldn't
end up matching on any ASM as you pointed out in the V2 nack.

Ill break this down into:
1. the patch as I know you'll take it, as you wrote it :-P
2. Adding to the logging macros
3. exploring making it less matchy

Data:
arch/alpha/kernel/pci_iommu.c:25:# define DBGA(args...)		printk(KERN_DEBUG args)
arch/alpha/kernel/pci_iommu.c:30:# define DBGA2(args...)		printk(KERN_DEBUG args)
arch/alpha/kernel/core_tsunami.c:50:# define DBG_CFG(args)	printk args
arch/alpha/kernel/core_titan.c:50:# define DBG_CFG(args)	printk args
arch/alpha/kernel/ptrace.c:34:#define DBG(fac,args)	{if ((fac) & DEBUG) printk args;}
arch/alpha/kernel/core_apecs.c:42:# define DBGC(args)	printk args
arch/alpha/kernel/core_irongate.c:38:# define DBG_CFG(args)	printk args
arch/alpha/kernel/core_wildfire.c:30:# define DBG_CFG(args)	printk args
arch/alpha/kernel/smc37c93x.c:18:# define DBG_DEVS(args)         printk args
arch/alpha/boot/misc.c:27:#define puts		srm_printk
arch/alpha/mm/numa.c:27:#define DBGDCONT(args...) printk(args)
arch/powerpc/sysdev/tsi108_pci.c:43:#define DBG(x...) printk(x)
arch/powerpc/sysdev/ge/ge_pic.c:31:#define DBG(fmt...) do { printk(KERN_DEBUG "gef_pic: " fmt); } while (0)
arch/powerpc/sysdev/tsi108_dev.c:34:#define DBG(fmt...) do { printk(fmt); } while(0)
arch/powerpc/sysdev/mpic.c:45:#define DBG(fmt...) printk(fmt)
arch/powerpc/kernel/process.c:69:#define TM_DEBUG(x...) printk(KERN_INFO x)
arch/powerpc/kernel/vdso.c:42:#define DBG(fmt...) printk(fmt)
arch/powerpc/kernel/legacy_serial.c:21:#define DBG(fmt...) do { printk(fmt); } while(0)
arch/powerpc/kernel/traps.c:89:#define TM_DEBUG(x...) printk(KERN_INFO x)
arch/powerpc/kernel/prom.c:65:#define DBG(fmt...) printk(KERN_ERR fmt)
arch/powerpc/kvm/book3s_paired_singles.c:33:#define dprintk printk
Joe Perches Feb. 10, 2017, 11:49 p.m. UTC | #6
(adding Emese Revfy and Julia Lawall)

On Fri, 2017-02-10 at 23:31 +0000, Roberts, William C wrote:
> The problem starts to get hairy when we think of how often folks roll their own logging macros (see some small sampling at the end).
> 
> I think we would want to add DEBUG DBG and sn?printf and maybe consider dropping the \b on the regex so it's a bit more matchy but still shouldn't
> end up matching on any ASM as you pointed out in the V2 nack.
> 
> Ill break this down into:
> 1. the patch as I know you'll take it, as you wrote it :-P
> 2. Adding to the logging macros
> 3. exploring making it less matchy

checkpatch is a line-oriented bunch of regexes
and doesn't know what is a __printf format.

It won't ever be "perfect" for this sort of
format verification checking.

Another way to do this is to write a gcc compiler
plugin that verifies the %p<foo> format types and
emits a warning/error.

That's probably the "best" solution.

Maybe coccinelle could help too.
Roberts, William C Feb. 10, 2017, 11:54 p.m. UTC | #7
> -----Original Message-----
> From: Roberts, William C
> Sent: Friday, February 10, 2017 3:32 PM
> To: 'Joe Perches' <joe@perches.com>; linux-kernel@vger.kernel.org;
> apw@canonical.com; Andew Morton <akpm@linux-foundation.org>
> Cc: keescook@chromium.org; kernel-hardening@lists.openwall.com
> Subject: RE: [PATCH] checkpatch: add warning on %pk instead of %pK usage
> 
> 
> 
> > -----Original Message-----
> > From: Joe Perches [mailto:joe@perches.com]
> > Sent: Friday, February 10, 2017 2:50 PM
> > To: Roberts, William C <william.c.roberts@intel.com>; linux-
> > kernel@vger.kernel.org; apw@canonical.com; Andew Morton <akpm@linux-
> > foundation.org>
> > Cc: keescook@chromium.org; kernel-hardening@lists.openwall.com
> > Subject: Re: [PATCH] checkpatch: add warning on %pk instead of %pK
> > usage
> >
> > On Fri, 2017-02-10 at 22:26 +0000, Roberts, William C wrote:
> > > <snip>
> > >
> > > > >
> > > > > On Fri, 2017-02-10 at 11:37 -0800, william.c.roberts@intel.com wrote:
> > > > > > From: William Roberts <william.c.roberts@intel.com>
> > > > > >
> > > > > > Sample output:
> > > > > > WARNING: %pk is close to %pK, did you mean %pK?.
> > > > > > \#20: FILE: drivers/char/applicom.c:230:
> > > > > > +			printk(KERN_INFO "Could not allocate IRQ %d for
> > PCI
> > > > >
> > > > > Applicom
> > > > > > +device. %pk\n", dev->irq, pci_get_class);
> > > > >
> > > > > There isn't a single instance of this in the kernel tree.
> > > > >
> > > > > Maybe if this is really useful, then all the %p<foo> extensions
> > > > > should be enumerated and all unknown uses should have warnings.
> > > >
> > > > I was thinking of doing that, but I figured I would start with the
> > > > bare minimum patch.
> > > >
> > > > >
> > > > > Something like:
> > > > >
> > > > > ---
> > > > >  scripts/checkpatch.pl | 9 +++++++++
> > > > >  1 file changed, 9 insertions(+)
> > > > >
> > > > > diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index
> > > > > ad5ea5c545b2..8a90b457e8b5 100755
> > > > > --- a/scripts/checkpatch.pl
> > > > > +++ b/scripts/checkpatch.pl
> > > > > @@ -5305,6 +5305,15 @@ sub process {
> > > > >  			}
> > > > >  		}
> > > > >
> > > > > +# check for vsprintf extension %p<foo> misuses
> > > > > +		if ($line =~ /\b$logFunctions\s*\(.*$String/) {
> > >
> > > I don't see the normal string formatting routines in that list... I
> > > think this is too
> > restrictive.
> >
> > I don't.  There are no "normal" string formatting routines.
> 
> By "normal" I'm referring to things that call into pointer(), just casually looking I
> see bstr_printf vsnprintf kvasprintf, which would be easy enough to add
> 
> > What do you think is missing?  sn?printf ? That's easy to add.
> 
> The problem starts to get hairy when we think of how often folks roll their own
> logging macros (see some small sampling at the end).
> 
> I think we would want to add DEBUG DBG and sn?printf and maybe consider
> dropping the \b on the regex so it's a bit more matchy but still shouldn't end up
> matching on any ASM as you pointed out in the V2 nack.
> 
> Ill break this down into:
> 1. the patch as I know you'll take it, as you wrote it :-P 2. Adding to the logging
> macros 3. exploring making it less matchy

Sent v3 --> Let me think on something better than items 2 and 3. We really want to
Know if were looking at at a string that is in a function or something there about.

Everyone has their own print routines... which is why I am in favor of neutering %p
within vsprintf itself.

> 
> Data:
> arch/alpha/kernel/pci_iommu.c:25:# define DBGA(args...)
> 	printk(KERN_DEBUG args)
> arch/alpha/kernel/pci_iommu.c:30:# define DBGA2(args...)
> 	printk(KERN_DEBUG args)
> arch/alpha/kernel/core_tsunami.c:50:# define DBG_CFG(args)	printk args
> arch/alpha/kernel/core_titan.c:50:# define DBG_CFG(args)	printk args
> arch/alpha/kernel/ptrace.c:34:#define DBG(fac,args)	{if ((fac) & DEBUG) printk
> args;}
> arch/alpha/kernel/core_apecs.c:42:# define DBGC(args)	printk args
> arch/alpha/kernel/core_irongate.c:38:# define DBG_CFG(args)	printk args
> arch/alpha/kernel/core_wildfire.c:30:# define DBG_CFG(args)	printk args
> arch/alpha/kernel/smc37c93x.c:18:# define DBG_DEVS(args)         printk args
> arch/alpha/boot/misc.c:27:#define puts		srm_printk
> arch/alpha/mm/numa.c:27:#define DBGDCONT(args...) printk(args)
> arch/powerpc/sysdev/tsi108_pci.c:43:#define DBG(x...) printk(x)
> arch/powerpc/sysdev/ge/ge_pic.c:31:#define DBG(fmt...) do {
> printk(KERN_DEBUG "gef_pic: " fmt); } while (0)
> arch/powerpc/sysdev/tsi108_dev.c:34:#define DBG(fmt...) do { printk(fmt); }
> while(0) arch/powerpc/sysdev/mpic.c:45:#define DBG(fmt...) printk(fmt)
> arch/powerpc/kernel/process.c:69:#define TM_DEBUG(x...) printk(KERN_INFO
> x) arch/powerpc/kernel/vdso.c:42:#define DBG(fmt...) printk(fmt)
> arch/powerpc/kernel/legacy_serial.c:21:#define DBG(fmt...) do { printk(fmt); }
> while(0) arch/powerpc/kernel/traps.c:89:#define TM_DEBUG(x...)
> printk(KERN_INFO x) arch/powerpc/kernel/prom.c:65:#define DBG(fmt...)
> printk(KERN_ERR fmt) arch/powerpc/kvm/book3s_paired_singles.c:33:#define
> dprintk printk
>
Joe Perches Feb. 11, 2017, 12:01 a.m. UTC | #8
On Fri, 2017-02-10 at 23:54 +0000, Roberts, William C wrote:
> > The problem starts to get hairy when we think of how often folks roll their own
> > logging macros (see some small sampling at the end).

It's not just the "hairy" local macros.

In its current form, checkpatch could not find uses like:

	netif_<foo>(x, y, z,
		    "some string with %pk",
		    args);
and
	some_logging_function(arg, "string 1" CONSTANT "string 2", etc...)

if string 2 or CONSTANT had the "%pk" use.

and a bunch of other styles.

This really needs to be verified by the compiler.
Roberts, William C Feb. 11, 2017, 1:32 a.m. UTC | #9
<snip>
> > By "normal" I'm referring to things that call into pointer(), just
> > casually looking I see bstr_printf vsnprintf kvasprintf, which would
> > be easy enough to add
> >
> > > What do you think is missing?  sn?printf ? That's easy to add.
> >
> > The problem starts to get hairy when we think of how often folks roll
> > their own logging macros (see some small sampling at the end).
> >
> > I think we would want to add DEBUG DBG and sn?printf and maybe
> > consider dropping the \b on the regex so it's a bit more matchy but
> > still shouldn't end up matching on any ASM as you pointed out in the V2 nack.
> >
> > Ill break this down into:
> > 1. the patch as I know you'll take it, as you wrote it :-P 2. Adding
> > to the logging macros 3. exploring making it less matchy

-Kees and Andrew they likely don't care about the rest of this...

I have been working up a regex (I suck at these) to match C functions that have an invalid
%p format string and take arguments:
http://www.regexr.com/3f92k

This could be a way to get better coverage in a more generic approach, thoughts?
diff mbox

Patch

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index ad5ea5c545b2..8a90b457e8b5 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5305,6 +5305,15 @@  sub process {
 			}
 		}
 
+# check for vsprintf extension %p<foo> misuses
+		if ($line =~ /\b$logFunctions\s*\(.*$String/) {
+			my $format = get_quoted_string($line, $rawline);
+			if ($format =~ /(\%[\*\d\.]*p(?![\WFfSsBKRraEhMmIiUDdgVCbGN]).)/) {
+				WARN("VSPRINTF_POINTER_EXTENSION",
+				     "Invalid vsprintf pointer extension '$1'\n" . $herecurr);
+			}
+		}
+
 # check for logging continuations
 		if ($line =~ /\bprintk\s*\(\s*KERN_CONT\b|\bpr_cont\s*\(/) {
 			WARN("LOGGING_CONTINUATION",