[v7] Makefile: Add clang-tidy and static analyzer support to makefile
diff mbox series

Message ID 20200728004736.3590053-1-nhuck@google.com
State New
Headers show
Series
  • [v7] Makefile: Add clang-tidy and static analyzer support to makefile
Related show

Commit Message

Nathan Huckleberry July 28, 2020, 12:47 a.m. UTC
This patch adds clang-tidy and the clang static-analyzer as make
targets. The goal of this patch is to make static analysis tools
usable and extendable by any developer or researcher who is familiar
with basic c++.

The current static analysis tools require intimate knowledge of the
internal workings of the static analysis. Clang-tidy and the clang
static analyzers expose an easy to use api and allow users unfamiliar
with clang to write new checks with relative ease.

===Clang-tidy===

Clang-tidy is an easily extendable 'linter' that runs on the AST.
Clang-tidy checks are easy to write and understand. A check consists of
two parts, a matcher and a checker. The matcher is created using a
domain specific language that acts on the AST
(https://clang.llvm.org/docs/LibASTMatchersReference.html).  When AST
nodes are found by the matcher a callback is made to the checker. The
checker can then execute additional checks and issue warnings.

Here is an example clang-tidy check to report functions that have calls
to local_irq_disable without calls to local_irq_enable and vice-versa.
Functions flagged with __attribute((annotation("ignore_irq_balancing")))
are ignored for analysis. (https://reviews.llvm.org/D65828)

===Clang static analyzer===

The clang static analyzer is a more powerful static analysis tool that
uses symbolic execution to find bugs. Currently there is a check that
looks for potential security bugs from invalid uses of kmalloc and
kfree. There are several more general purpose checks that are useful for
the kernel.

The clang static analyzer is well documented and designed to be
extensible.
(https://clang-analyzer.llvm.org/checker_dev_manual.html)
(https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf)

The main draw of the clang tools is how accessible they are. The clang
documentation is very nice and these tools are built specifically to be
easily extendable by any developer. They provide an accessible method of
bug-finding and research to people who are not overly familiar with the
kernel codebase.

Signed-off-by: Nathan Huckleberry <nhuck@google.com>
---
Changes v6->v7
* Fix issues with relative paths
* Additional style fixes
 MAINTAINERS                                   |  1 +
 Makefile                                      |  3 +
 scripts/clang-tools/Makefile.clang-tools      | 23 ++++++
 .../{ => clang-tools}/gen_compile_commands.py |  0
 scripts/clang-tools/run-clang-tools.py        | 74 +++++++++++++++++++
 5 files changed, 101 insertions(+)
 create mode 100644 scripts/clang-tools/Makefile.clang-tools
 rename scripts/{ => clang-tools}/gen_compile_commands.py (100%)
 create mode 100755 scripts/clang-tools/run-clang-tools.py

Comments

Nick Desaulniers July 28, 2020, 8:35 p.m. UTC | #1
On Mon, Jul 27, 2020 at 5:47 PM 'Nathan Huckleberry' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
>
> This patch adds clang-tidy and the clang static-analyzer as make
> targets. The goal of this patch is to make static analysis tools
> usable and extendable by any developer or researcher who is familiar
> with basic c++.
>
> The current static analysis tools require intimate knowledge of the
> internal workings of the static analysis. Clang-tidy and the clang
> static analyzers expose an easy to use api and allow users unfamiliar
> with clang to write new checks with relative ease.
>
> ===Clang-tidy===
>
> Clang-tidy is an easily extendable 'linter' that runs on the AST.
> Clang-tidy checks are easy to write and understand. A check consists of
> two parts, a matcher and a checker. The matcher is created using a
> domain specific language that acts on the AST
> (https://clang.llvm.org/docs/LibASTMatchersReference.html).  When AST
> nodes are found by the matcher a callback is made to the checker. The
> checker can then execute additional checks and issue warnings.
>
> Here is an example clang-tidy check to report functions that have calls
> to local_irq_disable without calls to local_irq_enable and vice-versa.
> Functions flagged with __attribute((annotation("ignore_irq_balancing")))
> are ignored for analysis. (https://reviews.llvm.org/D65828)
>
> ===Clang static analyzer===
>
> The clang static analyzer is a more powerful static analysis tool that
> uses symbolic execution to find bugs. Currently there is a check that
> looks for potential security bugs from invalid uses of kmalloc and
> kfree. There are several more general purpose checks that are useful for
> the kernel.
>
> The clang static analyzer is well documented and designed to be
> extensible.
> (https://clang-analyzer.llvm.org/checker_dev_manual.html)
> (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf)
>
> The main draw of the clang tools is how accessible they are. The clang
> documentation is very nice and these tools are built specifically to be
> easily extendable by any developer. They provide an accessible method of
> bug-finding and research to people who are not overly familiar with the
> kernel codebase.
>
> Signed-off-by: Nathan Huckleberry <nhuck@google.com>
> ---
> Changes v6->v7
> * Fix issues with relative paths
> * Additional style fixes
>  MAINTAINERS                                   |  1 +
>  Makefile                                      |  3 +
>  scripts/clang-tools/Makefile.clang-tools      | 23 ++++++
>  .../{ => clang-tools}/gen_compile_commands.py |  0
>  scripts/clang-tools/run-clang-tools.py        | 74 +++++++++++++++++++
>  5 files changed, 101 insertions(+)
>  create mode 100644 scripts/clang-tools/Makefile.clang-tools
>  rename scripts/{ => clang-tools}/gen_compile_commands.py (100%)
>  create mode 100755 scripts/clang-tools/run-clang-tools.py
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1d4aa7f942de..a444564e5572 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4198,6 +4198,7 @@ W:        https://clangbuiltlinux.github.io/
>  B:     https://github.com/ClangBuiltLinux/linux/issues
>  C:     irc://chat.freenode.net/clangbuiltlinux
>  F:     Documentation/kbuild/llvm.rst
> +F:     scripts/clang-tools/
>  K:     \b(?i:clang|llvm)\b
>
>  CLEANCACHE API
> diff --git a/Makefile b/Makefile
> index fe0164a654c7..3e2df010b342 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -747,6 +747,7 @@ KBUILD_CFLAGS       += $(call cc-option,-fno-allow-store-data-races)
>
>  include scripts/Makefile.kcov
>  include scripts/Makefile.gcc-plugins
> +include scripts/clang-tools/Makefile.clang-tools
>
>  ifdef CONFIG_READABLE_ASM
>  # Disable optimizations that make assembler listings hard to read.
> @@ -1543,6 +1544,8 @@ help:
>         @echo  '  export_report   - List the usages of all exported symbols'
>         @echo  '  headerdep       - Detect inclusion cycles in headers'
>         @echo  '  coccicheck      - Check with Coccinelle'
> +       @echo  '  clang-analyzer  - Check with clang static analyzer'
> +       @echo  '  clang-tidy      - Check with clang-tidy'
>         @echo  ''
>         @echo  'Tools:'
>         @echo  '  nsdeps          - Generate missing symbol namespace dependencies'
> diff --git a/scripts/clang-tools/Makefile.clang-tools b/scripts/clang-tools/Makefile.clang-tools
> new file mode 100644
> index 000000000000..5c9d76f77595
> --- /dev/null
> +++ b/scripts/clang-tools/Makefile.clang-tools
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Copyright (C) Google LLC, 2020
> +#
> +# Author: Nathan Huckleberry <nhuck@google.com>
> +#
> +PHONY += clang-tidy
> +clang-tidy:
> +ifdef CONFIG_CC_IS_CLANG
> +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
> +else
> +       $(error clang-tidy requires CC=clang)
> +endif
> +
> +PHONY += clang-analyzer
> +clang-analyzer:
> +ifdef CONFIG_CC_IS_CLANG
> +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-analyzer compile_commands.json
> +else
> +       $(error clang-analyzer requires CC=clang)
> +endif
> diff --git a/scripts/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> similarity index 100%
> rename from scripts/gen_compile_commands.py
> rename to scripts/clang-tools/gen_compile_commands.py
> diff --git a/scripts/clang-tools/run-clang-tools.py b/scripts/clang-tools/run-clang-tools.py
> new file mode 100755
> index 000000000000..fa7655c7cec0
> --- /dev/null
> +++ b/scripts/clang-tools/run-clang-tools.py
> @@ -0,0 +1,74 @@
> +#!/usr/bin/env python
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Copyright (C) Google LLC, 2020
> +#
> +# Author: Nathan Huckleberry <nhuck@google.com>
> +#
> +"""A helper routine run clang-tidy and the clang static-analyzer on
> +compile_commands.json.
> +"""
> +
> +import argparse
> +import json
> +import multiprocessing
> +import os

^ unused import.  Maybe Masahiro would be so kind as to drop that line
for you when merging v7?

That said, I hammered on this in testing, and it now LGTM.

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>

For testing, I did `make clean` for x86/arm/arm64/powerpc, then a
defconfig (pseries_defconfig for ppc) build, then:
$ make -j71 LLVM=1 clang-tidy 2> log.txt
$ grep -e linuxkernel- -e clang- log.txt | sort -u | cut -d '[' -f 2 |
sort | uniq -c
$ make -j71 LLVM=1 clang-analyzer 2> log.txt
$ grep -e linuxkernel- -e clang- log.txt | sort -u | cut -d '[' -f 2 |
sort | uniq -c

Looks like it's already finding a couple bugs.  I'll probably disable
clang-diagnostic-* in a follow up, as those seem pretty noisy, though
clang-diagnostic-incompatible-pointer-types will likely be of interest
for the CFI work.

Thanks for all of the work that went into this, and tolerating my pedantry.

> +import subprocess
> +import sys
> +
> +
> +def parse_arguments():
> +    """Set up and parses command-line arguments.
> +    Returns:
> +        args: Dict of parsed args
> +        Has keys: [path, type]
> +    """
> +    usage = """Run clang-tidy or the clang static-analyzer on a
> +        compilation database."""
> +    parser = argparse.ArgumentParser(description=usage)
> +
> +    type_help = "Type of analysis to be performed"
> +    parser.add_argument("type",
> +                        choices=["clang-tidy", "clang-analyzer"],
> +                        help=type_help)
> +    path_help = "Path to the compilation database to parse"
> +    parser.add_argument("path", type=str, help=path_help)
> +
> +    return parser.parse_args()
> +
> +
> +def init(l, a):
> +    global lock
> +    global args
> +    lock = l
> +    args = a
> +
> +
> +def run_analysis(entry):
> +    # Disable all checks, then re-enable the ones we want
> +    checks = "-checks=-*,"
> +    if args.type == "clang-tidy":
> +        checks += "linuxkernel-*"
> +    else:
> +        checks += "clang-analyzer-*"
> +    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
> +                       stdout=subprocess.PIPE,
> +                       stderr=subprocess.STDOUT,
> +                       cwd=entry["directory"])
> +    with lock:
> +        sys.stderr.buffer.write(p.stdout)
> +
> +
> +def main():
> +    args = parse_arguments()
> +
> +    lock = multiprocessing.Lock()
> +    pool = multiprocessing.Pool(initializer=init, initargs=(lock, args))
> +    # Read JSON data into the datastore variable
> +    with open(args.path, "r") as f:
> +        datastore = json.load(f)
> +        pool.map(run_analysis, datastore)
> +
> +
> +if __name__ == "__main__":
> +    main()
> --
> 2.28.0.rc0.142.g3c755180ce-goog
>
> --
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200728004736.3590053-1-nhuck%40google.com.
Lukas Bulwahn Aug. 1, 2020, 7:23 p.m. UTC | #2
On Tue, 28 Jul 2020, 'Nathan Huckleberry' via Clang Built Linux wrote:

> This patch adds clang-tidy and the clang static-analyzer as make
> targets. The goal of this patch is to make static analysis tools
> usable and extendable by any developer or researcher who is familiar
> with basic c++.
> 
> The current static analysis tools require intimate knowledge of the
> internal workings of the static analysis. Clang-tidy and the clang
> static analyzers expose an easy to use api and allow users unfamiliar
> with clang to write new checks with relative ease.
> 
> ===Clang-tidy===
> 
> Clang-tidy is an easily extendable 'linter' that runs on the AST.
> Clang-tidy checks are easy to write and understand. A check consists of
> two parts, a matcher and a checker. The matcher is created using a
> domain specific language that acts on the AST
> (https://clang.llvm.org/docs/LibASTMatchersReference.html).  When AST
> nodes are found by the matcher a callback is made to the checker. The
> checker can then execute additional checks and issue warnings.
> 
> Here is an example clang-tidy check to report functions that have calls
> to local_irq_disable without calls to local_irq_enable and vice-versa.
> Functions flagged with __attribute((annotation("ignore_irq_balancing")))
> are ignored for analysis. (https://reviews.llvm.org/D65828)
> 
> ===Clang static analyzer===
> 
> The clang static analyzer is a more powerful static analysis tool that
> uses symbolic execution to find bugs. Currently there is a check that
> looks for potential security bugs from invalid uses of kmalloc and
> kfree. There are several more general purpose checks that are useful for
> the kernel.
> 
> The clang static analyzer is well documented and designed to be
> extensible.
> (https://clang-analyzer.llvm.org/checker_dev_manual.html)
> (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf)
> 
> The main draw of the clang tools is how accessible they are. The clang
> documentation is very nice and these tools are built specifically to be
> easily extendable by any developer. They provide an accessible method of
> bug-finding and research to people who are not overly familiar with the
> kernel codebase.
> 
> Signed-off-by: Nathan Huckleberry <nhuck@google.com>

Hi Nathan, Hi Nick,

I have been busy with other topics around the kernel and static analysis; 
but then, I read clang and static analysis in my mailbox in this patch.

So, I thought let me give this patch a try on the weekend.

I applied the patch on next-2020729; that worked.

Then:
$ make clang-tidy
scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires 
CC=clang.  Stop.

Okay, that is a good and clear error message.

Then:

$ make CC=clang-10 defconfig
$ make CC=clang-10 clang-tidy

python3 scripts/clang-tools/gen_compile_commands.py
WARNING: Found 8 entries. Have you compiled the kernel?
python3 scripts/clang-tools/run-clang-tools.py clang-tidy 
compile_commands.json
 
Then actually an error in clang-tidy.
Error: no checks enabled.
USAGE: clang-tidy [options] <source0> [... <sourceN>]
...

I will get to that later how I fixed that for my setup.

Okay, good, that is clear... I need to compile it first, got it.

$ make CC=clang-10
$ make CC=clang-10 clang-tidy

Okay, I run except for the fix I needed.

Where is the output from clang-tidy?

It prints:

python3 scripts/clang-tools/gen_compile_commands.py
python3 scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json

That is it. Does that mean 0 warnings, or where do I find the output?
The script suggests it should be in stderr once all the parallel runs 
collected it, right?

I was confused; maybe a short summary output might help here.

Then, I ran

$ make CC=clang-10 clang-analyzer

And I see a lot of warnings... I guess that is intended.

There is a lot of:

Suppressed XX warnings (XX in non-user code).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.

To an outsider, it is unclear if that is intended or if the tool is broken 
in this setup.

Is there are way to silent that meta-warning? Or is my setup broken?

In summary, it is pretty clear how to run clang-tidy and clang-analyzer 
and it was a pretty smooth experience, even with no documentation at hand.

It was fun for me. Keep up the good work!

Just one issue... see below.

> ---
> Changes v6->v7
> * Fix issues with relative paths
> * Additional style fixes
>  MAINTAINERS                                   |  1 +
>  Makefile                                      |  3 +
>  scripts/clang-tools/Makefile.clang-tools      | 23 ++++++
>  .../{ => clang-tools}/gen_compile_commands.py |  0
>  scripts/clang-tools/run-clang-tools.py        | 74 +++++++++++++++++++
>  5 files changed, 101 insertions(+)
>  create mode 100644 scripts/clang-tools/Makefile.clang-tools
>  rename scripts/{ => clang-tools}/gen_compile_commands.py (100%)
>  create mode 100755 scripts/clang-tools/run-clang-tools.py
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1d4aa7f942de..a444564e5572 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4198,6 +4198,7 @@ W:	https://clangbuiltlinux.github.io/
>  B:	https://github.com/ClangBuiltLinux/linux/issues
>  C:	irc://chat.freenode.net/clangbuiltlinux
>  F:	Documentation/kbuild/llvm.rst
> +F:	scripts/clang-tools/
>  K:	\b(?i:clang|llvm)\b
>  
>  CLEANCACHE API
> diff --git a/Makefile b/Makefile
> index fe0164a654c7..3e2df010b342 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -747,6 +747,7 @@ KBUILD_CFLAGS	+= $(call cc-option,-fno-allow-store-data-races)
>  
>  include scripts/Makefile.kcov
>  include scripts/Makefile.gcc-plugins
> +include scripts/clang-tools/Makefile.clang-tools
>  
>  ifdef CONFIG_READABLE_ASM
>  # Disable optimizations that make assembler listings hard to read.
> @@ -1543,6 +1544,8 @@ help:
>  	@echo  '  export_report   - List the usages of all exported symbols'
>  	@echo  '  headerdep       - Detect inclusion cycles in headers'
>  	@echo  '  coccicheck      - Check with Coccinelle'
> +	@echo  '  clang-analyzer  - Check with clang static analyzer'
> +	@echo  '  clang-tidy      - Check with clang-tidy'
>  	@echo  ''
>  	@echo  'Tools:'
>  	@echo  '  nsdeps          - Generate missing symbol namespace dependencies'
> diff --git a/scripts/clang-tools/Makefile.clang-tools b/scripts/clang-tools/Makefile.clang-tools
> new file mode 100644
> index 000000000000..5c9d76f77595
> --- /dev/null
> +++ b/scripts/clang-tools/Makefile.clang-tools
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Copyright (C) Google LLC, 2020
> +#
> +# Author: Nathan Huckleberry <nhuck@google.com>
> +#
> +PHONY += clang-tidy
> +clang-tidy:
> +ifdef CONFIG_CC_IS_CLANG
> +	$(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> +	$(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
> +else
> +	$(error clang-tidy requires CC=clang)
> +endif
> +
> +PHONY += clang-analyzer
> +clang-analyzer:
> +ifdef CONFIG_CC_IS_CLANG
> +	$(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> +	$(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-analyzer compile_commands.json
> +else
> +	$(error clang-analyzer requires CC=clang)
> +endif
> diff --git a/scripts/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> similarity index 100%
> rename from scripts/gen_compile_commands.py
> rename to scripts/clang-tools/gen_compile_commands.py
> diff --git a/scripts/clang-tools/run-clang-tools.py b/scripts/clang-tools/run-clang-tools.py
> new file mode 100755
> index 000000000000..fa7655c7cec0
> --- /dev/null
> +++ b/scripts/clang-tools/run-clang-tools.py
> @@ -0,0 +1,74 @@
> +#!/usr/bin/env python
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Copyright (C) Google LLC, 2020
> +#
> +# Author: Nathan Huckleberry <nhuck@google.com>
> +#
> +"""A helper routine run clang-tidy and the clang static-analyzer on
> +compile_commands.json.
> +"""
> +
> +import argparse
> +import json
> +import multiprocessing
> +import os
> +import subprocess
> +import sys
> +
> +
> +def parse_arguments():
> +    """Set up and parses command-line arguments.
> +    Returns:
> +        args: Dict of parsed args
> +        Has keys: [path, type]
> +    """
> +    usage = """Run clang-tidy or the clang static-analyzer on a
> +        compilation database."""
> +    parser = argparse.ArgumentParser(description=usage)
> +
> +    type_help = "Type of analysis to be performed"
> +    parser.add_argument("type",
> +                        choices=["clang-tidy", "clang-analyzer"],
> +                        help=type_help)
> +    path_help = "Path to the compilation database to parse"
> +    parser.add_argument("path", type=str, help=path_help)
> +
> +    return parser.parse_args()
> +
> +
> +def init(l, a):
> +    global lock
> +    global args
> +    lock = l
> +    args = a
> +
> +
> +def run_analysis(entry):
> +    # Disable all checks, then re-enable the ones we want
> +    checks = "-checks=-*,"
> +    if args.type == "clang-tidy":
> +        checks += "linuxkernel-*"
> +    else:
> +        checks += "clang-analyzer-*"
> +    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],

You hardcoded here: clang-tidy

But in my Ubuntu 18.04 setup, I got multiple versions of clang and 
clang-tidy installed; yeah, maybe my setup is broken, but maybe those from 
others are similar.

When I run:

  make CC=clang-10 clang-tidy

it picks up the "wrong" clang-tidy version...

My setup is:

$ which clang-tidy
/usr/bin/clang-tidy

$ which clang-tidy-10
/usr/bin/clang-tidy-10

$ clang-tidy --version
LLVM (http://llvm.org/):
  LLVM version 6.0.0
  
  Optimized build.
  Default target: x86_64-pc-linux-gnu
  Host CPU: znver1

$ clang-tidy-10 --version
LLVM (http://llvm.org/):
  LLVM version 10.0.1
  
  Optimized build.
  Default target: x86_64-pc-linux-gnu
  Host CPU: znver1

When I run make CC=clang-10 clang-tidy, I would expect it to use 
clang-tidy-10, not clang-tidy. (clang-tidy errors just because it is too 
old; I guess it does have the linuxkernel-* options.)

Now, I cannot fix that without touching your script. There is no way I can 
tell the build target to use clang-tidy-10.

With a quick touch:

-    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
+    p = subprocess.run(["clang-tidy-10", "-p", args.path, checks, entry["file"]],

I got it to work.

Maybe you have a good idea how to get make clang-tidy to pick 
up the intended version without touching the python script itself?

It is a minor issue, but it would be nice if that setting would work 
somehow.

Thanks a lot.

Best regards,

Lukas

> +                       stdout=subprocess.PIPE,
> +                       stderr=subprocess.STDOUT,
> +                       cwd=entry["directory"])
> +    with lock:
> +        sys.stderr.buffer.write(p.stdout)
> +
> +
> +def main():
> +    args = parse_arguments()
> +
> +    lock = multiprocessing.Lock()
> +    pool = multiprocessing.Pool(initializer=init, initargs=(lock, args))
> +    # Read JSON data into the datastore variable
> +    with open(args.path, "r") as f:
> +        datastore = json.load(f)
> +        pool.map(run_analysis, datastore)
> +
> +
> +if __name__ == "__main__":
> +    main()
> -- 
> 2.28.0.rc0.142.g3c755180ce-goog
> 
> -- 
> You received this message because you are subscribed to the Google Groups "Clang Built Linux" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to clang-built-linux+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/clang-built-linux/20200728004736.3590053-1-nhuck%40google.com.
>
Nick Desaulniers Aug. 3, 2020, 6:49 p.m. UTC | #3
On Sat, Aug 1, 2020 at 12:23 PM Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:
>
> Hi Nathan, Hi Nick,
>
> I have been busy with other topics around the kernel and static analysis;
> but then, I read clang and static analysis in my mailbox in this patch.
>
> So, I thought let me give this patch a try on the weekend.
>
> I applied the patch on next-2020729; that worked.
>
> Then:
> $ make clang-tidy
> scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
> CC=clang.  Stop.
>
> Okay, that is a good and clear error message.
>
> Then:
>
> $ make CC=clang-10 defconfig
> $ make CC=clang-10 clang-tidy
>
> python3 scripts/clang-tools/gen_compile_commands.py
> WARNING: Found 8 entries. Have you compiled the kernel?
> python3 scripts/clang-tools/run-clang-tools.py clang-tidy
> compile_commands.json
>
> Then actually an error in clang-tidy.
> Error: no checks enabled.
> USAGE: clang-tidy [options] <source0> [... <sourceN>]
> ...
>
> I will get to that later how I fixed that for my setup.
>
> Okay, good, that is clear... I need to compile it first, got it.

Hi Lukas,
Thank you so much for taking the time to apply the patch and help test it.

For the case of not doing a build first: gen_compile_commands.py
parses the .*.d files to build the compilation database and warns if
not many were found.  I think it might be interesting for it to just
invoke a build if it sees that, or maybe for the clang-tidy and
clang-analyzer targets to somehow invoke the default make target.  The
issue there might be that you need to invoke `make clang-tidy` with
`make CC=clang LD=ld.lld ... clang-tidy` in order to trigger a build
successfully.

Also, I wonder if gen_compile_commands.py should set a return code in
that case so that callers can handle such an exceptional case?  In
that case, I'd consider that a papercut or potential improvement to
scripts/get_compile_commands.py orthogonal to this patch.

>
> $ make CC=clang-10
> $ make CC=clang-10 clang-tidy
>
> Okay, I run except for the fix I needed.
>
> Where is the output from clang-tidy?
>
> It prints:
>
> python3 scripts/clang-tools/gen_compile_commands.py
> python3 scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
>
> That is it. Does that mean 0 warnings, or where do I find the output?
> The script suggests it should be in stderr once all the parallel runs
> collected it, right?
>
> I was confused; maybe a short summary output might help here.

I was also caught by this; for x86 defconfig, the kernel is actually
clean of instances of linuxkernel-* clang-tidy checks (there was also
an issue with the CWD for x86 in v6, but was resolved in v7 of this
patch).  I had to add a case that should fail to make clang-tidy have
output, and the check only checks for unchecked "ERR_PTR", "PTR_ERR",
"IS_ERR", "IS_ERR_OR_NULL", "ERR_CAST", "PTR_ERR_OR_ZERO".
(Documentation for that should be improved.) So if you add a function
that just constructs an `ERR_PTR(0)` and call it from code that gets
compiled in, then you'll start to see warnings from clang-tidy for x86
defconfig.  For aarch64 and arm, you'll see there are some unchecked
cases that look like low hanging fruit to fix.

It probably can be improved to note that there was no output, but that
will require more machinery to track how much output there was.  I'd
prefer to follow up with additional polish on top of this once this
lands.

>
> Then, I ran
>
> $ make CC=clang-10 clang-analyzer
>
> And I see a lot of warnings... I guess that is intended.
>
> There is a lot of:
>
> Suppressed XX warnings (XX in non-user code).
> Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
>
> To an outsider, it is unclear if that is intended or if the tool is broken
> in this setup.
>
> Is there are way to silent that meta-warning? Or is my setup broken?

See also my comment about disabling the clang-diagnostic-* analyzer
checks. We haven't had time to sort out the cause of them yet, and for
now they just look like noise.

>
> In summary, it is pretty clear how to run clang-tidy and clang-analyzer
> and it was a pretty smooth experience, even with no documentation at hand.
>
> It was fun for me. Keep up the good work!
>
> Just one issue... see below.
>
> > +    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
>
> You hardcoded here: clang-tidy
>
> But in my Ubuntu 18.04 setup, I got multiple versions of clang and
> clang-tidy installed; yeah, maybe my setup is broken, but maybe those from
> others are similar.
>
> When I run:
>
>   make CC=clang-10 clang-tidy
>
> it picks up the "wrong" clang-tidy version...
>
> My setup is:
>
> $ which clang-tidy
> /usr/bin/clang-tidy
>
> $ which clang-tidy-10
> /usr/bin/clang-tidy-10
>
> $ clang-tidy --version
> LLVM (http://llvm.org/):
>   LLVM version 6.0.0
>
>   Optimized build.
>   Default target: x86_64-pc-linux-gnu
>   Host CPU: znver1
>
> $ clang-tidy-10 --version
> LLVM (http://llvm.org/):
>   LLVM version 10.0.1
>
>   Optimized build.
>   Default target: x86_64-pc-linux-gnu
>   Host CPU: znver1
>
> When I run make CC=clang-10 clang-tidy, I would expect it to use
> clang-tidy-10, not clang-tidy. (clang-tidy errors just because it is too
> old; I guess it does have the linuxkernel-* options.)
>
> Now, I cannot fix that without touching your script. There is no way I can
> tell the build target to use clang-tidy-10.
>
> With a quick touch:
>
> -    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
> +    p = subprocess.run(["clang-tidy-10", "-p", args.path, checks, entry["file"]],
>
> I got it to work.
>
> Maybe you have a good idea how to get make clang-tidy to pick
> up the intended version without touching the python script itself?
>
> It is a minor issue, but it would be nice if that setting would work
> somehow.

Ah right, sorry, I tend to forget about the use case of having
multiple versions of clang installed.

I think the best approach here might be for the user (you, in this
case) to ensure that list of PATHs in the path list has the path to
the intended version of clang-tidy you'd like to run listed before
others.  That is similar to the recommendation for the LLVM=1 patch
set. ie.
commit a0d1c951ef08e ("kbuild: support LLVM=1 to switch the default
tools to Clang/LLVM")
specifically this part of the commit message:

> the suffixed versions in /usr/bin/ are symlinks to binaries in
> /usr/lib/llvm-#/bin/, so this can also be handled by PATH.

If `clang-tidy` on your system points to an old version of clang-tidy,
it may be worthwhile to uninstall the old version, and update the
symlink to point to a newer version.  That may be simpler than trying
to support invoking `make clang-tidy` for arbitrary versions or binary
names of clang-tidy.  I can understand having multiple versions of a
compiler installed for quickly checking compatibility (though these
days I prefer godbolt.org for that) or if a particular codebase is
stuck on an old version of a toolchain for whatever reason; but having
multiple versions of clang-tidy installed and supporting all of them
is a little harder to justify.
Lukas Bulwahn Aug. 3, 2020, 8:29 p.m. UTC | #4
On Mon, 3 Aug 2020, Nick Desaulniers wrote:

> On Sat, Aug 1, 2020 at 12:23 PM Lukas Bulwahn <lukas.bulwahn@gmail.com> wrote:
> >
> > Hi Nathan, Hi Nick,
> >
> > I have been busy with other topics around the kernel and static analysis;
> > but then, I read clang and static analysis in my mailbox in this patch.
> >
> > So, I thought let me give this patch a try on the weekend.
> >
> > I applied the patch on next-2020729; that worked.
> >
> > Then:
> > $ make clang-tidy
> > scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
> > CC=clang.  Stop.
> >
> > Okay, that is a good and clear error message.
> >
> > Then:
> >
> > $ make CC=clang-10 defconfig
> > $ make CC=clang-10 clang-tidy
> >
> > python3 scripts/clang-tools/gen_compile_commands.py
> > WARNING: Found 8 entries. Have you compiled the kernel?
> > python3 scripts/clang-tools/run-clang-tools.py clang-tidy
> > compile_commands.json
> >
> > Then actually an error in clang-tidy.
> > Error: no checks enabled.
> > USAGE: clang-tidy [options] <source0> [... <sourceN>]
> > ...
> >
> > I will get to that later how I fixed that for my setup.
> >
> > Okay, good, that is clear... I need to compile it first, got it.
> 
> Hi Lukas,
> Thank you so much for taking the time to apply the patch and help test it.
> 
> For the case of not doing a build first: gen_compile_commands.py
> parses the .*.d files to build the compilation database and warns if
> not many were found.  I think it might be interesting for it to just
> invoke a build if it sees that, or maybe for the clang-tidy and
> clang-analyzer targets to somehow invoke the default make target.  The
> issue there might be that you need to invoke `make clang-tidy` with
> `make CC=clang LD=ld.lld ... clang-tidy` in order to trigger a build
> successfully.
>

I think the workflow is good as it is.

This already indicates to me that the clang-tidy result will depend on the 
kernel config and the build, which is helpful to know.

The coccicheck target, e.g., ignores the kernel config; other basic 
'syntactic' checks in the Makefile, such as make includecheck, seem also 
to be config-independent, but you cannot run it without having a config...
that is a bit inconsistent in that case.
 
> Also, I wonder if gen_compile_commands.py should set a return code in
> that case so that callers can handle such an exceptional case?  In
> that case, I'd consider that a papercut or potential improvement to
> scripts/get_compile_commands.py orthogonal to this patch.
>

I cannot see the immediate use case yet, but maybe I can provide a 
specific use case once I try using clang-tidy in my attempt to work with 
Ericsson's CodeChecker Web UI. It is still a very investigation for me 
here with that.

> >
> > $ make CC=clang-10
> > $ make CC=clang-10 clang-tidy
> >
> > Okay, I run except for the fix I needed.
> >
> > Where is the output from clang-tidy?
> >
> > It prints:
> >
> > python3 scripts/clang-tools/gen_compile_commands.py
> > python3 scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
> >
> > That is it. Does that mean 0 warnings, or where do I find the output?
> > The script suggests it should be in stderr once all the parallel runs
> > collected it, right?
> >
> > I was confused; maybe a short summary output might help here.
> 
> I was also caught by this; for x86 defconfig, the kernel is actually
> clean of instances of linuxkernel-* clang-tidy checks (there was also
> an issue with the CWD for x86 in v6, but was resolved in v7 of this
> patch).  I had to add a case that should fail to make clang-tidy have
> output, and the check only checks for unchecked "ERR_PTR", "PTR_ERR",
> "IS_ERR", "IS_ERR_OR_NULL", "ERR_CAST", "PTR_ERR_OR_ZERO".
> (Documentation for that should be improved.) So if you add a function
> that just constructs an `ERR_PTR(0)` and call it from code that gets
> compiled in, then you'll start to see warnings from clang-tidy for x86
> defconfig.  For aarch64 and arm, you'll see there are some unchecked
> cases that look like low hanging fruit to fix.
>
> It probably can be improved to note that there was no output, but that
> will require more machinery to track how much output there was.  I'd
> prefer to follow up with additional polish on top of this once this
> lands.
>

Agree, it is somehow unfortunate that clang-tidy cannot provide such 
summary. No big issue, though. You will just have further developers 
asking the same question when more developers are attracted...
 
> >
> > Then, I ran
> >
> > $ make CC=clang-10 clang-analyzer
> >
> > And I see a lot of warnings... I guess that is intended.
> >
> > There is a lot of:
> >
> > Suppressed XX warnings (XX in non-user code).
> > Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
> >
> > To an outsider, it is unclear if that is intended or if the tool is broken
> > in this setup.
> >
> > Is there are way to silent that meta-warning? Or is my setup broken?
> 
> See also my comment about disabling the clang-diagnostic-* analyzer
> checks. We haven't had time to sort out the cause of them yet, and for
> now they just look like noise.
>

Okay, good to know. So, my setup is not broken and the tool works :)

So, then I guess I finished with this result:

Tested-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>


> >
> > In summary, it is pretty clear how to run clang-tidy and clang-analyzer
> > and it was a pretty smooth experience, even with no documentation at hand.
> >
> > It was fun for me. Keep up the good work!
> >
> > Just one issue... see below.
> >
> > > +    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
> >
> > You hardcoded here: clang-tidy
> >
> > But in my Ubuntu 18.04 setup, I got multiple versions of clang and
> > clang-tidy installed; yeah, maybe my setup is broken, but maybe those from
> > others are similar.
> >
> > When I run:
> >
> >   make CC=clang-10 clang-tidy
> >
> > it picks up the "wrong" clang-tidy version...
> >
> > My setup is:
> >
> > $ which clang-tidy
> > /usr/bin/clang-tidy
> >
> > $ which clang-tidy-10
> > /usr/bin/clang-tidy-10
> >
> > $ clang-tidy --version
> > LLVM (http://llvm.org/):
> >   LLVM version 6.0.0
> >
> >   Optimized build.
> >   Default target: x86_64-pc-linux-gnu
> >   Host CPU: znver1
> >
> > $ clang-tidy-10 --version
> > LLVM (http://llvm.org/):
> >   LLVM version 10.0.1
> >
> >   Optimized build.
> >   Default target: x86_64-pc-linux-gnu
> >   Host CPU: znver1
> >
> > When I run make CC=clang-10 clang-tidy, I would expect it to use
> > clang-tidy-10, not clang-tidy. (clang-tidy errors just because it is too
> > old; I guess it does have the linuxkernel-* options.)
> >
> > Now, I cannot fix that without touching your script. There is no way I can
> > tell the build target to use clang-tidy-10.
> >
> > With a quick touch:
> >
> > -    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
> > +    p = subprocess.run(["clang-tidy-10", "-p", args.path, checks, entry["file"]],
> >
> > I got it to work.
> >
> > Maybe you have a good idea how to get make clang-tidy to pick
> > up the intended version without touching the python script itself?
> >
> > It is a minor issue, but it would be nice if that setting would work
> > somehow.
> 
> Ah right, sorry, I tend to forget about the use case of having
> multiple versions of clang installed.
> 
> I think the best approach here might be for the user (you, in this
> case) to ensure that list of PATHs in the path list has the path to
> the intended version of clang-tidy you'd like to run listed before
> others.  That is similar to the recommendation for the LLVM=1 patch
> set. ie.
> commit a0d1c951ef08e ("kbuild: support LLVM=1 to switch the default
> tools to Clang/LLVM")
> specifically this part of the commit message:
> 
> > the suffixed versions in /usr/bin/ are symlinks to binaries in
> > /usr/lib/llvm-#/bin/, so this can also be handled by PATH.
> 
> If `clang-tidy` on your system points to an old version of clang-tidy,
> it may be worthwhile to uninstall the old version, and update the
> symlink to point to a newer version.  That may be simpler than trying
> to support invoking `make clang-tidy` for arbitrary versions or binary
> names of clang-tidy.  I can understand having multiple versions of a
> compiler installed for quickly checking compatibility (though these
> days I prefer godbolt.org for that) or if a particular codebase is
> stuck on an old version of a toolchain for whatever reason; but having
> multiple versions of clang-tidy installed and supporting all of them
> is a little harder to justify.

I can live with that. I can modify the script or modify the symbolic link 
of clang-tidy. No need to support arbitrary broken setups :)


Other question I came across while playing around with clang-tidy on the 
kernel sources.

Is it possible to simply have a .clang-tidy in the repository root to 
configure the clang-tidy invocation rather than having it in the python 
script? If not, why?

I tried it, but I could not really figure out what I really needed to do 
to get the same output behavior than Nathan's command-line invocation 
from the python script.

Here is my .clang-tidy attempt:

$ cat .clang-tidy 
Checks:		'-*,linuxkernel-*'

$ clang-tidy -dump-config
---
Checks:          'clang-diagnostic-*,clang-analyzer-*,-*,linuxkernel-*'
WarningsAsErrors: ''
HeaderFilterRegex: ''
AnalyzeTemporaryDtors: false
FormatStyle:     none
User:            lukas
CheckOptions:    
  - key:             
google-readability-braces-around-statements.ShortStatementLines
    value:           '1'
  - key:             google-readability-function-size.StatementThreshold
    value:           '800'
  - key:             
google-readability-namespace-comments.ShortNamespaceLines
    value:           '10'
  - key:             
google-readability-namespace-comments.SpacesBeforeComments
    value:           '2'
  - key:             modernize-loop-convert.MaxCopySize
    value:           '16'
  - key:             modernize-loop-convert.MinConfidence
    value:           reasonable
  - key:             modernize-loop-convert.NamingStyle
    value:           CamelCase
  - key:             modernize-pass-by-value.IncludeStyle
    value:           llvm
  - key:             modernize-replace-auto-ptr.IncludeStyle
    value:           llvm
  - key:             modernize-use-nullptr.NullMacros
    value:           'NULL'
...

I could not understand why 'clang-diagnostic-*,clang-analyzer-*' is still 
there when running clang-tidy. I think that actually leads to more 
(unwanted) findings, when I run for example:

$ find ./kernel/trace/ -name '*.c' -exec clang-tidy {} +

such as:

./arch/x86/include/asm/apic.h:107:2: error: expected '(' after 'asm' 
[clang-diagnostic-error]
        alternative_io("movl %0, %P1", "xchgl %0, %P1", X86_BUG_11AP,
        ^
./arch/x86/include/asm/alternative.h:240:2: note: expanded from macro 
'alternative_io'
        asm_inline volatile (ALTERNATIVE(oldinstr, newinstr, feature)   \
        ^
./include/linux/compiler_types.h:239:24: note: expanded from macro 
'asm_inline'
#define asm_inline asm __inline
                       ^

Is there no way to completely override the standard clang-tidy 
configuration with .clang-tidy file?

I expected that ',-*' deactivates everything that was activated to the 
left of this checks expression, but it seems not to be the case. 

Did you also try that and hence then settled for passing that checks as 
command-line argument because there is no way to have a sane .clang-tidy 
file in the root? Or did I stumble on some .clang-tidy file 
misunderstanding and we can actually place a .clang-tidy file for simple 
use cases as the example, find ... -exec clang-tidy {} +, above?


Lukas

> -- 
> Thanks,
> ~Nick Desaulniers
>
Masahiro Yamada Aug. 6, 2020, 8:43 a.m. UTC | #5
On Tue, Jul 28, 2020 at 9:47 AM Nathan Huckleberry <nhuck@google.com> wrote:
>
> This patch adds clang-tidy and the clang static-analyzer as make
> targets. The goal of this patch is to make static analysis tools
> usable and extendable by any developer or researcher who is familiar
> with basic c++.
>
> The current static analysis tools require intimate knowledge of the
> internal workings of the static analysis. Clang-tidy and the clang
> static analyzers expose an easy to use api and allow users unfamiliar
> with clang to write new checks with relative ease.
>
> ===Clang-tidy===
>
> Clang-tidy is an easily extendable 'linter' that runs on the AST.
> Clang-tidy checks are easy to write and understand. A check consists of
> two parts, a matcher and a checker. The matcher is created using a
> domain specific language that acts on the AST
> (https://clang.llvm.org/docs/LibASTMatchersReference.html).  When AST
> nodes are found by the matcher a callback is made to the checker. The
> checker can then execute additional checks and issue warnings.
>
> Here is an example clang-tidy check to report functions that have calls
> to local_irq_disable without calls to local_irq_enable and vice-versa.
> Functions flagged with __attribute((annotation("ignore_irq_balancing")))
> are ignored for analysis. (https://reviews.llvm.org/D65828)
>
> ===Clang static analyzer===
>
> The clang static analyzer is a more powerful static analysis tool that
> uses symbolic execution to find bugs. Currently there is a check that
> looks for potential security bugs from invalid uses of kmalloc and
> kfree. There are several more general purpose checks that are useful for
> the kernel.
>
> The clang static analyzer is well documented and designed to be
> extensible.
> (https://clang-analyzer.llvm.org/checker_dev_manual.html)
> (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf)
>
> The main draw of the clang tools is how accessible they are. The clang
> documentation is very nice and these tools are built specifically to be
> easily extendable by any developer. They provide an accessible method of
> bug-finding and research to people who are not overly familiar with the
> kernel codebase.
>
> Signed-off-by: Nathan Huckleberry <nhuck@google.com>
> ---
> Changes v6->v7
> * Fix issues with relative paths
> * Additional style fixes
>  MAINTAINERS                                   |  1 +
>  Makefile                                      |  3 +
>  scripts/clang-tools/Makefile.clang-tools      | 23 ++++++
>  .../{ => clang-tools}/gen_compile_commands.py |  0
>  scripts/clang-tools/run-clang-tools.py        | 74 +++++++++++++++++++
>  5 files changed, 101 insertions(+)
>  create mode 100644 scripts/clang-tools/Makefile.clang-tools
>  rename scripts/{ => clang-tools}/gen_compile_commands.py (100%)
>  create mode 100755 scripts/clang-tools/run-clang-tools.py
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 1d4aa7f942de..a444564e5572 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -4198,6 +4198,7 @@ W:        https://clangbuiltlinux.github.io/
>  B:     https://github.com/ClangBuiltLinux/linux/issues
>  C:     irc://chat.freenode.net/clangbuiltlinux
>  F:     Documentation/kbuild/llvm.rst
> +F:     scripts/clang-tools/
>  K:     \b(?i:clang|llvm)\b
>
>  CLEANCACHE API
> diff --git a/Makefile b/Makefile
> index fe0164a654c7..3e2df010b342 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -747,6 +747,7 @@ KBUILD_CFLAGS       += $(call cc-option,-fno-allow-store-data-races)
>
>  include scripts/Makefile.kcov
>  include scripts/Makefile.gcc-plugins
> +include scripts/clang-tools/Makefile.clang-tools
>
>  ifdef CONFIG_READABLE_ASM
>  # Disable optimizations that make assembler listings hard to read.
> @@ -1543,6 +1544,8 @@ help:
>         @echo  '  export_report   - List the usages of all exported symbols'
>         @echo  '  headerdep       - Detect inclusion cycles in headers'
>         @echo  '  coccicheck      - Check with Coccinelle'
> +       @echo  '  clang-analyzer  - Check with clang static analyzer'
> +       @echo  '  clang-tidy      - Check with clang-tidy'
>         @echo  ''
>         @echo  'Tools:'
>         @echo  '  nsdeps          - Generate missing symbol namespace dependencies'
> diff --git a/scripts/clang-tools/Makefile.clang-tools b/scripts/clang-tools/Makefile.clang-tools
> new file mode 100644
> index 000000000000..5c9d76f77595
> --- /dev/null
> +++ b/scripts/clang-tools/Makefile.clang-tools
> @@ -0,0 +1,23 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Copyright (C) Google LLC, 2020
> +#
> +# Author: Nathan Huckleberry <nhuck@google.com>
> +#
> +PHONY += clang-tidy
> +clang-tidy:
> +ifdef CONFIG_CC_IS_CLANG
> +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
> +else
> +       $(error clang-tidy requires CC=clang)
> +endif
> +
> +PHONY += clang-analyzer
> +clang-analyzer:
> +ifdef CONFIG_CC_IS_CLANG
> +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-analyzer compile_commands.json
> +else
> +       $(error clang-analyzer requires CC=clang)
> +endif



You can unify the almost same two rules.

PHONY += clang-tidy clang-analyzer
clang-tidy clang-analyzer:
ifdef CONFIG_CC_IS_CLANG
        $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
        $(PYTHON3) scripts/clang-tools/run-clang-tools.py $@
compile_commands.json
else
        $(error $@ requires CC=clang)
endif




But, before we proceed, please tell me
what this check is intended for.





Case 1)
Build the kernel with CC=clang,
and then run clang-tidy without CC=clang.

$ make CC=clang defconfig
$ make CC=clang -j$(nproc)
$ make clang-tidy

scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
CC=clang.  Stop.




Case 2)
Build the kernel using GCC,
and then run clang-tidy with CC=clang.

$ make defconfig
$ make -j$(nproc)
$ make CC=clang clang-tidy

This patch happily runs clang-tidy
although compile_commands.json
contains GCC commands.





So, it checks if you have passed CC=clang
to "make clang-tidy", where I do not see
any user of the $(CC) variable.

It does not care whether you have built
the kernel with GCC or Clang.



What happens if you run clang-tidy against
compile_commands.json that contains GCC
commands?


I also care about stale commands
in compile_commands.json.


gen_compile_commands.py creates compile_commands.json
based on *.cmd files it found.

If you rebuild the kernel for various settings
using GCC/clang without "make clean",
stale .*.cmd files will grow.

compile_commands.json will pick up commands
from older compilation, i.e. the end up with
the mixture of GCC/Clang commands.

So, I'd like to know how clang-tidy handles
the GCC commands from compile_commands.json





> diff --git a/scripts/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> similarity index 100%
> rename from scripts/gen_compile_commands.py
> rename to scripts/clang-tools/gen_compile_commands.py
> diff --git a/scripts/clang-tools/run-clang-tools.py b/scripts/clang-tools/run-clang-tools.py
> new file mode 100755
> index 000000000000..fa7655c7cec0
> --- /dev/null
> +++ b/scripts/clang-tools/run-clang-tools.py
> @@ -0,0 +1,74 @@
> +#!/usr/bin/env python
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Copyright (C) Google LLC, 2020
> +#
> +# Author: Nathan Huckleberry <nhuck@google.com>
> +#
> +"""A helper routine run clang-tidy and the clang static-analyzer on
> +compile_commands.json.
> +"""
> +
> +import argparse
> +import json
> +import multiprocessing
> +import os
> +import subprocess
> +import sys
> +
> +
> +def parse_arguments():
> +    """Set up and parses command-line arguments.
> +    Returns:
> +        args: Dict of parsed args
> +        Has keys: [path, type]
> +    """
> +    usage = """Run clang-tidy or the clang static-analyzer on a
> +        compilation database."""
> +    parser = argparse.ArgumentParser(description=usage)
> +
> +    type_help = "Type of analysis to be performed"
> +    parser.add_argument("type",
> +                        choices=["clang-tidy", "clang-analyzer"],
> +                        help=type_help)
> +    path_help = "Path to the compilation database to parse"
> +    parser.add_argument("path", type=str, help=path_help)
> +
> +    return parser.parse_args()
> +
> +
> +def init(l, a):
> +    global lock
> +    global args
> +    lock = l
> +    args = a
> +
> +
> +def run_analysis(entry):
> +    # Disable all checks, then re-enable the ones we want
> +    checks = "-checks=-*,"
> +    if args.type == "clang-tidy":
> +        checks += "linuxkernel-*"
> +    else:
> +        checks += "clang-analyzer-*"
> +    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
> +                       stdout=subprocess.PIPE,
> +                       stderr=subprocess.STDOUT,
> +                       cwd=entry["directory"])
> +    with lock:
> +        sys.stderr.buffer.write(p.stdout)
> +
> +
> +def main():
> +    args = parse_arguments()
> +
> +    lock = multiprocessing.Lock()
> +    pool = multiprocessing.Pool(initializer=init, initargs=(lock, args))
> +    # Read JSON data into the datastore variable
> +    with open(args.path, "r") as f:
> +        datastore = json.load(f)
> +        pool.map(run_analysis, datastore)
> +
> +
> +if __name__ == "__main__":
> +    main()
> --
> 2.28.0.rc0.142.g3c755180ce-goog
>
Nathan Huckleberry Aug. 6, 2020, 9:42 p.m. UTC | #6
On Thu, Aug 6, 2020 at 3:44 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Tue, Jul 28, 2020 at 9:47 AM Nathan Huckleberry <nhuck@google.com> wrote:
> >
> > This patch adds clang-tidy and the clang static-analyzer as make
> > targets. The goal of this patch is to make static analysis tools
> > usable and extendable by any developer or researcher who is familiar
> > with basic c++.
> >
> > The current static analysis tools require intimate knowledge of the
> > internal workings of the static analysis. Clang-tidy and the clang
> > static analyzers expose an easy to use api and allow users unfamiliar
> > with clang to write new checks with relative ease.
> >
> > ===Clang-tidy===
> >
> > Clang-tidy is an easily extendable 'linter' that runs on the AST.
> > Clang-tidy checks are easy to write and understand. A check consists of
> > two parts, a matcher and a checker. The matcher is created using a
> > domain specific language that acts on the AST
> > (https://clang.llvm.org/docs/LibASTMatchersReference.html).  When AST
> > nodes are found by the matcher a callback is made to the checker. The
> > checker can then execute additional checks and issue warnings.
> >
> > Here is an example clang-tidy check to report functions that have calls
> > to local_irq_disable without calls to local_irq_enable and vice-versa.
> > Functions flagged with __attribute((annotation("ignore_irq_balancing")))
> > are ignored for analysis. (https://reviews.llvm.org/D65828)
> >
> > ===Clang static analyzer===
> >
> > The clang static analyzer is a more powerful static analysis tool that
> > uses symbolic execution to find bugs. Currently there is a check that
> > looks for potential security bugs from invalid uses of kmalloc and
> > kfree. There are several more general purpose checks that are useful for
> > the kernel.
> >
> > The clang static analyzer is well documented and designed to be
> > extensible.
> > (https://clang-analyzer.llvm.org/checker_dev_manual.html)
> > (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf)
> >
> > The main draw of the clang tools is how accessible they are. The clang
> > documentation is very nice and these tools are built specifically to be
> > easily extendable by any developer. They provide an accessible method of
> > bug-finding and research to people who are not overly familiar with the
> > kernel codebase.
> >
> > Signed-off-by: Nathan Huckleberry <nhuck@google.com>
> > ---
> > Changes v6->v7
> > * Fix issues with relative paths
> > * Additional style fixes
> >  MAINTAINERS                                   |  1 +
> >  Makefile                                      |  3 +
> >  scripts/clang-tools/Makefile.clang-tools      | 23 ++++++
> >  .../{ => clang-tools}/gen_compile_commands.py |  0
> >  scripts/clang-tools/run-clang-tools.py        | 74 +++++++++++++++++++
> >  5 files changed, 101 insertions(+)
> >  create mode 100644 scripts/clang-tools/Makefile.clang-tools
> >  rename scripts/{ => clang-tools}/gen_compile_commands.py (100%)
> >  create mode 100755 scripts/clang-tools/run-clang-tools.py
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index 1d4aa7f942de..a444564e5572 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -4198,6 +4198,7 @@ W:        https://clangbuiltlinux.github.io/
> >  B:     https://github.com/ClangBuiltLinux/linux/issues
> >  C:     irc://chat.freenode.net/clangbuiltlinux
> >  F:     Documentation/kbuild/llvm.rst
> > +F:     scripts/clang-tools/
> >  K:     \b(?i:clang|llvm)\b
> >
> >  CLEANCACHE API
> > diff --git a/Makefile b/Makefile
> > index fe0164a654c7..3e2df010b342 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -747,6 +747,7 @@ KBUILD_CFLAGS       += $(call cc-option,-fno-allow-store-data-races)
> >
> >  include scripts/Makefile.kcov
> >  include scripts/Makefile.gcc-plugins
> > +include scripts/clang-tools/Makefile.clang-tools
> >
> >  ifdef CONFIG_READABLE_ASM
> >  # Disable optimizations that make assembler listings hard to read.
> > @@ -1543,6 +1544,8 @@ help:
> >         @echo  '  export_report   - List the usages of all exported symbols'
> >         @echo  '  headerdep       - Detect inclusion cycles in headers'
> >         @echo  '  coccicheck      - Check with Coccinelle'
> > +       @echo  '  clang-analyzer  - Check with clang static analyzer'
> > +       @echo  '  clang-tidy      - Check with clang-tidy'
> >         @echo  ''
> >         @echo  'Tools:'
> >         @echo  '  nsdeps          - Generate missing symbol namespace dependencies'
> > diff --git a/scripts/clang-tools/Makefile.clang-tools b/scripts/clang-tools/Makefile.clang-tools
> > new file mode 100644
> > index 000000000000..5c9d76f77595
> > --- /dev/null
> > +++ b/scripts/clang-tools/Makefile.clang-tools
> > @@ -0,0 +1,23 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Copyright (C) Google LLC, 2020
> > +#
> > +# Author: Nathan Huckleberry <nhuck@google.com>
> > +#
> > +PHONY += clang-tidy
> > +clang-tidy:
> > +ifdef CONFIG_CC_IS_CLANG
> > +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> > +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
> > +else
> > +       $(error clang-tidy requires CC=clang)
> > +endif
> > +
> > +PHONY += clang-analyzer
> > +clang-analyzer:
> > +ifdef CONFIG_CC_IS_CLANG
> > +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> > +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-analyzer compile_commands.json
> > +else
> > +       $(error clang-analyzer requires CC=clang)
> > +endif
>
>
>
> You can unify the almost same two rules.
>
> PHONY += clang-tidy clang-analyzer
> clang-tidy clang-analyzer:
> ifdef CONFIG_CC_IS_CLANG
>         $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
>         $(PYTHON3) scripts/clang-tools/run-clang-tools.py $@
> compile_commands.json
> else
>         $(error $@ requires CC=clang)
> endif
>

I like this.

>
>
>
> But, before we proceed, please tell me
> what this check is intended for.
>

Clang-tidy invokes clang using the command line
options specified in the compile_commands.json file.
Using gcc command line options causes a bunch of
errors for unknown options.

>
>
>
>
> Case 1)
> Build the kernel with CC=clang,
> and then run clang-tidy without CC=clang.
>
> $ make CC=clang defconfig
> $ make CC=clang -j$(nproc)
> $ make clang-tidy
>
> scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
> CC=clang.  Stop.
>

I suppose this case could allow clang-tidy to
be run.

>
>
>
> Case 2)
> Build the kernel using GCC,
> and then run clang-tidy with CC=clang.
>
> $ make defconfig
> $ make -j$(nproc)
> $ make CC=clang clang-tidy
>
> This patch happily runs clang-tidy
> although compile_commands.json
> contains GCC commands.
>

This is the worst of the two cases. I'm not
sure how to prevent this other than parsing the
compiler invocation in run-clang-tools.py.

I'm open to better suggestions.

>
>
>
>
> So, it checks if you have passed CC=clang
> to "make clang-tidy", where I do not see
> any user of the $(CC) variable.
>
> It does not care whether you have built
> the kernel with GCC or Clang.
>
>
>
> What happens if you run clang-tidy against
> compile_commands.json that contains GCC
> commands?

Clang-tidy itself uses the command line options from
compile_commands.json to invoke clang. If you run
clang-tidy against GCC commands you get lots of
errors similar to this

Found compiler error(s).
12 warnings and 8 errors generated.
Error while processing /usr/local/google/home/nhuck/linux/arch/x86/lib/iomem.c.
error: unknown argument: '-fconserve-stack' [clang-diagnostic-error]
error: unknown argument: '-fno-var-tracking-assignments'
[clang-diagnostic-error]
error: unknown argument: '-mindirect-branch-register' [clang-diagnostic-error]
error: unknown argument: '-mindirect-branch=thunk-extern'
[clang-diagnostic-error]
error: unknown argument: '-mno-fp-ret-in-387' [clang-diagnostic-error]
error: unknown argument: '-mpreferred-stack-boundary=3' [clang-diagnostic-error]
error: unknown argument: '-mskip-rax-setup' [clang-diagnostic-error]

>
>
> I also care about stale commands
> in compile_commands.json.
>

I agree with this point, but it's more of a bug with
gen_compile_commands.py. Maybe gen_compile_commands.py
could emit a warning when stale commands are detected in the
.*.cmd files.

>
> gen_compile_commands.py creates compile_commands.json
> based on *.cmd files it found.
>
> If you rebuild the kernel for various settings
> using GCC/clang without "make clean",
> stale .*.cmd files will grow.
>
> compile_commands.json will pick up commands
> from older compilation, i.e. the end up with
> the mixture of GCC/Clang commands.
>
> So, I'd like to know how clang-tidy handles
> the GCC commands from compile_commands.json
>
>
>
>
>
> > diff --git a/scripts/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
> > similarity index 100%
> > rename from scripts/gen_compile_commands.py
> > rename to scripts/clang-tools/gen_compile_commands.py
> > diff --git a/scripts/clang-tools/run-clang-tools.py b/scripts/clang-tools/run-clang-tools.py
> > new file mode 100755
> > index 000000000000..fa7655c7cec0
> > --- /dev/null
> > +++ b/scripts/clang-tools/run-clang-tools.py
> > @@ -0,0 +1,74 @@
> > +#!/usr/bin/env python
> > +# SPDX-License-Identifier: GPL-2.0
> > +#
> > +# Copyright (C) Google LLC, 2020
> > +#
> > +# Author: Nathan Huckleberry <nhuck@google.com>
> > +#
> > +"""A helper routine run clang-tidy and the clang static-analyzer on
> > +compile_commands.json.
> > +"""
> > +
> > +import argparse
> > +import json
> > +import multiprocessing
> > +import os
> > +import subprocess
> > +import sys
> > +
> > +
> > +def parse_arguments():
> > +    """Set up and parses command-line arguments.
> > +    Returns:
> > +        args: Dict of parsed args
> > +        Has keys: [path, type]
> > +    """
> > +    usage = """Run clang-tidy or the clang static-analyzer on a
> > +        compilation database."""
> > +    parser = argparse.ArgumentParser(description=usage)
> > +
> > +    type_help = "Type of analysis to be performed"
> > +    parser.add_argument("type",
> > +                        choices=["clang-tidy", "clang-analyzer"],
> > +                        help=type_help)
> > +    path_help = "Path to the compilation database to parse"
> > +    parser.add_argument("path", type=str, help=path_help)
> > +
> > +    return parser.parse_args()
> > +
> > +
> > +def init(l, a):
> > +    global lock
> > +    global args
> > +    lock = l
> > +    args = a
> > +
> > +
> > +def run_analysis(entry):
> > +    # Disable all checks, then re-enable the ones we want
> > +    checks = "-checks=-*,"
> > +    if args.type == "clang-tidy":
> > +        checks += "linuxkernel-*"
> > +    else:
> > +        checks += "clang-analyzer-*"
> > +    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
> > +                       stdout=subprocess.PIPE,
> > +                       stderr=subprocess.STDOUT,
> > +                       cwd=entry["directory"])
> > +    with lock:
> > +        sys.stderr.buffer.write(p.stdout)
> > +
> > +
> > +def main():
> > +    args = parse_arguments()
> > +
> > +    lock = multiprocessing.Lock()
> > +    pool = multiprocessing.Pool(initializer=init, initargs=(lock, args))
> > +    # Read JSON data into the datastore variable
> > +    with open(args.path, "r") as f:
> > +        datastore = json.load(f)
> > +        pool.map(run_analysis, datastore)
> > +
> > +
> > +if __name__ == "__main__":
> > +    main()
> > --
> > 2.28.0.rc0.142.g3c755180ce-goog
> >
>
>
> --
> Best Regards
> Masahiro Yamada
Masahiro Yamada Aug. 6, 2020, 10:10 p.m. UTC | #7
On Fri, Aug 7, 2020 at 6:42 AM 'Nathan Huckleberry' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
>
> On Thu, Aug 6, 2020 at 3:44 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> >
> > On Tue, Jul 28, 2020 at 9:47 AM Nathan Huckleberry <nhuck@google.com> wrote:
> > >
> > > This patch adds clang-tidy and the clang static-analyzer as make
> > > targets. The goal of this patch is to make static analysis tools
> > > usable and extendable by any developer or researcher who is familiar
> > > with basic c++.
> > >
> > > The current static analysis tools require intimate knowledge of the
> > > internal workings of the static analysis. Clang-tidy and the clang
> > > static analyzers expose an easy to use api and allow users unfamiliar
> > > with clang to write new checks with relative ease.
> > >
> > > ===Clang-tidy===
> > >
> > > Clang-tidy is an easily extendable 'linter' that runs on the AST.
> > > Clang-tidy checks are easy to write and understand. A check consists of
> > > two parts, a matcher and a checker. The matcher is created using a
> > > domain specific language that acts on the AST
> > > (https://clang.llvm.org/docs/LibASTMatchersReference.html).  When AST
> > > nodes are found by the matcher a callback is made to the checker. The
> > > checker can then execute additional checks and issue warnings.
> > >
> > > Here is an example clang-tidy check to report functions that have calls
> > > to local_irq_disable without calls to local_irq_enable and vice-versa.
> > > Functions flagged with __attribute((annotation("ignore_irq_balancing")))
> > > are ignored for analysis. (https://reviews.llvm.org/D65828)
> > >
> > > ===Clang static analyzer===
> > >
> > > The clang static analyzer is a more powerful static analysis tool that
> > > uses symbolic execution to find bugs. Currently there is a check that
> > > looks for potential security bugs from invalid uses of kmalloc and
> > > kfree. There are several more general purpose checks that are useful for
> > > the kernel.
> > >
> > > The clang static analyzer is well documented and designed to be
> > > extensible.
> > > (https://clang-analyzer.llvm.org/checker_dev_manual.html)
> > > (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf)
> > >
> > > The main draw of the clang tools is how accessible they are. The clang
> > > documentation is very nice and these tools are built specifically to be
> > > easily extendable by any developer. They provide an accessible method of
> > > bug-finding and research to people who are not overly familiar with the
> > > kernel codebase.
> > >
> > > Signed-off-by: Nathan Huckleberry <nhuck@google.com>
> > > ---
> > > Changes v6->v7
> > > * Fix issues with relative paths
> > > * Additional style fixes
> > >  MAINTAINERS                                   |  1 +
> > >  Makefile                                      |  3 +
> > >  scripts/clang-tools/Makefile.clang-tools      | 23 ++++++
> > >  .../{ => clang-tools}/gen_compile_commands.py |  0
> > >  scripts/clang-tools/run-clang-tools.py        | 74 +++++++++++++++++++
> > >  5 files changed, 101 insertions(+)
> > >  create mode 100644 scripts/clang-tools/Makefile.clang-tools
> > >  rename scripts/{ => clang-tools}/gen_compile_commands.py (100%)
> > >  create mode 100755 scripts/clang-tools/run-clang-tools.py
> > >
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index 1d4aa7f942de..a444564e5572 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -4198,6 +4198,7 @@ W:        https://clangbuiltlinux.github.io/
> > >  B:     https://github.com/ClangBuiltLinux/linux/issues
> > >  C:     irc://chat.freenode.net/clangbuiltlinux
> > >  F:     Documentation/kbuild/llvm.rst
> > > +F:     scripts/clang-tools/
> > >  K:     \b(?i:clang|llvm)\b
> > >
> > >  CLEANCACHE API
> > > diff --git a/Makefile b/Makefile
> > > index fe0164a654c7..3e2df010b342 100644
> > > --- a/Makefile
> > > +++ b/Makefile
> > > @@ -747,6 +747,7 @@ KBUILD_CFLAGS       += $(call cc-option,-fno-allow-store-data-races)
> > >
> > >  include scripts/Makefile.kcov
> > >  include scripts/Makefile.gcc-plugins
> > > +include scripts/clang-tools/Makefile.clang-tools
> > >
> > >  ifdef CONFIG_READABLE_ASM
> > >  # Disable optimizations that make assembler listings hard to read.
> > > @@ -1543,6 +1544,8 @@ help:
> > >         @echo  '  export_report   - List the usages of all exported symbols'
> > >         @echo  '  headerdep       - Detect inclusion cycles in headers'
> > >         @echo  '  coccicheck      - Check with Coccinelle'
> > > +       @echo  '  clang-analyzer  - Check with clang static analyzer'
> > > +       @echo  '  clang-tidy      - Check with clang-tidy'
> > >         @echo  ''
> > >         @echo  'Tools:'
> > >         @echo  '  nsdeps          - Generate missing symbol namespace dependencies'
> > > diff --git a/scripts/clang-tools/Makefile.clang-tools b/scripts/clang-tools/Makefile.clang-tools
> > > new file mode 100644
> > > index 000000000000..5c9d76f77595
> > > --- /dev/null
> > > +++ b/scripts/clang-tools/Makefile.clang-tools
> > > @@ -0,0 +1,23 @@
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +#
> > > +# Copyright (C) Google LLC, 2020
> > > +#
> > > +# Author: Nathan Huckleberry <nhuck@google.com>
> > > +#
> > > +PHONY += clang-tidy
> > > +clang-tidy:
> > > +ifdef CONFIG_CC_IS_CLANG
> > > +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> > > +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
> > > +else
> > > +       $(error clang-tidy requires CC=clang)
> > > +endif
> > > +
> > > +PHONY += clang-analyzer
> > > +clang-analyzer:
> > > +ifdef CONFIG_CC_IS_CLANG
> > > +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> > > +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-analyzer compile_commands.json
> > > +else
> > > +       $(error clang-analyzer requires CC=clang)
> > > +endif
> >
> >
> >
> > You can unify the almost same two rules.
> >
> > PHONY += clang-tidy clang-analyzer
> > clang-tidy clang-analyzer:
> > ifdef CONFIG_CC_IS_CLANG
> >         $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> >         $(PYTHON3) scripts/clang-tools/run-clang-tools.py $@
> > compile_commands.json
> > else
> >         $(error $@ requires CC=clang)
> > endif
> >
>
> I like this.
>
> >
> >
> >
> > But, before we proceed, please tell me
> > what this check is intended for.
> >
>
> Clang-tidy invokes clang using the command line
> options specified in the compile_commands.json file.
> Using gcc command line options causes a bunch of
> errors for unknown options.
>
> >
> >
> >
> >
> > Case 1)
> > Build the kernel with CC=clang,
> > and then run clang-tidy without CC=clang.
> >
> > $ make CC=clang defconfig
> > $ make CC=clang -j$(nproc)
> > $ make clang-tidy
> >
> > scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
> > CC=clang.  Stop.
> >
>
> I suppose this case could allow clang-tidy to
> be run.
>
> >
> >
> >
> > Case 2)
> > Build the kernel using GCC,
> > and then run clang-tidy with CC=clang.
> >
> > $ make defconfig
> > $ make -j$(nproc)
> > $ make CC=clang clang-tidy
> >
> > This patch happily runs clang-tidy
> > although compile_commands.json
> > contains GCC commands.
> >
>
> This is the worst of the two cases. I'm not
> sure how to prevent this other than parsing the
> compiler invocation in run-clang-tools.py.
>
> I'm open to better suggestions.
>
> >
> >
> >
> >
> > So, it checks if you have passed CC=clang
> > to "make clang-tidy", where I do not see
> > any user of the $(CC) variable.
> >
> > It does not care whether you have built
> > the kernel with GCC or Clang.
> >
> >
> >
> > What happens if you run clang-tidy against
> > compile_commands.json that contains GCC
> > commands?
>
> Clang-tidy itself uses the command line options from
> compile_commands.json to invoke clang. If you run
> clang-tidy against GCC commands you get lots of
> errors similar to this
>
> Found compiler error(s).
> 12 warnings and 8 errors generated.
> Error while processing /usr/local/google/home/nhuck/linux/arch/x86/lib/iomem.c.
> error: unknown argument: '-fconserve-stack' [clang-diagnostic-error]
> error: unknown argument: '-fno-var-tracking-assignments'
> [clang-diagnostic-error]
> error: unknown argument: '-mindirect-branch-register' [clang-diagnostic-error]
> error: unknown argument: '-mindirect-branch=thunk-extern'
> [clang-diagnostic-error]
> error: unknown argument: '-mno-fp-ret-in-387' [clang-diagnostic-error]
> error: unknown argument: '-mpreferred-stack-boundary=3' [clang-diagnostic-error]
> error: unknown argument: '-mskip-rax-setup' [clang-diagnostic-error]
>
> >
> >
> > I also care about stale commands
> > in compile_commands.json.
> >
>
> I agree with this point, but it's more of a bug with
> gen_compile_commands.py. Maybe gen_compile_commands.py
> could emit a warning when stale commands are detected in the
> .*.cmd files.


Nathan, thanks for your comments.

I can improve this
so compile_commands.json contains
only commands from the last build.

Working on a patch.
Nathan Huckleberry Aug. 12, 2020, 1:24 a.m. UTC | #8
Sounds good. Do you think this patch is ready to land then?

On Thu, Aug 6, 2020 at 5:10 PM Masahiro Yamada <masahiroy@kernel.org> wrote:
>
> On Fri, Aug 7, 2020 at 6:42 AM 'Nathan Huckleberry' via Clang Built
> Linux <clang-built-linux@googlegroups.com> wrote:
> >
> > On Thu, Aug 6, 2020 at 3:44 AM Masahiro Yamada <masahiroy@kernel.org> wrote:
> > >
> > > On Tue, Jul 28, 2020 at 9:47 AM Nathan Huckleberry <nhuck@google.com> wrote:
> > > >
> > > > This patch adds clang-tidy and the clang static-analyzer as make
> > > > targets. The goal of this patch is to make static analysis tools
> > > > usable and extendable by any developer or researcher who is familiar
> > > > with basic c++.
> > > >
> > > > The current static analysis tools require intimate knowledge of the
> > > > internal workings of the static analysis. Clang-tidy and the clang
> > > > static analyzers expose an easy to use api and allow users unfamiliar
> > > > with clang to write new checks with relative ease.
> > > >
> > > > ===Clang-tidy===
> > > >
> > > > Clang-tidy is an easily extendable 'linter' that runs on the AST.
> > > > Clang-tidy checks are easy to write and understand. A check consists of
> > > > two parts, a matcher and a checker. The matcher is created using a
> > > > domain specific language that acts on the AST
> > > > (https://clang.llvm.org/docs/LibASTMatchersReference.html).  When AST
> > > > nodes are found by the matcher a callback is made to the checker. The
> > > > checker can then execute additional checks and issue warnings.
> > > >
> > > > Here is an example clang-tidy check to report functions that have calls
> > > > to local_irq_disable without calls to local_irq_enable and vice-versa.
> > > > Functions flagged with __attribute((annotation("ignore_irq_balancing")))
> > > > are ignored for analysis. (https://reviews.llvm.org/D65828)
> > > >
> > > > ===Clang static analyzer===
> > > >
> > > > The clang static analyzer is a more powerful static analysis tool that
> > > > uses symbolic execution to find bugs. Currently there is a check that
> > > > looks for potential security bugs from invalid uses of kmalloc and
> > > > kfree. There are several more general purpose checks that are useful for
> > > > the kernel.
> > > >
> > > > The clang static analyzer is well documented and designed to be
> > > > extensible.
> > > > (https://clang-analyzer.llvm.org/checker_dev_manual.html)
> > > > (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf)
> > > >
> > > > The main draw of the clang tools is how accessible they are. The clang
> > > > documentation is very nice and these tools are built specifically to be
> > > > easily extendable by any developer. They provide an accessible method of
> > > > bug-finding and research to people who are not overly familiar with the
> > > > kernel codebase.
> > > >
> > > > Signed-off-by: Nathan Huckleberry <nhuck@google.com>
> > > > ---
> > > > Changes v6->v7
> > > > * Fix issues with relative paths
> > > > * Additional style fixes
> > > >  MAINTAINERS                                   |  1 +
> > > >  Makefile                                      |  3 +
> > > >  scripts/clang-tools/Makefile.clang-tools      | 23 ++++++
> > > >  .../{ => clang-tools}/gen_compile_commands.py |  0
> > > >  scripts/clang-tools/run-clang-tools.py        | 74 +++++++++++++++++++
> > > >  5 files changed, 101 insertions(+)
> > > >  create mode 100644 scripts/clang-tools/Makefile.clang-tools
> > > >  rename scripts/{ => clang-tools}/gen_compile_commands.py (100%)
> > > >  create mode 100755 scripts/clang-tools/run-clang-tools.py
> > > >
> > > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > > index 1d4aa7f942de..a444564e5572 100644
> > > > --- a/MAINTAINERS
> > > > +++ b/MAINTAINERS
> > > > @@ -4198,6 +4198,7 @@ W:        https://clangbuiltlinux.github.io/
> > > >  B:     https://github.com/ClangBuiltLinux/linux/issues
> > > >  C:     irc://chat.freenode.net/clangbuiltlinux
> > > >  F:     Documentation/kbuild/llvm.rst
> > > > +F:     scripts/clang-tools/
> > > >  K:     \b(?i:clang|llvm)\b
> > > >
> > > >  CLEANCACHE API
> > > > diff --git a/Makefile b/Makefile
> > > > index fe0164a654c7..3e2df010b342 100644
> > > > --- a/Makefile
> > > > +++ b/Makefile
> > > > @@ -747,6 +747,7 @@ KBUILD_CFLAGS       += $(call cc-option,-fno-allow-store-data-races)
> > > >
> > > >  include scripts/Makefile.kcov
> > > >  include scripts/Makefile.gcc-plugins
> > > > +include scripts/clang-tools/Makefile.clang-tools
> > > >
> > > >  ifdef CONFIG_READABLE_ASM
> > > >  # Disable optimizations that make assembler listings hard to read.
> > > > @@ -1543,6 +1544,8 @@ help:
> > > >         @echo  '  export_report   - List the usages of all exported symbols'
> > > >         @echo  '  headerdep       - Detect inclusion cycles in headers'
> > > >         @echo  '  coccicheck      - Check with Coccinelle'
> > > > +       @echo  '  clang-analyzer  - Check with clang static analyzer'
> > > > +       @echo  '  clang-tidy      - Check with clang-tidy'
> > > >         @echo  ''
> > > >         @echo  'Tools:'
> > > >         @echo  '  nsdeps          - Generate missing symbol namespace dependencies'
> > > > diff --git a/scripts/clang-tools/Makefile.clang-tools b/scripts/clang-tools/Makefile.clang-tools
> > > > new file mode 100644
> > > > index 000000000000..5c9d76f77595
> > > > --- /dev/null
> > > > +++ b/scripts/clang-tools/Makefile.clang-tools
> > > > @@ -0,0 +1,23 @@
> > > > +# SPDX-License-Identifier: GPL-2.0
> > > > +#
> > > > +# Copyright (C) Google LLC, 2020
> > > > +#
> > > > +# Author: Nathan Huckleberry <nhuck@google.com>
> > > > +#
> > > > +PHONY += clang-tidy
> > > > +clang-tidy:
> > > > +ifdef CONFIG_CC_IS_CLANG
> > > > +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> > > > +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
> > > > +else
> > > > +       $(error clang-tidy requires CC=clang)
> > > > +endif
> > > > +
> > > > +PHONY += clang-analyzer
> > > > +clang-analyzer:
> > > > +ifdef CONFIG_CC_IS_CLANG
> > > > +       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> > > > +       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-analyzer compile_commands.json
> > > > +else
> > > > +       $(error clang-analyzer requires CC=clang)
> > > > +endif
> > >
> > >
> > >
> > > You can unify the almost same two rules.
> > >
> > > PHONY += clang-tidy clang-analyzer
> > > clang-tidy clang-analyzer:
> > > ifdef CONFIG_CC_IS_CLANG
> > >         $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
> > >         $(PYTHON3) scripts/clang-tools/run-clang-tools.py $@
> > > compile_commands.json
> > > else
> > >         $(error $@ requires CC=clang)
> > > endif
> > >
> >
> > I like this.
> >
> > >
> > >
> > >
> > > But, before we proceed, please tell me
> > > what this check is intended for.
> > >
> >
> > Clang-tidy invokes clang using the command line
> > options specified in the compile_commands.json file.
> > Using gcc command line options causes a bunch of
> > errors for unknown options.
> >
> > >
> > >
> > >
> > >
> > > Case 1)
> > > Build the kernel with CC=clang,
> > > and then run clang-tidy without CC=clang.
> > >
> > > $ make CC=clang defconfig
> > > $ make CC=clang -j$(nproc)
> > > $ make clang-tidy
> > >
> > > scripts/clang-tools/Makefile.clang-tools:13: *** clang-tidy requires
> > > CC=clang.  Stop.
> > >
> >
> > I suppose this case could allow clang-tidy to
> > be run.
> >
> > >
> > >
> > >
> > > Case 2)
> > > Build the kernel using GCC,
> > > and then run clang-tidy with CC=clang.
> > >
> > > $ make defconfig
> > > $ make -j$(nproc)
> > > $ make CC=clang clang-tidy
> > >
> > > This patch happily runs clang-tidy
> > > although compile_commands.json
> > > contains GCC commands.
> > >
> >
> > This is the worst of the two cases. I'm not
> > sure how to prevent this other than parsing the
> > compiler invocation in run-clang-tools.py.
> >
> > I'm open to better suggestions.
> >
> > >
> > >
> > >
> > >
> > > So, it checks if you have passed CC=clang
> > > to "make clang-tidy", where I do not see
> > > any user of the $(CC) variable.
> > >
> > > It does not care whether you have built
> > > the kernel with GCC or Clang.
> > >
> > >
> > >
> > > What happens if you run clang-tidy against
> > > compile_commands.json that contains GCC
> > > commands?
> >
> > Clang-tidy itself uses the command line options from
> > compile_commands.json to invoke clang. If you run
> > clang-tidy against GCC commands you get lots of
> > errors similar to this
> >
> > Found compiler error(s).
> > 12 warnings and 8 errors generated.
> > Error while processing /usr/local/google/home/nhuck/linux/arch/x86/lib/iomem.c.
> > error: unknown argument: '-fconserve-stack' [clang-diagnostic-error]
> > error: unknown argument: '-fno-var-tracking-assignments'
> > [clang-diagnostic-error]
> > error: unknown argument: '-mindirect-branch-register' [clang-diagnostic-error]
> > error: unknown argument: '-mindirect-branch=thunk-extern'
> > [clang-diagnostic-error]
> > error: unknown argument: '-mno-fp-ret-in-387' [clang-diagnostic-error]
> > error: unknown argument: '-mpreferred-stack-boundary=3' [clang-diagnostic-error]
> > error: unknown argument: '-mskip-rax-setup' [clang-diagnostic-error]
> >
> > >
> > >
> > > I also care about stale commands
> > > in compile_commands.json.
> > >
> >
> > I agree with this point, but it's more of a bug with
> > gen_compile_commands.py. Maybe gen_compile_commands.py
> > could emit a warning when stale commands are detected in the
> > .*.cmd files.
>
>
> Nathan, thanks for your comments.
>
> I can improve this
> so compile_commands.json contains
> only commands from the last build.
>
> Working on a patch.
>
> --
> Best Regards
> Masahiro Yamada
Masahiro Yamada Aug. 12, 2020, 5:47 p.m. UTC | #9
On Wed, Aug 12, 2020 at 10:24 AM 'Nathan Huckleberry' via Clang Built
Linux <clang-built-linux@googlegroups.com> wrote:
>
> Sounds good. Do you think this patch is ready to land then?


I do not think so.

I pointed out the CC=clang check was not working.
I see false positive errors from GCC commands.



This patch does not use the benefit of Makefile.

Makefile is used to describe the dependency
between a target and its prerequisites,
and how to update the target.

Make compares the timestamps between the
targets and prerequisites, then determines
which targets need updating.


See your code.


clang-tidy:
ifdef CONFIG_CC_IS_CLANG
       $(PYTHON3) scripts/clang-tools/gen_compile_commands.py
       $(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy
compile_commands.json
else
       $(error clang-tidy requires CC=clang)
endif


This always runs two commands sequentially.
It rebuilds compile_commands.json even if
nothing in the source tree has been changed.

If you do this, there is no strong reason to use Make,
and actually you can rewrite it in a shell script:


clang_tidy () {
  if [ "$CONFIG_CC_IS_CLANG = "y" ]; then
    $PYTHON3 scripts/clang-tools/gen_compile_commands.py
    $PYTHON3 scripts/clang-tools/run-clang-tools.py clang-tidy
compile_commands.json
  else
    echo "clang-tidy requires CC=clang"
    exit 1
  fi
}




I changed the rules to Makefile-ish style.

https://patchwork.kernel.org/project/linux-kbuild/list/?series=331893


I will wait for comments for the new version.

Patch
diff mbox series

diff --git a/MAINTAINERS b/MAINTAINERS
index 1d4aa7f942de..a444564e5572 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4198,6 +4198,7 @@  W:	https://clangbuiltlinux.github.io/
 B:	https://github.com/ClangBuiltLinux/linux/issues
 C:	irc://chat.freenode.net/clangbuiltlinux
 F:	Documentation/kbuild/llvm.rst
+F:	scripts/clang-tools/
 K:	\b(?i:clang|llvm)\b
 
 CLEANCACHE API
diff --git a/Makefile b/Makefile
index fe0164a654c7..3e2df010b342 100644
--- a/Makefile
+++ b/Makefile
@@ -747,6 +747,7 @@  KBUILD_CFLAGS	+= $(call cc-option,-fno-allow-store-data-races)
 
 include scripts/Makefile.kcov
 include scripts/Makefile.gcc-plugins
+include scripts/clang-tools/Makefile.clang-tools
 
 ifdef CONFIG_READABLE_ASM
 # Disable optimizations that make assembler listings hard to read.
@@ -1543,6 +1544,8 @@  help:
 	@echo  '  export_report   - List the usages of all exported symbols'
 	@echo  '  headerdep       - Detect inclusion cycles in headers'
 	@echo  '  coccicheck      - Check with Coccinelle'
+	@echo  '  clang-analyzer  - Check with clang static analyzer'
+	@echo  '  clang-tidy      - Check with clang-tidy'
 	@echo  ''
 	@echo  'Tools:'
 	@echo  '  nsdeps          - Generate missing symbol namespace dependencies'
diff --git a/scripts/clang-tools/Makefile.clang-tools b/scripts/clang-tools/Makefile.clang-tools
new file mode 100644
index 000000000000..5c9d76f77595
--- /dev/null
+++ b/scripts/clang-tools/Makefile.clang-tools
@@ -0,0 +1,23 @@ 
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) Google LLC, 2020
+#
+# Author: Nathan Huckleberry <nhuck@google.com>
+#
+PHONY += clang-tidy
+clang-tidy:
+ifdef CONFIG_CC_IS_CLANG
+	$(PYTHON3) scripts/clang-tools/gen_compile_commands.py
+	$(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-tidy compile_commands.json
+else
+	$(error clang-tidy requires CC=clang)
+endif
+
+PHONY += clang-analyzer
+clang-analyzer:
+ifdef CONFIG_CC_IS_CLANG
+	$(PYTHON3) scripts/clang-tools/gen_compile_commands.py
+	$(PYTHON3) scripts/clang-tools/run-clang-tools.py clang-analyzer compile_commands.json
+else
+	$(error clang-analyzer requires CC=clang)
+endif
diff --git a/scripts/gen_compile_commands.py b/scripts/clang-tools/gen_compile_commands.py
similarity index 100%
rename from scripts/gen_compile_commands.py
rename to scripts/clang-tools/gen_compile_commands.py
diff --git a/scripts/clang-tools/run-clang-tools.py b/scripts/clang-tools/run-clang-tools.py
new file mode 100755
index 000000000000..fa7655c7cec0
--- /dev/null
+++ b/scripts/clang-tools/run-clang-tools.py
@@ -0,0 +1,74 @@ 
+#!/usr/bin/env python
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (C) Google LLC, 2020
+#
+# Author: Nathan Huckleberry <nhuck@google.com>
+#
+"""A helper routine run clang-tidy and the clang static-analyzer on
+compile_commands.json.
+"""
+
+import argparse
+import json
+import multiprocessing
+import os
+import subprocess
+import sys
+
+
+def parse_arguments():
+    """Set up and parses command-line arguments.
+    Returns:
+        args: Dict of parsed args
+        Has keys: [path, type]
+    """
+    usage = """Run clang-tidy or the clang static-analyzer on a
+        compilation database."""
+    parser = argparse.ArgumentParser(description=usage)
+
+    type_help = "Type of analysis to be performed"
+    parser.add_argument("type",
+                        choices=["clang-tidy", "clang-analyzer"],
+                        help=type_help)
+    path_help = "Path to the compilation database to parse"
+    parser.add_argument("path", type=str, help=path_help)
+
+    return parser.parse_args()
+
+
+def init(l, a):
+    global lock
+    global args
+    lock = l
+    args = a
+
+
+def run_analysis(entry):
+    # Disable all checks, then re-enable the ones we want
+    checks = "-checks=-*,"
+    if args.type == "clang-tidy":
+        checks += "linuxkernel-*"
+    else:
+        checks += "clang-analyzer-*"
+    p = subprocess.run(["clang-tidy", "-p", args.path, checks, entry["file"]],
+                       stdout=subprocess.PIPE,
+                       stderr=subprocess.STDOUT,
+                       cwd=entry["directory"])
+    with lock:
+        sys.stderr.buffer.write(p.stdout)
+
+
+def main():
+    args = parse_arguments()
+
+    lock = multiprocessing.Lock()
+    pool = multiprocessing.Pool(initializer=init, initargs=(lock, args))
+    # Read JSON data into the datastore variable
+    with open(args.path, "r") as f:
+        datastore = json.load(f)
+        pool.map(run_analysis, datastore)
+
+
+if __name__ == "__main__":
+    main()