[RFC] kbuild: Add option to generate a Compilation Database
diff mbox series

Message ID 20190606203003.112040-1-rrangel@chromium.org
State New
Headers show
Series
  • [RFC] kbuild: Add option to generate a Compilation Database
Related show

Commit Message

Raul Rangel June 6, 2019, 8:30 p.m. UTC
Clang tooling requires a compilation database to figure out the build
options for each file. This enables tools like clang-tidy and
clang-check.

See https://clang.llvm.org/docs/HowToSetupToolingForLLVM.html for more
information.

Normally cmake is used to generate the compilation database, but the
linux kernel uses make. Another option is using
[BEAR](https://github.com/rizsotto/Bear) which instruments
exec to find clang invocations and generate the database that way.

Clang 4.0.0 added the -MJ option to generate the json for each
compilation unit. https://reviews.llvm.org/D27140

This patch takes advantage of the -MJ option. So it only works for
Clang.

Signed-off-by: Raul E Rangel <rrangel@chromium.org>
---
I have a couple TODOs in the code that I would like some feedback on.
Specifically why extra-y doesn't seem to work in the root Makefile.
Also, is there a way to add the correct list of prerequisites to the
compile_commands.json target?

Thanks,
Raul


 Makefile               | 20 ++++++++++++++++++++
 lib/Kconfig.debug      |  7 +++++++
 scripts/Makefile.build |  9 ++++++++-
 3 files changed, 35 insertions(+), 1 deletion(-)

Comments

Tom Roeder June 6, 2019, 8:54 p.m. UTC | #1
On Thu, Jun 06, 2019 at 02:30:03PM -0600, Raul E Rangel wrote:
> Clang tooling requires a compilation database to figure out the build
> options for each file. This enables tools like clang-tidy and
> clang-check.
> 
> See https://clang.llvm.org/docs/HowToSetupToolingForLLVM.html for more
> information.

I'm glad to see someone adding this to the Makefile directly. I added
scripts/gen_compile_commands.py in b302046 (in Dec 2018) when I was
working on using clang-check to look for bugs in KVM. That script
sidesteps the -MJ option because I found that trying to add it as an
extra option ended up adding entries to the database that didn't work
properly in some cases. This patch adds -MJ in a different way than I
was trying, so I hope it doesn't have the same problems.

I would much prefer to have this functionality integrated into the
Makefile system directly, so if this works with clang-check over all
files and doesn't lead to spurious entries in the database, I'm all for
it.

> 
> Normally cmake is used to generate the compilation database, but the
> linux kernel uses make. Another option is using
> [BEAR](https://github.com/rizsotto/Bear) which instruments
> exec to find clang invocations and generate the database that way.
> 
> Clang 4.0.0 added the -MJ option to generate the json for each
> compilation unit. https://reviews.llvm.org/D27140
> 
> This patch takes advantage of the -MJ option. So it only works for
> Clang.
> 
Can you please add details about how this was tested and compare
coverage with the existing script?

Tom
Nick Desaulniers June 6, 2019, 11:40 p.m. UTC | #2
On Thu, Jun 6, 2019 at 1:54 PM Tom Roeder <tmroeder@google.com> wrote:
>
> On Thu, Jun 06, 2019 at 02:30:03PM -0600, Raul E Rangel wrote:
> > Clang tooling requires a compilation database to figure out the build
> > options for each file. This enables tools like clang-tidy and
> > clang-check.
> >
> > See https://clang.llvm.org/docs/HowToSetupToolingForLLVM.html for more
> > information.

I'm also super happy to see this!
https://nickdesaulniers.github.io/blog/2017/05/31/running-clang-tidy-on-the-linux-kernel/
I don't know enough about GNU Make/Kbuild to answer the questions, but
hopefully Masahiro can help there.

> I'm glad to see someone adding this to the Makefile directly. I added
> scripts/gen_compile_commands.py in b302046 (in Dec 2018) when I was

Heh, cool.  I had a script that basically did this; we recently
dropped it from the Android trees when doing an audit of out of tree
patches.

> working on using clang-check to look for bugs in KVM. That script

I'm very interested in this work; my summer intern is looking into
static analyses of the Linux kernel.  Can you maybe reach out to me
off thread to tell me more about what you found (or didn't)?

> > Normally cmake is used to generate the compilation database, but the
> > linux kernel uses make. Another option is using
> > [BEAR](https://github.com/rizsotto/Bear) which instruments
> > exec to find clang invocations and generate the database that way.

It's probably possible to get this to work w/ GCC if the additional
dependency of bear exists on the host's system (and may reduce the
number of implementations).  Downside is the additional host
dependency.

Sounds like it may also be possible to just run
scripts/gen_compile_commands.py at build time if this config is
enabled?

Maybe a comparison of the output of Tom's script and your patch might
reveal if one approach is incomplete?
Tom Roeder June 7, 2019, 3:32 p.m. UTC | #3
On Thu, Jun 06, 2019 at 04:40:00PM -0700, Nick Desaulniers wrote:
> On Thu, Jun 6, 2019 at 1:54 PM Tom Roeder <tmroeder@google.com> wrote:
> >
> > On Thu, Jun 06, 2019 at 02:30:03PM -0600, Raul E Rangel wrote:
> > > Clang tooling requires a compilation database to figure out the build
> > > options for each file. This enables tools like clang-tidy and
> > > clang-check.
> > >
> > > See https://clang.llvm.org/docs/HowToSetupToolingForLLVM.html for more
> > > information.
> 
> I'm also super happy to see this!
> https://nickdesaulniers.github.io/blog/2017/05/31/running-clang-tidy-on-the-linux-kernel/
> I don't know enough about GNU Make/Kbuild to answer the questions, but
> hopefully Masahiro can help there.
> 
> > I'm glad to see someone adding this to the Makefile directly. I added
> > scripts/gen_compile_commands.py in b302046 (in Dec 2018) when I was
> 
> Heh, cool.  I had a script that basically did this; we recently
> dropped it from the Android trees when doing an audit of out of tree
> patches.
> 
> > working on using clang-check to look for bugs in KVM. That script
> 
> I'm very interested in this work; my summer intern is looking into
> static analyses of the Linux kernel.  Can you maybe reach out to me
> off thread to tell me more about what you found (or didn't)?
> 
> > > Normally cmake is used to generate the compilation database, but the
> > > linux kernel uses make. Another option is using
> > > [BEAR](https://github.com/rizsotto/Bear) which instruments
> > > exec to find clang invocations and generate the database that way.
> 
> It's probably possible to get this to work w/ GCC if the additional
> dependency of bear exists on the host's system (and may reduce the
> number of implementations).  Downside is the additional host
> dependency.
> 
> Sounds like it may also be possible to just run
> scripts/gen_compile_commands.py at build time if this config is
> enabled?

Yes, for scripts/gen_compile_commands.py, you run a build first with
whatever configuration you want, then call the script to produce the
compile_commands.json file.
Masahiro Yamada June 18, 2019, 3:16 p.m. UTC | #4
On Fri, Jun 7, 2019 at 5:30 AM Raul E Rangel <rrangel@chromium.org> wrote:
>
> Clang tooling requires a compilation database to figure out the build
> options for each file. This enables tools like clang-tidy and
> clang-check.
>
> See https://clang.llvm.org/docs/HowToSetupToolingForLLVM.html for more
> information.
>
> Normally cmake is used to generate the compilation database, but the
> linux kernel uses make. Another option is using
> [BEAR](https://github.com/rizsotto/Bear) which instruments
> exec to find clang invocations and generate the database that way.
>
> Clang 4.0.0 added the -MJ option to generate the json for each
> compilation unit. https://reviews.llvm.org/D27140
>
> This patch takes advantage of the -MJ option. So it only works for
> Clang.
>
> Signed-off-by: Raul E Rangel <rrangel@chromium.org>
> ---
> I have a couple TODOs in the code that I would like some feedback on.
> Specifically why extra-y doesn't seem to work in the root Makefile.
> Also, is there a way to add the correct list of prerequisites to the
> compile_commands.json target?
>
> Thanks,
> Raul
>
>
>  Makefile               | 20 ++++++++++++++++++++
>  lib/Kconfig.debug      |  7 +++++++
>  scripts/Makefile.build |  9 ++++++++-
>  3 files changed, 35 insertions(+), 1 deletion(-)
>
> diff --git a/Makefile b/Makefile
> index a61a95b6b38f7..06067ee18ff64 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1663,6 +1663,26 @@ quiet_cmd_tags = GEN     $@
>  tags TAGS cscope gtags: FORCE
>         $(call cmd,tags)
>
> +# Compilation Database
> +# ---------------------------------------------------------------------------
> +# Generates a compilation database that can be used with the LLVM tools
> +ifdef CONFIG_COMPILATION_DATABASE
> +
> +quiet_cmd_compilation_db = GEN   $@
> +cmd_compilation_db = (echo '['; \
> +       find "$(@D)" -mindepth 2 -iname '*.json' -print0 | xargs -0 cat; \
> +       echo ']') > "$(@D)/$(@F)"


Using 'find' has the same problem as
scripts/gen_compile_commands.py does.

Unless we start build from the clean source tree,
compile_commands.json will contain stale entries
from the previous building.

The motivation is
to replace scripts/gen_compile_commands.py entirely?

Generally, we do not need two ways to do the same thing...





> +# Make sure the database is built when calling `make` without a target.
> +# TODO: Using extra-y doesn't seem to work.
> +_all: $(obj)/compile_commands.json
> +
> +# TODO: Is there a variable that contains all the object files created by
> +# cmd_cc_o_c? Depending on `all` is kind of a hack
> +$(obj)/compile_commands.json: all FORCE
> +       $(call if_changed,compilation_db)
> +endif
> +
>  # Scripts to check various things for consistency
>  # ---------------------------------------------------------------------------
>
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index eae43952902eb..46fceb1fff3d9 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -238,6 +238,13 @@ config GDB_SCRIPTS
>           instance. See Documentation/dev-tools/gdb-kernel-debugging.rst
>           for further details.
>
> +config COMPILATION_DATABASE
> +       bool "Generate a compilation database"
> +       depends on CLANG_VERSION >= 40000
> +       help
> +         This creates a JSON Compilation Database (compile_commands.json)
> +         that is used by the clang tooling (clang-tidy, clang-check, etc).
> +
>  config ENABLE_MUST_CHECK
>         bool "Enable __must_check logic"
>         default y
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index ae9cf740633e1..0017bf397292d 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -145,8 +145,15 @@ $(obj)/%.ll: $(src)/%.c FORCE
>  # The C file is compiled and updated dependency information is generated.
>  # (See cmd_cc_o_c + relevant part of rule_cc_o_c)
>
> +ifdef CONFIG_COMPILATION_DATABASE
> +# TODO: Should we store the json in a temp variable and only copy it to the
> +# final name when the content is different? In theory we could avoid having to
> +# generate the compilation db if the json did not change.
> +compdb_flags = -MJ $(@D)/.$(@F).json
> +endif
> +
>  quiet_cmd_cc_o_c = CC $(quiet_modtag)  $@
> -      cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $<
> +      cmd_cc_o_c = $(CC) $(c_flags) $(compdb_flags) -c -o $@ $<
>
>  ifdef CONFIG_MODVERSIONS
>  # When module versioning is enabled the following steps are executed:
> --
> 2.22.0.rc1.311.g5d7573a151-goog
>

Patch
diff mbox series

diff --git a/Makefile b/Makefile
index a61a95b6b38f7..06067ee18ff64 100644
--- a/Makefile
+++ b/Makefile
@@ -1663,6 +1663,26 @@  quiet_cmd_tags = GEN     $@
 tags TAGS cscope gtags: FORCE
 	$(call cmd,tags)
 
+# Compilation Database
+# ---------------------------------------------------------------------------
+# Generates a compilation database that can be used with the LLVM tools
+ifdef CONFIG_COMPILATION_DATABASE
+
+quiet_cmd_compilation_db = GEN   $@
+cmd_compilation_db = (echo '['; \
+	find "$(@D)" -mindepth 2 -iname '*.json' -print0 | xargs -0 cat; \
+	echo ']') > "$(@D)/$(@F)"
+
+# Make sure the database is built when calling `make` without a target.
+# TODO: Using extra-y doesn't seem to work.
+_all: $(obj)/compile_commands.json
+
+# TODO: Is there a variable that contains all the object files created by
+# cmd_cc_o_c? Depending on `all` is kind of a hack
+$(obj)/compile_commands.json: all FORCE
+	$(call if_changed,compilation_db)
+endif
+
 # Scripts to check various things for consistency
 # ---------------------------------------------------------------------------
 
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index eae43952902eb..46fceb1fff3d9 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -238,6 +238,13 @@  config GDB_SCRIPTS
 	  instance. See Documentation/dev-tools/gdb-kernel-debugging.rst
 	  for further details.
 
+config COMPILATION_DATABASE
+	bool "Generate a compilation database"
+	depends on CLANG_VERSION >= 40000
+	help
+	  This creates a JSON Compilation Database (compile_commands.json)
+	  that is used by the clang tooling (clang-tidy, clang-check, etc).
+
 config ENABLE_MUST_CHECK
 	bool "Enable __must_check logic"
 	default y
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index ae9cf740633e1..0017bf397292d 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -145,8 +145,15 @@  $(obj)/%.ll: $(src)/%.c FORCE
 # The C file is compiled and updated dependency information is generated.
 # (See cmd_cc_o_c + relevant part of rule_cc_o_c)
 
+ifdef CONFIG_COMPILATION_DATABASE
+# TODO: Should we store the json in a temp variable and only copy it to the
+# final name when the content is different? In theory we could avoid having to
+# generate the compilation db if the json did not change.
+compdb_flags = -MJ $(@D)/.$(@F).json
+endif
+
 quiet_cmd_cc_o_c = CC $(quiet_modtag)  $@
-      cmd_cc_o_c = $(CC) $(c_flags) -c -o $@ $<
+      cmd_cc_o_c = $(CC) $(c_flags) $(compdb_flags) -c -o $@ $<
 
 ifdef CONFIG_MODVERSIONS
 # When module versioning is enabled the following steps are executed: