diff mbox series

[v2,1/7] tests: introduce tree-wide code style checking

Message ID 20220704152303.760983-2-berrange@redhat.com (mailing list archive)
State New, archived
Headers show
Series tests: introduce a tree-wide code style checking facility | expand

Commit Message

Daniel P. Berrangé July 4, 2022, 3:22 p.m. UTC
Historically QEMU has used the 'scripts/checkpatch.pl' script to
validate various style rules but there are a number of issues:

 - It requires the contributor to remember to run it as it
   is not wired into 'make check'

 - While it can be told to check whole files, in practice
   it is usually only used to check patch diffs, because the
   former would flag up pages of pre-existing violations that
   have never been fixed

 - It is said to be OK to ignore some things reported by the
   script, but don't record these exceptional cases anywere.
   Thus contributors looking at existing violations in tree
   are never sure whether they are intentional or historical
   unfixed technical debt.

 - There is no distinct reporting for each type of check
   performed and as a consequence there is also no way to
   mark particular files to be skipped for particular checks

This commit aims to give us a better approach to checking many types
of code style problems by introducing a flexible way to define whole
tree style checks.

The logic is essentially an import of the 'top/maint.mk' file from
GNULIB, with the following changes applied

 - Logic unrelated to the GNULIB syntax-check functionality
   is removed.

 - sc_excl is redefined, so that whitespace is turned into
   an '|' condition. This allows filename exclusion lists
   to span multiple lines making it more readable.

 - VC_LIST is changed to directly invoke git instead of
   indirectly via the vc-list script, since QEMU does not
   need portaility across many source control systems.

 - .DEFAULT_GOAL is set to 'syntax-check' since this is being
   used in a self-contained manner and thus doesn't need to
   play nice with our makefile rules QEMU has

This commit does the bare minimum introducing the basic infra:

 - tests/style-infra.mak - the cut down import of maint.mk
 - tests/style-excludes.mak - where the list of filename
   exclusions per test will be maintained (empty)
 - tests/style.mak - where the interesting tests live (empty)

As a general rule new checks can be implemented by merely defining
a regex matching the code pattern that should be blocked.

For example, consider we want to stop people using the 'bool' type
entirely. A rule starting with the prefix 'sc_' is defined in the
style.mak file:

  sc_prohibit_bool:
	prohibit='\<bool\>' \
	halt='do not use the bool type' \
	$(_sc_search_regexp)

Where '$(_sc_search_regexp)' (acquired from style-infra.mak)
contains all the magic to perform the tree-wide search for the
offending code pattern, reporting '$(halt)' as the error message
if violations are found.

If certain files need to be skipped for certain tests this can
be achieved by defining a variable with 'exclude_file_name_regexp--'
followed by the name of the check

  exclude_file_name_regexp--sc_prohibit_bool = \
        i-am-allowed-to-use-bool.c \
	and-so-am-i.c

Some checks can't be easily implemented by a simple per-line matching
regex. Since the checks are just implemented in make/shell, custom
rules can run essentially arbitrary logic.

Note that the checks require the use of 'git' to detect the list of
source files to search. Thus the check is skipped when not running
from a git repository.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 meson.build              |   3 +
 tests/Makefile.include   |   3 +-
 tests/meson.build        |  19 +++
 tests/style-excludes.mak |   4 +
 tests/style-infra.mak    | 265 +++++++++++++++++++++++++++++++++++++++
 tests/style.mak          |  24 ++++
 6 files changed, 317 insertions(+), 1 deletion(-)
 create mode 100644 tests/style-excludes.mak
 create mode 100644 tests/style-infra.mak
 create mode 100644 tests/style.mak

Comments

Peter Maydell July 4, 2022, 3:46 p.m. UTC | #1
On Mon, 4 Jul 2022 at 16:23, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> Historically QEMU has used the 'scripts/checkpatch.pl' script to
> validate various style rules but there are a number of issues:

>  meson.build              |   3 +
>  tests/Makefile.include   |   3 +-
>  tests/meson.build        |  19 +++
>  tests/style-excludes.mak |   4 +
>  tests/style-infra.mak    | 265 +++++++++++++++++++++++++++++++++++++++
>  tests/style.mak          |  24 ++++

From my point of view the main issue with checkpatch.pl is
that nobody in the QEMU developers particularly understands
it or is enthusiastic about trying to add more tests to it
or adjust the existing ones where QEMU style diverges from
the kernel style (but nor are we tracking and upgrading to
newer versions of the kernel's script).

This seems to be aiming to replace a complex and hard to
understand perl script with a complex and hard to understand
makefile. I can't say I'm terribly enthusiastic :-/

thanks
-- PMM
Daniel P. Berrangé July 4, 2022, 4:12 p.m. UTC | #2
On Mon, Jul 04, 2022 at 04:46:53PM +0100, Peter Maydell wrote:
> On Mon, 4 Jul 2022 at 16:23, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > Historically QEMU has used the 'scripts/checkpatch.pl' script to
> > validate various style rules but there are a number of issues:
> 
> >  meson.build              |   3 +
> >  tests/Makefile.include   |   3 +-
> >  tests/meson.build        |  19 +++
> >  tests/style-excludes.mak |   4 +
> >  tests/style-infra.mak    | 265 +++++++++++++++++++++++++++++++++++++++
> >  tests/style.mak          |  24 ++++
> 
> From my point of view the main issue with checkpatch.pl is
> that nobody in the QEMU developers particularly understands
> it or is enthusiastic about trying to add more tests to it
> or adjust the existing ones where QEMU style diverges from
> the kernel style (but nor are we tracking and upgrading to
> newer versions of the kernel's script).
> 
> This seems to be aiming to replace a complex and hard to
> understand perl script with a complex and hard to understand
> makefile. I can't say I'm terribly enthusiastic :-/

I think the downsides comapred here are rather different orders of
magnitude. The checkpatch.pl script is 3000 lines of code where we
have years of experiance that no one in QEMU likes touching it.

The makefile here is 265 lines of which 50% is comments of license
text.  In terms of what contributors most care about though, is
how you add new rules, and most of the time that's involves just
adding a 3 line make rule based off a regex to match the code
pattern you want to prohibit. Some of this is a bit crufty to
look at, but we've got years of experiance in libvirt with many
contributors frequently adding new tests.

It only gets hairy if the pattern you're trying to forbid needs
to match across multiple lines of text - hence the difference in
complexity between matching 'osdep.h' exists in .c, vs 'osdep.h'
exists as the very first #include.  IME, the single-line matches
are most typical need that is addressed.

So while I wont claim this proposal is perfect, IMHO this would
be a significant step fowards over checkpatch.pl.

With regards,
Daniel
Daniel P. Berrangé July 7, 2022, 4:43 p.m. UTC | #3
On Mon, Jul 04, 2022 at 04:46:53PM +0100, Peter Maydell wrote:
> On Mon, 4 Jul 2022 at 16:23, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > Historically QEMU has used the 'scripts/checkpatch.pl' script to
> > validate various style rules but there are a number of issues:
> 
> >  meson.build              |   3 +
> >  tests/Makefile.include   |   3 +-
> >  tests/meson.build        |  19 +++
> >  tests/style-excludes.mak |   4 +
> >  tests/style-infra.mak    | 265 +++++++++++++++++++++++++++++++++++++++
> >  tests/style.mak          |  24 ++++
> 
> From my point of view the main issue with checkpatch.pl is
> that nobody in the QEMU developers particularly understands
> it or is enthusiastic about trying to add more tests to it
> or adjust the existing ones where QEMU style diverges from
> the kernel style (but nor are we tracking and upgrading to
> newer versions of the kernel's script).
> 
> This seems to be aiming to replace a complex and hard to
> understand perl script with a complex and hard to understand
> makefile. I can't say I'm terribly enthusiastic :-/

Taking that feedback on board, I've proposed a new version which is
written in Python, and uses a plain yaml config file, which I admit
results in an easier to understand and more attractive impl than
this makefile based one.

With regards,
Daniel
diff mbox series

Patch

diff --git a/meson.build b/meson.build
index 65a885ea69..420102353e 100644
--- a/meson.build
+++ b/meson.build
@@ -18,6 +18,9 @@  config_host = keyval.load(meson.current_build_dir() / 'config-host.mak')
 enable_modules = 'CONFIG_MODULES' in config_host
 enable_static = 'CONFIG_STATIC' in config_host
 
+make = find_program(config_host['MAKE'])
+in_gitrepo = run_command('test', '-d', '.git', check: false).returncode() == 0
+
 # Allow both shared and static libraries unless --enable-static
 static_kwargs = enable_static ? {'static': true} : {}
 
diff --git a/tests/Makefile.include b/tests/Makefile.include
index 3accb83b13..f7c1d2644e 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -3,12 +3,13 @@ 
 .PHONY: check-help
 check-help:
 	@echo "Regression testing targets:"
-	@echo " $(MAKE) check                  Run block, qapi-schema, unit, softfloat, qtest and decodetree tests"
+	@echo " $(MAKE) check                  Run block, qapi-schema, unit, style, softfloat, qtest and decodetree tests"
 	@echo " $(MAKE) bench                  Run speed tests"
 	@echo
 	@echo "Individual test suites:"
 	@echo " $(MAKE) check-qtest-TARGET     Run qtest tests for given target"
 	@echo " $(MAKE) check-qtest            Run qtest tests"
+	@echo " $(MAKE) check-style            Run style checks"
 	@echo " $(MAKE) check-unit             Run qobject tests"
 	@echo " $(MAKE) check-qapi-schema      Run QAPI schema tests"
 	@echo " $(MAKE) check-block            Run block tests"
diff --git a/tests/meson.build b/tests/meson.build
index 8e318ec513..e26cadbc8a 100644
--- a/tests/meson.build
+++ b/tests/meson.build
@@ -89,6 +89,25 @@  if get_option('tcg').allowed()
   endif
 endif
 
+if in_gitrepo
+  rc = run_command(
+    'sed', '-n',
+    's/^sc_\\([a-zA-Z0-9_-]*\\):.*/\\1/p',
+    meson.current_source_dir() / 'style.mak',
+    check: true,
+  )
+
+  sc_tests = rc.stdout().strip().split()
+
+  foreach check: sc_tests
+     test(check,
+          make,
+          args: [ '-f', files('style.mak'), 'sc_' + check ],
+          workdir: meson.project_source_root(),
+          suite: 'style')
+  endforeach
+endif
+
 subdir('unit')
 subdir('qapi-schema')
 subdir('qtest')
diff --git a/tests/style-excludes.mak b/tests/style-excludes.mak
new file mode 100644
index 0000000000..32c0e9c328
--- /dev/null
+++ b/tests/style-excludes.mak
@@ -0,0 +1,4 @@ 
+# -*- makefile -*-
+#
+# Filenames that should be excluded from specific
+# style checks performed by style.mak
diff --git a/tests/style-infra.mak b/tests/style-infra.mak
new file mode 100644
index 0000000000..99229f0c3f
--- /dev/null
+++ b/tests/style-infra.mak
@@ -0,0 +1,265 @@ 
+# -*- makefile -*-
+#
+# Rules for simple code style checks applying across the
+# whole tree. Partially derived from GNULIB's 'maint.mk'
+#
+# Copyright (C) 2008-2019 Red Hat, Inc.
+# Copyright (C) 2003-2019 Free Software Foundation, Inc.
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see
+# <http://www.gnu.org/licenses/>.
+
+ME := tests/style.mak
+
+srcdir = .
+
+AWK ?= awk
+GREP ?= grep
+SED ?= sed
+
+# Helper variables.
+_empty =
+_sp = $(_empty) $(_empty)
+
+VC_LIST = (cd $(srcdir) && git ls-tree -r 'HEAD:' | \
+	sed -n "s|^100[^	]*.||p" )
+
+# You can override this variable in cfg.mk if your gnulib submodule lives
+# in a different location.
+gnulib_dir ?= $(shell if test -f $(srcdir)/gnulib/gnulib-tool; then \
+			echo $(srcdir)/gnulib; \
+		else \
+			echo ${GNULIB_SRCDIR}; \
+		fi)
+
+# You can override this variable in cfg.mk to set your own regexp
+# matching files to ignore.
+VC_LIST_ALWAYS_EXCLUDE_REGEX ?= ^$$
+
+# This is to preprocess robustly the output of $(VC_LIST), so that even
+# when $(srcdir) is a pathological name like "....", the leading sed command
+# removes only the intended prefix.
+_dot_escaped_srcdir = $(subst .,\.,$(srcdir))
+
+# Post-process $(VC_LIST) output, prepending $(srcdir)/, but only
+# when $(srcdir) is not ".".
+ifeq ($(srcdir),.)
+  _prepend_srcdir_prefix =
+else
+  _prepend_srcdir_prefix = | $(SED) 's|^|$(srcdir)/|'
+endif
+
+# In order to be able to consistently filter "."-relative names,
+# (i.e., with no $(srcdir) prefix), this definition is careful to
+# remove any $(srcdir) prefix, and to restore what it removes.
+_sc_excl = \
+  $(or $(subst $(_sp),|,$(exclude_file_name_regexp--$@)),^$$)
+
+VC_LIST_EXCEPT = \
+  $(VC_LIST) | $(SED) 's|^$(_dot_escaped_srcdir)/||' \
+	| if test -f $(srcdir)/.x-$@; then $(GREP) -vEf $(srcdir)/.x-$@; \
+	  else $(GREP) -Ev -e "$${VC_LIST_EXCEPT_DEFAULT-ChangeLog}"; fi \
+	| $(GREP) -Ev -e '($(VC_LIST_ALWAYS_EXCLUDE_REGEX)|$(_sc_excl))' \
+	$(_prepend_srcdir_prefix)
+
+
+# Prevent programs like 'sort' from considering distinct strings to be equal.
+# Doing it here saves us from having to set LC_ALL elsewhere in this file.
+export LC_ALL = C
+
+# Collect the names of rules starting with 'sc_'.
+syntax-check-rules := $(sort $(shell env LC_ALL=C $(SED) -n \
+   's/^\(sc_[a-zA-Z0-9_-]*\):.*/\1/p' $(srcdir)/$(ME)))
+.PHONY: $(syntax-check-rules)
+
+ifeq ($(shell $(VC_LIST) >/dev/null 2>&1; echo $$?),0)
+  local-checks-available += $(syntax-check-rules)
+else
+  local-checks-available += no-vc-detected
+no-vc-detected:
+	@echo "No version control files detected; skipping syntax check"
+endif
+.PHONY: $(local-checks-available)
+
+# Arrange to print the name of each syntax-checking rule just before running it.
+$(syntax-check-rules): %: %.m
+sc_m_rules_ = $(patsubst %, %.m, $(syntax-check-rules))
+.PHONY: $(sc_m_rules_)
+$(sc_m_rules_):
+	@echo $(patsubst sc_%.m, %, $@)
+	@date +%s.%N > .sc-start-$(basename $@)
+
+# Compute and print the elapsed time for each syntax-check rule.
+sc_z_rules_ = $(patsubst %, %.z, $(syntax-check-rules))
+.PHONY: $(sc_z_rules_)
+$(sc_z_rules_): %.z: %
+	@end=$$(date +%s.%N);						\
+	start=$$(cat .sc-start-$*);					\
+	rm -f .sc-start-$*;						\
+	$(AWK) -v s=$$start -v e=$$end					\
+	  'END {printf "%.2f $(patsubst sc_%,%,$*)\n", e - s}' < /dev/null
+
+# The patsubst here is to replace each sc_% rule with its sc_%.z wrapper
+# that computes and prints elapsed time.
+local-check :=								\
+  $(patsubst sc_%, sc_%.z,						\
+    $(filter-out $(local-checks-to-skip), $(local-checks-available)))
+
+syntax-check: $(local-check)
+
+.DEFAULT_GOAL := syntax-check
+
+# _sc_search_regexp
+#
+# This macro searches for a given construct in the selected files and
+# then takes some action.
+#
+# Parameters (shell variables):
+#
+#  prohibit | require
+#
+#     Regular expression (ERE) denoting either a forbidden construct
+#     or a required construct.  Those arguments are exclusive.
+#
+#  exclude
+#
+#     Regular expression (ERE) denoting lines to ignore that matched
+#     a prohibit construct.  For example, this can be used to exclude
+#     comments that mention why the nearby code uses an alternative
+#     construct instead of the simpler prohibited construct.
+#
+#  in_vc_files | in_files
+#
+#     grep-E-style regexp selecting the files to check.  For in_vc_files,
+#     the regexp is used to select matching files from the list of all
+#     version-controlled files; for in_files, it's from the names printed
+#     by "find $(srcdir)".  When neither is specified, use all files that
+#     are under version control.
+#
+#  containing | non_containing
+#
+#     Select the files (non) containing strings matching this regexp.
+#     If both arguments are specified then CONTAINING takes
+#     precedence.
+#
+#  with_grep_options
+#
+#     Extra options for grep.
+#
+#  ignore_case
+#
+#     Ignore case.
+#
+#  halt
+#
+#     Message to display before to halting execution.
+#
+# Finally, you may exempt files based on an ERE matching file names.
+# For example, to exempt from the sc_space_tab check all files with the
+# .diff suffix, set this Make variable:
+#
+# exclude_file_name_regexp--sc_space_tab = \.diff$
+#
+# Note that while this functionality is mostly inherited via VC_LIST_EXCEPT,
+# when filtering by name via in_files, we explicitly filter out matching
+# names here as well.
+
+# Initialize each, so that envvar settings cannot interfere.
+export require =
+export prohibit =
+export exclude =
+export in_vc_files =
+export in_files =
+export containing =
+export non_containing =
+export halt =
+export with_grep_options =
+
+# By default, _sc_search_regexp does not ignore case.
+export ignore_case =
+_ignore_case = $$(test -n "$$ignore_case" && printf %s -i || :)
+
+define _sc_say_and_exit
+   dummy=; : so we do not need a semicolon before each use;		\
+   { printf '%s\n' "$(ME): $$msg" 1>&2; exit 1; };
+endef
+
+define _sc_search_regexp
+   dummy=; : so we do not need a semicolon before each use;		\
+									\
+   : Check arguments;							\
+   test -n "$$prohibit" && test -n "$$require"				\
+     && { msg='Cannot specify both prohibit and require'		\
+          $(_sc_say_and_exit) } || :;					\
+   test -z "$$prohibit" && test -z "$$require"				\
+     && { msg='Should specify either prohibit or require'		\
+          $(_sc_say_and_exit) } || :;					\
+   test -z "$$prohibit" && test -n "$$exclude"				\
+     && { msg='Use of exclude requires a prohibit pattern'		\
+          $(_sc_say_and_exit) } || :;					\
+   test -n "$$in_vc_files" && test -n "$$in_files"			\
+     && { msg='Cannot specify both in_vc_files and in_files'		\
+          $(_sc_say_and_exit) } || :;					\
+   test "x$$halt" != x							\
+     || { msg='halt not defined' $(_sc_say_and_exit) };			\
+									\
+   : Filter by file name;						\
+   if test -n "$$in_files"; then					\
+     files=$$(find $(srcdir) | $(GREP) -E "$$in_files"			\
+              | $(GREP) -Ev '$(_sc_excl)');				\
+   else									\
+     files=$$($(VC_LIST_EXCEPT));					\
+     if test -n "$$in_vc_files"; then					\
+       files=$$(echo "$$files" | $(GREP) -E "$$in_vc_files");		\
+     fi;								\
+   fi;									\
+									\
+   : Filter by content;							\
+   test -n "$$files"							\
+     && test -n "$$containing"						\
+     && { files=$$(echo "$$files" | xargs $(GREP) -l "$$containing"); }	\
+     || :;								\
+   test -n "$$files"							\
+     && test -n "$$non_containing"					\
+     && { files=$$(echo "$$files" | xargs $(GREP) -vl "$$non_containing"); } \
+     || :;								\
+									\
+   : Check for the construct;						\
+   if test -n "$$files"; then						\
+     if test -n "$$prohibit"; then					\
+       echo "$$files"							\
+         | xargs $(GREP) $$with_grep_options $(_ignore_case) -nE	\
+		"$$prohibit" /dev/null					\
+         | $(GREP) -vE "$${exclude:-^$$}"				\
+         && { msg="$$halt" $(_sc_say_and_exit) }			\
+         || :;								\
+     else								\
+       echo "$$files"							\
+         | xargs							\
+             $(GREP) $$with_grep_options $(_ignore_case) -LE "$$require" \
+         | $(GREP) .							\
+         && { msg="$$halt" $(_sc_say_and_exit) }			\
+         || :;								\
+     fi									\
+   else :;								\
+   fi || :;
+endef
+
+# Perl block to convert a match to FILE_NAME:LINENO:TEST
+perl_filename_lineno_text_ =						\
+    -e '  {'								\
+    -e '    $$n = ($$` =~ tr/\n/\n/ + 1);'				\
+    -e '    ($$v = $$&) =~ s/\n/\\n/g;'					\
+    -e '    print "$$ARGV:$$n:$$v\n";'					\
+    -e '  }'
diff --git a/tests/style.mak b/tests/style.mak
new file mode 100644
index 0000000000..32c7e706ba
--- /dev/null
+++ b/tests/style.mak
@@ -0,0 +1,24 @@ 
+# -*- makefile -*-
+#
+# Rules for simple code style checks applying across the
+# whole tree. Partially derived from GNULIB's 'maint.mk'
+#
+# Copyright (C) 2008-2019 Red Hat, Inc.
+# Copyright (C) 2003-2019 Free Software Foundation, Inc.
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see
+# <http://www.gnu.org/licenses/>.
+
+include tests/style-infra.mak
+include tests/style-excludes.mak