Message ID | afe5ed943414f7ec3044c1547503b9941686a867.1709629403.git.jani.nikula@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm: enable W=1 warnings by default across the subsystem | expand |
Jani Nikula <jani.nikula@intel.com> writes: Hello Jani, > Add kconfig to enable -Werror subsystem wide. This is useful for > development and CI to keep the subsystem warning free, while avoiding > issues outside of the subsystem that kernel wide CONFIG_WERROR=y might > hit. > > v2: Don't depend on COMPILE_TEST > > Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com> # v1 > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > --- Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
On Tue, Mar 05, 2024 at 11:07:36AM +0200, Jani Nikula wrote: > Add kconfig to enable -Werror subsystem wide. This is useful for > development and CI to keep the subsystem warning free, while avoiding > issues outside of the subsystem that kernel wide CONFIG_WERROR=y might > hit. > > v2: Don't depend on COMPILE_TEST > > Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com> # v1 > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > --- > drivers/gpu/drm/Kconfig | 13 +++++++++++++ > drivers/gpu/drm/Makefile | 3 +++ > 2 files changed, 16 insertions(+) > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > index 6e853acf15da..c08e18108c2a 100644 > --- a/drivers/gpu/drm/Kconfig > +++ b/drivers/gpu/drm/Kconfig > @@ -416,3 +416,16 @@ config DRM_LIB_RANDOM > config DRM_PRIVACY_SCREEN > bool > default n > + > +config DRM_WERROR > + bool "Compile the drm subsystem with warnings as errors" > + depends on EXPERT > + default n > + help > + A kernel build should not cause any compiler warnings, and this > + enables the '-Werror' flag to enforce that rule in the drm subsystem. > + > + The drm subsystem enables more warnings than the kernel default, so > + this config option is disabled by default. > + > + If in doubt, say N. While I understand the desire for an easy switch that maintainers and developers can use to ensure that their changes are warning free for the drm subsystem specifically, I think subsystem specific configuration options like this are actively detrimental to developers and continuous integration systems that build test the entire kernel. For example, we turned off CONFIG_WERROR for our Hexagon builds because of warnings that appear with -Wextra that are legitimate but require treewide changes to resolve in a manner sufficient for Linus: https://github.com/ClangBuiltLinux/linux/issues/1285 https://lore.kernel.org/all/CAHk-=wg80je=K7madF4e7WrRNp37e3qh6y10Svhdc7O8SZ_-8g@mail.gmail.com/ https://lore.kernel.org/all/20230522105049.1467313-1-schnelle@linux.ibm.com/ But now, due to CONFIG_DRM_WERROR getting enabled by all{mod,yes}config and -Wextra being unconditionally enabled for DRM, those warnings hard break the build despite CONFIG_WERROR=n... https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2eEBDGEqfmMZjGg3ZvDx2af2pde/build.log Same thing with PowerPC allmodconfig because we see -Wframe-larger-than that appears because allmodconfig enables CONFIG_KASAN or CONFIG_KCSAN usually: https://storage.tuxsuite.com/public/clangbuiltlinux/continuous-integration2/builds/2eE2HDsODudQGqkMKAPQnId7pRd/build.log I don't know what the solution for this conflict is through. I guess it is just the nature of the kernel being a federation of independent subsystems that want to have their own policies. I suppose we can just set CONFIG_DRM_WERROR=n and be done with it but I would like to avoid this issue from spreading to other subsystems because it does not scale for folks like us who do many builds across many trees. It would be nice if there was something like CONFIG_WERROR_DIRS or something that could take a set of directories that should have -Werror enabled so that you could do something like CONFIG_WERROR_DIRS="drivers/gpu/drm" and have -Werror automatically added to all commands within that directory like subdir-ccflags-y but it is explicitly opt in on the part of the developer/tester, rather than just happening to get enabled due to all{mod,yes}config. No idea if that is feasible or not though. > diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile > index ea456f057e8a..a73c04d2d7a3 100644 > --- a/drivers/gpu/drm/Makefile > +++ b/drivers/gpu/drm/Makefile > @@ -30,6 +30,9 @@ subdir-ccflags-y += -Wno-sign-compare > endif > # --- end copy-paste > > +# Enable -Werror in CI and development > +subdir-ccflags-$(CONFIG_DRM_WERROR) += -Werror > + > drm-y := \ > drm_aperture.o \ > drm_atomic.o \ > -- > 2.39.2 >
Hi, On Tue, Mar 26, 2024 at 03:56:50PM -0700, Nathan Chancellor wrote: > On Tue, Mar 05, 2024 at 11:07:36AM +0200, Jani Nikula wrote: > > Add kconfig to enable -Werror subsystem wide. This is useful for > > development and CI to keep the subsystem warning free, while avoiding > > issues outside of the subsystem that kernel wide CONFIG_WERROR=y might > > hit. > > > > v2: Don't depend on COMPILE_TEST > > > > Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com> # v1 > > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > > --- > > drivers/gpu/drm/Kconfig | 13 +++++++++++++ > > drivers/gpu/drm/Makefile | 3 +++ > > 2 files changed, 16 insertions(+) > > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > > index 6e853acf15da..c08e18108c2a 100644 > > --- a/drivers/gpu/drm/Kconfig > > +++ b/drivers/gpu/drm/Kconfig > > @@ -416,3 +416,16 @@ config DRM_LIB_RANDOM > > config DRM_PRIVACY_SCREEN > > bool > > default n > > + > > +config DRM_WERROR > > + bool "Compile the drm subsystem with warnings as errors" > > + depends on EXPERT > > + default n > > + help > > + A kernel build should not cause any compiler warnings, and this > > + enables the '-Werror' flag to enforce that rule in the drm subsystem. > > + > > + The drm subsystem enables more warnings than the kernel default, so > > + this config option is disabled by default. > > + > > + If in doubt, say N. > > While I understand the desire for an easy switch that maintainers and > developers can use to ensure that their changes are warning free for the > drm subsystem specifically, I think subsystem specific configuration > options like this are actively detrimental to developers and continuous > integration systems that build test the entire kernel. For example, we > turned off CONFIG_WERROR for our Hexagon builds because of warnings that > appear with -Wextra that are legitimate but require treewide changes to > resolve in a manner sufficient for Linus: > > https://github.com/ClangBuiltLinux/linux/issues/1285 > https://lore.kernel.org/all/CAHk-=wg80je=K7madF4e7WrRNp37e3qh6y10Svhdc7O8SZ_-8g@mail.gmail.com/ > https://lore.kernel.org/all/20230522105049.1467313-1-schnelle@linux.ibm.com/ > > But now, due to CONFIG_DRM_WERROR getting enabled by all{mod,yes}config > and -Wextra being unconditionally enabled for DRM, those warnings hard > break the build despite CONFIG_WERROR=n... Would making DRM_WERROR depends on WERROR address your concerns? Maxime
On Wed, 27 Mar 2024, Maxime Ripard <mripard@kernel.org> wrote: > Hi, > > On Tue, Mar 26, 2024 at 03:56:50PM -0700, Nathan Chancellor wrote: >> On Tue, Mar 05, 2024 at 11:07:36AM +0200, Jani Nikula wrote: >> > Add kconfig to enable -Werror subsystem wide. This is useful for >> > development and CI to keep the subsystem warning free, while avoiding >> > issues outside of the subsystem that kernel wide CONFIG_WERROR=y might >> > hit. >> > >> > v2: Don't depend on COMPILE_TEST >> > >> > Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com> # v1 >> > Signed-off-by: Jani Nikula <jani.nikula@intel.com> >> > --- >> > drivers/gpu/drm/Kconfig | 13 +++++++++++++ >> > drivers/gpu/drm/Makefile | 3 +++ >> > 2 files changed, 16 insertions(+) >> > >> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig >> > index 6e853acf15da..c08e18108c2a 100644 >> > --- a/drivers/gpu/drm/Kconfig >> > +++ b/drivers/gpu/drm/Kconfig >> > @@ -416,3 +416,16 @@ config DRM_LIB_RANDOM >> > config DRM_PRIVACY_SCREEN >> > bool >> > default n >> > + >> > +config DRM_WERROR >> > + bool "Compile the drm subsystem with warnings as errors" >> > + depends on EXPERT >> > + default n >> > + help >> > + A kernel build should not cause any compiler warnings, and this >> > + enables the '-Werror' flag to enforce that rule in the drm subsystem. >> > + >> > + The drm subsystem enables more warnings than the kernel default, so >> > + this config option is disabled by default. >> > + >> > + If in doubt, say N. >> >> While I understand the desire for an easy switch that maintainers and >> developers can use to ensure that their changes are warning free for the >> drm subsystem specifically, I think subsystem specific configuration >> options like this are actively detrimental to developers and continuous >> integration systems that build test the entire kernel. For example, we >> turned off CONFIG_WERROR for our Hexagon builds because of warnings that >> appear with -Wextra that are legitimate but require treewide changes to >> resolve in a manner sufficient for Linus: >> >> https://github.com/ClangBuiltLinux/linux/issues/1285 >> https://lore.kernel.org/all/CAHk-=wg80je=K7madF4e7WrRNp37e3qh6y10Svhdc7O8SZ_-8g@mail.gmail.com/ >> https://lore.kernel.org/all/20230522105049.1467313-1-schnelle@linux.ibm.com/ >> >> But now, due to CONFIG_DRM_WERROR getting enabled by all{mod,yes}config >> and -Wextra being unconditionally enabled for DRM, those warnings hard >> break the build despite CONFIG_WERROR=n... > > Would making DRM_WERROR depends on WERROR address your concerns? But then what would be the point of having DRM_WERROR at all? For me the point is, "werror in drm, ignore the rest, they're someone else's problem". An alternative would be to "depends on !COMPILE_TEST" that we have in i915, but then some folks want to have COMPILE_TEST in drm, because some drivers are otherwise hard for people to build. Nathan, we do want to fix any issues switfly. Are you hitting specific build problems? BR, Jani. > > Maxime
On Wed, Mar 27, 2024 at 09:59:01AM +0200, Jani Nikula wrote: > On Wed, 27 Mar 2024, Maxime Ripard <mripard@kernel.org> wrote: > > On Tue, Mar 26, 2024 at 03:56:50PM -0700, Nathan Chancellor wrote: > >> On Tue, Mar 05, 2024 at 11:07:36AM +0200, Jani Nikula wrote: > >> > Add kconfig to enable -Werror subsystem wide. This is useful for > >> > development and CI to keep the subsystem warning free, while avoiding > >> > issues outside of the subsystem that kernel wide CONFIG_WERROR=y might > >> > hit. > >> > > >> > v2: Don't depend on COMPILE_TEST > >> > > >> > Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com> # v1 > >> > Signed-off-by: Jani Nikula <jani.nikula@intel.com> > >> > --- > >> > drivers/gpu/drm/Kconfig | 13 +++++++++++++ > >> > drivers/gpu/drm/Makefile | 3 +++ > >> > 2 files changed, 16 insertions(+) > >> > > >> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > >> > index 6e853acf15da..c08e18108c2a 100644 > >> > --- a/drivers/gpu/drm/Kconfig > >> > +++ b/drivers/gpu/drm/Kconfig > >> > @@ -416,3 +416,16 @@ config DRM_LIB_RANDOM > >> > config DRM_PRIVACY_SCREEN > >> > bool > >> > default n > >> > + > >> > +config DRM_WERROR > >> > + bool "Compile the drm subsystem with warnings as errors" > >> > + depends on EXPERT > >> > + default n > >> > + help > >> > + A kernel build should not cause any compiler warnings, and this > >> > + enables the '-Werror' flag to enforce that rule in the drm subsystem. > >> > + > >> > + The drm subsystem enables more warnings than the kernel default, so > >> > + this config option is disabled by default. > >> > + > >> > + If in doubt, say N. > >> > >> While I understand the desire for an easy switch that maintainers and > >> developers can use to ensure that their changes are warning free for the > >> drm subsystem specifically, I think subsystem specific configuration > >> options like this are actively detrimental to developers and continuous > >> integration systems that build test the entire kernel. For example, we > >> turned off CONFIG_WERROR for our Hexagon builds because of warnings that > >> appear with -Wextra that are legitimate but require treewide changes to > >> resolve in a manner sufficient for Linus: > >> > >> https://github.com/ClangBuiltLinux/linux/issues/1285 > >> https://lore.kernel.org/all/CAHk-=wg80je=K7madF4e7WrRNp37e3qh6y10Svhdc7O8SZ_-8g@mail.gmail.com/ > >> https://lore.kernel.org/all/20230522105049.1467313-1-schnelle@linux.ibm.com/ > >> > >> But now, due to CONFIG_DRM_WERROR getting enabled by all{mod,yes}config > >> and -Wextra being unconditionally enabled for DRM, those warnings hard > >> break the build despite CONFIG_WERROR=n... > > > > Would making DRM_WERROR depends on WERROR address your concerns? > > But then what would be the point of having DRM_WERROR at all? For me the > point is, "werror in drm, ignore the rest, they're someone else's > problem". Right, I do think this is a valid view point and one I am sympathetic to, especially since it is in the pursuit of increased code quality. I do not want to disrupt that. > An alternative would be to "depends on !COMPILE_TEST" that we have in > i915, but then some folks want to have COMPILE_TEST in drm, because some > drivers are otherwise hard for people to build. Right. I think it is unfortunate how (at least in my opinion) CONFIG_COMPILE_TEST has two meanings: genuinely just compile testing or "allmodconfig". For the first case, we would want CONFIG_DRM_WERROR=y but for the second case, it would be nice for CONFIG_DRM_WERROR to default to off (because CONFIG_WERROR is enabled) but allow developers to turn it on explicitly. Another lofty/wistful idea to solve this would be to implement something similar to compiler diagnostic groups for Kconfig, where there would be a hierarchy like CONFIG_WERROR CONFIG_DRM_WERROR CONFIG_SUBSYSTEM_A_WERROR CONFIG_SUBSYSTEM_B_WERROR where the value of CONFIG_WERROR is the same value for all subconfigurations under it but they could still be enabled individually without any additional dependencies (ala something like '-Wno-unused -Wunused-variable'), which would allow my use case of CONFIG_WERROR=n removing all instances of -Werror to continue to work but allow other developers and CI systems to just set their specific -Werror configuration and be done with it. I don't think something like that exists but maybe I don't know Kconfig as well as I think I do :) > Nathan, we do want to fix any issues switfly. Are you hitting specific > build problems? Yes, I see three distinct set of problems from our CI as a direct result of this series. I already covered two in the prior mail but I'll be a little more expansive below. 1. Instances of -Wunused-but-set-variable from variables that only have unary operations applied to them. Clang can warn in this case where GCC cannot: https://godbolt.org/z/d368q3coP int main(void) { int a = 0; a++; return 0; } which shows up in a few drm drivers. Most have a patch on the mailing list that has not been applied. drivers/gpu/drm/qxl/qxl_cmd.c:424:6: error: variable 'count' set but not used [-Werror,-Wunused-but-set-variable] 424 | int count = 0; | ^ https://lore.kernel.org/all/20230408165023.2706235-1-trix@redhat.com/ (almost a year old) drivers/gpu/drm/qxl/qxl_ioctl.c:148:14: error: variable 'num_relocs' set but not used [-Werror,-Wunused-but-set-variable] 148 | int i, ret, num_relocs; | ^ https://lore.kernel.org/all/20240307104119.1980621-1-colin.i.king@gmail.com/ drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c:843:6: error: variable 'out' set but not used [-Werror,-Wunused-but-set-variable] 843 | u64 out = dumper->iova + A6XX_CD_DATA_OFFSET; | ^ https://lore.kernel.org/all/20240326212324.185832-1-ojeda@kernel.org/ (recent patch) drivers/gpu/drm/panthor/panthor_sched.c:2048:6: error: variable 'csg_mod_mask' set but not used [-Werror,-Wunused-but-set-variable] 2048 | u32 csg_mod_mask = 0, free_csg_slots = 0; | ^ No patch, new driver, not reported yet it seems. 2. High stack usage in AMDGPU files for ARCH=powerpc allmodconfig. This might be a compiler issue but until now, there have been more important fires. drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:1096:12: error: stack frame size (2064) exceeds limit (2048) in 'vcn_v3_0_start' [-Werror,-Wframe-larger-than] 1096 | static int vcn_v3_0_start(struct amdgpu_device *adev) | ^ drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:955:12: error: stack frame size (2112) exceeds limit (2048) in 'vcn_v4_0_5_start' [-Werror,-Wframe-larger-than] 955 | static int vcn_v4_0_5_start(struct amdgpu_device *adev) | ^ drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:713:12: error: stack frame size (2992) exceeds limit (2048) in 'vcn_v5_0_0_start' [-Werror,-Wframe-larger-than] 713 | static int vcn_v5_0_0_start(struct amdgpu_device *adev) | ^ Taking a brief look at it while writing this email, it appears related to CONFIG_UBSAN_BOUNDS, as none of the warnings appear when that is disabled on top of allmodconfig. I suspect that the sanitizer instrumentation and inlining might be messing something up here, it has happened with other sanitizers like KASAN and KCSAN in the past. Without CONFIG_UBSAN_BOUNDS, the stack usage of these functions does not seem too bad: drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:1096:12: warning: stack frame size (816) exceeds limit (512) in 'vcn_v3_0_start' [-Wframe-larger-than] drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c:955:12: warning: stack frame size (800) exceeds limit (512) in 'vcn_v4_0_5_start' [-Wframe-larger-than] drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:713:12: warning: stack frame size (1040) exceeds limit (512) in 'vcn_v5_0_0_start' [-Wframe-larger-than] 3. -Wnull-pointer-arithmetic from IO port accessors on architectures that do not have them (such as hexagon and s390). For example: In file included from drivers/gpu/drm/virtio/virtgpu_plane.c:26: In file included from include/drm/drm_atomic_helper.h:31: In file included from include/drm/drm_crtc.h:32: In file included from include/drm/drm_modes.h:33: In file included from include/drm/drm_connector.h:32: In file included from include/drm/drm_util.h:35: In file included from include/linux/interrupt.h:11: In file included from include/linux/hardirq.h:11: In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1: In file included from include/asm-generic/hardirq.h:17: In file included from include/linux/irq.h:20: In file included from include/linux/io.h:13: In file included from arch/hexagon/include/asm/io.h:328: include/asm-generic/io.h:584:33: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic] 584 | __raw_writeb(value, PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:594:59: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic] 594 | __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ include/asm-generic/io.h:604:59: error: performing pointer arithmetic on a null pointer has undefined behavior [-Werror,-Wnull-pointer-arithmetic] 604 | __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr); | ~~~~~~~~~~ ^ These warnings are numerous throughout drivers/gpu/drm/ because this warning is enabled with -Wextra. Again, this is not exactly your problem and it should eventually be fixed by [1] (it appears that Niklas is working on a new version at [2]) but it is exacerbated by the default combo of W=1 + -Werror for DRM with allmodconfig now, even with CONFIG_WERROR=n. Hopefully that helps clear things up. I am more than happy to send patches or work towards solutions that satisfies everyone (or at least a majority/consensus). Wider testing with clang never hurts as well but I understand increasing build matrices is not always an easy sell. [1]: https://lore.kernel.org/all/20230522105049.1467313-45-schnelle@linux.ibm.com/ [2]: https://git.kernel.org/niks/l/has_ioport_v6 Cheers, Nathan
On Wed, 27 Mar 2024, Nathan Chancellor <nathan@kernel.org> wrote: > On Wed, Mar 27, 2024 at 09:59:01AM +0200, Jani Nikula wrote: >> An alternative would be to "depends on !COMPILE_TEST" that we have in >> i915, but then some folks want to have COMPILE_TEST in drm, because some >> drivers are otherwise hard for people to build. > > Right. I think it is unfortunate how (at least in my opinion) > CONFIG_COMPILE_TEST has two meanings: genuinely just compile testing or > "allmodconfig". For the first case, we would want CONFIG_DRM_WERROR=y > but for the second case, it would be nice for CONFIG_DRM_WERROR to > default to off (because CONFIG_WERROR is enabled) but allow developers > to turn it on explicitly. Yes, CONFIG_COMPILE_TEST has become overloaded. > Another lofty/wistful idea to solve this would be to implement something > similar to compiler diagnostic groups for Kconfig, where there would be > a hierarchy like > > CONFIG_WERROR > CONFIG_DRM_WERROR > CONFIG_SUBSYSTEM_A_WERROR > CONFIG_SUBSYSTEM_B_WERROR > > where the value of CONFIG_WERROR is the same value for all > subconfigurations under it but they could still be enabled individually > without any additional dependencies (ala something like '-Wno-unused > -Wunused-variable'), which would allow my use case of CONFIG_WERROR=n > removing all instances of -Werror to continue to work but allow other > developers and CI systems to just set their specific -Werror > configuration and be done with it. I don't think something like that > exists but maybe I don't know Kconfig as well as I think I do :) Yet another idea is to have a way to mark a config option "manual", that is, never enable this automatically under any circumstances, not in make allyesconfig or allmodconfig, don't ask in make oldconfig, don't allow selects. The only way to enable is to toggle it manually. If you want it and enable it and see problems, it's on you. CONFIG_WERROR and CONFIG_DRM_WERROR could both be like this. The problem with them is that they're not so much different configurations, they are about how to deal with build errors, and that's not really what, say, make allyesconfig should be about. BR, Jani.
diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 6e853acf15da..c08e18108c2a 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -416,3 +416,16 @@ config DRM_LIB_RANDOM config DRM_PRIVACY_SCREEN bool default n + +config DRM_WERROR + bool "Compile the drm subsystem with warnings as errors" + depends on EXPERT + default n + help + A kernel build should not cause any compiler warnings, and this + enables the '-Werror' flag to enforce that rule in the drm subsystem. + + The drm subsystem enables more warnings than the kernel default, so + this config option is disabled by default. + + If in doubt, say N. diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile index ea456f057e8a..a73c04d2d7a3 100644 --- a/drivers/gpu/drm/Makefile +++ b/drivers/gpu/drm/Makefile @@ -30,6 +30,9 @@ subdir-ccflags-y += -Wno-sign-compare endif # --- end copy-paste +# Enable -Werror in CI and development +subdir-ccflags-$(CONFIG_DRM_WERROR) += -Werror + drm-y := \ drm_aperture.o \ drm_atomic.o \