Message ID | 20200820170951.v4.3.I066d89f39023956c47fb0a42edf196b3950ffbf7@changeid (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4,1/3,RESEND] media: camss: vfe: Use trace_printk for debugging only | expand |
On Thu, 20 Aug 2020 17:14:12 +0800 Nicolas Boichat <drinkcat@chromium.org> wrote: > Technically, we could only initialize the trace_printk buffers > when the print env is switched, to avoid the build error and > unconditional boot-time warning, but I assume this printing > framework will eventually get removed when the driver moves out > of staging? Perhaps this should be converting into a trace event. Look at what bpf did for their bpf_trace_printk(). The more I think about it, the less I like this series. -- Steve
On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Thu, 20 Aug 2020 17:14:12 +0800 > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > Technically, we could only initialize the trace_printk buffers > > when the print env is switched, to avoid the build error and > > unconditional boot-time warning, but I assume this printing > > framework will eventually get removed when the driver moves out > > of staging? > > Perhaps this should be converting into a trace event. Look at what bpf > did for their bpf_trace_printk(). > > The more I think about it, the less I like this series. To make it clear, the primary goal of this series is to get rid of trace_printk sprinkled in the kernel by making sure some randconfig builds fail. Since my v2, there already has been one more added (the one that this patch removes), so I'd like to land 2/3 ASAP to prevent even more from being added. Looking at your reply on 1/3, I think we are aligned on that goal? Is there some other approach you'd recommend? Now, I'm not pretending my fixes are the best possible ones, but I would much rather have the burden of converting to trace events on the respective driver maintainers. (btw is there a short documentation/tutorial that I could link to in these patches, to help developers understand what is the recommended way now?) Thanks, > > -- Steve
On Fri, 21 Aug 2020 08:13:00 +0800 Nicolas Boichat <drinkcat@chromium.org> wrote: > On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > On Thu, 20 Aug 2020 17:14:12 +0800 > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > Technically, we could only initialize the trace_printk buffers > > > when the print env is switched, to avoid the build error and > > > unconditional boot-time warning, but I assume this printing > > > framework will eventually get removed when the driver moves out > > > of staging? > > > > Perhaps this should be converting into a trace event. Look at what bpf > > did for their bpf_trace_printk(). > > > > The more I think about it, the less I like this series. > > To make it clear, the primary goal of this series is to get rid of > trace_printk sprinkled in the kernel by making sure some randconfig > builds fail. Since my v2, there already has been one more added (the > one that this patch removes), so I'd like to land 2/3 ASAP to prevent > even more from being added. > > Looking at your reply on 1/3, I think we are aligned on that goal? Is > there some other approach you'd recommend? > > Now, I'm not pretending my fixes are the best possible ones, but I > would much rather have the burden of converting to trace events on the > respective driver maintainers. (btw is there a short > documentation/tutorial that I could link to in these patches, to help > developers understand what is the recommended way now?) > I like the goal, but I guess I never articulated the problem I have with the methodology. trace_printk() is meant to be a debugging tool. Something that people can and do sprinkle all over the kernel to help them find a bug in areas that are called quite often (where printk() is way too slow). The last thing I want them to deal with is adding a trace_printk() with their distro's config (or a config from someone that triggered the bug) only to have the build to fail, because they also need to add a config value. I add to the Cc a few developers I know that use trace_printk() in this fashion. I'd like to hear their view on having to add a config option to make trace_printk work before they test a config that is sent to them. -- Steve
On Fri, Aug 21, 2020 at 8:36 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Fri, 21 Aug 2020 08:13:00 +0800 > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > On Thu, 20 Aug 2020 17:14:12 +0800 > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > Technically, we could only initialize the trace_printk buffers > > > > when the print env is switched, to avoid the build error and > > > > unconditional boot-time warning, but I assume this printing > > > > framework will eventually get removed when the driver moves out > > > > of staging? > > > > > > Perhaps this should be converting into a trace event. Look at what bpf > > > did for their bpf_trace_printk(). > > > > > > The more I think about it, the less I like this series. > > > > To make it clear, the primary goal of this series is to get rid of > > trace_printk sprinkled in the kernel by making sure some randconfig > > builds fail. Since my v2, there already has been one more added (the > > one that this patch removes), so I'd like to land 2/3 ASAP to prevent > > even more from being added. > > > > Looking at your reply on 1/3, I think we are aligned on that goal? Is > > there some other approach you'd recommend? > > > > Now, I'm not pretending my fixes are the best possible ones, but I > > would much rather have the burden of converting to trace events on the > > respective driver maintainers. (btw is there a short > > documentation/tutorial that I could link to in these patches, to help > > developers understand what is the recommended way now?) > > > > I like the goal, but I guess I never articulated the problem I have > with the methodology. > > trace_printk() is meant to be a debugging tool. Something that people > can and do sprinkle all over the kernel to help them find a bug in > areas that are called quite often (where printk() is way too slow). > > The last thing I want them to deal with is adding a trace_printk() with > their distro's config (or a config from someone that triggered the bug) > only to have the build to fail, because they also need to add a config > value. > > I add to the Cc a few developers I know that use trace_printk() in this > fashion. I'd like to hear their view on having to add a config option > to make trace_printk work before they test a config that is sent to > them. Gotcha, thanks. I have also used trace_printk in the past, as uncommitted changes (and understand the usefulness ,-)). And in Chrome OS team here, developers have also raised this concern: how do we make the developer flow convenient so that we can add trace_printk to our code for debugging, without having to flip back that config option, and _yet_ make sure that no trace_printk ever makes it into our production kernels. We have creative ways of making that work (portage USE flags and stuff). But I'm not sure about other flows, and your concern is totally valid... Some other approaches/ideas: 1. Filter all lkml messages that contain trace_printk. Already found 1 instance, and I can easily reply to those with a semi-canned answer, if I remember to check that filter regularly (not sustainable in the long run...). 2. Integration into some kernel test robot? (I will not roll my own for this ,-)) It may be a bit difficult as some debug config options do enable trace_printk, and that's ok. 3. In Chromium OS, I can add a unit test (i.e. something outside of the normal kernel build system), but that'll only catch regressions downstream (or when we happen to backport patches). Down the line, #3 catches what I care about the most (Chromium OS issues: we had production kernels for a few days/weeks showing that splat on boot), but it'd be nice to have something upstream that benefits everyone. Thanks, > > -- Steve
On Fri, 21 Aug 2020 09:39:19 +0800 Nicolas Boichat <drinkcat@chromium.org> wrote: > On Fri, Aug 21, 2020 at 8:36 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > On Fri, 21 Aug 2020 08:13:00 +0800 > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > > > On Thu, 20 Aug 2020 17:14:12 +0800 > > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > > > Technically, we could only initialize the trace_printk buffers > > > > > when the print env is switched, to avoid the build error and > > > > > unconditional boot-time warning, but I assume this printing > > > > > framework will eventually get removed when the driver moves out > > > > > of staging? > > > > > > > > Perhaps this should be converting into a trace event. Look at what bpf > > > > did for their bpf_trace_printk(). > > > > > > > > The more I think about it, the less I like this series. > > > > > > To make it clear, the primary goal of this series is to get rid of > > > trace_printk sprinkled in the kernel by making sure some randconfig > > > builds fail. Since my v2, there already has been one more added (the > > > one that this patch removes), so I'd like to land 2/3 ASAP to prevent > > > even more from being added. > > > > > > Looking at your reply on 1/3, I think we are aligned on that goal? Is > > > there some other approach you'd recommend? > > > > > > Now, I'm not pretending my fixes are the best possible ones, but I > > > would much rather have the burden of converting to trace events on the > > > respective driver maintainers. (btw is there a short > > > documentation/tutorial that I could link to in these patches, to help > > > developers understand what is the recommended way now?) > > > > > > > I like the goal, but I guess I never articulated the problem I have > > with the methodology. > > > > trace_printk() is meant to be a debugging tool. Something that people > > can and do sprinkle all over the kernel to help them find a bug in > > areas that are called quite often (where printk() is way too slow). > > > > The last thing I want them to deal with is adding a trace_printk() with > > their distro's config (or a config from someone that triggered the bug) > > only to have the build to fail, because they also need to add a config > > value. > > > > I add to the Cc a few developers I know that use trace_printk() in this > > fashion. I'd like to hear their view on having to add a config option > > to make trace_printk work before they test a config that is sent to > > them. > > Gotcha, thanks. I have also used trace_printk in the past, as > uncommitted changes (and understand the usefulness ,-)). And in Chrome > OS team here, developers have also raised this concern: how do we make > the developer flow convenient so that we can add trace_printk to our > code for debugging, without having to flip back that config option, > and _yet_ make sure that no trace_printk ever makes it into our > production kernels. We have creative ways of making that work (portage > USE flags and stuff). But I'm not sure about other flows, and your > concern is totally valid... > > Some other approaches/ideas: > 1. Filter all lkml messages that contain trace_printk. Already found > 1 instance, and I can easily reply to those with a semi-canned answer, > if I remember to check that filter regularly (not sustainable in the > long run...). Added Joe Perches to the thread. We can update checkpatch.pl to complain about a trace_printk() that it finds in the added code. > 2. Integration into some kernel test robot? (I will not roll my own > for this ,-)) It may be a bit difficult as some debug config options > do enable trace_printk, and that's ok. > 3. In Chromium OS, I can add a unit test (i.e. something outside of > the normal kernel build system), but that'll only catch regressions > downstream (or when we happen to backport patches). > > Down the line, #3 catches what I care about the most (Chromium OS > issues: we had production kernels for a few days/weeks showing that > splat on boot), but it'd be nice to have something upstream that > benefits everyone. > What about an opposite config. That is, not have a config to enable it. But one to disable it. If it is disabled and a trace_printk is found, it will fail the build. This way your builds will not allow your kernel to get out the door with one. #ifdef CONFIG_DISABLE_TRACE_PRINTK #define trace_printk __this_function_is_disabled #endif ? -- Steve
On Thu, 2020-08-20 at 21:57 -0400, Steven Rostedt wrote: > On Fri, 21 Aug 2020 09:39:19 +0800 > Nicolas Boichat <drinkcat@chromium.org> wrote: [] > > Some other approaches/ideas: > > 1. Filter all lkml messages that contain trace_printk. Already found > > 1 instance, and I can easily reply to those with a semi-canned answer, > > if I remember to check that filter regularly (not sustainable in the > > long run...). > > Added Joe Perches to the thread. > > We can update checkpatch.pl to complain about a trace_printk() that it > finds in the added code. Why? I don't see much value in a trace_printk checkpatch warning. tracing is still dependent on CONFIG_TRACING otherwise trace_printk is an if (0) ELI5 please.
On Fri, Aug 21, 2020 at 9:57 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Fri, 21 Aug 2020 09:39:19 +0800 > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > On Fri, Aug 21, 2020 at 8:36 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > On Fri, 21 Aug 2020 08:13:00 +0800 > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > > > > > On Thu, 20 Aug 2020 17:14:12 +0800 > > > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > > > > > Technically, we could only initialize the trace_printk buffers > > > > > > when the print env is switched, to avoid the build error and > > > > > > unconditional boot-time warning, but I assume this printing > > > > > > framework will eventually get removed when the driver moves out > > > > > > of staging? > > > > > > > > > > Perhaps this should be converting into a trace event. Look at what bpf > > > > > did for their bpf_trace_printk(). > > > > > > > > > > The more I think about it, the less I like this series. > > > > > > > > To make it clear, the primary goal of this series is to get rid of > > > > trace_printk sprinkled in the kernel by making sure some randconfig > > > > builds fail. Since my v2, there already has been one more added (the > > > > one that this patch removes), so I'd like to land 2/3 ASAP to prevent > > > > even more from being added. > > > > > > > > Looking at your reply on 1/3, I think we are aligned on that goal? Is > > > > there some other approach you'd recommend? > > > > > > > > Now, I'm not pretending my fixes are the best possible ones, but I > > > > would much rather have the burden of converting to trace events on the > > > > respective driver maintainers. (btw is there a short > > > > documentation/tutorial that I could link to in these patches, to help > > > > developers understand what is the recommended way now?) > > > > > > > > > > I like the goal, but I guess I never articulated the problem I have > > > with the methodology. > > > > > > trace_printk() is meant to be a debugging tool. Something that people > > > can and do sprinkle all over the kernel to help them find a bug in > > > areas that are called quite often (where printk() is way too slow). > > > > > > The last thing I want them to deal with is adding a trace_printk() with > > > their distro's config (or a config from someone that triggered the bug) > > > only to have the build to fail, because they also need to add a config > > > value. > > > > > > I add to the Cc a few developers I know that use trace_printk() in this > > > fashion. I'd like to hear their view on having to add a config option > > > to make trace_printk work before they test a config that is sent to > > > them. > > > > Gotcha, thanks. I have also used trace_printk in the past, as > > uncommitted changes (and understand the usefulness ,-)). And in Chrome > > OS team here, developers have also raised this concern: how do we make > > the developer flow convenient so that we can add trace_printk to our > > code for debugging, without having to flip back that config option, > > and _yet_ make sure that no trace_printk ever makes it into our > > production kernels. We have creative ways of making that work (portage > > USE flags and stuff). But I'm not sure about other flows, and your > > concern is totally valid... > > > > Some other approaches/ideas: > > 1. Filter all lkml messages that contain trace_printk. Already found > > 1 instance, and I can easily reply to those with a semi-canned answer, > > if I remember to check that filter regularly (not sustainable in the > > long run...). > > Added Joe Perches to the thread. > > We can update checkpatch.pl to complain about a trace_printk() that it > finds in the added code. Oh, that's a good and simple idea. > > > 2. Integration into some kernel test robot? (I will not roll my own > > for this ,-)) It may be a bit difficult as some debug config options > > do enable trace_printk, and that's ok. > > 3. In Chromium OS, I can add a unit test (i.e. something outside of > > the normal kernel build system), but that'll only catch regressions > > downstream (or when we happen to backport patches). > > > > Down the line, #3 catches what I care about the most (Chromium OS > > issues: we had production kernels for a few days/weeks showing that > > splat on boot), but it'd be nice to have something upstream that > > benefits everyone. > > > > What about an opposite config. That is, not have a config to enable it. > But one to disable it. If it is disabled and a trace_printk is found, > it will fail the build. This way your builds will not allow your kernel > to get out the door with one. > > #ifdef CONFIG_DISABLE_TRACE_PRINTK > #define trace_printk __this_function_is_disabled > #endif I'm not sure how that helps? I mean, the use case you have in mind is somebody reusing a distro/random config and not being able to use trace_printk, right? If that config has CONFIG_DISABLE_TRACE_PRINTK=y, then the developer will still need to flip that back. Note that the option I'm added has default=y (_allow_ trace_printk), so I don't think default y or default n really matters? > > ? > > -- Steve
On Fri, Aug 21, 2020 at 10:36 AM Joe Perches <joe@perches.com> wrote: > > On Thu, 2020-08-20 at 21:57 -0400, Steven Rostedt wrote: > > On Fri, 21 Aug 2020 09:39:19 +0800 > > Nicolas Boichat <drinkcat@chromium.org> wrote: > [] > > > Some other approaches/ideas: > > > 1. Filter all lkml messages that contain trace_printk. Already found > > > 1 instance, and I can easily reply to those with a semi-canned answer, > > > if I remember to check that filter regularly (not sustainable in the > > > long run...). > > > > Added Joe Perches to the thread. > > > > We can update checkpatch.pl to complain about a trace_printk() that it > > finds in the added code. > > Why? > > I don't see much value in a trace_printk checkpatch warning. > tracing is still dependent on CONFIG_TRACING otherwise > trace_printk is an if (0) > > ELI5 please. This is my "new" canned answer to this: Please do not use trace_printk in production code [1,2], it is only meant for debug use. Consider using trace events, or dev_dbg. [1] https://elixir.bootlin.com/linux/v5.8/source/kernel/trace/trace.c#L3158 [2] https://elixir.bootlin.com/linux/v5.8/source/include/linux/kernel.h#L766 I also had arguments in patch 2/3 notes: There's at least 3 reasons that I can come up with: 1. trace_printk introduces some overhead. [some users, e.g. Android/Chrome OS, want CONFIG_TRACING but _not_ that extra overhead] 2. If the kernel keeps adding always-enabled trace_printk, it will be much harder for developers to make use of trace_printk for debugging. 3. People may assume that trace_printk is for debugging only, and may accidentally output sensitive data (theoretical at this stage). (we'll need to summarize that somehow if we want to add to checkpatch.pl)
On Thu, 20 Aug 2020 19:36:19 -0700 Joe Perches <joe@perches.com> wrote: > On Thu, 2020-08-20 at 21:57 -0400, Steven Rostedt wrote: > > On Fri, 21 Aug 2020 09:39:19 +0800 > > Nicolas Boichat <drinkcat@chromium.org> wrote: > [] > > > Some other approaches/ideas: > > > 1. Filter all lkml messages that contain trace_printk. Already found > > > 1 instance, and I can easily reply to those with a semi-canned answer, > > > if I remember to check that filter regularly (not sustainable in the > > > long run...). > > > > Added Joe Perches to the thread. > > > > We can update checkpatch.pl to complain about a trace_printk() that it > > finds in the added code. > > Why? > > I don't see much value in a trace_printk checkpatch warning. > tracing is still dependent on CONFIG_TRACING otherwise > trace_printk is an if (0) > > ELI5 please. > Because no production code should contain trace_printk(). It should be deleted before going to Linus. If you have trace_printk() in your code, you will be greeted by the following banner in your dmesg: ********************************************************** ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** ** ** ** trace_printk() being used. Allocating extra memory. ** ** ** ** This means that this is a DEBUG kernel and it is ** ** unsafe for production use. ** ** ** ** If you see this message and you are not debugging ** ** the kernel, report this immediately to your vendor! ** ** ** ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE ** ********************************************************** -- Steve
On Fri, 2020-08-21 at 10:42 +0800, Nicolas Boichat wrote: > On Fri, Aug 21, 2020 at 10:36 AM Joe Perches <joe@perches.com> wrote: > > On Thu, 2020-08-20 at 21:57 -0400, Steven Rostedt wrote: > > > On Fri, 21 Aug 2020 09:39:19 +0800 > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > [] > > > > Some other approaches/ideas: > > > > 1. Filter all lkml messages that contain trace_printk. Already found > > > > 1 instance, and I can easily reply to those with a semi-canned answer, > > > > if I remember to check that filter regularly (not sustainable in the > > > > long run...). > > > > > > Added Joe Perches to the thread. > > > > > > We can update checkpatch.pl to complain about a trace_printk() that it > > > finds in the added code. > > > > Why? > > > > I don't see much value in a trace_printk checkpatch warning. > > tracing is still dependent on CONFIG_TRACING otherwise > > trace_printk is an if (0) > > > > ELI5 please. > > This is my "new" canned answer to this: > > Please do not use trace_printk in production code [1,2], it is only > meant for debug use. Consider using trace events, or dev_dbg. > [1] https://elixir.bootlin.com/linux/v5.8/source/kernel/trace/trace.c#L3158 > [2] https://elixir.bootlin.com/linux/v5.8/source/include/linux/kernel.h#L766 > > I also had arguments in patch 2/3 notes: > > There's at least 3 reasons that I can come up with: > 1. trace_printk introduces some overhead. [some users, e.g. > Android/Chrome OS, want CONFIG_TRACING but _not_ that extra overhead] > 2. If the kernel keeps adding always-enabled trace_printk, it will be > much harder for developers to make use of trace_printk for debugging. > 3. People may assume that trace_printk is for debugging only, and may > accidentally output sensitive data (theoretical at this stage). Perhaps make trace_printk dependent on #define DEBUG? Something like: --- include/linux/kernel.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 500def620d8f..6ca8f958df73 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -717,6 +717,7 @@ do { \ * let gcc optimize the rest. */ +#ifdef DEBUG #define trace_printk(fmt, ...) \ do { \ char _______STR[] = __stringify((__VA_ARGS__)); \ @@ -725,6 +726,12 @@ do { \ else \ trace_puts(fmt); \ } while (0) +#else +#define trace_printk(fmt, ...) \ +do { \ + __trace_printk_check_format(fmt, ##args); \ +} while (0) +#endif #define do_trace_printk(fmt, args...) \ do { \
On Fri, 21 Aug 2020 10:39:02 +0800 Nicolas Boichat <drinkcat@chromium.org> wrote: > I'm not sure how that helps? I mean, the use case you have in mind is > somebody reusing a distro/random config and not being able to use > trace_printk, right? If that config has CONFIG_DISABLE_TRACE_PRINTK=y, > then the developer will still need to flip that back. > > Note that the option I'm added has default=y (_allow_ trace_printk), > so I don't think default y or default n really matters? Ideally, the production system doesn't have it set. It only sets it to make sure that it's clean before sending out. But then it can add it back before production. Yeah, it's pretty much cutting hairs between the two. I don't like either one. Really, if you are worried about this, just add your patch to your local tree. I'm not sure this is something that can be fixed upstream. Another idea is to add something like below, and build with: make CHECK_TRACE_PRINT=1 This way it is a build command line option that causes the build to fail if trace_printk() is added. This way production systems can add this to make sure their kernels are free of trace_printk() but it doesn't affect the config that is used. -- Steve [ Not even compiled tested! ] diff --git a/Makefile b/Makefile index 2057c92a6205..5714a738879d 100644 --- a/Makefile +++ b/Makefile @@ -91,6 +91,13 @@ else Q = @ endif +ifeq ("$(origin CHECK_TRACE_PRINTK)", "command line") + KBUILD_NO_TRACE_PRINTK = $(NO_TRACE_PRINTK) +endif +ifndef KBUILD_NO_TRACE_PRINTK + KBUILD_NO_TRACE_PRINTK = 0 +endif + # If the user is running make -s (silent mode), suppress echoing of # commands @@ -839,6 +846,10 @@ KBUILD_AFLAGS += -gz=zlib KBUILD_LDFLAGS += --compress-debug-sections=zlib endif +ifeq ($(KBUILD_NO_TRACE_PRINTK),1) +KBUILD_CFLAGS += -DNO_TRACE_PRINTK +endif + KBUILD_CFLAGS += $(DEBUG_CFLAGS) export DEBUG_CFLAGS diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 500def620d8f..bee432547d26 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -680,11 +680,14 @@ extern void tracing_stop(void); static inline __printf(1, 2) void ____trace_printk_check_format(const char *fmt, ...) { +#ifdef NO_TRACE_PRINTK + extern void __no_trace_printk_on_build(void); + __no_trace_printk_on_build(); +#endif } #define __trace_printk_check_format(fmt, args...) \ do { \ - if (0) \ - ____trace_printk_check_format(fmt, ##args); \ + ____trace_printk_check_format(fmt, ##args); \ } while (0) /**
On Thu, 20 Aug 2020 19:49:59 -0700 Joe Perches <joe@perches.com> wrote: > Perhaps make trace_printk dependent on #define DEBUG? This is basically what Nicolas's patch series does in this very patch! And no, I hate it. We are currently discussing ways of not having to modify the config in order to allow trace_printk() to be used. We don't want to burden the developer to take a config, add a bunch of trace_printks() and find that it's compiled out! Thus, this is a NAK. -- Steve > > Something like: > --- > include/linux/kernel.h | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index 500def620d8f..6ca8f958df73 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -717,6 +717,7 @@ do { \ > * let gcc optimize the rest. > */ > > +#ifdef DEBUG > #define trace_printk(fmt, ...) \ > do { \ > char _______STR[] = __stringify((__VA_ARGS__)); \ > @@ -725,6 +726,12 @@ do { \ > else \ > trace_puts(fmt); \ > } while (0) > +#else > +#define trace_printk(fmt, ...) \ > +do { \ > + __trace_printk_check_format(fmt, ##args); \ > +} while (0) > +#endif > > #define do_trace_printk(fmt, args...) \ > do { \ >
On Thu, 20 Aug 2020 23:04:59 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Thu, 20 Aug 2020 19:49:59 -0700 > Joe Perches <joe@perches.com> wrote: > > > Perhaps make trace_printk dependent on #define DEBUG? > > This is basically what Nicolas's patch series does in this very patch! > > And no, I hate it. We are currently discussing ways of not having to > modify the config in order to allow trace_printk() to be used. > > We don't want to burden the developer to take a config, add a bunch of > trace_printks() and find that it's compiled out! > This also breaks another use case. You may be working on a module for a production kernel. It is fine to include trace_printk() in your module, and load it on the production kernel. You will get that banner when you load your module, but that's OK because it is still under development. But something like this change will prevent that from happening. -- Steve
From: Steven Rostedt > Sent: 21 August 2020 01:36 > On Fri, 21 Aug 2020 08:13:00 +0800 > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > On Thu, 20 Aug 2020 17:14:12 +0800 > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > Technically, we could only initialize the trace_printk buffers > > > > when the print env is switched, to avoid the build error and > > > > unconditional boot-time warning, but I assume this printing > > > > framework will eventually get removed when the driver moves out > > > > of staging? > > > > > > Perhaps this should be converting into a trace event. Look at what bpf > > > did for their bpf_trace_printk(). > > > > > > The more I think about it, the less I like this series. > > > > To make it clear, the primary goal of this series is to get rid of > > trace_printk sprinkled in the kernel by making sure some randconfig > > builds fail. Since my v2, there already has been one more added (the > > one that this patch removes), so I'd like to land 2/3 ASAP to prevent > > even more from being added. > > > > Looking at your reply on 1/3, I think we are aligned on that goal? Is > > there some other approach you'd recommend? > > > > Now, I'm not pretending my fixes are the best possible ones, but I > > would much rather have the burden of converting to trace events on the > > respective driver maintainers. (btw is there a short > > documentation/tutorial that I could link to in these patches, to help > > developers understand what is the recommended way now?) > > > > I like the goal, but I guess I never articulated the problem I have > with the methodology. > > trace_printk() is meant to be a debugging tool. Something that people > can and do sprinkle all over the kernel to help them find a bug in > areas that are called quite often (where printk() is way too slow). > > The last thing I want them to deal with is adding a trace_printk() with > their distro's config (or a config from someone that triggered the bug) > only to have the build to fail, because they also need to add a config > value. > > I add to the Cc a few developers I know that use trace_printk() in this > fashion. I'd like to hear their view on having to add a config option > to make trace_printk work before they test a config that is sent to > them. Is it worth having three compile-time options: 1) trace_printk() ignored. 2) trace_printk() enabled. 3) trace_printk() generates a compile time error. Normal kernel builds would ignore calls. Either a config option or a module option (etc) would enable it. A config option that 'rand-config' builds would turn on would generate compile-time errors. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, Aug 21, 2020 at 4:48 PM David Laight <David.Laight@aculab.com> wrote: > > From: Steven Rostedt > > Sent: 21 August 2020 01:36 > > On Fri, 21 Aug 2020 08:13:00 +0800 > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > > > On Thu, 20 Aug 2020 17:14:12 +0800 > > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > > > Technically, we could only initialize the trace_printk buffers > > > > > when the print env is switched, to avoid the build error and > > > > > unconditional boot-time warning, but I assume this printing > > > > > framework will eventually get removed when the driver moves out > > > > > of staging? > > > > > > > > Perhaps this should be converting into a trace event. Look at what bpf > > > > did for their bpf_trace_printk(). > > > > > > > > The more I think about it, the less I like this series. > > > > > > To make it clear, the primary goal of this series is to get rid of > > > trace_printk sprinkled in the kernel by making sure some randconfig > > > builds fail. Since my v2, there already has been one more added (the > > > one that this patch removes), so I'd like to land 2/3 ASAP to prevent > > > even more from being added. > > > > > > Looking at your reply on 1/3, I think we are aligned on that goal? Is > > > there some other approach you'd recommend? > > > > > > Now, I'm not pretending my fixes are the best possible ones, but I > > > would much rather have the burden of converting to trace events on the > > > respective driver maintainers. (btw is there a short > > > documentation/tutorial that I could link to in these patches, to help > > > developers understand what is the recommended way now?) > > > > > > > I like the goal, but I guess I never articulated the problem I have > > with the methodology. > > > > trace_printk() is meant to be a debugging tool. Something that people > > can and do sprinkle all over the kernel to help them find a bug in > > areas that are called quite often (where printk() is way too slow). > > > > The last thing I want them to deal with is adding a trace_printk() with > > their distro's config (or a config from someone that triggered the bug) > > only to have the build to fail, because they also need to add a config > > value. > > > > I add to the Cc a few developers I know that use trace_printk() in this > > fashion. I'd like to hear their view on having to add a config option > > to make trace_printk work before they test a config that is sent to > > them. > > Is it worth having three compile-time options: > 1) trace_printk() ignored. CONFIG_TRACE=n (now) > 2) trace_printk() enabled. CONFIG_TRACE=y (now) > 3) trace_printk() generates a compile time error. CONFIG_TRACE=y and CONFIG_TRACING_ALLOW_PRINTK=n (my patch) > > Normal kernel builds would ignore calls. > Either a config option or a module option (etc) would enable it. > A config option that 'rand-config' builds would turn on would > generate compile-time errors. Yes, a config option is exactly what my patch 2/2 does. And see Steven's argument. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales) >
From: Nicolas Boichat > Sent: 21 August 2020 11:28 > > On Fri, Aug 21, 2020 at 4:48 PM David Laight <David.Laight@aculab.com> wrote: > > > > From: Steven Rostedt > > > Sent: 21 August 2020 01:36 > > > On Fri, 21 Aug 2020 08:13:00 +0800 > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > > > > > On Thu, 20 Aug 2020 17:14:12 +0800 > > > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > > > > > Technically, we could only initialize the trace_printk buffers > > > > > > when the print env is switched, to avoid the build error and > > > > > > unconditional boot-time warning, but I assume this printing > > > > > > framework will eventually get removed when the driver moves out > > > > > > of staging? > > > > > > > > > > Perhaps this should be converting into a trace event. Look at what bpf > > > > > did for their bpf_trace_printk(). > > > > > > > > > > The more I think about it, the less I like this series. > > > > > > > > To make it clear, the primary goal of this series is to get rid of > > > > trace_printk sprinkled in the kernel by making sure some randconfig > > > > builds fail. Since my v2, there already has been one more added (the > > > > one that this patch removes), so I'd like to land 2/3 ASAP to prevent > > > > even more from being added. > > > > > > > > Looking at your reply on 1/3, I think we are aligned on that goal? Is > > > > there some other approach you'd recommend? > > > > > > > > Now, I'm not pretending my fixes are the best possible ones, but I > > > > would much rather have the burden of converting to trace events on the > > > > respective driver maintainers. (btw is there a short > > > > documentation/tutorial that I could link to in these patches, to help > > > > developers understand what is the recommended way now?) > > > > > > > > > > I like the goal, but I guess I never articulated the problem I have > > > with the methodology. > > > > > > trace_printk() is meant to be a debugging tool. Something that people > > > can and do sprinkle all over the kernel to help them find a bug in > > > areas that are called quite often (where printk() is way too slow). > > > > > > The last thing I want them to deal with is adding a trace_printk() with > > > their distro's config (or a config from someone that triggered the bug) > > > only to have the build to fail, because they also need to add a config > > > value. > > > > > > I add to the Cc a few developers I know that use trace_printk() in this > > > fashion. I'd like to hear their view on having to add a config option > > > to make trace_printk work before they test a config that is sent to > > > them. > > > > Is it worth having three compile-time options: > > 1) trace_printk() ignored. > > CONFIG_TRACE=n (now) > > > 2) trace_printk() enabled. > > CONFIG_TRACE=y (now) > > > 3) trace_printk() generates a compile time error. > > CONFIG_TRACE=y and CONFIG_TRACING_ALLOW_PRINTK=n (my patch) > > > > > Normal kernel builds would ignore calls. > > Either a config option or a module option (etc) would enable it. > > A config option that 'rand-config' builds would turn on would > > generate compile-time errors. > > Yes, a config option is exactly what my patch 2/2 does. And see > Steven's argument. But you were adding #ifs to you code to enable the traces. That is just horrid. What you want is CONFIG_HJHJVLKHCVKIYVKIIYVYKIYVLUCLUCL=y (default n) that would only ever get set by a 'rand-config' build and would never be tested in any source code. You might also want a #define that can set temporarily to enable traces in a specific file/module even though CONFIG_TRACE=n. (But rand-config builds would still fail if they enabled the 'special' option.) David. - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, Aug 21, 2020 at 7:32 PM David Laight <David.Laight@aculab.com> wrote: > > From: Nicolas Boichat > > Sent: 21 August 2020 11:28 > > > > On Fri, Aug 21, 2020 at 4:48 PM David Laight <David.Laight@aculab.com> wrote: > > > > > > From: Steven Rostedt > > > > Sent: 21 August 2020 01:36 > > > > On Fri, 21 Aug 2020 08:13:00 +0800 > > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > > > On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > > > > > > > > On Thu, 20 Aug 2020 17:14:12 +0800 > > > > > > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > > > > > > > > > > > Technically, we could only initialize the trace_printk buffers > > > > > > > when the print env is switched, to avoid the build error and > > > > > > > unconditional boot-time warning, but I assume this printing > > > > > > > framework will eventually get removed when the driver moves out > > > > > > > of staging? > > > > > > > > > > > > Perhaps this should be converting into a trace event. Look at what bpf > > > > > > did for their bpf_trace_printk(). > > > > > > > > > > > > The more I think about it, the less I like this series. > > > > > > > > > > To make it clear, the primary goal of this series is to get rid of > > > > > trace_printk sprinkled in the kernel by making sure some randconfig > > > > > builds fail. Since my v2, there already has been one more added (the > > > > > one that this patch removes), so I'd like to land 2/3 ASAP to prevent > > > > > even more from being added. > > > > > > > > > > Looking at your reply on 1/3, I think we are aligned on that goal? Is > > > > > there some other approach you'd recommend? > > > > > > > > > > Now, I'm not pretending my fixes are the best possible ones, but I > > > > > would much rather have the burden of converting to trace events on the > > > > > respective driver maintainers. (btw is there a short > > > > > documentation/tutorial that I could link to in these patches, to help > > > > > developers understand what is the recommended way now?) > > > > > > > > > > > > > I like the goal, but I guess I never articulated the problem I have > > > > with the methodology. > > > > > > > > trace_printk() is meant to be a debugging tool. Something that people > > > > can and do sprinkle all over the kernel to help them find a bug in > > > > areas that are called quite often (where printk() is way too slow). > > > > > > > > The last thing I want them to deal with is adding a trace_printk() with > > > > their distro's config (or a config from someone that triggered the bug) > > > > only to have the build to fail, because they also need to add a config > > > > value. > > > > > > > > I add to the Cc a few developers I know that use trace_printk() in this > > > > fashion. I'd like to hear their view on having to add a config option > > > > to make trace_printk work before they test a config that is sent to > > > > them. > > > > > > Is it worth having three compile-time options: > > > 1) trace_printk() ignored. > > > > CONFIG_TRACE=n (now) > > > > > 2) trace_printk() enabled. > > > > CONFIG_TRACE=y (now) > > > > > 3) trace_printk() generates a compile time error. > > > > CONFIG_TRACE=y and CONFIG_TRACING_ALLOW_PRINTK=n (my patch) > > > > > > > > Normal kernel builds would ignore calls. > > > Either a config option or a module option (etc) would enable it. > > > A config option that 'rand-config' builds would turn on would > > > generate compile-time errors. > > > > Yes, a config option is exactly what my patch 2/2 does. And see > > Steven's argument. > > But you were adding #ifs to you code to enable the traces. > That is just horrid. Yeah this patch 3/3 is not the best (if I understand what you mean), 1/3 type of #ifdef is actually fairly common in the kernel (an ifdef that you can enable manually in the same file, _not_ a config option). Steven's point is that both 1/3 and 3/3 should be replaced by trace events anyway. > > What you want is CONFIG_HJHJVLKHCVKIYVKIIYVYKIYVLUCLUCL=y (default n) > that would only ever get set by a 'rand-config' build and would > never be tested in any source code. What I have now is CONFIG_TRACING_ALLOW_PRINTK=n (default y). That's the same thing? Also, I'd want to enable this option on Chromium OS (i.e. set your CONFIG_HJHJVLKHCVKIYVKIIYVYKIYVLUCLUCL=y, or my CONFIG_TRACING_ALLOW_PRINTK=n), and it's likely some distros would make that choice (because they'd also do not want to see that big trace_printk warning on their production kernels). And then you end up with Steven's point (developers inconvenience when trying to add trace_printk using a config that somebody else provided to them) ,-( > > You might also want a #define that can set temporarily > to enable traces in a specific file/module even though > CONFIG_TRACE=n. I don't understand how traces are supposed to work with CONFIG_TRACE=n? > (But rand-config builds would still fail if they enabled the > 'special' option.) > > David. > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)
From: Nicolas Boichat > Sent: 21 August 2020 13:07 ... > > You might also want a #define that can set temporarily > > to enable traces in a specific file/module even though > > CONFIG_TRACE=n. > > I don't understand how traces are supposed to work with CONFIG_TRACE=n? Probably because I meant something different :-) You want the kernel built so that there are no (expanded) calls to trace_printf() but with support for modules that contain them. Then I can load a module into a distro kernel that contains trace_printf() calls for debug testing. Which is why I was suggesting a config option that only rand-config builds would ever set that would cause the calls to generate compile-time errors. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, Aug 21, 2020 at 11:01 AM Steven Rostedt <rostedt@goodmis.org> wrote: > > On Fri, 21 Aug 2020 10:39:02 +0800 > Nicolas Boichat <drinkcat@chromium.org> wrote: > > > I'm not sure how that helps? I mean, the use case you have in mind is > > somebody reusing a distro/random config and not being able to use > > trace_printk, right? If that config has CONFIG_DISABLE_TRACE_PRINTK=y, > > then the developer will still need to flip that back. > > > > Note that the option I'm added has default=y (_allow_ trace_printk), > > so I don't think default y or default n really matters? > > Ideally, the production system doesn't have it set. It only sets it to > make sure that it's clean before sending out. But then it can add it > back before production. Yeah, it's pretty much cutting hairs between > the two. I don't like either one. > > Really, if you are worried about this, just add your patch to your > local tree. I'm not sure this is something that can be fixed upstream. > > Another idea is to add something like below, and build with: > > make CHECK_TRACE_PRINT=1 > > This way it is a build command line option that causes the build to > fail if trace_printk() is added. > > This way production systems can add this to make sure their kernels are > free of trace_printk() but it doesn't affect the config that is used. (for some reason I missed this reply in the thread ,-P) Thanks, that's quite a nice idea, I'll try it out (not sure if we still want that build_assert trick). We'd lose the automatic detection of issues through randconfig kernel test robot, but hopefully if enough distros enable it they'll start filing reports about issues. And maybe we can use that in combination with a checkpatch.pl check. (it turns out I'm having a hard time finding a good spot for this test in our Chrome OS build infra, so an upstream-friendly solution would be much better ,-P) > > -- Steve > > [ Not even compiled tested! ] > > diff --git a/Makefile b/Makefile > index 2057c92a6205..5714a738879d 100644 > --- a/Makefile > +++ b/Makefile > @@ -91,6 +91,13 @@ else > Q = @ > endif > > +ifeq ("$(origin CHECK_TRACE_PRINTK)", "command line") > + KBUILD_NO_TRACE_PRINTK = $(NO_TRACE_PRINTK) > +endif > +ifndef KBUILD_NO_TRACE_PRINTK > + KBUILD_NO_TRACE_PRINTK = 0 > +endif > + > # If the user is running make -s (silent mode), suppress echoing of > # commands > > @@ -839,6 +846,10 @@ KBUILD_AFLAGS += -gz=zlib > KBUILD_LDFLAGS += --compress-debug-sections=zlib > endif > > +ifeq ($(KBUILD_NO_TRACE_PRINTK),1) > +KBUILD_CFLAGS += -DNO_TRACE_PRINTK > +endif > + > KBUILD_CFLAGS += $(DEBUG_CFLAGS) > export DEBUG_CFLAGS > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index 500def620d8f..bee432547d26 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -680,11 +680,14 @@ extern void tracing_stop(void); > static inline __printf(1, 2) > void ____trace_printk_check_format(const char *fmt, ...) > { > +#ifdef NO_TRACE_PRINTK > + extern void __no_trace_printk_on_build(void); > + __no_trace_printk_on_build(); > +#endif > } > #define __trace_printk_check_format(fmt, args...) \ > do { \ > - if (0) \ > - ____trace_printk_check_format(fmt, ##args); \ > + ____trace_printk_check_format(fmt, ##args); \ > } while (0) > > /**
On Fri, Aug 21, 2020 at 8:18 PM David Laight <David.Laight@aculab.com> wrote: > > From: Nicolas Boichat > > Sent: 21 August 2020 13:07 > ... > > > You might also want a #define that can set temporarily > > > to enable traces in a specific file/module even though > > > CONFIG_TRACE=n. > > > > I don't understand how traces are supposed to work with CONFIG_TRACE=n? > > Probably because I meant something different :-) > > You want the kernel built so that there are no (expanded) > calls to trace_printf() but with support for modules that > contain them. > > Then I can load a module into a distro kernel that > contains trace_printf() calls for debug testing. Gotcha. I think it already works this way ,-) So if you have CONFIG_TRACE=y, but no trace_printk in your vmlinux/kernel, no memory is used, and no warning splat (https://elixir.bootlin.com/linux/v5.8/source/kernel/trace/trace.c#L3160) is displayed. But then when you load a module with trace_printk, the buffers are allocated and the warning splat is printed. The magic is here: https://elixir.bootlin.com/linux/v5.8/source/kernel/trace/trace_printk.c#L53 My option wouldn't really change that. I mean, if you have CONFIG_TRACING_ALLOW_PRINTK=n when you compile your module, it'd fail at build time, but if you set it to =y, your module could happily build and load (with the big warning splat), no matter how you built your kernel (I mean, you still need CONFIG_TRACE=y, but CONFIG_TRACING_ALLOW_PRINTK doesn't matter). > Which is why I was suggesting a config option that > only rand-config builds would ever set that would > cause the calls to generate compile-time errors. I think I already answered that one above. We'd want that config option enabled on Chrome OS and we're not a rand-config build (I mean, we're a very carefully selected random config ,-P). Thanks, > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)
diff --git a/drivers/staging/media/atomisp/pci/atomisp_compat_css20.c b/drivers/staging/media/atomisp/pci/atomisp_compat_css20.c index 1b2b2c68025b4cc..020519dca1324ab 100644 --- a/drivers/staging/media/atomisp/pci/atomisp_compat_css20.c +++ b/drivers/staging/media/atomisp/pci/atomisp_compat_css20.c @@ -165,11 +165,13 @@ static int atomisp_css2_dbg_print(const char *fmt, va_list args) return 0; } +#ifdef CONFIG_TRACING_ALLOW_PRINTK static int atomisp_css2_dbg_ftrace_print(const char *fmt, va_list args) { ftrace_vprintk(fmt, args); return 0; } +#endif static int atomisp_css2_err_print(const char *fmt, va_list args) { @@ -865,9 +867,11 @@ static inline int __set_css_print_env(struct atomisp_device *isp, int opt) if (opt == 0) isp->css_env.isp_css_env.print_env.debug_print = NULL; +#ifdef CONFIG_TRACING_ALLOW_PRINTK else if (opt == 1) isp->css_env.isp_css_env.print_env.debug_print = atomisp_css2_dbg_ftrace_print; +#endif else if (opt == 2) isp->css_env.isp_css_env.print_env.debug_print = atomisp_css2_dbg_print;
We added a config option CONFIG_TRACING_ALLOW_PRINTK to make sure that no extra trace_printk gets added to the kernel, let's use that in this driver to guard the trace_printk call. Signed-off-by: Nicolas Boichat <drinkcat@chromium.org> --- Technically, we could only initialize the trace_printk buffers when the print env is switched, to avoid the build error and unconditional boot-time warning, but I assume this printing framework will eventually get removed when the driver moves out of staging? drivers/staging/media/atomisp/pci/atomisp_compat_css20.c | 4 ++++ 1 file changed, 4 insertions(+)