diff mbox series

drm: Funnel drm logs to tracepoints

Message ID 20191010204823.195540-1-sean@poorly.run (mailing list archive)
State New, archived
Headers show
Series drm: Funnel drm logs to tracepoints | expand

Commit Message

Sean Paul Oct. 10, 2019, 8:48 p.m. UTC
From: Sean Paul <seanpaul@chromium.org>

*Record scratch* You read that subject correctly, I bet you're wondering
how we got here. At least hear me out before you flame :-)

For a long while now, we (ChromeOS) have been struggling getting any
value out of user feedback reports of display failures (notably external
displays not working). The problem is that all logging, even fatal
errors (well, fatal in the sense that a display won't light up) are
logged at DEBUG log level. So in order to extract these logs, you need
to be able to turn on logging, and reproduce the issue with debug
enabled. Unfortunately, this isn't really something we can ask CrOS users
I spoke with airlied about this and RHEL has similar issues.

This is the point where you ask me, "So Sean, why don't you just enable
DRM_UT_BLAH?". Good question! Here are the reasons in ascending order of
severity:
 1- People aren't consistent with their categories, so we'd have to
    enable a bunch to get proper coverage
 2- We don't want to overwhelm syslog with drm spam, others have to use
    it too
 3- Console logging is slow

Hopefully you're with me so far. I have a problem that has no existing
solution. What I really want is a ringbuffer of the most recent logs
(in the categories I'm interested in) exposed via debugfs so I can
extract it when users file feedback.

It just so happens that there is something which does _exactly_ this! I
can dump the most recent logs into tracepoints, turn them on and off
depending on which category I want, and pull them from debugfs on demand.

"What about trace_printk()?" You'll say. It doesn't give us the control we
get from using tracepoints and it's not meant to be left sprinkled around
in code.

So that is how we got here, now it's time for you to tell me why this is
a horrible idea :-)

Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
---
 drivers/gpu/drm/drm_print.c      | 143 ++++++++++++++++++++++++++-----
 include/trace/events/drm_print.h | 116 +++++++++++++++++++++++++
 2 files changed, 239 insertions(+), 20 deletions(-)
 create mode 100644 include/trace/events/drm_print.h

Comments

Daniel Vetter Oct. 14, 2019, 8:52 a.m. UTC | #1
On Thu, Oct 10, 2019 at 04:48:08PM -0400, Sean Paul wrote:
> From: Sean Paul <seanpaul@chromium.org>
> 
> *Record scratch* You read that subject correctly, I bet you're wondering
> how we got here. At least hear me out before you flame :-)
> 
> For a long while now, we (ChromeOS) have been struggling getting any
> value out of user feedback reports of display failures (notably external
> displays not working). The problem is that all logging, even fatal
> errors (well, fatal in the sense that a display won't light up) are
> logged at DEBUG log level. So in order to extract these logs, you need
> to be able to turn on logging, and reproduce the issue with debug
> enabled. Unfortunately, this isn't really something we can ask CrOS users
> I spoke with airlied about this and RHEL has similar issues.
> 
> This is the point where you ask me, "So Sean, why don't you just enable
> DRM_UT_BLAH?". Good question! Here are the reasons in ascending order of
> severity:
>  1- People aren't consistent with their categories, so we'd have to
>     enable a bunch to get proper coverage
>  2- We don't want to overwhelm syslog with drm spam, others have to use
>     it too
>  3- Console logging is slow
> 
> Hopefully you're with me so far. I have a problem that has no existing
> solution. What I really want is a ringbuffer of the most recent logs
> (in the categories I'm interested in) exposed via debugfs so I can
> extract it when users file feedback.
> 
> It just so happens that there is something which does _exactly_ this! I
> can dump the most recent logs into tracepoints, turn them on and off
> depending on which category I want, and pull them from debugfs on demand.
> 
> "What about trace_printk()?" You'll say. It doesn't give us the control we
> get from using tracepoints and it's not meant to be left sprinkled around
> in code.
> 
> So that is how we got here, now it's time for you to tell me why this is
> a horrible idea :-)
> 
> Cc: David Airlie <airlied@gmail.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Sean Paul <seanpaul@chromium.org>

I think if this comes with example script for how to use it (enable the
right tracepoints in debugfs, capture the trace) in the commit message or
somewhere else (igt/tools maybe?) then this is ok with me. If you can also
get some other distro/OS people to ack it, then we're good imo.

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/drm_print.c      | 143 ++++++++++++++++++++++++++-----
>  include/trace/events/drm_print.h | 116 +++++++++++++++++++++++++
>  2 files changed, 239 insertions(+), 20 deletions(-)
>  create mode 100644 include/trace/events/drm_print.h
> 
> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> index 9a25d73c155c..f591292811aa 100644
> --- a/drivers/gpu/drm/drm_print.c
> +++ b/drivers/gpu/drm/drm_print.c
> @@ -27,11 +27,15 @@
>  
>  #include <stdarg.h>
>  
> +#include <linux/bitops.h>
>  #include <linux/io.h>
>  #include <linux/moduleparam.h>
>  #include <linux/seq_file.h>
>  #include <linux/slab.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/drm_print.h>
> +
>  #include <drm/drm.h>
>  #include <drm/drm_drv.h>
>  #include <drm/drm_print.h>
> @@ -241,10 +245,10 @@ void drm_dev_printk(const struct device *dev, const char *level,
>  	struct va_format vaf;
>  	va_list args;
>  
> -	va_start(args, format);
>  	vaf.fmt = format;
>  	vaf.va = &args;
>  
> +	va_start(args, format);
>  	if (dev)
>  		dev_printk(level, dev, "[" DRM_NAME ":%ps] %pV",
>  			   __builtin_return_address(0), &vaf);
> @@ -253,49 +257,145 @@ void drm_dev_printk(const struct device *dev, const char *level,
>  		       level, __builtin_return_address(0), &vaf);
>  
>  	va_end(args);
> +
> +	va_start(args, format);
> +	trace_drm_log(level, dev, &vaf);
> +	va_end(args);
>  }
>  EXPORT_SYMBOL(drm_dev_printk);
>  
> +static unsigned int drm_trace_enabled(unsigned int category)
> +{
> +	unsigned int bit;
> +
> +	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> +		switch (BIT(bit)) {
> +		case DRM_UT_NONE:
> +			return trace_drm_dbg_none_enabled();
> +		case DRM_UT_CORE:
> +			return trace_drm_dbg_core_enabled();
> +		case DRM_UT_DRIVER:
> +			return trace_drm_dbg_driver_enabled();
> +		case DRM_UT_KMS:
> +			return trace_drm_dbg_kms_enabled();
> +		case DRM_UT_PRIME:
> +			return trace_drm_dbg_prime_enabled();
> +		case DRM_UT_ATOMIC:
> +			return trace_drm_dbg_atomic_enabled();
> +		case DRM_UT_VBL:
> +			return trace_drm_dbg_vbl_enabled();
> +		case DRM_UT_STATE:
> +			return trace_drm_dbg_state_enabled();
> +		case DRM_UT_LEASE:
> +			return trace_drm_dbg_lease_enabled();
> +		case DRM_UT_DP:
> +			return trace_drm_dbg_dp_enabled();
> +		default:
> +			return trace_drm_dbg_unknown_enabled();
> +		}
> +	}
> +	return false;
> +}
> +
> +static void drm_do_trace(const struct device *dev, unsigned int category,
> +			 struct va_format *vaf)
> +{
> +	WARN_ON(hweight32(category) > 1);
> +
> +	switch (category) {
> +	case DRM_UT_NONE:
> +		trace_drm_dbg_none(dev, vaf);
> +		break;
> +	case DRM_UT_CORE:
> +		trace_drm_dbg_core(dev, vaf);
> +		break;
> +	case DRM_UT_DRIVER:
> +		trace_drm_dbg_driver(dev, vaf);
> +		break;
> +	case DRM_UT_KMS:
> +		trace_drm_dbg_kms(dev, vaf);
> +		break;
> +	case DRM_UT_PRIME:
> +		trace_drm_dbg_prime(dev, vaf);
> +		break;
> +	case DRM_UT_ATOMIC:
> +		trace_drm_dbg_atomic(dev, vaf);
> +		break;
> +	case DRM_UT_VBL:
> +		trace_drm_dbg_vbl(dev, vaf);
> +		break;
> +	case DRM_UT_STATE:
> +		trace_drm_dbg_state(dev, vaf);
> +		break;
> +	case DRM_UT_LEASE:
> +		trace_drm_dbg_lease(dev, vaf);
> +		break;
> +	case DRM_UT_DP:
> +		trace_drm_dbg_dp(dev, vaf);
> +		break;
> +	default:
> +		trace_drm_dbg_unknown(dev, vaf);
> +		break;
> +	}
> +}
> +
>  void drm_dev_dbg(const struct device *dev, unsigned int category,
>  		 const char *format, ...)
>  {
>  	struct va_format vaf;
> +	unsigned int bit;
>  	va_list args;
>  
> -	if (!drm_debug_enabled(category))
> -		return;
> -
> -	va_start(args, format);
>  	vaf.fmt = format;
>  	vaf.va = &args;
>  
> -	if (dev)
> -		dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
> -			   __builtin_return_address(0), &vaf);
> -	else
> -		printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> -		       __builtin_return_address(0), &vaf);
> +	if (drm_debug_enabled(category)) {
> +		va_start(args, format);
> +		if (dev)
> +			dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
> +				   __builtin_return_address(0), &vaf);
> +		else
> +			printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> +			       __builtin_return_address(0), &vaf);
> +		va_end(args);
> +	}
> +
> +	if (!drm_trace_enabled(category))
> +		return;
> +
> +	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> +		va_start(args, format);
> +		drm_do_trace(dev, BIT(bit), &vaf);
> +		va_end(args);
> +	}
>  
> -	va_end(args);
>  }
>  EXPORT_SYMBOL(drm_dev_dbg);
>  
>  void drm_dbg(unsigned int category, const char *format, ...)
>  {
>  	struct va_format vaf;
> +	unsigned int bit;
>  	va_list args;
>  
> -	if (!drm_debug_enabled(category))
> -		return;
> -
> -	va_start(args, format);
>  	vaf.fmt = format;
>  	vaf.va = &args;
>  
> -	printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> -	       __builtin_return_address(0), &vaf);
> +	if (drm_debug_enabled(category)) {
> +		va_start(args, format);
> +		printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> +		       __builtin_return_address(0), &vaf);
> +		va_end(args);
> +	}
>  
> -	va_end(args);
> +	if (!drm_trace_enabled(category))
> +		return;
> +
> +	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> +		va_start(args, format);
> +		drm_do_trace(NULL, BIT(bit), &vaf);
> +		va_end(args);
> +	}
>  }
>  EXPORT_SYMBOL(drm_dbg);
>  
> @@ -304,13 +404,16 @@ void drm_err(const char *format, ...)
>  	struct va_format vaf;
>  	va_list args;
>  
> -	va_start(args, format);
>  	vaf.fmt = format;
>  	vaf.va = &args;
>  
> +	va_start(args, format);
>  	printk(KERN_ERR "[" DRM_NAME ":%ps] *ERROR* %pV",
>  	       __builtin_return_address(0), &vaf);
> +	va_end(args);
>  
> +	va_start(args, format);
> +	trace_drm_err(NULL, &vaf);
>  	va_end(args);
>  }
>  EXPORT_SYMBOL(drm_err);
> diff --git a/include/trace/events/drm_print.h b/include/trace/events/drm_print.h
> new file mode 100644
> index 000000000000..cc93c69f3fe4
> --- /dev/null
> +++ b/include/trace/events/drm_print.h
> @@ -0,0 +1,116 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright (C) 2019 Google, Inc.
> + *
> + * Authors:
> + * Sean Paul <seanpaul@chromium.org>
> + */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM drm_print
> +
> +#if !defined(_TRACE_DRM_PRINT_H) || defined(TRACE_HEADER_MULTI_READ)
> +
> +#include <linux/device.h>
> +#include <linux/tracepoint.h>
> +
> +#define DRM_PRINT_MAX 256
> +
> +#define _TRACE_DRM_PRINT_H
> +
> +TRACE_EVENT(drm_log,
> +	TP_PROTO(const char * level, const struct device *dev,
> +		 struct va_format *vaf),
> +	TP_ARGS(level, dev, vaf),
> +	TP_STRUCT__entry(
> +		__field(const char *,		level			)
> +		__field(const struct device *,	dev			)
> +		__dynamic_array(char,		msg,	DRM_PRINT_MAX	)
> +		__field(int,			len			)
> +	),
> +	TP_fast_assign(
> +		__entry->level = level;
> +		__entry->dev = dev;
> +		__entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
> +					 vaf->fmt, *vaf->va);
> +		if (__entry->len > 0 &&
> +		    __get_str(msg)[__entry->len - 1] == '\n')
> +			__get_str(msg)[--__entry->len] = '\0';
> +	),
> +	TP_printk("%s %s %s: %s", __entry->level,
> +		  __entry->dev ? dev_driver_string(__entry->dev) : "(NULL):",
> +		  __entry->dev ? dev_name(__entry->dev) : "", __get_str(msg))
> +);
> +
> +TRACE_EVENT(class_drm_category_log,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf),
> +	TP_STRUCT__entry(
> +		__field(const struct device *,	dev			)
> +		__dynamic_array(char,		msg,	DRM_PRINT_MAX	)
> +		__field(int,			len			)
> +	),
> +	TP_fast_assign(
> +		__entry->dev = dev;
> +		__entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
> +					 vaf->fmt, *vaf->va);
> +		if (__entry->len > 0 &&
> +		    __get_str(msg)[__entry->len - 1] == '\n')
> +			__get_str(msg)[--__entry->len] = '\0';
> +	),
> +	TP_printk("%s %s%s%s",
> +		  __entry->dev ? dev_driver_string(__entry->dev) : "",
> +		  __entry->dev ? dev_name(__entry->dev) : "",
> +		  __entry->dev ? ": " : "", __get_str(msg))
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_err,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_none,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_core,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_driver,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_kms,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_prime,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_atomic,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_vbl,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_state,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_lease,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_dp,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_unknown,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +#endif
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> -- 
> Sean Paul, Software Engineer, Google / Chromium OS
>
Joonas Lahtinen Oct. 15, 2019, 11:18 a.m. UTC | #2
Quoting Sean Paul (2019-10-10 23:48:08)
> From: Sean Paul <seanpaul@chromium.org>
> 
> *Record scratch* You read that subject correctly, I bet you're wondering
> how we got here. At least hear me out before you flame :-)
> 
> For a long while now, we (ChromeOS) have been struggling getting any
> value out of user feedback reports of display failures (notably external
> displays not working). The problem is that all logging, even fatal
> errors (well, fatal in the sense that a display won't light up) are
> logged at DEBUG log level. So in order to extract these logs, you need
> to be able to turn on logging, and reproduce the issue with debug
> enabled. Unfortunately, this isn't really something we can ask CrOS users
> I spoke with airlied about this and RHEL has similar issues.
> 
> This is the point where you ask me, "So Sean, why don't you just enable
> DRM_UT_BLAH?". Good question! Here are the reasons in ascending order of
> severity:
>  1- People aren't consistent with their categories, so we'd have to
>     enable a bunch to get proper coverage
>  2- We don't want to overwhelm syslog with drm spam, others have to use
>     it too
>  3- Console logging is slow
> 
> Hopefully you're with me so far. I have a problem that has no existing
> solution. What I really want is a ringbuffer of the most recent logs
> (in the categories I'm interested in) exposed via debugfs so I can
> extract it when users file feedback.

A nitpick, tracepoints are no longer in debugfs but tracefs. I'm being
told the reason is because production environments want to use them and
expect them to be stable.

> It just so happens that there is something which does _exactly_ this! I
> can dump the most recent logs into tracepoints, turn them on and off
> depending on which category I want, and pull them from debugfs on demand.
> 
> "What about trace_printk()?" You'll say. It doesn't give us the control we
> get from using tracepoints and it's not meant to be left sprinkled around
> in code.
> 
> So that is how we got here, now it's time for you to tell me why this is
> a horrible idea :-)

Being devil's advocate; How long until all our debugging messages will be
kernel ABI?

In the context of the DRM subsystem level unified tracing
(Message-Id: 20190121232040.26114-1-chris@chris-wilson.co.uk)
we already struggled to find the sweet spot of exposing only
information we can maintain long term.

I can imagine this de-railing into to a direction when
the userspace debugging information of interest is extracted
from the kernel debug messages. When that message format
changes and breaks the userspace tool, you probably know
the drill.

I like the base idea, but implementation through tracepoints
has great potential to become maintenance nightmare. So maybe
something actually in debugfs might be the right solution?

Regards, Joonas
Thomas Zimmermann Oct. 15, 2019, 12:07 p.m. UTC | #3
Hi

Am 10.10.19 um 22:48 schrieb Sean Paul:
> From: Sean Paul <seanpaul@chromium.org>
> 
> *Record scratch* You read that subject correctly, I bet you're wondering
> how we got here. At least hear me out before you flame :-)
> 
> For a long while now, we (ChromeOS) have been struggling getting any
> value out of user feedback reports of display failures (notably external
> displays not working). The problem is that all logging, even fatal
> errors (well, fatal in the sense that a display won't light up) are
> logged at DEBUG log level. So in order to extract these logs, you need
> to be able to turn on logging, and reproduce the issue with debug
> enabled. Unfortunately, this isn't really something we can ask CrOS users
> I spoke with airlied about this and RHEL has similar issues.
> 
> This is the point where you ask me, "So Sean, why don't you just enable
> DRM_UT_BLAH?". Good question! Here are the reasons in ascending order of
> severity:
>  1- People aren't consistent with their categories, so we'd have to
>     enable a bunch to get proper coverage
>  2- We don't want to overwhelm syslog with drm spam, others have to use
>     it too
>  3- Console logging is slow
> 
> Hopefully you're with me so far. I have a problem that has no existing
> solution. What I really want is a ringbuffer of the most recent logs
> (in the categories I'm interested in) exposed via debugfs so I can
> extract it when users file feedback.

For bug reports, I don't want categories or anything else that users can
switch on/off. All I'd want is a simple way of retrieving the last ~100
messages from DRM (ala: "please attach the content of the file at
/sys/debug...").

> It just so happens that there is something which does _exactly_ this! I
> can dump the most recent logs into tracepoints, turn them on and off
> depending on which category I want, and pull them from debugfs on demand.
> 
> "What about trace_printk()?" You'll say. It doesn't give us the control we
> get from using tracepoints and it's not meant to be left sprinkled around
> in code.
> 
> So that is how we got here, now it's time for you to tell me why this is
> a horrible idea :-)

Tracepoints are considered stable uapi, right? As a distro person (SUSE)
I don't want us to have to maintain debugging messages forever.


The problem itself doesn't seem related to DRM. Do other subsystems have
similar requirements?

Best regards
Thomas

> Cc: David Airlie <airlied@gmail.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Signed-off-by: Sean Paul <seanpaul@chromium.org>
> ---
>  drivers/gpu/drm/drm_print.c      | 143 ++++++++++++++++++++++++++-----
>  include/trace/events/drm_print.h | 116 +++++++++++++++++++++++++
>  2 files changed, 239 insertions(+), 20 deletions(-)
>  create mode 100644 include/trace/events/drm_print.h
> 
> diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> index 9a25d73c155c..f591292811aa 100644
> --- a/drivers/gpu/drm/drm_print.c
> +++ b/drivers/gpu/drm/drm_print.c
> @@ -27,11 +27,15 @@
>  
>  #include <stdarg.h>
>  
> +#include <linux/bitops.h>
>  #include <linux/io.h>
>  #include <linux/moduleparam.h>
>  #include <linux/seq_file.h>
>  #include <linux/slab.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/drm_print.h>
> +
>  #include <drm/drm.h>
>  #include <drm/drm_drv.h>
>  #include <drm/drm_print.h>
> @@ -241,10 +245,10 @@ void drm_dev_printk(const struct device *dev, const char *level,
>  	struct va_format vaf;
>  	va_list args;
>  
> -	va_start(args, format);
>  	vaf.fmt = format;
>  	vaf.va = &args;
>  
> +	va_start(args, format);
>  	if (dev)
>  		dev_printk(level, dev, "[" DRM_NAME ":%ps] %pV",
>  			   __builtin_return_address(0), &vaf);
> @@ -253,49 +257,145 @@ void drm_dev_printk(const struct device *dev, const char *level,
>  		       level, __builtin_return_address(0), &vaf);
>  
>  	va_end(args);
> +
> +	va_start(args, format);
> +	trace_drm_log(level, dev, &vaf);
> +	va_end(args);
>  }
>  EXPORT_SYMBOL(drm_dev_printk);
>  
> +static unsigned int drm_trace_enabled(unsigned int category)
> +{
> +	unsigned int bit;
> +
> +	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> +		switch (BIT(bit)) {
> +		case DRM_UT_NONE:
> +			return trace_drm_dbg_none_enabled();
> +		case DRM_UT_CORE:
> +			return trace_drm_dbg_core_enabled();
> +		case DRM_UT_DRIVER:
> +			return trace_drm_dbg_driver_enabled();
> +		case DRM_UT_KMS:
> +			return trace_drm_dbg_kms_enabled();
> +		case DRM_UT_PRIME:
> +			return trace_drm_dbg_prime_enabled();
> +		case DRM_UT_ATOMIC:
> +			return trace_drm_dbg_atomic_enabled();
> +		case DRM_UT_VBL:
> +			return trace_drm_dbg_vbl_enabled();
> +		case DRM_UT_STATE:
> +			return trace_drm_dbg_state_enabled();
> +		case DRM_UT_LEASE:
> +			return trace_drm_dbg_lease_enabled();
> +		case DRM_UT_DP:
> +			return trace_drm_dbg_dp_enabled();
> +		default:
> +			return trace_drm_dbg_unknown_enabled();
> +		}
> +	}
> +	return false;
> +}
> +
> +static void drm_do_trace(const struct device *dev, unsigned int category,
> +			 struct va_format *vaf)
> +{
> +	WARN_ON(hweight32(category) > 1);
> +
> +	switch (category) {
> +	case DRM_UT_NONE:
> +		trace_drm_dbg_none(dev, vaf);
> +		break;
> +	case DRM_UT_CORE:
> +		trace_drm_dbg_core(dev, vaf);
> +		break;
> +	case DRM_UT_DRIVER:
> +		trace_drm_dbg_driver(dev, vaf);
> +		break;
> +	case DRM_UT_KMS:
> +		trace_drm_dbg_kms(dev, vaf);
> +		break;
> +	case DRM_UT_PRIME:
> +		trace_drm_dbg_prime(dev, vaf);
> +		break;
> +	case DRM_UT_ATOMIC:
> +		trace_drm_dbg_atomic(dev, vaf);
> +		break;
> +	case DRM_UT_VBL:
> +		trace_drm_dbg_vbl(dev, vaf);
> +		break;
> +	case DRM_UT_STATE:
> +		trace_drm_dbg_state(dev, vaf);
> +		break;
> +	case DRM_UT_LEASE:
> +		trace_drm_dbg_lease(dev, vaf);
> +		break;
> +	case DRM_UT_DP:
> +		trace_drm_dbg_dp(dev, vaf);
> +		break;
> +	default:
> +		trace_drm_dbg_unknown(dev, vaf);
> +		break;
> +	}
> +}
> +
>  void drm_dev_dbg(const struct device *dev, unsigned int category,
>  		 const char *format, ...)
>  {
>  	struct va_format vaf;
> +	unsigned int bit;
>  	va_list args;
>  
> -	if (!drm_debug_enabled(category))
> -		return;
> -
> -	va_start(args, format);
>  	vaf.fmt = format;
>  	vaf.va = &args;
>  
> -	if (dev)
> -		dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
> -			   __builtin_return_address(0), &vaf);
> -	else
> -		printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> -		       __builtin_return_address(0), &vaf);
> +	if (drm_debug_enabled(category)) {
> +		va_start(args, format);
> +		if (dev)
> +			dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
> +				   __builtin_return_address(0), &vaf);
> +		else
> +			printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> +			       __builtin_return_address(0), &vaf);
> +		va_end(args);
> +	}
> +
> +	if (!drm_trace_enabled(category))
> +		return;
> +
> +	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> +		va_start(args, format);
> +		drm_do_trace(dev, BIT(bit), &vaf);
> +		va_end(args);
> +	}
>  
> -	va_end(args);
>  }
>  EXPORT_SYMBOL(drm_dev_dbg);
>  
>  void drm_dbg(unsigned int category, const char *format, ...)
>  {
>  	struct va_format vaf;
> +	unsigned int bit;
>  	va_list args;
>  
> -	if (!drm_debug_enabled(category))
> -		return;
> -
> -	va_start(args, format);
>  	vaf.fmt = format;
>  	vaf.va = &args;
>  
> -	printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> -	       __builtin_return_address(0), &vaf);
> +	if (drm_debug_enabled(category)) {
> +		va_start(args, format);
> +		printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> +		       __builtin_return_address(0), &vaf);
> +		va_end(args);
> +	}
>  
> -	va_end(args);
> +	if (!drm_trace_enabled(category))
> +		return;
> +
> +	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> +		va_start(args, format);
> +		drm_do_trace(NULL, BIT(bit), &vaf);
> +		va_end(args);
> +	}
>  }
>  EXPORT_SYMBOL(drm_dbg);
>  
> @@ -304,13 +404,16 @@ void drm_err(const char *format, ...)
>  	struct va_format vaf;
>  	va_list args;
>  
> -	va_start(args, format);
>  	vaf.fmt = format;
>  	vaf.va = &args;
>  
> +	va_start(args, format);
>  	printk(KERN_ERR "[" DRM_NAME ":%ps] *ERROR* %pV",
>  	       __builtin_return_address(0), &vaf);
> +	va_end(args);
>  
> +	va_start(args, format);
> +	trace_drm_err(NULL, &vaf);
>  	va_end(args);
>  }
>  EXPORT_SYMBOL(drm_err);
> diff --git a/include/trace/events/drm_print.h b/include/trace/events/drm_print.h
> new file mode 100644
> index 000000000000..cc93c69f3fe4
> --- /dev/null
> +++ b/include/trace/events/drm_print.h
> @@ -0,0 +1,116 @@
> +// SPDX-License-Identifier: MIT
> +/*
> + * Copyright (C) 2019 Google, Inc.
> + *
> + * Authors:
> + * Sean Paul <seanpaul@chromium.org>
> + */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM drm_print
> +
> +#if !defined(_TRACE_DRM_PRINT_H) || defined(TRACE_HEADER_MULTI_READ)
> +
> +#include <linux/device.h>
> +#include <linux/tracepoint.h>
> +
> +#define DRM_PRINT_MAX 256
> +
> +#define _TRACE_DRM_PRINT_H
> +
> +TRACE_EVENT(drm_log,
> +	TP_PROTO(const char * level, const struct device *dev,
> +		 struct va_format *vaf),
> +	TP_ARGS(level, dev, vaf),
> +	TP_STRUCT__entry(
> +		__field(const char *,		level			)
> +		__field(const struct device *,	dev			)
> +		__dynamic_array(char,		msg,	DRM_PRINT_MAX	)
> +		__field(int,			len			)
> +	),
> +	TP_fast_assign(
> +		__entry->level = level;
> +		__entry->dev = dev;
> +		__entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
> +					 vaf->fmt, *vaf->va);
> +		if (__entry->len > 0 &&
> +		    __get_str(msg)[__entry->len - 1] == '\n')
> +			__get_str(msg)[--__entry->len] = '\0';
> +	),
> +	TP_printk("%s %s %s: %s", __entry->level,
> +		  __entry->dev ? dev_driver_string(__entry->dev) : "(NULL):",
> +		  __entry->dev ? dev_name(__entry->dev) : "", __get_str(msg))
> +);
> +
> +TRACE_EVENT(class_drm_category_log,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf),
> +	TP_STRUCT__entry(
> +		__field(const struct device *,	dev			)
> +		__dynamic_array(char,		msg,	DRM_PRINT_MAX	)
> +		__field(int,			len			)
> +	),
> +	TP_fast_assign(
> +		__entry->dev = dev;
> +		__entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
> +					 vaf->fmt, *vaf->va);
> +		if (__entry->len > 0 &&
> +		    __get_str(msg)[__entry->len - 1] == '\n')
> +			__get_str(msg)[--__entry->len] = '\0';
> +	),
> +	TP_printk("%s %s%s%s",
> +		  __entry->dev ? dev_driver_string(__entry->dev) : "",
> +		  __entry->dev ? dev_name(__entry->dev) : "",
> +		  __entry->dev ? ": " : "", __get_str(msg))
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_err,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_none,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_core,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_driver,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_kms,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_prime,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_atomic,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_vbl,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_state,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_lease,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_dp,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +DEFINE_EVENT(class_drm_category_log, drm_dbg_unknown,
> +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> +	TP_ARGS(dev, vaf)
> +);
> +#endif
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
>
Rob Clark Oct. 15, 2019, 4:18 p.m. UTC | #4
On Tue, Oct 15, 2019 at 5:07 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
> Hi
>
> Am 10.10.19 um 22:48 schrieb Sean Paul:
> > From: Sean Paul <seanpaul@chromium.org>
> >
> > *Record scratch* You read that subject correctly, I bet you're wondering
> > how we got here. At least hear me out before you flame :-)
> >
> > For a long while now, we (ChromeOS) have been struggling getting any
> > value out of user feedback reports of display failures (notably external
> > displays not working). The problem is that all logging, even fatal
> > errors (well, fatal in the sense that a display won't light up) are
> > logged at DEBUG log level. So in order to extract these logs, you need
> > to be able to turn on logging, and reproduce the issue with debug
> > enabled. Unfortunately, this isn't really something we can ask CrOS users
> > I spoke with airlied about this and RHEL has similar issues.
> >
> > This is the point where you ask me, "So Sean, why don't you just enable
> > DRM_UT_BLAH?". Good question! Here are the reasons in ascending order of
> > severity:
> >  1- People aren't consistent with their categories, so we'd have to
> >     enable a bunch to get proper coverage
> >  2- We don't want to overwhelm syslog with drm spam, others have to use
> >     it too
> >  3- Console logging is slow
> >
> > Hopefully you're with me so far. I have a problem that has no existing
> > solution. What I really want is a ringbuffer of the most recent logs
> > (in the categories I'm interested in) exposed via debugfs so I can
> > extract it when users file feedback.
>
> For bug reports, I don't want categories or anything else that users can
> switch on/off. All I'd want is a simple way of retrieving the last ~100
> messages from DRM (ala: "please attach the content of the file at
> /sys/debug...").
>
> > It just so happens that there is something which does _exactly_ this! I
> > can dump the most recent logs into tracepoints, turn them on and off
> > depending on which category I want, and pull them from debugfs on demand.
> >
> > "What about trace_printk()?" You'll say. It doesn't give us the control we
> > get from using tracepoints and it's not meant to be left sprinkled around
> > in code.
> >
> > So that is how we got here, now it's time for you to tell me why this is
> > a horrible idea :-)
>
> Tracepoints are considered stable uapi, right? As a distro person (SUSE)
> I don't want us to have to maintain debugging messages forever.
>

it does seem like there is somehow some room for unstable/debug
tracepoints, so we can re-use the tracepoint mechanism for things like
the

BR,
-R

>
> The problem itself doesn't seem related to DRM. Do other subsystems have
> similar requirements?
>
> Best regards
> Thomas
>
> > Cc: David Airlie <airlied@gmail.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Signed-off-by: Sean Paul <seanpaul@chromium.org>
> > ---
> >  drivers/gpu/drm/drm_print.c      | 143 ++++++++++++++++++++++++++-----
> >  include/trace/events/drm_print.h | 116 +++++++++++++++++++++++++
> >  2 files changed, 239 insertions(+), 20 deletions(-)
> >  create mode 100644 include/trace/events/drm_print.h
> >
> > diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> > index 9a25d73c155c..f591292811aa 100644
> > --- a/drivers/gpu/drm/drm_print.c
> > +++ b/drivers/gpu/drm/drm_print.c
> > @@ -27,11 +27,15 @@
> >
> >  #include <stdarg.h>
> >
> > +#include <linux/bitops.h>
> >  #include <linux/io.h>
> >  #include <linux/moduleparam.h>
> >  #include <linux/seq_file.h>
> >  #include <linux/slab.h>
> >
> > +#define CREATE_TRACE_POINTS
> > +#include <trace/events/drm_print.h>
> > +
> >  #include <drm/drm.h>
> >  #include <drm/drm_drv.h>
> >  #include <drm/drm_print.h>
> > @@ -241,10 +245,10 @@ void drm_dev_printk(const struct device *dev, const char *level,
> >       struct va_format vaf;
> >       va_list args;
> >
> > -     va_start(args, format);
> >       vaf.fmt = format;
> >       vaf.va = &args;
> >
> > +     va_start(args, format);
> >       if (dev)
> >               dev_printk(level, dev, "[" DRM_NAME ":%ps] %pV",
> >                          __builtin_return_address(0), &vaf);
> > @@ -253,49 +257,145 @@ void drm_dev_printk(const struct device *dev, const char *level,
> >                      level, __builtin_return_address(0), &vaf);
> >
> >       va_end(args);
> > +
> > +     va_start(args, format);
> > +     trace_drm_log(level, dev, &vaf);
> > +     va_end(args);
> >  }
> >  EXPORT_SYMBOL(drm_dev_printk);
> >
> > +static unsigned int drm_trace_enabled(unsigned int category)
> > +{
> > +     unsigned int bit;
> > +
> > +     for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> > +             switch (BIT(bit)) {
> > +             case DRM_UT_NONE:
> > +                     return trace_drm_dbg_none_enabled();
> > +             case DRM_UT_CORE:
> > +                     return trace_drm_dbg_core_enabled();
> > +             case DRM_UT_DRIVER:
> > +                     return trace_drm_dbg_driver_enabled();
> > +             case DRM_UT_KMS:
> > +                     return trace_drm_dbg_kms_enabled();
> > +             case DRM_UT_PRIME:
> > +                     return trace_drm_dbg_prime_enabled();
> > +             case DRM_UT_ATOMIC:
> > +                     return trace_drm_dbg_atomic_enabled();
> > +             case DRM_UT_VBL:
> > +                     return trace_drm_dbg_vbl_enabled();
> > +             case DRM_UT_STATE:
> > +                     return trace_drm_dbg_state_enabled();
> > +             case DRM_UT_LEASE:
> > +                     return trace_drm_dbg_lease_enabled();
> > +             case DRM_UT_DP:
> > +                     return trace_drm_dbg_dp_enabled();
> > +             default:
> > +                     return trace_drm_dbg_unknown_enabled();
> > +             }
> > +     }
> > +     return false;
> > +}
> > +
> > +static void drm_do_trace(const struct device *dev, unsigned int category,
> > +                      struct va_format *vaf)
> > +{
> > +     WARN_ON(hweight32(category) > 1);
> > +
> > +     switch (category) {
> > +     case DRM_UT_NONE:
> > +             trace_drm_dbg_none(dev, vaf);
> > +             break;
> > +     case DRM_UT_CORE:
> > +             trace_drm_dbg_core(dev, vaf);
> > +             break;
> > +     case DRM_UT_DRIVER:
> > +             trace_drm_dbg_driver(dev, vaf);
> > +             break;
> > +     case DRM_UT_KMS:
> > +             trace_drm_dbg_kms(dev, vaf);
> > +             break;
> > +     case DRM_UT_PRIME:
> > +             trace_drm_dbg_prime(dev, vaf);
> > +             break;
> > +     case DRM_UT_ATOMIC:
> > +             trace_drm_dbg_atomic(dev, vaf);
> > +             break;
> > +     case DRM_UT_VBL:
> > +             trace_drm_dbg_vbl(dev, vaf);
> > +             break;
> > +     case DRM_UT_STATE:
> > +             trace_drm_dbg_state(dev, vaf);
> > +             break;
> > +     case DRM_UT_LEASE:
> > +             trace_drm_dbg_lease(dev, vaf);
> > +             break;
> > +     case DRM_UT_DP:
> > +             trace_drm_dbg_dp(dev, vaf);
> > +             break;
> > +     default:
> > +             trace_drm_dbg_unknown(dev, vaf);
> > +             break;
> > +     }
> > +}
> > +
> >  void drm_dev_dbg(const struct device *dev, unsigned int category,
> >                const char *format, ...)
> >  {
> >       struct va_format vaf;
> > +     unsigned int bit;
> >       va_list args;
> >
> > -     if (!drm_debug_enabled(category))
> > -             return;
> > -
> > -     va_start(args, format);
> >       vaf.fmt = format;
> >       vaf.va = &args;
> >
> > -     if (dev)
> > -             dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
> > -                        __builtin_return_address(0), &vaf);
> > -     else
> > -             printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> > -                    __builtin_return_address(0), &vaf);
> > +     if (drm_debug_enabled(category)) {
> > +             va_start(args, format);
> > +             if (dev)
> > +                     dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
> > +                                __builtin_return_address(0), &vaf);
> > +             else
> > +                     printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> > +                            __builtin_return_address(0), &vaf);
> > +             va_end(args);
> > +     }
> > +
> > +     if (!drm_trace_enabled(category))
> > +             return;
> > +
> > +     for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> > +             va_start(args, format);
> > +             drm_do_trace(dev, BIT(bit), &vaf);
> > +             va_end(args);
> > +     }
> >
> > -     va_end(args);
> >  }
> >  EXPORT_SYMBOL(drm_dev_dbg);
> >
> >  void drm_dbg(unsigned int category, const char *format, ...)
> >  {
> >       struct va_format vaf;
> > +     unsigned int bit;
> >       va_list args;
> >
> > -     if (!drm_debug_enabled(category))
> > -             return;
> > -
> > -     va_start(args, format);
> >       vaf.fmt = format;
> >       vaf.va = &args;
> >
> > -     printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> > -            __builtin_return_address(0), &vaf);
> > +     if (drm_debug_enabled(category)) {
> > +             va_start(args, format);
> > +             printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> > +                    __builtin_return_address(0), &vaf);
> > +             va_end(args);
> > +     }
> >
> > -     va_end(args);
> > +     if (!drm_trace_enabled(category))
> > +             return;
> > +
> > +     for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> > +             va_start(args, format);
> > +             drm_do_trace(NULL, BIT(bit), &vaf);
> > +             va_end(args);
> > +     }
> >  }
> >  EXPORT_SYMBOL(drm_dbg);
> >
> > @@ -304,13 +404,16 @@ void drm_err(const char *format, ...)
> >       struct va_format vaf;
> >       va_list args;
> >
> > -     va_start(args, format);
> >       vaf.fmt = format;
> >       vaf.va = &args;
> >
> > +     va_start(args, format);
> >       printk(KERN_ERR "[" DRM_NAME ":%ps] *ERROR* %pV",
> >              __builtin_return_address(0), &vaf);
> > +     va_end(args);
> >
> > +     va_start(args, format);
> > +     trace_drm_err(NULL, &vaf);
> >       va_end(args);
> >  }
> >  EXPORT_SYMBOL(drm_err);
> > diff --git a/include/trace/events/drm_print.h b/include/trace/events/drm_print.h
> > new file mode 100644
> > index 000000000000..cc93c69f3fe4
> > --- /dev/null
> > +++ b/include/trace/events/drm_print.h
> > @@ -0,0 +1,116 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright (C) 2019 Google, Inc.
> > + *
> > + * Authors:
> > + * Sean Paul <seanpaul@chromium.org>
> > + */
> > +#undef TRACE_SYSTEM
> > +#define TRACE_SYSTEM drm_print
> > +
> > +#if !defined(_TRACE_DRM_PRINT_H) || defined(TRACE_HEADER_MULTI_READ)
> > +
> > +#include <linux/device.h>
> > +#include <linux/tracepoint.h>
> > +
> > +#define DRM_PRINT_MAX 256
> > +
> > +#define _TRACE_DRM_PRINT_H
> > +
> > +TRACE_EVENT(drm_log,
> > +     TP_PROTO(const char * level, const struct device *dev,
> > +              struct va_format *vaf),
> > +     TP_ARGS(level, dev, vaf),
> > +     TP_STRUCT__entry(
> > +             __field(const char *,           level                   )
> > +             __field(const struct device *,  dev                     )
> > +             __dynamic_array(char,           msg,    DRM_PRINT_MAX   )
> > +             __field(int,                    len                     )
> > +     ),
> > +     TP_fast_assign(
> > +             __entry->level = level;
> > +             __entry->dev = dev;
> > +             __entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
> > +                                      vaf->fmt, *vaf->va);
> > +             if (__entry->len > 0 &&
> > +                 __get_str(msg)[__entry->len - 1] == '\n')
> > +                     __get_str(msg)[--__entry->len] = '\0';
> > +     ),
> > +     TP_printk("%s %s %s: %s", __entry->level,
> > +               __entry->dev ? dev_driver_string(__entry->dev) : "(NULL):",
> > +               __entry->dev ? dev_name(__entry->dev) : "", __get_str(msg))
> > +);
> > +
> > +TRACE_EVENT(class_drm_category_log,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf),
> > +     TP_STRUCT__entry(
> > +             __field(const struct device *,  dev                     )
> > +             __dynamic_array(char,           msg,    DRM_PRINT_MAX   )
> > +             __field(int,                    len                     )
> > +     ),
> > +     TP_fast_assign(
> > +             __entry->dev = dev;
> > +             __entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
> > +                                      vaf->fmt, *vaf->va);
> > +             if (__entry->len > 0 &&
> > +                 __get_str(msg)[__entry->len - 1] == '\n')
> > +                     __get_str(msg)[--__entry->len] = '\0';
> > +     ),
> > +     TP_printk("%s %s%s%s",
> > +               __entry->dev ? dev_driver_string(__entry->dev) : "",
> > +               __entry->dev ? dev_name(__entry->dev) : "",
> > +               __entry->dev ? ": " : "", __get_str(msg))
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_err,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_none,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_core,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_driver,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_kms,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_prime,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_atomic,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_vbl,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_state,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_lease,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_dp,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_unknown,
> > +     TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +     TP_ARGS(dev, vaf)
> > +);
> > +#endif
> > +
> > +/* This part must be outside protection */
> > +#include <trace/define_trace.h>
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Sean Paul Oct. 15, 2019, 7:11 p.m. UTC | #5
On Tue, Oct 15, 2019 at 02:07:00PM +0200, Thomas Zimmermann wrote:
> Hi
> 

Hi Thomas (and Daniel/Joonas/Rob),
Thanks for your feedback

> Am 10.10.19 um 22:48 schrieb Sean Paul:
> > From: Sean Paul <seanpaul@chromium.org>
> > 
> > *Record scratch* You read that subject correctly, I bet you're wondering
> > how we got here. At least hear me out before you flame :-)
> > 
> > For a long while now, we (ChromeOS) have been struggling getting any
> > value out of user feedback reports of display failures (notably external
> > displays not working). The problem is that all logging, even fatal
> > errors (well, fatal in the sense that a display won't light up) are
> > logged at DEBUG log level. So in order to extract these logs, you need
> > to be able to turn on logging, and reproduce the issue with debug
> > enabled. Unfortunately, this isn't really something we can ask CrOS users
> > I spoke with airlied about this and RHEL has similar issues.
> > 
> > This is the point where you ask me, "So Sean, why don't you just enable
> > DRM_UT_BLAH?". Good question! Here are the reasons in ascending order of
> > severity:
> >  1- People aren't consistent with their categories, so we'd have to
> >     enable a bunch to get proper coverage
> >  2- We don't want to overwhelm syslog with drm spam, others have to use
> >     it too
> >  3- Console logging is slow
> > 
> > Hopefully you're with me so far. I have a problem that has no existing
> > solution. What I really want is a ringbuffer of the most recent logs
> > (in the categories I'm interested in) exposed via debugfs so I can
> > extract it when users file feedback.
> 
> For bug reports, I don't want categories or anything else that users can
> switch on/off. All I'd want is a simple way of retrieving the last ~100
> messages from DRM (ala: "please attach the content of the file at
> /sys/debug...").
> 

Yeah, this is basically what I'm looking for (although I'd filter out a few
categories that are particularly noisy and bump N to as big as I can get away
with).

> > It just so happens that there is something which does _exactly_ this! I
> > can dump the most recent logs into tracepoints, turn them on and off
> > depending on which category I want, and pull them from debugfs on demand.
> > 
> > "What about trace_printk()?" You'll say. It doesn't give us the control we
> > get from using tracepoints and it's not meant to be left sprinkled around
> > in code.
> > 
> > So that is how we got here, now it's time for you to tell me why this is
> > a horrible idea :-)
> 
> Tracepoints are considered stable uapi, right? As a distro person (SUSE)
> I don't want us to have to maintain debugging messages forever.
> 

This is something I wasn't aware of (that's what reviews are for, right?!). So
this idea is perhaps dead in the water. I could probably integrate with
ringbuffer at a lower level and expose a drm-specific debugfs node without _too_
much extra goo, but [ab]using tracepoints seemed pretty natural. The
other benefit to this approach is that backporting it into distro kernels is
relatively easy.

> 
> The problem itself doesn't seem related to DRM. Do other subsystems have
> similar requirements?

I think other subsystems are chattier than ours (ie: when an error occurs they
actually log something). Some of our drivers in drm are more helpful than others
as far as complaining when something unexpected happens. The core in particular
is really conservative about issuing errors (which makes sense since a lot of
the failures are expected (atomic testing is messy), only the compositor really
knows which ones are noteworthy).

I've been holding onto this patch for a couple weeks now because although it
does what I want (and it seems others want), it does feel pretty greasy.

One half-baked idea I had (when I was thinking about what to do after this
particular approach was inevitably nak'd :) would be to audit some of the more
important messages and re-categorize them (or rather make use of category being
a bit field to add a new category). The new category would be much lower volume
and could be turned on in production. It would hopefully glean a little more use
from the console logs. Examples of uses would be dp link training messages,
modeset failures, bandwidth errors, etc). The downside to this is that it's a
ton of work, lots of churn, hard to enforce as new driver support is added, and
a complete horror-show to backport. So, not sure about embarking down that path.

I'm desperate for suggestions, if anyone has better ideas!

Sean

PS- Something I didn't address is how this approach differs from just enabling
    dynamic debug. We could probably print out the drm category in
    drm_[dev_]printk and then use a filter on dynamic debug to enable a single
    category. However we would still be using the [slow] console and most
    categories would overwhelm the console with drm spam. So not a good fit for
    what I'm after.

> 
> Best regards
> Thomas
> 
> > Cc: David Airlie <airlied@gmail.com>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Signed-off-by: Sean Paul <seanpaul@chromium.org>
> > ---
> >  drivers/gpu/drm/drm_print.c      | 143 ++++++++++++++++++++++++++-----
> >  include/trace/events/drm_print.h | 116 +++++++++++++++++++++++++
> >  2 files changed, 239 insertions(+), 20 deletions(-)
> >  create mode 100644 include/trace/events/drm_print.h
> > 
> > diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
> > index 9a25d73c155c..f591292811aa 100644
> > --- a/drivers/gpu/drm/drm_print.c
> > +++ b/drivers/gpu/drm/drm_print.c
> > @@ -27,11 +27,15 @@
> >  
> >  #include <stdarg.h>
> >  
> > +#include <linux/bitops.h>
> >  #include <linux/io.h>
> >  #include <linux/moduleparam.h>
> >  #include <linux/seq_file.h>
> >  #include <linux/slab.h>
> >  
> > +#define CREATE_TRACE_POINTS
> > +#include <trace/events/drm_print.h>
> > +
> >  #include <drm/drm.h>
> >  #include <drm/drm_drv.h>
> >  #include <drm/drm_print.h>
> > @@ -241,10 +245,10 @@ void drm_dev_printk(const struct device *dev, const char *level,
> >  	struct va_format vaf;
> >  	va_list args;
> >  
> > -	va_start(args, format);
> >  	vaf.fmt = format;
> >  	vaf.va = &args;
> >  
> > +	va_start(args, format);
> >  	if (dev)
> >  		dev_printk(level, dev, "[" DRM_NAME ":%ps] %pV",
> >  			   __builtin_return_address(0), &vaf);
> > @@ -253,49 +257,145 @@ void drm_dev_printk(const struct device *dev, const char *level,
> >  		       level, __builtin_return_address(0), &vaf);
> >  
> >  	va_end(args);
> > +
> > +	va_start(args, format);
> > +	trace_drm_log(level, dev, &vaf);
> > +	va_end(args);
> >  }
> >  EXPORT_SYMBOL(drm_dev_printk);
> >  
> > +static unsigned int drm_trace_enabled(unsigned int category)
> > +{
> > +	unsigned int bit;
> > +
> > +	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> > +		switch (BIT(bit)) {
> > +		case DRM_UT_NONE:
> > +			return trace_drm_dbg_none_enabled();
> > +		case DRM_UT_CORE:
> > +			return trace_drm_dbg_core_enabled();
> > +		case DRM_UT_DRIVER:
> > +			return trace_drm_dbg_driver_enabled();
> > +		case DRM_UT_KMS:
> > +			return trace_drm_dbg_kms_enabled();
> > +		case DRM_UT_PRIME:
> > +			return trace_drm_dbg_prime_enabled();
> > +		case DRM_UT_ATOMIC:
> > +			return trace_drm_dbg_atomic_enabled();
> > +		case DRM_UT_VBL:
> > +			return trace_drm_dbg_vbl_enabled();
> > +		case DRM_UT_STATE:
> > +			return trace_drm_dbg_state_enabled();
> > +		case DRM_UT_LEASE:
> > +			return trace_drm_dbg_lease_enabled();
> > +		case DRM_UT_DP:
> > +			return trace_drm_dbg_dp_enabled();
> > +		default:
> > +			return trace_drm_dbg_unknown_enabled();
> > +		}
> > +	}
> > +	return false;
> > +}
> > +
> > +static void drm_do_trace(const struct device *dev, unsigned int category,
> > +			 struct va_format *vaf)
> > +{
> > +	WARN_ON(hweight32(category) > 1);
> > +
> > +	switch (category) {
> > +	case DRM_UT_NONE:
> > +		trace_drm_dbg_none(dev, vaf);
> > +		break;
> > +	case DRM_UT_CORE:
> > +		trace_drm_dbg_core(dev, vaf);
> > +		break;
> > +	case DRM_UT_DRIVER:
> > +		trace_drm_dbg_driver(dev, vaf);
> > +		break;
> > +	case DRM_UT_KMS:
> > +		trace_drm_dbg_kms(dev, vaf);
> > +		break;
> > +	case DRM_UT_PRIME:
> > +		trace_drm_dbg_prime(dev, vaf);
> > +		break;
> > +	case DRM_UT_ATOMIC:
> > +		trace_drm_dbg_atomic(dev, vaf);
> > +		break;
> > +	case DRM_UT_VBL:
> > +		trace_drm_dbg_vbl(dev, vaf);
> > +		break;
> > +	case DRM_UT_STATE:
> > +		trace_drm_dbg_state(dev, vaf);
> > +		break;
> > +	case DRM_UT_LEASE:
> > +		trace_drm_dbg_lease(dev, vaf);
> > +		break;
> > +	case DRM_UT_DP:
> > +		trace_drm_dbg_dp(dev, vaf);
> > +		break;
> > +	default:
> > +		trace_drm_dbg_unknown(dev, vaf);
> > +		break;
> > +	}
> > +}
> > +
> >  void drm_dev_dbg(const struct device *dev, unsigned int category,
> >  		 const char *format, ...)
> >  {
> >  	struct va_format vaf;
> > +	unsigned int bit;
> >  	va_list args;
> >  
> > -	if (!drm_debug_enabled(category))
> > -		return;
> > -
> > -	va_start(args, format);
> >  	vaf.fmt = format;
> >  	vaf.va = &args;
> >  
> > -	if (dev)
> > -		dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
> > -			   __builtin_return_address(0), &vaf);
> > -	else
> > -		printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> > -		       __builtin_return_address(0), &vaf);
> > +	if (drm_debug_enabled(category)) {
> > +		va_start(args, format);
> > +		if (dev)
> > +			dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
> > +				   __builtin_return_address(0), &vaf);
> > +		else
> > +			printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> > +			       __builtin_return_address(0), &vaf);
> > +		va_end(args);
> > +	}
> > +
> > +	if (!drm_trace_enabled(category))
> > +		return;
> > +
> > +	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> > +		va_start(args, format);
> > +		drm_do_trace(dev, BIT(bit), &vaf);
> > +		va_end(args);
> > +	}
> >  
> > -	va_end(args);
> >  }
> >  EXPORT_SYMBOL(drm_dev_dbg);
> >  
> >  void drm_dbg(unsigned int category, const char *format, ...)
> >  {
> >  	struct va_format vaf;
> > +	unsigned int bit;
> >  	va_list args;
> >  
> > -	if (!drm_debug_enabled(category))
> > -		return;
> > -
> > -	va_start(args, format);
> >  	vaf.fmt = format;
> >  	vaf.va = &args;
> >  
> > -	printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> > -	       __builtin_return_address(0), &vaf);
> > +	if (drm_debug_enabled(category)) {
> > +		va_start(args, format);
> > +		printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
> > +		       __builtin_return_address(0), &vaf);
> > +		va_end(args);
> > +	}
> >  
> > -	va_end(args);
> > +	if (!drm_trace_enabled(category))
> > +		return;
> > +
> > +	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
> > +		va_start(args, format);
> > +		drm_do_trace(NULL, BIT(bit), &vaf);
> > +		va_end(args);
> > +	}
> >  }
> >  EXPORT_SYMBOL(drm_dbg);
> >  
> > @@ -304,13 +404,16 @@ void drm_err(const char *format, ...)
> >  	struct va_format vaf;
> >  	va_list args;
> >  
> > -	va_start(args, format);
> >  	vaf.fmt = format;
> >  	vaf.va = &args;
> >  
> > +	va_start(args, format);
> >  	printk(KERN_ERR "[" DRM_NAME ":%ps] *ERROR* %pV",
> >  	       __builtin_return_address(0), &vaf);
> > +	va_end(args);
> >  
> > +	va_start(args, format);
> > +	trace_drm_err(NULL, &vaf);
> >  	va_end(args);
> >  }
> >  EXPORT_SYMBOL(drm_err);
> > diff --git a/include/trace/events/drm_print.h b/include/trace/events/drm_print.h
> > new file mode 100644
> > index 000000000000..cc93c69f3fe4
> > --- /dev/null
> > +++ b/include/trace/events/drm_print.h
> > @@ -0,0 +1,116 @@
> > +// SPDX-License-Identifier: MIT
> > +/*
> > + * Copyright (C) 2019 Google, Inc.
> > + *
> > + * Authors:
> > + * Sean Paul <seanpaul@chromium.org>
> > + */
> > +#undef TRACE_SYSTEM
> > +#define TRACE_SYSTEM drm_print
> > +
> > +#if !defined(_TRACE_DRM_PRINT_H) || defined(TRACE_HEADER_MULTI_READ)
> > +
> > +#include <linux/device.h>
> > +#include <linux/tracepoint.h>
> > +
> > +#define DRM_PRINT_MAX 256
> > +
> > +#define _TRACE_DRM_PRINT_H
> > +
> > +TRACE_EVENT(drm_log,
> > +	TP_PROTO(const char * level, const struct device *dev,
> > +		 struct va_format *vaf),
> > +	TP_ARGS(level, dev, vaf),
> > +	TP_STRUCT__entry(
> > +		__field(const char *,		level			)
> > +		__field(const struct device *,	dev			)
> > +		__dynamic_array(char,		msg,	DRM_PRINT_MAX	)
> > +		__field(int,			len			)
> > +	),
> > +	TP_fast_assign(
> > +		__entry->level = level;
> > +		__entry->dev = dev;
> > +		__entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
> > +					 vaf->fmt, *vaf->va);
> > +		if (__entry->len > 0 &&
> > +		    __get_str(msg)[__entry->len - 1] == '\n')
> > +			__get_str(msg)[--__entry->len] = '\0';
> > +	),
> > +	TP_printk("%s %s %s: %s", __entry->level,
> > +		  __entry->dev ? dev_driver_string(__entry->dev) : "(NULL):",
> > +		  __entry->dev ? dev_name(__entry->dev) : "", __get_str(msg))
> > +);
> > +
> > +TRACE_EVENT(class_drm_category_log,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf),
> > +	TP_STRUCT__entry(
> > +		__field(const struct device *,	dev			)
> > +		__dynamic_array(char,		msg,	DRM_PRINT_MAX	)
> > +		__field(int,			len			)
> > +	),
> > +	TP_fast_assign(
> > +		__entry->dev = dev;
> > +		__entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
> > +					 vaf->fmt, *vaf->va);
> > +		if (__entry->len > 0 &&
> > +		    __get_str(msg)[__entry->len - 1] == '\n')
> > +			__get_str(msg)[--__entry->len] = '\0';
> > +	),
> > +	TP_printk("%s %s%s%s",
> > +		  __entry->dev ? dev_driver_string(__entry->dev) : "",
> > +		  __entry->dev ? dev_name(__entry->dev) : "",
> > +		  __entry->dev ? ": " : "", __get_str(msg))
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_err,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_none,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_core,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_driver,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_kms,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_prime,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_atomic,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_vbl,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_state,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_lease,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_dp,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +DEFINE_EVENT(class_drm_category_log, drm_dbg_unknown,
> > +	TP_PROTO(const struct device *dev, struct va_format *vaf),
> > +	TP_ARGS(dev, vaf)
> > +);
> > +#endif
> > +
> > +/* This part must be outside protection */
> > +#include <trace/define_trace.h>
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>
Ville Syrjälä Oct. 15, 2019, 8:02 p.m. UTC | #6
On Tue, Oct 15, 2019 at 03:11:56PM -0400, Sean Paul wrote:
> On Tue, Oct 15, 2019 at 02:07:00PM +0200, Thomas Zimmermann wrote:
> > Hi
> > 
> 
> Hi Thomas (and Daniel/Joonas/Rob),
> Thanks for your feedback
> 
> > Am 10.10.19 um 22:48 schrieb Sean Paul:
> > > From: Sean Paul <seanpaul@chromium.org>
> > > 
> > > *Record scratch* You read that subject correctly, I bet you're wondering
> > > how we got here. At least hear me out before you flame :-)
> > > 
> > > For a long while now, we (ChromeOS) have been struggling getting any
> > > value out of user feedback reports of display failures (notably external
> > > displays not working). The problem is that all logging, even fatal
> > > errors (well, fatal in the sense that a display won't light up) are
> > > logged at DEBUG log level. So in order to extract these logs, you need
> > > to be able to turn on logging, and reproduce the issue with debug
> > > enabled. Unfortunately, this isn't really something we can ask CrOS users
> > > I spoke with airlied about this and RHEL has similar issues.
> > > 
> > > This is the point where you ask me, "So Sean, why don't you just enable
> > > DRM_UT_BLAH?". Good question! Here are the reasons in ascending order of
> > > severity:
> > >  1- People aren't consistent with their categories, so we'd have to
> > >     enable a bunch to get proper coverage
> > >  2- We don't want to overwhelm syslog with drm spam, others have to use
> > >     it too
> > >  3- Console logging is slow
> > > 
> > > Hopefully you're with me so far. I have a problem that has no existing
> > > solution. What I really want is a ringbuffer of the most recent logs
> > > (in the categories I'm interested in) exposed via debugfs so I can
> > > extract it when users file feedback.
> > 
> > For bug reports, I don't want categories or anything else that users can
> > switch on/off. All I'd want is a simple way of retrieving the last ~100
> > messages from DRM (ala: "please attach the content of the file at
> > /sys/debug...").
> > 
> 
> Yeah, this is basically what I'm looking for (although I'd filter out a few
> categories that are particularly noisy and bump N to as big as I can get away
> with).
> 
> > > It just so happens that there is something which does _exactly_ this! I
> > > can dump the most recent logs into tracepoints, turn them on and off
> > > depending on which category I want, and pull them from debugfs on demand.
> > > 
> > > "What about trace_printk()?" You'll say. It doesn't give us the control we
> > > get from using tracepoints and it's not meant to be left sprinkled around
> > > in code.
> > > 
> > > So that is how we got here, now it's time for you to tell me why this is
> > > a horrible idea :-)
> > 
> > Tracepoints are considered stable uapi, right? As a distro person (SUSE)
> > I don't want us to have to maintain debugging messages forever.
> > 
> 
> This is something I wasn't aware of (that's what reviews are for, right?!). So
> this idea is perhaps dead in the water. I could probably integrate with
> ringbuffer at a lower level and expose a drm-specific debugfs node without _too_
> much extra goo, but [ab]using tracepoints seemed pretty natural. The
> other benefit to this approach is that backporting it into distro kernels is
> relatively easy.
> 
> > 
> > The problem itself doesn't seem related to DRM. Do other subsystems have
> > similar requirements?
> 
> I think other subsystems are chattier than ours (ie: when an error occurs they
> actually log something). Some of our drivers in drm are more helpful than others
> as far as complaining when something unexpected happens. The core in particular
> is really conservative about issuing errors (which makes sense since a lot of
> the failures are expected (atomic testing is messy), only the compositor really
> knows which ones are noteworthy).

Apart from the "let's not spam the logs if userspace is fuzzing our
uapi" issue I think the other main reason for conservatism is hotplug,
or rather unplug. When the other end of the cable is flapping around
in the breeze it's hard to talk to whoever you think you're talking
to. And it doesn't help that often the other end decides to hang up
and go for a nap even if the cable is superglued on.

If we didn't have hotpluggable displays it would totally make sense
to turn all commit phase debugs into errors. As it stands we've had
to do the exact oppposite or CI in particular would drown in noise.
If only those DP connector hooks were motorized and we could lock
them from the driver...
Daniel Vetter Oct. 15, 2019, 10:35 p.m. UTC | #7
On Tue, Oct 15, 2019 at 10:02 PM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Tue, Oct 15, 2019 at 03:11:56PM -0400, Sean Paul wrote:
> > On Tue, Oct 15, 2019 at 02:07:00PM +0200, Thomas Zimmermann wrote:
> > > Hi
> > >
> >
> > Hi Thomas (and Daniel/Joonas/Rob),
> > Thanks for your feedback
> >
> > > Am 10.10.19 um 22:48 schrieb Sean Paul:
> > > > From: Sean Paul <seanpaul@chromium.org>
> > > >
> > > > *Record scratch* You read that subject correctly, I bet you're wondering
> > > > how we got here. At least hear me out before you flame :-)
> > > >
> > > > For a long while now, we (ChromeOS) have been struggling getting any
> > > > value out of user feedback reports of display failures (notably external
> > > > displays not working). The problem is that all logging, even fatal
> > > > errors (well, fatal in the sense that a display won't light up) are
> > > > logged at DEBUG log level. So in order to extract these logs, you need
> > > > to be able to turn on logging, and reproduce the issue with debug
> > > > enabled. Unfortunately, this isn't really something we can ask CrOS users
> > > > I spoke with airlied about this and RHEL has similar issues.
> > > >
> > > > This is the point where you ask me, "So Sean, why don't you just enable
> > > > DRM_UT_BLAH?". Good question! Here are the reasons in ascending order of
> > > > severity:
> > > >  1- People aren't consistent with their categories, so we'd have to
> > > >     enable a bunch to get proper coverage
> > > >  2- We don't want to overwhelm syslog with drm spam, others have to use
> > > >     it too
> > > >  3- Console logging is slow
> > > >
> > > > Hopefully you're with me so far. I have a problem that has no existing
> > > > solution. What I really want is a ringbuffer of the most recent logs
> > > > (in the categories I'm interested in) exposed via debugfs so I can
> > > > extract it when users file feedback.
> > >
> > > For bug reports, I don't want categories or anything else that users can
> > > switch on/off. All I'd want is a simple way of retrieving the last ~100
> > > messages from DRM (ala: "please attach the content of the file at
> > > /sys/debug...").
> > >
> >
> > Yeah, this is basically what I'm looking for (although I'd filter out a few
> > categories that are particularly noisy and bump N to as big as I can get away
> > with).
> >
> > > > It just so happens that there is something which does _exactly_ this! I
> > > > can dump the most recent logs into tracepoints, turn them on and off
> > > > depending on which category I want, and pull them from debugfs on demand.
> > > >
> > > > "What about trace_printk()?" You'll say. It doesn't give us the control we
> > > > get from using tracepoints and it's not meant to be left sprinkled around
> > > > in code.
> > > >
> > > > So that is how we got here, now it's time for you to tell me why this is
> > > > a horrible idea :-)
> > >
> > > Tracepoints are considered stable uapi, right? As a distro person (SUSE)
> > > I don't want us to have to maintain debugging messages forever.
> > >
> >
> > This is something I wasn't aware of (that's what reviews are for, right?!). So
> > this idea is perhaps dead in the water. I could probably integrate with
> > ringbuffer at a lower level and expose a drm-specific debugfs node without _too_
> > much extra goo, but [ab]using tracepoints seemed pretty natural. The
> > other benefit to this approach is that backporting it into distro kernels is
> > relatively easy.
> >
> > >
> > > The problem itself doesn't seem related to DRM. Do other subsystems have
> > > similar requirements?
> >
> > I think other subsystems are chattier than ours (ie: when an error occurs they
> > actually log something). Some of our drivers in drm are more helpful than others
> > as far as complaining when something unexpected happens. The core in particular
> > is really conservative about issuing errors (which makes sense since a lot of
> > the failures are expected (atomic testing is messy), only the compositor really
> > knows which ones are noteworthy).
>
> Apart from the "let's not spam the logs if userspace is fuzzing our
> uapi" issue I think the other main reason for conservatism is hotplug,
> or rather unplug. When the other end of the cable is flapping around
> in the breeze it's hard to talk to whoever you think you're talking
> to. And it doesn't help that often the other end decides to hang up
> and go for a nap even if the cable is superglued on.
>
> If we didn't have hotpluggable displays it would totally make sense
> to turn all commit phase debugs into errors. As it stands we've had
> to do the exact oppposite or CI in particular would drown in noise.
> If only those DP connector hooks were motorized and we could lock
> them from the driver...

Yeah I don't think tuning the spam level will ever work. What we need
is some external input (most likely from the user clicking the "my
external screen doesn't work" button, or maybe the compositor
realizing something that should work didn't, or some other thing that
indicates trouble), and then retroactively capture all
debug/informational message leading up to doom.

But without that external "houston we have a problem" input all the
debug spam is really just spam and unwanted. btw even if we don't spam
dmesg if we enable too much we might have simply trouble with all the
printk formatting work we do for nothing. So maybe we need something
like trace_printk which iirc delays the formatting until the stuff
actually gets read from the log buffer. Plus trace_printk might make
it clear enough that it's not stable uapi ... so maybe we do want
trace_printk in the end?

Just not really looking forward to reimplementing half the tracing
infrastructure just for this ...
-Daniel
Pekka Paalanen Oct. 16, 2019, 1:05 p.m. UTC | #8
On Wed, 16 Oct 2019 00:35:39 +0200
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> Yeah I don't think tuning the spam level will ever work. What we need
> is some external input (most likely from the user clicking the "my
> external screen doesn't work" button, or maybe the compositor
> realizing something that should work didn't, or some other thing that
> indicates trouble), and then retroactively capture all
> debug/informational message leading up to doom.
> 
> But without that external "houston we have a problem" input all the
> debug spam is really just spam and unwanted. btw even if we don't spam
> dmesg if we enable too much we might have simply trouble with all the
> printk formatting work we do for nothing. So maybe we need something
> like trace_printk which iirc delays the formatting until the stuff
> actually gets read from the log buffer. Plus trace_printk might make
> it clear enough that it's not stable uapi ... so maybe we do want
> trace_printk in the end?
> 
> Just not really looking forward to reimplementing half the tracing
> infrastructure just for this ...

Hi,

a thought about the UAPI:

Debugfs is not good because it's not supposed to be touched or even
present in production, right? But we want something that will
specifically be available in production. So a new file in some fs
somewhere it should be, and userspace in production can read it at will
to attach to a bug report.

Those semantics, "only use this content for attaching into a bug
report" should be made very clear in the UAPI. Not just the kernel
function names, but in the UAPI that an end user might
see. /proc/driver/drm/bug-report-aid or whatever. And that file, while
a ring buffer, could be aggressively big, intended to be compressed
before sent out with a bug report. Or maybe it comes out as readily
compressed.

The file name itself would be stable UAPI, the contents not at all.

I believe it has to be a ring buffer that is being continuously written
also during normal operations, so that we don't have to ask end users
to reproduce the issue again just to get some logs. Maybe the issue
happens once in a fortnight. The information must be extractable after
the fact, without before-hand preparations.


Thanks,
pq
Thomas Zimmermann Oct. 16, 2019, 1:23 p.m. UTC | #9
Hi

Am 16.10.19 um 15:05 schrieb Pekka Paalanen:
> On Wed, 16 Oct 2019 00:35:39 +0200
> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> 
>> Yeah I don't think tuning the spam level will ever work. What we need
>> is some external input (most likely from the user clicking the "my
>> external screen doesn't work" button, or maybe the compositor
>> realizing something that should work didn't, or some other thing that
>> indicates trouble), and then retroactively capture all
>> debug/informational message leading up to doom.
>>
>> But without that external "houston we have a problem" input all the
>> debug spam is really just spam and unwanted. btw even if we don't spam
>> dmesg if we enable too much we might have simply trouble with all the
>> printk formatting work we do for nothing. So maybe we need something
>> like trace_printk which iirc delays the formatting until the stuff
>> actually gets read from the log buffer. Plus trace_printk might make
>> it clear enough that it's not stable uapi ... so maybe we do want
>> trace_printk in the end?
>>
>> Just not really looking forward to reimplementing half the tracing
>> infrastructure just for this ...
> 
> Hi,
> 
> a thought about the UAPI:
> 
> Debugfs is not good because it's not supposed to be touched or even
> present in production, right?

I'm running Tumbleweed where debugfs is mounted by default for root. I
could live having the user to mount debugfs to get the file's content.

> specifically be available in production. So a new file in some fs
> somewhere it should be, and userspace in production can read it at will
> to attach to a bug report.
> 
> Those semantics, "only use this content for attaching into a bug
> report" should be made very clear in the UAPI.

Has this ever worked? As soon as a userspace program starts depending on
the content of this file, it becomes kabi. From the incidents I know,
Linus has always been quite strict about this. Even for broken interfaces.

> I believe it has to be a ring buffer that is being continuously written
> also during normal operations, so that we don't have to ask end users
> to reproduce the issue again just to get some logs. Maybe the issue
> happens once in a fortnight. The information must be extractable after
> the fact, without before-hand preparations.

Agreed.

Best regards
Thomas

> 
> 
> Thanks,
> pq
>
Sean Paul Oct. 16, 2019, 3:55 p.m. UTC | #10
On Wed, Oct 16, 2019 at 03:23:45PM +0200, Thomas Zimmermann wrote:
> Hi
> 
> Am 16.10.19 um 15:05 schrieb Pekka Paalanen:
> > On Wed, 16 Oct 2019 00:35:39 +0200
> > Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > 
> >> Yeah I don't think tuning the spam level will ever work. What we need
> >> is some external input (most likely from the user clicking the "my
> >> external screen doesn't work" button, or maybe the compositor
> >> realizing something that should work didn't, or some other thing that
> >> indicates trouble), and then retroactively capture all
> >> debug/informational message leading up to doom.
> >>
> >> But without that external "houston we have a problem" input all the
> >> debug spam is really just spam and unwanted. btw even if we don't spam
> >> dmesg if we enable too much we might have simply trouble with all the
> >> printk formatting work we do for nothing. So maybe we need something
> >> like trace_printk which iirc delays the formatting until the stuff
> >> actually gets read from the log buffer. Plus trace_printk might make
> >> it clear enough that it's not stable uapi ... so maybe we do want
> >> trace_printk in the end?
> >>
> >> Just not really looking forward to reimplementing half the tracing
> >> infrastructure just for this ...
> > 
> > Hi,
> > 
> > a thought about the UAPI:
> > 
> > Debugfs is not good because it's not supposed to be touched or even
> > present in production, right?
> 
> I'm running Tumbleweed where debugfs is mounted by default for root. I
> could live having the user to mount debugfs to get the file's content.
> 
> > specifically be available in production. So a new file in some fs
> > somewhere it should be, and userspace in production can read it at will
> > to attach to a bug report.
> > 
> > Those semantics, "only use this content for attaching into a bug
> > report" should be made very clear in the UAPI.
> 
> Has this ever worked? As soon as a userspace program starts depending on
> the content of this file, it becomes kabi. From the incidents I know,
> Linus has always been quite strict about this. Even for broken interfaces.
> 

I think at this point I've convinced myself to spend the time to add proper
stable event traces to the atomic core + helpers (dp/self refresh/hdcp/etc).
It's going to be a pain (I really hate how much boilerplate is involved in
adding just one event), but I think there's enough interest that it'll be
worth it.

If it turns out to be useful, we can dig into some of the drivers as well.

Sean

> > I believe it has to be a ring buffer that is being continuously written
> > also during normal operations, so that we don't have to ask end users
> > to reproduce the issue again just to get some logs. Maybe the issue
> > happens once in a fortnight. The information must be extractable after
> > the fact, without before-hand preparations.
> 
> Agreed.
> 
> Best regards
> Thomas
> 
> > 
> > 
> > Thanks,
> > pq
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>
Pekka Paalanen Oct. 17, 2019, 7:22 a.m. UTC | #11
On Wed, 16 Oct 2019 15:23:45 +0200
Thomas Zimmermann <tzimmermann@suse.de> wrote:

> Hi
> 
> Am 16.10.19 um 15:05 schrieb Pekka Paalanen:

> > specifically be available in production. So a new file in some fs
> > somewhere it should be, and userspace in production can read it at will
> > to attach to a bug report.
> > 
> > Those semantics, "only use this content for attaching into a bug
> > report" should be made very clear in the UAPI.  
> 
> Has this ever worked? As soon as a userspace program starts depending on
> the content of this file, it becomes kabi. From the incidents I know,
> Linus has always been quite strict about this. Even for broken interfaces.

The kernel log content is not kabi, is it? I've seen it change plenty
during the years. This would be just another similar log with free-form
text.


Thanks,
pq
Sean Paul Dec. 3, 2019, 7:10 p.m. UTC | #12
On Thu, Oct 17, 2019 at 10:22:16AM +0300, Pekka Paalanen wrote:
> On Wed, 16 Oct 2019 15:23:45 +0200
> Thomas Zimmermann <tzimmermann@suse.de> wrote:
> 
> > Hi
> > 
> > Am 16.10.19 um 15:05 schrieb Pekka Paalanen:
> 
> > > specifically be available in production. So a new file in some fs
> > > somewhere it should be, and userspace in production can read it at will
> > > to attach to a bug report.
> > > 
> > > Those semantics, "only use this content for attaching into a bug
> > > report" should be made very clear in the UAPI.  
> > 
> > Has this ever worked? As soon as a userspace program starts depending on
> > the content of this file, it becomes kabi. From the incidents I know,
> > Linus has always been quite strict about this. Even for broken interfaces.
> 
> The kernel log content is not kabi, is it? I've seen it change plenty
> during the years. This would be just another similar log with free-form
> text.
> 

Ok, so given the more structured version of this set [1] was not well received,
are we all comfortable going with the freeform approach in this version?

Sean

[1]- https://lists.freedesktop.org/archives/dri-devel/2019-November/243230.html

> 
> Thanks,
> pq
Daniel Vetter Dec. 3, 2019, 9:20 p.m. UTC | #13
On Tue, Dec 3, 2019 at 8:10 PM Sean Paul <sean@poorly.run> wrote:
>
> On Thu, Oct 17, 2019 at 10:22:16AM +0300, Pekka Paalanen wrote:
> > On Wed, 16 Oct 2019 15:23:45 +0200
> > Thomas Zimmermann <tzimmermann@suse.de> wrote:
> >
> > > Hi
> > >
> > > Am 16.10.19 um 15:05 schrieb Pekka Paalanen:
> >
> > > > specifically be available in production. So a new file in some fs
> > > > somewhere it should be, and userspace in production can read it at will
> > > > to attach to a bug report.
> > > >
> > > > Those semantics, "only use this content for attaching into a bug
> > > > report" should be made very clear in the UAPI.
> > >
> > > Has this ever worked? As soon as a userspace program starts depending on
> > > the content of this file, it becomes kabi. From the incidents I know,
> > > Linus has always been quite strict about this. Even for broken interfaces.
> >
> > The kernel log content is not kabi, is it? I've seen it change plenty
> > during the years. This would be just another similar log with free-form
> > text.
> >
>
> Ok, so given the more structured version of this set [1] was not well received,
> are we all comfortable going with the freeform approach in this version?

Imo yes. It's still uabi, so someone will have regrets about it. But
given that dmesg has been around forever, and causes rather little
breakage, I think we should be fairly ok.

I still think that figuring out the drm_dev logging bikeshed might be
good, while we noodle around in here.
-Daniel

>
> Sean
>
> [1]- https://lists.freedesktop.org/archives/dri-devel/2019-November/243230.html
>
> >
> > Thanks,
> > pq
>
>
>
> --
> Sean Paul, Software Engineer, Google / Chromium OS
Pekka Paalanen Dec. 4, 2019, 7:33 a.m. UTC | #14
On Tue, 3 Dec 2019 22:20:14 +0100
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> On Tue, Dec 3, 2019 at 8:10 PM Sean Paul <sean@poorly.run> wrote:
> >
> > On Thu, Oct 17, 2019 at 10:22:16AM +0300, Pekka Paalanen wrote:  
> > > On Wed, 16 Oct 2019 15:23:45 +0200
> > > Thomas Zimmermann <tzimmermann@suse.de> wrote:
> > >  
> > > > Hi
> > > >
> > > > Am 16.10.19 um 15:05 schrieb Pekka Paalanen:  
> > >  
> > > > > specifically be available in production. So a new file in some fs
> > > > > somewhere it should be, and userspace in production can read it at will
> > > > > to attach to a bug report.
> > > > >
> > > > > Those semantics, "only use this content for attaching into a bug
> > > > > report" should be made very clear in the UAPI.  
> > > >
> > > > Has this ever worked? As soon as a userspace program starts depending on
> > > > the content of this file, it becomes kabi. From the incidents I know,
> > > > Linus has always been quite strict about this. Even for broken interfaces.  
> > >
> > > The kernel log content is not kabi, is it? I've seen it change plenty
> > > during the years. This would be just another similar log with free-form
> > > text.
> > >  
> >
> > Ok, so given the more structured version of this set [1] was not well received,
> > are we all comfortable going with the freeform approach in this version?  
> 
> Imo yes. It's still uabi, so someone will have regrets about it. But
> given that dmesg has been around forever, and causes rather little
> breakage, I think we should be fairly ok.
> 
> I still think that figuring out the drm_dev logging bikeshed might be
> good, while we noodle around in here.

Hi,

one more wacky idea: have a flight recorder buffer(s) in the kernel,
but do not expose them as is to userspace. Instead, create a trigger
somewhere (/proc?) that causes the flight recorder buffers to be
flushed into dmesg. That way the amount of new UABI is reduced to just
the trigger. Obviously this spams dmesg and would need the rights to
access dmesg to actually collect the logs. I'm not sure if that's good
or bad, but it would re-use dmesg.


Thanks,
pq
Daniel Vetter Dec. 4, 2019, 9:14 a.m. UTC | #15
On Wed, Dec 4, 2019 at 8:33 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>
> On Tue, 3 Dec 2019 22:20:14 +0100
> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> > On Tue, Dec 3, 2019 at 8:10 PM Sean Paul <sean@poorly.run> wrote:
> > >
> > > On Thu, Oct 17, 2019 at 10:22:16AM +0300, Pekka Paalanen wrote:
> > > > On Wed, 16 Oct 2019 15:23:45 +0200
> > > > Thomas Zimmermann <tzimmermann@suse.de> wrote:
> > > >
> > > > > Hi
> > > > >
> > > > > Am 16.10.19 um 15:05 schrieb Pekka Paalanen:
> > > >
> > > > > > specifically be available in production. So a new file in some fs
> > > > > > somewhere it should be, and userspace in production can read it at will
> > > > > > to attach to a bug report.
> > > > > >
> > > > > > Those semantics, "only use this content for attaching into a bug
> > > > > > report" should be made very clear in the UAPI.
> > > > >
> > > > > Has this ever worked? As soon as a userspace program starts depending on
> > > > > the content of this file, it becomes kabi. From the incidents I know,
> > > > > Linus has always been quite strict about this. Even for broken interfaces.
> > > >
> > > > The kernel log content is not kabi, is it? I've seen it change plenty
> > > > during the years. This would be just another similar log with free-form
> > > > text.
> > > >
> > >
> > > Ok, so given the more structured version of this set [1] was not well received,
> > > are we all comfortable going with the freeform approach in this version?
> >
> > Imo yes. It's still uabi, so someone will have regrets about it. But
> > given that dmesg has been around forever, and causes rather little
> > breakage, I think we should be fairly ok.
> >
> > I still think that figuring out the drm_dev logging bikeshed might be
> > good, while we noodle around in here.
>
> Hi,
>
> one more wacky idea: have a flight recorder buffer(s) in the kernel,
> but do not expose them as is to userspace. Instead, create a trigger
> somewhere (/proc?) that causes the flight recorder buffers to be
> flushed into dmesg. That way the amount of new UABI is reduced to just
> the trigger. Obviously this spams dmesg and would need the rights to
> access dmesg to actually collect the logs. I'm not sure if that's good
> or bad, but it would re-use dmesg.

That's roughly how we ended up here, since trace buffer dumping is
something that exists already (you can e.g. dump it on an Oops too, we
do that in our CI with a few i915 tracepoints enabled). I think at
that point a section in drm-uapi.rst explaining what you
should/shouldn't do with these tracepoints is about as dmesg.
-Daniel
Pekka Paalanen Dec. 4, 2019, 9:54 a.m. UTC | #16
On Wed, 4 Dec 2019 10:14:11 +0100
Daniel Vetter <daniel.vetter@ffwll.ch> wrote:

> On Wed, Dec 4, 2019 at 8:33 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > On Tue, 3 Dec 2019 22:20:14 +0100
> > Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >  
> > > On Tue, Dec 3, 2019 at 8:10 PM Sean Paul <sean@poorly.run> wrote:  
> > > >
> > > > On Thu, Oct 17, 2019 at 10:22:16AM +0300, Pekka Paalanen wrote:  
> > > > > On Wed, 16 Oct 2019 15:23:45 +0200
> > > > > Thomas Zimmermann <tzimmermann@suse.de> wrote:
> > > > >  
> > > > > > Hi
> > > > > >
> > > > > > Am 16.10.19 um 15:05 schrieb Pekka Paalanen:  
> > > > >  
> > > > > > > specifically be available in production. So a new file in some fs
> > > > > > > somewhere it should be, and userspace in production can read it at will
> > > > > > > to attach to a bug report.
> > > > > > >
> > > > > > > Those semantics, "only use this content for attaching into a bug
> > > > > > > report" should be made very clear in the UAPI.  
> > > > > >
> > > > > > Has this ever worked? As soon as a userspace program starts depending on
> > > > > > the content of this file, it becomes kabi. From the incidents I know,
> > > > > > Linus has always been quite strict about this. Even for broken interfaces.  
> > > > >
> > > > > The kernel log content is not kabi, is it? I've seen it change plenty
> > > > > during the years. This would be just another similar log with free-form
> > > > > text.
> > > > >  
> > > >
> > > > Ok, so given the more structured version of this set [1] was not well received,
> > > > are we all comfortable going with the freeform approach in this version?  
> > >
> > > Imo yes. It's still uabi, so someone will have regrets about it. But
> > > given that dmesg has been around forever, and causes rather little
> > > breakage, I think we should be fairly ok.
> > >
> > > I still think that figuring out the drm_dev logging bikeshed might be
> > > good, while we noodle around in here.  
> >
> > Hi,
> >
> > one more wacky idea: have a flight recorder buffer(s) in the kernel,
> > but do not expose them as is to userspace. Instead, create a trigger
> > somewhere (/proc?) that causes the flight recorder buffers to be
> > flushed into dmesg. That way the amount of new UABI is reduced to just
> > the trigger. Obviously this spams dmesg and would need the rights to
> > access dmesg to actually collect the logs. I'm not sure if that's good
> > or bad, but it would re-use dmesg.  
> 
> That's roughly how we ended up here, since trace buffer dumping is
> something that exists already (you can e.g. dump it on an Oops too, we
> do that in our CI with a few i915 tracepoints enabled). I think at
> that point a section in drm-uapi.rst explaining what you
> should/shouldn't do with these tracepoints is about as dmesg.

Sorry?

Is there already a kernel flight recorder buffer (a ring buffer
continuously overwritten with DRM debug messages) and some userspace
trigger to flush it out after the fact, without any preparations
needed beforehand from userspace?

I specifically mean not like "you need to enable a thousand tracepoints
manually from userspace and they don't have any stable names so you
can't even do that". The whole point is to have a single one-bit stable
UAPI: "flush out" and nothing else, into dmesg (maybe as debug level
prints). Not any tracepoint enabling hassle that userspace would need
to take care of, since it cannot.

Preferably as much tied to debugfs as dmesg is: not at all.


Thanks,
pq
Daniel Vetter Dec. 4, 2019, 1:02 p.m. UTC | #17
On Wed, Dec 4, 2019 at 10:54 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
>
> On Wed, 4 Dec 2019 10:14:11 +0100
> Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> > On Wed, Dec 4, 2019 at 8:33 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> > >
> > > On Tue, 3 Dec 2019 22:20:14 +0100
> > > Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > >
> > > > On Tue, Dec 3, 2019 at 8:10 PM Sean Paul <sean@poorly.run> wrote:
> > > > >
> > > > > On Thu, Oct 17, 2019 at 10:22:16AM +0300, Pekka Paalanen wrote:
> > > > > > On Wed, 16 Oct 2019 15:23:45 +0200
> > > > > > Thomas Zimmermann <tzimmermann@suse.de> wrote:
> > > > > >
> > > > > > > Hi
> > > > > > >
> > > > > > > Am 16.10.19 um 15:05 schrieb Pekka Paalanen:
> > > > > >
> > > > > > > > specifically be available in production. So a new file in some fs
> > > > > > > > somewhere it should be, and userspace in production can read it at will
> > > > > > > > to attach to a bug report.
> > > > > > > >
> > > > > > > > Those semantics, "only use this content for attaching into a bug
> > > > > > > > report" should be made very clear in the UAPI.
> > > > > > >
> > > > > > > Has this ever worked? As soon as a userspace program starts depending on
> > > > > > > the content of this file, it becomes kabi. From the incidents I know,
> > > > > > > Linus has always been quite strict about this. Even for broken interfaces.
> > > > > >
> > > > > > The kernel log content is not kabi, is it? I've seen it change plenty
> > > > > > during the years. This would be just another similar log with free-form
> > > > > > text.
> > > > > >
> > > > >
> > > > > Ok, so given the more structured version of this set [1] was not well received,
> > > > > are we all comfortable going with the freeform approach in this version?
> > > >
> > > > Imo yes. It's still uabi, so someone will have regrets about it. But
> > > > given that dmesg has been around forever, and causes rather little
> > > > breakage, I think we should be fairly ok.
> > > >
> > > > I still think that figuring out the drm_dev logging bikeshed might be
> > > > good, while we noodle around in here.
> > >
> > > Hi,
> > >
> > > one more wacky idea: have a flight recorder buffer(s) in the kernel,
> > > but do not expose them as is to userspace. Instead, create a trigger
> > > somewhere (/proc?) that causes the flight recorder buffers to be
> > > flushed into dmesg. That way the amount of new UABI is reduced to just
> > > the trigger. Obviously this spams dmesg and would need the rights to
> > > access dmesg to actually collect the logs. I'm not sure if that's good
> > > or bad, but it would re-use dmesg.
> >
> > That's roughly how we ended up here, since trace buffer dumping is
> > something that exists already (you can e.g. dump it on an Oops too, we
> > do that in our CI with a few i915 tracepoints enabled). I think at
> > that point a section in drm-uapi.rst explaining what you
> > should/shouldn't do with these tracepoints is about as dmesg.
>
> Sorry?
>
> Is there already a kernel flight recorder buffer (a ring buffer
> continuously overwritten with DRM debug messages) and some userspace
> trigger to flush it out after the fact, without any preparations
> needed beforehand from userspace?

Nope. But what you describe is why we ended up with tracepoints, since
those already do flight recorder with on-demand dumping. Engineering
our own thing when there's one already there seems a bit silly, and
the debug log stuff might also be useful in some gpu trace
visualizers. You're not going to get that with some dmesg thing.

> I specifically mean not like "you need to enable a thousand tracepoints
> manually from userspace and they don't have any stable names so you
> can't even do that". The whole point is to have a single one-bit stable
> UAPI: "flush out" and nothing else, into dmesg (maybe as debug level
> prints). Not any tracepoint enabling hassle that userspace would need
> to take care of, since it cannot.

I thought the idea is you only enable that one drm debug tracepoint.
Not thousands of them.

> Preferably as much tied to debugfs as dmesg is: not at all.

tracepoints moved out of dmesg too.
-Daniel
Sean Paul Dec. 4, 2019, 1:52 p.m. UTC | #18
On Wed, Dec 04, 2019 at 02:02:51PM +0100, Daniel Vetter wrote:
> On Wed, Dec 4, 2019 at 10:54 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> >
> > On Wed, 4 Dec 2019 10:14:11 +0100
> > Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >
> > > On Wed, Dec 4, 2019 at 8:33 AM Pekka Paalanen <ppaalanen@gmail.com> wrote:
> > > >
> > > > On Tue, 3 Dec 2019 22:20:14 +0100
> > > > Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> > > >
> > > > > On Tue, Dec 3, 2019 at 8:10 PM Sean Paul <sean@poorly.run> wrote:
> > > > > >
> > > > > > On Thu, Oct 17, 2019 at 10:22:16AM +0300, Pekka Paalanen wrote:
> > > > > > > On Wed, 16 Oct 2019 15:23:45 +0200
> > > > > > > Thomas Zimmermann <tzimmermann@suse.de> wrote:
> > > > > > >
> > > > > > > > Hi
> > > > > > > >
> > > > > > > > Am 16.10.19 um 15:05 schrieb Pekka Paalanen:
> > > > > > >
> > > > > > > > > specifically be available in production. So a new file in some fs
> > > > > > > > > somewhere it should be, and userspace in production can read it at will
> > > > > > > > > to attach to a bug report.
> > > > > > > > >
> > > > > > > > > Those semantics, "only use this content for attaching into a bug
> > > > > > > > > report" should be made very clear in the UAPI.
> > > > > > > >
> > > > > > > > Has this ever worked? As soon as a userspace program starts depending on
> > > > > > > > the content of this file, it becomes kabi. From the incidents I know,
> > > > > > > > Linus has always been quite strict about this. Even for broken interfaces.
> > > > > > >
> > > > > > > The kernel log content is not kabi, is it? I've seen it change plenty
> > > > > > > during the years. This would be just another similar log with free-form
> > > > > > > text.
> > > > > > >
> > > > > >
> > > > > > Ok, so given the more structured version of this set [1] was not well received,
> > > > > > are we all comfortable going with the freeform approach in this version?
> > > > >
> > > > > Imo yes. It's still uabi, so someone will have regrets about it. But
> > > > > given that dmesg has been around forever, and causes rather little
> > > > > breakage, I think we should be fairly ok.
> > > > >
> > > > > I still think that figuring out the drm_dev logging bikeshed might be
> > > > > good, while we noodle around in here.

Yeah, we might want to take a closer look at how logs are categorized (for
instance, it might be nice to be able to separate DP aux dumps from DP training
failures, and distinguish between atomic test failures and state dumps). All the
circuitry is in place to classify a message in multiple log categories since log
category is a bitfield.


> > > >
> > > > Hi,
> > > >
> > > > one more wacky idea: have a flight recorder buffer(s) in the kernel,
> > > > but do not expose them as is to userspace. Instead, create a trigger
> > > > somewhere (/proc?) that causes the flight recorder buffers to be
> > > > flushed into dmesg. That way the amount of new UABI is reduced to just
> > > > the trigger. Obviously this spams dmesg and would need the rights to
> > > > access dmesg to actually collect the logs. I'm not sure if that's good
> > > > or bad, but it would re-use dmesg.
> > >
> > > That's roughly how we ended up here, since trace buffer dumping is
> > > something that exists already (you can e.g. dump it on an Oops too, we
> > > do that in our CI with a few i915 tracepoints enabled). I think at
> > > that point a section in drm-uapi.rst explaining what you
> > > should/shouldn't do with these tracepoints is about as dmesg.
> >
> > Sorry?
> >
> > Is there already a kernel flight recorder buffer (a ring buffer
> > continuously overwritten with DRM debug messages) and some userspace
> > trigger to flush it out after the fact, without any preparations
> > needed beforehand from userspace?
> 
> Nope. But what you describe is why we ended up with tracepoints, since
> those already do flight recorder with on-demand dumping. Engineering
> our own thing when there's one already there seems a bit silly, and
> the debug log stuff might also be useful in some gpu trace
> visualizers. You're not going to get that with some dmesg thing.
> 
> > I specifically mean not like "you need to enable a thousand tracepoints
> > manually from userspace and they don't have any stable names so you
> > can't even do that". The whole point is to have a single one-bit stable
> > UAPI: "flush out" and nothing else, into dmesg (maybe as debug level
> > prints). Not any tracepoint enabling hassle that userspace would need
> > to take care of, since it cannot.
> 
> I thought the idea is you only enable that one drm debug tracepoint.
> Not thousands of them.

Correct. You can enable _all_ log categories by writing a 1 into the top-level
drm_print enable file. Alternatively, you can selectively enable log categories
by writing a 1 into that category's enable file (ie: drm_dbg_kms/enable). Finer
granularity is available using a trace events filter, but that would be
inadvisable since we don't want these messages to be relied upon.

> 
> > Preferably as much tied to debugfs as dmesg is: not at all.
> 
> tracepoints moved out of dmesg too.

Well, copied out, they're still available in debugfs.

Sean

> -Daniel
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_print.c b/drivers/gpu/drm/drm_print.c
index 9a25d73c155c..f591292811aa 100644
--- a/drivers/gpu/drm/drm_print.c
+++ b/drivers/gpu/drm/drm_print.c
@@ -27,11 +27,15 @@ 
 
 #include <stdarg.h>
 
+#include <linux/bitops.h>
 #include <linux/io.h>
 #include <linux/moduleparam.h>
 #include <linux/seq_file.h>
 #include <linux/slab.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/drm_print.h>
+
 #include <drm/drm.h>
 #include <drm/drm_drv.h>
 #include <drm/drm_print.h>
@@ -241,10 +245,10 @@  void drm_dev_printk(const struct device *dev, const char *level,
 	struct va_format vaf;
 	va_list args;
 
-	va_start(args, format);
 	vaf.fmt = format;
 	vaf.va = &args;
 
+	va_start(args, format);
 	if (dev)
 		dev_printk(level, dev, "[" DRM_NAME ":%ps] %pV",
 			   __builtin_return_address(0), &vaf);
@@ -253,49 +257,145 @@  void drm_dev_printk(const struct device *dev, const char *level,
 		       level, __builtin_return_address(0), &vaf);
 
 	va_end(args);
+
+	va_start(args, format);
+	trace_drm_log(level, dev, &vaf);
+	va_end(args);
 }
 EXPORT_SYMBOL(drm_dev_printk);
 
+static unsigned int drm_trace_enabled(unsigned int category)
+{
+	unsigned int bit;
+
+	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
+		switch (BIT(bit)) {
+		case DRM_UT_NONE:
+			return trace_drm_dbg_none_enabled();
+		case DRM_UT_CORE:
+			return trace_drm_dbg_core_enabled();
+		case DRM_UT_DRIVER:
+			return trace_drm_dbg_driver_enabled();
+		case DRM_UT_KMS:
+			return trace_drm_dbg_kms_enabled();
+		case DRM_UT_PRIME:
+			return trace_drm_dbg_prime_enabled();
+		case DRM_UT_ATOMIC:
+			return trace_drm_dbg_atomic_enabled();
+		case DRM_UT_VBL:
+			return trace_drm_dbg_vbl_enabled();
+		case DRM_UT_STATE:
+			return trace_drm_dbg_state_enabled();
+		case DRM_UT_LEASE:
+			return trace_drm_dbg_lease_enabled();
+		case DRM_UT_DP:
+			return trace_drm_dbg_dp_enabled();
+		default:
+			return trace_drm_dbg_unknown_enabled();
+		}
+	}
+	return false;
+}
+
+static void drm_do_trace(const struct device *dev, unsigned int category,
+			 struct va_format *vaf)
+{
+	WARN_ON(hweight32(category) > 1);
+
+	switch (category) {
+	case DRM_UT_NONE:
+		trace_drm_dbg_none(dev, vaf);
+		break;
+	case DRM_UT_CORE:
+		trace_drm_dbg_core(dev, vaf);
+		break;
+	case DRM_UT_DRIVER:
+		trace_drm_dbg_driver(dev, vaf);
+		break;
+	case DRM_UT_KMS:
+		trace_drm_dbg_kms(dev, vaf);
+		break;
+	case DRM_UT_PRIME:
+		trace_drm_dbg_prime(dev, vaf);
+		break;
+	case DRM_UT_ATOMIC:
+		trace_drm_dbg_atomic(dev, vaf);
+		break;
+	case DRM_UT_VBL:
+		trace_drm_dbg_vbl(dev, vaf);
+		break;
+	case DRM_UT_STATE:
+		trace_drm_dbg_state(dev, vaf);
+		break;
+	case DRM_UT_LEASE:
+		trace_drm_dbg_lease(dev, vaf);
+		break;
+	case DRM_UT_DP:
+		trace_drm_dbg_dp(dev, vaf);
+		break;
+	default:
+		trace_drm_dbg_unknown(dev, vaf);
+		break;
+	}
+}
+
 void drm_dev_dbg(const struct device *dev, unsigned int category,
 		 const char *format, ...)
 {
 	struct va_format vaf;
+	unsigned int bit;
 	va_list args;
 
-	if (!drm_debug_enabled(category))
-		return;
-
-	va_start(args, format);
 	vaf.fmt = format;
 	vaf.va = &args;
 
-	if (dev)
-		dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
-			   __builtin_return_address(0), &vaf);
-	else
-		printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
-		       __builtin_return_address(0), &vaf);
+	if (drm_debug_enabled(category)) {
+		va_start(args, format);
+		if (dev)
+			dev_printk(KERN_DEBUG, dev, "[" DRM_NAME ":%ps] %pV",
+				   __builtin_return_address(0), &vaf);
+		else
+			printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
+			       __builtin_return_address(0), &vaf);
+		va_end(args);
+	}
+
+	if (!drm_trace_enabled(category))
+		return;
+
+	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
+		va_start(args, format);
+		drm_do_trace(dev, BIT(bit), &vaf);
+		va_end(args);
+	}
 
-	va_end(args);
 }
 EXPORT_SYMBOL(drm_dev_dbg);
 
 void drm_dbg(unsigned int category, const char *format, ...)
 {
 	struct va_format vaf;
+	unsigned int bit;
 	va_list args;
 
-	if (!drm_debug_enabled(category))
-		return;
-
-	va_start(args, format);
 	vaf.fmt = format;
 	vaf.va = &args;
 
-	printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
-	       __builtin_return_address(0), &vaf);
+	if (drm_debug_enabled(category)) {
+		va_start(args, format);
+		printk(KERN_DEBUG "[" DRM_NAME ":%ps] %pV",
+		       __builtin_return_address(0), &vaf);
+		va_end(args);
+	}
 
-	va_end(args);
+	if (!drm_trace_enabled(category))
+		return;
+
+	for_each_set_bit(bit, (unsigned long*)&category, sizeof(category) * 8) {
+		va_start(args, format);
+		drm_do_trace(NULL, BIT(bit), &vaf);
+		va_end(args);
+	}
 }
 EXPORT_SYMBOL(drm_dbg);
 
@@ -304,13 +404,16 @@  void drm_err(const char *format, ...)
 	struct va_format vaf;
 	va_list args;
 
-	va_start(args, format);
 	vaf.fmt = format;
 	vaf.va = &args;
 
+	va_start(args, format);
 	printk(KERN_ERR "[" DRM_NAME ":%ps] *ERROR* %pV",
 	       __builtin_return_address(0), &vaf);
+	va_end(args);
 
+	va_start(args, format);
+	trace_drm_err(NULL, &vaf);
 	va_end(args);
 }
 EXPORT_SYMBOL(drm_err);
diff --git a/include/trace/events/drm_print.h b/include/trace/events/drm_print.h
new file mode 100644
index 000000000000..cc93c69f3fe4
--- /dev/null
+++ b/include/trace/events/drm_print.h
@@ -0,0 +1,116 @@ 
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright (C) 2019 Google, Inc.
+ *
+ * Authors:
+ * Sean Paul <seanpaul@chromium.org>
+ */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM drm_print
+
+#if !defined(_TRACE_DRM_PRINT_H) || defined(TRACE_HEADER_MULTI_READ)
+
+#include <linux/device.h>
+#include <linux/tracepoint.h>
+
+#define DRM_PRINT_MAX 256
+
+#define _TRACE_DRM_PRINT_H
+
+TRACE_EVENT(drm_log,
+	TP_PROTO(const char * level, const struct device *dev,
+		 struct va_format *vaf),
+	TP_ARGS(level, dev, vaf),
+	TP_STRUCT__entry(
+		__field(const char *,		level			)
+		__field(const struct device *,	dev			)
+		__dynamic_array(char,		msg,	DRM_PRINT_MAX	)
+		__field(int,			len			)
+	),
+	TP_fast_assign(
+		__entry->level = level;
+		__entry->dev = dev;
+		__entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
+					 vaf->fmt, *vaf->va);
+		if (__entry->len > 0 &&
+		    __get_str(msg)[__entry->len - 1] == '\n')
+			__get_str(msg)[--__entry->len] = '\0';
+	),
+	TP_printk("%s %s %s: %s", __entry->level,
+		  __entry->dev ? dev_driver_string(__entry->dev) : "(NULL):",
+		  __entry->dev ? dev_name(__entry->dev) : "", __get_str(msg))
+);
+
+TRACE_EVENT(class_drm_category_log,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf),
+	TP_STRUCT__entry(
+		__field(const struct device *,	dev			)
+		__dynamic_array(char,		msg,	DRM_PRINT_MAX	)
+		__field(int,			len			)
+	),
+	TP_fast_assign(
+		__entry->dev = dev;
+		__entry->len = vsnprintf(__get_str(msg), DRM_PRINT_MAX,
+					 vaf->fmt, *vaf->va);
+		if (__entry->len > 0 &&
+		    __get_str(msg)[__entry->len - 1] == '\n')
+			__get_str(msg)[--__entry->len] = '\0';
+	),
+	TP_printk("%s %s%s%s",
+		  __entry->dev ? dev_driver_string(__entry->dev) : "",
+		  __entry->dev ? dev_name(__entry->dev) : "",
+		  __entry->dev ? ": " : "", __get_str(msg))
+);
+DEFINE_EVENT(class_drm_category_log, drm_err,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_none,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_core,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_driver,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_kms,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_prime,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_atomic,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_vbl,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_state,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_lease,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_dp,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+DEFINE_EVENT(class_drm_category_log, drm_dbg_unknown,
+	TP_PROTO(const struct device *dev, struct va_format *vaf),
+	TP_ARGS(dev, vaf)
+);
+#endif
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>