Message ID | 20240314044301.2108650-1-alison.schofield@intel.com |
---|---|
State | New, archived |
Headers | show |
Series | cxl/trace: Initialize cxl_poison region name to NULL | expand |
alison.schofield@ wrote: > From: Alison Schofield <alison.schofield@intel.com> > > The TP_STRUCT__entry that gets assigned the region name, or an > empty string if no region is present, is erroneously initialized > to the cxl_region pointer. It needs to be initialized to NULL > otherwise its length is wrong and garbage chars can appear in > the kernel trace output: /sys/kernel/tracing/trace > > Impact is that tooling depending on that trace data can miss > picking up a valid event when searching by region name. The > TP_printk() output, if enabled, does emit the correct region > names in the dmesg log. > > This was found during testing of the cxl-list option to report > media-errors for a region. > > Fixes: ddf49d57b841 ("cxl/trace: Add TRACE support for CXL media-error records") > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > --- > drivers/cxl/core/trace.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h > index bdf117a33744..bc5ca4d530d1 100644 > --- a/drivers/cxl/core/trace.h > +++ b/drivers/cxl/core/trace.h > @@ -657,7 +657,7 @@ TRACE_EVENT(cxl_poison, > __string(host, dev_name(cxlmd->dev.parent)) > __field(u64, serial) > __field(u8, trace_type) > - __string(region, region) > + __string(region, NULL) Couldn't this be "" instead of NULL then remove the __assign_str() if region is NULL? Ira > __field(u64, overflow_ts) > __field(u64, hpa) > __field(u64, dpa) > > base-commit: e8f897f4afef0031fe618a8e94127a0934896aba > -- > 2.37.3 > k
On Thu, Mar 14, 2024 at 06:53:47AM -0700, Ira Weiny wrote: > alison.schofield@ wrote: > > From: Alison Schofield <alison.schofield@intel.com> > > > > The TP_STRUCT__entry that gets assigned the region name, or an > > empty string if no region is present, is erroneously initialized > > to the cxl_region pointer. It needs to be initialized to NULL > > otherwise its length is wrong and garbage chars can appear in > > the kernel trace output: /sys/kernel/tracing/trace > > > > Impact is that tooling depending on that trace data can miss > > picking up a valid event when searching by region name. The > > TP_printk() output, if enabled, does emit the correct region > > names in the dmesg log. > > > > This was found during testing of the cxl-list option to report > > media-errors for a region. > > > > Fixes: ddf49d57b841 ("cxl/trace: Add TRACE support for CXL media-error records") > > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > > --- > > drivers/cxl/core/trace.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h > > index bdf117a33744..bc5ca4d530d1 100644 > > --- a/drivers/cxl/core/trace.h > > +++ b/drivers/cxl/core/trace.h > > @@ -657,7 +657,7 @@ TRACE_EVENT(cxl_poison, > > __string(host, dev_name(cxlmd->dev.parent)) > > __field(u64, serial) > > __field(u8, trace_type) > > - __string(region, region) > > + __string(region, NULL) > > Couldn't this be "" instead of NULL then remove the __assign_str() if > region is NULL? > > Ira Thanks for the review Ira. That doesn't work because it inits with too short of a length and the region names all get truncated. Adding Steve to this thread. He has upcoming changes to this space, but I also should note that I think this fix may want to be applied backwards, ie to stable, so we cannot count on Steves upcoming changes. > > > __field(u64, overflow_ts) > > __field(u64, hpa) > > __field(u64, dpa) > > > > base-commit: e8f897f4afef0031fe618a8e94127a0934896aba > > -- > > 2.37.3 > > > > k
Ira Weiny wrote: > alison.schofield@ wrote: > > From: Alison Schofield <alison.schofield@intel.com> > > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h [snip] > > index bdf117a33744..bc5ca4d530d1 100644 > > --- a/drivers/cxl/core/trace.h > > +++ b/drivers/cxl/core/trace.h > > @@ -657,7 +657,7 @@ TRACE_EVENT(cxl_poison, > > __string(host, dev_name(cxlmd->dev.parent)) > > __field(u64, serial) > > __field(u8, trace_type) > > - __string(region, region) > > + __string(region, NULL) > > Couldn't this be "" instead of NULL then remove the __assign_str() if > region is NULL? > > Ira > > > __field(u64, overflow_ts) > > __field(u64, hpa) > > __field(u64, dpa) > > > > base-commit: e8f897f4afef0031fe618a8e94127a0934896aba > > -- > > 2.37.3 > > > > k Apologies... This is just a lost 'k' looking for it's home. (emacs users don't hate on me please... :-) Ira
Alison Schofield wrote: > On Thu, Mar 14, 2024 at 06:53:47AM -0700, Ira Weiny wrote: > > alison.schofield@ wrote: > > > From: Alison Schofield <alison.schofield@intel.com> > > > > > > The TP_STRUCT__entry that gets assigned the region name, or an > > > empty string if no region is present, is erroneously initialized > > > to the cxl_region pointer. It needs to be initialized to NULL > > > otherwise its length is wrong and garbage chars can appear in > > > the kernel trace output: /sys/kernel/tracing/trace > > > > > > Impact is that tooling depending on that trace data can miss > > > picking up a valid event when searching by region name. The > > > TP_printk() output, if enabled, does emit the correct region > > > names in the dmesg log. > > > > > > This was found during testing of the cxl-list option to report > > > media-errors for a region. > > > > > > Fixes: ddf49d57b841 ("cxl/trace: Add TRACE support for CXL media-error records") > > > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > > > --- > > > drivers/cxl/core/trace.h | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h > > > index bdf117a33744..bc5ca4d530d1 100644 > > > --- a/drivers/cxl/core/trace.h > > > +++ b/drivers/cxl/core/trace.h > > > @@ -657,7 +657,7 @@ TRACE_EVENT(cxl_poison, > > > __string(host, dev_name(cxlmd->dev.parent)) > > > __field(u64, serial) > > > __field(u8, trace_type) > > > - __string(region, region) > > > + __string(region, NULL) > > > > Couldn't this be "" instead of NULL then remove the __assign_str() if > > region is NULL? > > > > Ira > > Thanks for the review Ira. > > That doesn't work because it inits with too short of a length and the > region names all get truncated. Ah ok. With that added detail. Reviewed-by: Ira Weiny <ira.weiny@intel.com> > > Adding Steve to this thread. He has upcoming changes to this space, but > I also should note that I think this fix may want to be applied > backwards, ie to stable, so we cannot count on Steves upcoming changes. >
On Thu, 14 Mar 2024 10:04:14 -0700 Alison Schofield <alison.schofield@intel.com> wrote: > On Thu, Mar 14, 2024 at 06:53:47AM -0700, Ira Weiny wrote: > > alison.schofield@ wrote: > > > From: Alison Schofield <alison.schofield@intel.com> > > > > > > The TP_STRUCT__entry that gets assigned the region name, or an > > > empty string if no region is present, is erroneously initialized > > > to the cxl_region pointer. It needs to be initialized to NULL > > > otherwise its length is wrong and garbage chars can appear in > > > the kernel trace output: /sys/kernel/tracing/trace > > > > > > Impact is that tooling depending on that trace data can miss > > > picking up a valid event when searching by region name. The > > > TP_printk() output, if enabled, does emit the correct region > > > names in the dmesg log. > > > > > > This was found during testing of the cxl-list option to report > > > media-errors for a region. > > > > > > Fixes: ddf49d57b841 ("cxl/trace: Add TRACE support for CXL media-error records") > > > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > > > --- > > > drivers/cxl/core/trace.h | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h > > > index bdf117a33744..bc5ca4d530d1 100644 > > > --- a/drivers/cxl/core/trace.h > > > +++ b/drivers/cxl/core/trace.h > > > @@ -657,7 +657,7 @@ TRACE_EVENT(cxl_poison, > > > __string(host, dev_name(cxlmd->dev.parent)) > > > __field(u64, serial) > > > __field(u8, trace_type) > > > - __string(region, region) > > > + __string(region, NULL) > > > > Couldn't this be "" instead of NULL then remove the __assign_str() if > > region is NULL? That will not work either. __string() allocates the space on the ring buffer. Getting rid of __assign_str() still leaves the buffer allocated with garbage. As I said in the other email thread [1], __string() and __assign_str() are tightly coupled and are the equivalent of: #define __string(buf, str) len = strlen(str); buf = malloc(len) #define __assign_str(buf, str) strcpy(buf, str) I also mentioned that __string() does much more. It can handle NULL pointers, so it's more like: #define __string(buf, str) len = str ? strlen(str) : strlen("null"); buf = malloc(len) #define __assign_str(buf, str) if (str) strcpy(buf, str); else strcpy(buf, "null") So if the string is NULL, then you want to copy it. That said, looking at this trace-event, it's still broken, because region isn't a string, so you are basically doing: len = strlen(region); Which doesn't make sense. What you want is: __string(region, region ? dev_name(®ion->dev) : NULL) and then you can add the __assign_str(region, NULL) to the else path. -- Steve [1] https://lore.kernel.org/all/20240314143406.6289a060@gandalf.local.home/
On Thu, 14 Mar 2024 14:44:05 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > #define __string(buf, str) len = strlen(str); buf = malloc(len) > #define __assign_str(buf, str) strcpy(buf, str) > > I also mentioned that __string() does much more. It can handle NULL > pointers, so it's more like: > > #define __string(buf, str) len = str ? strlen(str) : strlen("null"); buf = malloc(len) > #define __assign_str(buf, str) if (str) strcpy(buf, str); else strcpy(buf, "null") BTW, the above is just pseudo code, I'm ignoring the bug of the missing "+ 1" in malloc ;-) -- Steve
On Thu, Mar 14, 2024 at 02:44:05PM -0400, Steven Rostedt wrote: > On Thu, 14 Mar 2024 10:04:14 -0700 > Alison Schofield <alison.schofield@intel.com> wrote: > > > On Thu, Mar 14, 2024 at 06:53:47AM -0700, Ira Weiny wrote: > > > alison.schofield@ wrote: > > > > From: Alison Schofield <alison.schofield@intel.com> > > > > > > > > The TP_STRUCT__entry that gets assigned the region name, or an > > > > empty string if no region is present, is erroneously initialized > > > > to the cxl_region pointer. It needs to be initialized to NULL > > > > otherwise its length is wrong and garbage chars can appear in > > > > the kernel trace output: /sys/kernel/tracing/trace > > > > > > > > Impact is that tooling depending on that trace data can miss > > > > picking up a valid event when searching by region name. The > > > > TP_printk() output, if enabled, does emit the correct region > > > > names in the dmesg log. > > > > > > > > This was found during testing of the cxl-list option to report > > > > media-errors for a region. > > > > > > > > Fixes: ddf49d57b841 ("cxl/trace: Add TRACE support for CXL media-error records") > > > > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > > > > --- > > > > drivers/cxl/core/trace.h | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h > > > > index bdf117a33744..bc5ca4d530d1 100644 > > > > --- a/drivers/cxl/core/trace.h > > > > +++ b/drivers/cxl/core/trace.h > > > > @@ -657,7 +657,7 @@ TRACE_EVENT(cxl_poison, > > > > __string(host, dev_name(cxlmd->dev.parent)) > > > > __field(u64, serial) > > > > __field(u8, trace_type) > > > > - __string(region, region) > > > > + __string(region, NULL) > > > > > > Couldn't this be "" instead of NULL then remove the __assign_str() if > > > region is NULL? > > That will not work either. > > __string() allocates the space on the ring buffer. > > Getting rid of __assign_str() still leaves the buffer allocated with garbage. > > As I said in the other email thread [1], __string() and __assign_str() are > tightly coupled and are the equivalent of: > > #define __string(buf, str) len = strlen(str); buf = malloc(len) > #define __assign_str(buf, str) strcpy(buf, str) > > I also mentioned that __string() does much more. It can handle NULL > pointers, so it's more like: > > #define __string(buf, str) len = str ? strlen(str) : strlen("null"); buf = malloc(len) > #define __assign_str(buf, str) if (str) strcpy(buf, str); else strcpy(buf, "null") > > So if the string is NULL, then you want to copy it. That said, looking at > this trace-event, it's still broken, because region isn't a string, so you > are basically doing: > > len = strlen(region); > Thanks Steve - > Which doesn't make sense. What you want is: > > __string(region, region ? dev_name(®ion->dev) : NULL) Compiler didn't like that NULL. warning: argument 1 null where non-null expected [-Wnonnull] 50 | strlen((src) ? (const char *)(src) : "(null)") + 1) Compiler was fine with, and it works with "" > > and then you can add the __assign_str(region, NULL) to the else path. The compiler was fine w NULL here but gave me garbage in the output when region was NULL. "" worked. I'm posting a v2 that functions. Please take another look. - Alison > > -- Steve > > > [1] https://lore.kernel.org/all/20240314143406.6289a060@gandalf.local.home/ >
On Thu, 14 Mar 2024 13:08:02 -0700 Alison Schofield <alison.schofield@intel.com> wrote: > > Which doesn't make sense. What you want is: > > > > __string(region, region ? dev_name(®ion->dev) : NULL) > > Compiler didn't like that NULL. > warning: argument 1 null where non-null expected [-Wnonnull] > 50 | strlen((src) ? (const char *)(src) : "(null)") + 1) > > Compiler was fine with, and it works with "" So the above should have been: strlen((region ? dev_name(&dev->dev) : NULL) ? (const char *)(region ? dev_name(&dev->dev) : NULL) : "(null)") + 1 And the compiler didn't like that? What compiler are you using? Clang? I believe the above is valid, and shouldn't give a warning, as how does NULL get returned unless the compiler got confused. > > > > > and then you can add the __assign_str(region, NULL) to the else path. > The compiler was fine w NULL here but gave me garbage in the output when > region was NULL. "" worked. > > I'm posting a v2 that functions. Please take another look. I will. -- Steve
Steven Rostedt wrote: > On Thu, 14 Mar 2024 10:04:14 -0700 > Alison Schofield <alison.schofield@intel.com> wrote: > > > On Thu, Mar 14, 2024 at 06:53:47AM -0700, Ira Weiny wrote: > > > alison.schofield@ wrote: > > > > From: Alison Schofield <alison.schofield@intel.com> > > > > > > > > The TP_STRUCT__entry that gets assigned the region name, or an > > > > empty string if no region is present, is erroneously initialized > > > > to the cxl_region pointer. It needs to be initialized to NULL > > > > otherwise its length is wrong and garbage chars can appear in > > > > the kernel trace output: /sys/kernel/tracing/trace > > > > > > > > Impact is that tooling depending on that trace data can miss > > > > picking up a valid event when searching by region name. The > > > > TP_printk() output, if enabled, does emit the correct region > > > > names in the dmesg log. > > > > > > > > This was found during testing of the cxl-list option to report > > > > media-errors for a region. > > > > > > > > Fixes: ddf49d57b841 ("cxl/trace: Add TRACE support for CXL media-error records") > > > > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > > > > --- > > > > drivers/cxl/core/trace.h | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h > > > > index bdf117a33744..bc5ca4d530d1 100644 > > > > --- a/drivers/cxl/core/trace.h > > > > +++ b/drivers/cxl/core/trace.h > > > > @@ -657,7 +657,7 @@ TRACE_EVENT(cxl_poison, > > > > __string(host, dev_name(cxlmd->dev.parent)) > > > > __field(u64, serial) > > > > __field(u8, trace_type) > > > > - __string(region, region) > > > > + __string(region, NULL) > > > > > > Couldn't this be "" instead of NULL then remove the __assign_str() if > > > region is NULL? > > That will not work either. > > __string() allocates the space on the ring buffer. Ah! Thanks, good to know. > > Getting rid of __assign_str() still leaves the buffer allocated with garbage. > > As I said in the other email thread [1], __string() and __assign_str() are > tightly coupled and are the equivalent of: > > #define __string(buf, str) len = strlen(str); buf = malloc(len) > #define __assign_str(buf, str) strcpy(buf, str) > > I also mentioned that __string() does much more. It can handle NULL > pointers, so it's more like: > > #define __string(buf, str) len = str ? strlen(str) : strlen("null"); buf = malloc(len) > #define __assign_str(buf, str) if (str) strcpy(buf, str); else strcpy(buf, "null") > > So if the string is NULL, then you want to copy it. That said, looking at > this trace-event, it's still broken, because region isn't a string, so you > are basically doing: > > len = strlen(region); > > Which doesn't make sense. What you want is: > > __string(region, region ? dev_name(®ion->dev) : NULL) Yea this looks much cleaner. Thanks! Ira > > and then you can add the __assign_str(region, NULL) to the else path. > > -- Steve > > > [1] https://lore.kernel.org/all/20240314143406.6289a060@gandalf.local.home/ >
diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h index bdf117a33744..bc5ca4d530d1 100644 --- a/drivers/cxl/core/trace.h +++ b/drivers/cxl/core/trace.h @@ -657,7 +657,7 @@ TRACE_EVENT(cxl_poison, __string(host, dev_name(cxlmd->dev.parent)) __field(u64, serial) __field(u8, trace_type) - __string(region, region) + __string(region, NULL) __field(u64, overflow_ts) __field(u64, hpa) __field(u64, dpa)