Message ID | 1395985981-20476-2-git-send-email-gong.chen@linux.intel.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On Fri, Mar 28, 2014 at 01:52:57AM -0400, Chen, Gong wrote: > To avoid the confuision of usage for RAS related trace event, add > an unified RAS trace event stub. > > Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> > --- > drivers/Kconfig | 2 ++ > drivers/Makefile | 1 + > drivers/edac/edac_mc.c | 3 --- > drivers/ras/Kconfig | 4 ++++ > drivers/ras/Makefile | 1 + > drivers/ras/ras-traces.c | 12 ++++++++++++ > 6 files changed, 20 insertions(+), 3 deletions(-) > create mode 100644 drivers/ras/Kconfig > create mode 100644 drivers/ras/Makefile > create mode 100644 drivers/ras/ras-traces.c > > diff --git a/drivers/Kconfig b/drivers/Kconfig > index b3138fb..d70f7ba 100644 > --- a/drivers/Kconfig > +++ b/drivers/Kconfig > @@ -170,4 +170,6 @@ source "drivers/phy/Kconfig" > > source "drivers/powercap/Kconfig" > > +source "drivers/ras/Kconfig" > + > endmenu > diff --git a/drivers/Makefile b/drivers/Makefile > index 8e3b8b0..10aaab0 100644 > --- a/drivers/Makefile > +++ b/drivers/Makefile > @@ -155,3 +155,4 @@ obj-$(CONFIG_IPACK_BUS) += ipack/ > obj-$(CONFIG_NTB) += ntb/ > obj-$(CONFIG_FMC) += fmc/ > obj-$(CONFIG_POWERCAP) += powercap/ > +obj-$(CONFIG_RAS_TRACE) += ras/ > diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c > index 33edd67..28c1695 100644 > --- a/drivers/edac/edac_mc.c > +++ b/drivers/edac/edac_mc.c > @@ -33,9 +33,6 @@ > #include <asm/edac.h> > #include "edac_core.h" > #include "edac_module.h" > - > -#define CREATE_TRACE_POINTS > -#define TRACE_INCLUDE_PATH ../../include/ras > #include <ras/ras_event.h> > > /* lock to memory controller's control array */ > diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig > new file mode 100644 > index 0000000..6e4aec5 > --- /dev/null > +++ b/drivers/ras/Kconfig > @@ -0,0 +1,4 @@ > +# RAS_TRACE always gets selected by whoever wants it. > +config RAS_TRACE > + def_bool y > + depends on EDAC_MM_EDAC This should actually be menuconfig RAS bool "Reliability, Availability, Serviceability features" help <A nice text about what this is going to contain, i.e. RAS stuff ... if RAS config RAS_TRACE def_bool y depends on ... See drivers/edac/Kconfig for an example. RAS_TRACE should actually depend on all the code that uses it, or, it should be selected by them. EDAC_MM_EDAC is not accurate enough and will enable RAS_TRACE even for drivers which don't use the tracepoint(s). Thanks.
On Wed, Apr 09, 2014 at 09:46:54PM +0200, Borislav Petkov wrote: [...] > > This should actually be > > menuconfig RAS > bool "Reliability, Availability, Serviceability features" > help > <A nice text about what this is going to contain, i.e. RAS stuff > > ... > > if RAS > config RAS_TRACE > def_bool y > depends on ... > > See drivers/edac/Kconfig for an example. I don't use a explicit menu for RAS because I'm not sure if it is worth to add such a *heavy hammer* in the kernel tree. Since you suggest it, it is fine to me. > > RAS_TRACE should actually depend on all the code that uses it, or, it > should be selected by them. EDAC_MM_EDAC is not accurate enough and will > enable RAS_TRACE even for drivers which don't use the tracepoint(s). Maybe some drivers don't call trace_mc_event directly/indirectly, but edac_mc.c is the core of EDAC and must exist before any other drivers are loaded, which means whether the drivers call trace_mc_event or not, the trace interface in edac_mc should be there in advance. Do I miss something?
On Sun, Apr 13, 2014 at 11:20:58PM -0400, Chen, Gong wrote: > I don't use a explicit menu for RAS because I'm not sure if it is > worth to add such a *heavy hammer* in the kernel tree. We're going to be adding more stuff to it so a full menu will come sooner rather than later. > Maybe some drivers don't call trace_mc_event directly/indirectly, but > edac_mc.c is the core of EDAC and must exist before any other drivers > are loaded, which means whether the drivers call trace_mc_event or > not, the trace interface in edac_mc should be there in advance. Do I > miss something? No, you're fine. I missed the fact that you've moved the mc_event tracepoint, sorry. Thanks.
On Wed, Apr 09, 2014 at 09:46:54PM +0200, Borislav Petkov wrote: > > menuconfig RAS > bool "Reliability, Availability, Serviceability features" > help > <A nice text about what this is going to contain, i.e. RAS stuff > How about this: Reliability, availability, and serviceability (RAS) is a computer hardware engineering term. Computers designed with higher levels of RAS have a multitude of features that protect data integrity and help them stay available for long periods of time without failure. Reliability can be defined as the probability that it will produce correct outputs up to some given time. Reliability is enhanced by features that help to avoid, detect and repair hardware faults. Availability is the probability a system is operational at a given time, i.e. the amount of time a device is actually operating as the percentage of total time it should be operating. Serviceability or maintainability is the simplicity and speed with which a system can be repaired or maintained; if the time to repair a failed system increases, then availability will decrease. Note that reliability and availability are distinct concepts: Reliability is a measure of the ability of a system to function correctly, including avoiding data corruption, whereas availability measures how often it is available for use, even though it may not be functioning correctly. For example, a server may run forever and so have ideal availability, but may be unreliable, with frequent data corruption.
On Wed, Apr 16, 2014 at 02:33:01AM -0400, Chen, Gong wrote: > On Wed, Apr 09, 2014 at 09:46:54PM +0200, Borislav Petkov wrote: > > > > menuconfig RAS > > bool "Reliability, Availability, Serviceability features" > > help > > <A nice text about what this is going to contain, i.e. RAS stuff > > > How about this: Good. Just nitpicks below: > Reliability, availability, and serviceability (RAS) is a computer > hardware engineering term. Computers designed with higher levels > of RAS have a multitude of features that protect data integrity > and help them stay available for long periods of time without > failure. > > Reliability can be defined as the probability that it will produce s/it/the system/. "it" is kinda misleading as to what we refer to. > correct outputs up to some given time. Reliability is enhanced by > features that help to avoid, detect and repair hardware faults. > > Availability is the probability a system is operational at a given > time, i.e. the amount of time a device is actually operating as the > percentage of total time it should be operating. > > Serviceability or maintainability is the simplicity and speed with > which a system can be repaired or maintained; if the time to repair > a failed system increases, then availability will decrease. Nice! > Note that reliability and availability are distinct concepts: capitalized: ... that Reliability and Availability are ... > Reliability is a measure of the ability of a system to function > correctly, including avoiding data corruption, whereas availability ditto: Availability > measures how often it is available for use, even though it may not > be functioning correctly. For example, a server may run forever and > so have ideal availability, but may be unreliable, with frequent > data corruption. Very good description! :-)
diff --git a/drivers/Kconfig b/drivers/Kconfig index b3138fb..d70f7ba 100644 --- a/drivers/Kconfig +++ b/drivers/Kconfig @@ -170,4 +170,6 @@ source "drivers/phy/Kconfig" source "drivers/powercap/Kconfig" +source "drivers/ras/Kconfig" + endmenu diff --git a/drivers/Makefile b/drivers/Makefile index 8e3b8b0..10aaab0 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -155,3 +155,4 @@ obj-$(CONFIG_IPACK_BUS) += ipack/ obj-$(CONFIG_NTB) += ntb/ obj-$(CONFIG_FMC) += fmc/ obj-$(CONFIG_POWERCAP) += powercap/ +obj-$(CONFIG_RAS_TRACE) += ras/ diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c index 33edd67..28c1695 100644 --- a/drivers/edac/edac_mc.c +++ b/drivers/edac/edac_mc.c @@ -33,9 +33,6 @@ #include <asm/edac.h> #include "edac_core.h" #include "edac_module.h" - -#define CREATE_TRACE_POINTS -#define TRACE_INCLUDE_PATH ../../include/ras #include <ras/ras_event.h> /* lock to memory controller's control array */ diff --git a/drivers/ras/Kconfig b/drivers/ras/Kconfig new file mode 100644 index 0000000..6e4aec5 --- /dev/null +++ b/drivers/ras/Kconfig @@ -0,0 +1,4 @@ +# RAS_TRACE always gets selected by whoever wants it. +config RAS_TRACE + def_bool y + depends on EDAC_MM_EDAC diff --git a/drivers/ras/Makefile b/drivers/ras/Makefile new file mode 100644 index 0000000..826afc6 --- /dev/null +++ b/drivers/ras/Makefile @@ -0,0 +1 @@ +obj-$(CONFIG_RAS_TRACE) += ras-traces.o diff --git a/drivers/ras/ras-traces.c b/drivers/ras/ras-traces.c new file mode 100644 index 0000000..b0c6ed1 --- /dev/null +++ b/drivers/ras/ras-traces.c @@ -0,0 +1,12 @@ +/* + * Copyright (C) 2014 Intel Corporation + * + * Authors: + * Chen, Gong <gong.chen@linux.intel.com> + */ + +#define CREATE_TRACE_POINTS +#define TRACE_INCLUDE_PATH ../../include/ras +#include <ras/ras_event.h> + +EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
To avoid the confuision of usage for RAS related trace event, add an unified RAS trace event stub. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> --- drivers/Kconfig | 2 ++ drivers/Makefile | 1 + drivers/edac/edac_mc.c | 3 --- drivers/ras/Kconfig | 4 ++++ drivers/ras/Makefile | 1 + drivers/ras/ras-traces.c | 12 ++++++++++++ 6 files changed, 20 insertions(+), 3 deletions(-) create mode 100644 drivers/ras/Kconfig create mode 100644 drivers/ras/Makefile create mode 100644 drivers/ras/ras-traces.c