diff mbox

[V15,03/11] cper: add timestamp print to CPER status printing

Message ID 20170421122150.76cce2cfrt767glv@pd.tnic (mailing list archive)
State New, archived
Headers show

Commit Message

Borislav Petkov April 21, 2017, 12:21 p.m. UTC
On Tue, Apr 18, 2017 at 05:05:15PM -0600, Tyler Baicar wrote:
> The ACPI 6.1 spec added a timestamp to the HEST generic data

HEST?

I see the timestamp in

Table 18-343 Generic Error Data Entry

where those things are "One or more Generic Error Data Entry structures
may be recorded in the Generic Error Data Entries field of the Generic
Error Status Block structure."

And those are part of the "18.3.2.7 Generic Hardware Error Source",
i.e., GHES. So why do you say "HEST" above?

> structure. Print the timestamp out when printing out the error
> status information.
> 
> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
> CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
> Reviewed-by: James Morse <james.morse@arm.com>
> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

Remove those Reviewed-by:s

> ---
>  drivers/firmware/efi/cper.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 8328a6f..46585f9 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -32,6 +32,8 @@
>  #include <linux/acpi.h>
>  #include <linux/pci.h>
>  #include <linux/aer.h>
> +#include <linux/printk.h>
> +#include <linux/bcd.h>
>  #include <acpi/ghes.h>
>  
>  #define INDENT_SP	" "
> @@ -387,6 +389,29 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
>  	pfx, pcie->bridge.secondary_status, pcie->bridge.control);
>  }
>  
> +static void cper_estatus_timestamp(const char *pfx,

cper_print_tstamp()

> +				   struct acpi_hest_generic_data_v300 *gdata)
> +{
> +	__u8 hour, min, sec, day, mon, year, century, *timestamp;
> +
> +	if (gdata->validation_bits & ACPI_HEST_GEN_VALID_TIMESTAMP) {
> +		timestamp = (__u8 *)&(gdata->time_stamp);
> +		sec       = bcd2bin(timestamp[0]);
> +		min       = bcd2bin(timestamp[1]);
> +		hour      = bcd2bin(timestamp[2]);
> +		day       = bcd2bin(timestamp[4]);
> +		mon       = bcd2bin(timestamp[5]);
> +		year      = bcd2bin(timestamp[6]);
> +		century   = bcd2bin(timestamp[7]);
> +
> +		if (*(timestamp + 3) & 0x1)
> +			printk("%stimestamp is precise\n", pfx);
> +
> +		printk("%stime: %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx,
> +			century, year, mon, day, hour, min, sec);

Yeah, about the precise tstamp, you can do something like this:

---

Comments

Tyler Baicar April 21, 2017, 4:04 p.m. UTC | #1
On 4/21/2017 6:21 AM, Borislav Petkov wrote:
> On Tue, Apr 18, 2017 at 05:05:15PM -0600, Tyler Baicar wrote:
>> The ACPI 6.1 spec added a timestamp to the HEST generic data
> HEST?
>
> I see the timestamp in
>
> Table 18-343 Generic Error Data Entry
>
> where those things are "One or more Generic Error Data Entry structures
> may be recorded in the Generic Error Data Entries field of the Generic
> Error Status Block structure."
>
> And those are part of the "18.3.2.7 Generic Hardware Error Source",
> i.e., GHES. So why do you say "HEST" above?
Ah yes, this should be GHES no HEST. There are too many acronyms 
involved here :)
>
>> structure. Print the timestamp out when printing out the error
>> status information.
>>
>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>> CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
>> Reviewed-by: James Morse <james.morse@arm.com>
>> Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Remove those Reviewed-by:s
>
>> ---
>>   drivers/firmware/efi/cper.c | 28 ++++++++++++++++++++++++++++
>>   1 file changed, 28 insertions(+)
>>
>> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
>> index 8328a6f..46585f9 100644
>> --- a/drivers/firmware/efi/cper.c
>> +++ b/drivers/firmware/efi/cper.c
>> @@ -32,6 +32,8 @@
>>   #include <linux/acpi.h>
>>   #include <linux/pci.h>
>>   #include <linux/aer.h>
>> +#include <linux/printk.h>
>> +#include <linux/bcd.h>
>>   #include <acpi/ghes.h>
>>   
>>   #define INDENT_SP	" "
>> @@ -387,6 +389,29 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
>>   	pfx, pcie->bridge.secondary_status, pcie->bridge.control);
>>   }
>>   
>> +static void cper_estatus_timestamp(const char *pfx,
> cper_print_tstamp()
>
>> +				   struct acpi_hest_generic_data_v300 *gdata)
>> +{
>> +	__u8 hour, min, sec, day, mon, year, century, *timestamp;
>> +
>> +	if (gdata->validation_bits & ACPI_HEST_GEN_VALID_TIMESTAMP) {
>> +		timestamp = (__u8 *)&(gdata->time_stamp);
>> +		sec       = bcd2bin(timestamp[0]);
>> +		min       = bcd2bin(timestamp[1]);
>> +		hour      = bcd2bin(timestamp[2]);
>> +		day       = bcd2bin(timestamp[4]);
>> +		mon       = bcd2bin(timestamp[5]);
>> +		year      = bcd2bin(timestamp[6]);
>> +		century   = bcd2bin(timestamp[7]);
>> +
>> +		if (*(timestamp + 3) & 0x1)
>> +			printk("%stimestamp is precise\n", pfx);
>> +
>> +		printk("%stime: %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx,
>> +			century, year, mon, day, hour, min, sec);
> Yeah, about the precise tstamp, you can do something like this:
>
> ---
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 46585f92b741..a649884e2264 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -404,10 +404,8 @@ static void cper_estatus_timestamp(const char *pfx,
>   		year      = bcd2bin(timestamp[6]);
>   		century   = bcd2bin(timestamp[7]);
>   
> -		if (*(timestamp + 3) & 0x1)
> -			printk("%stimestamp is precise\n", pfx);
> -
> -		printk("%stime: %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx,
> +		printk("%s%ststamp: %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx,
> +			(timestamp[3] & 0x1 ? "precise " : ""),
>   			century, year, mon, day, hour, min, sec);
>   	}
>   }
This is basically what I already had in v14...you asked to move it into 
a different if-statement? https://lkml.org/lkml/2017/4/12/397

Thanks,
Tyler
Borislav Petkov April 21, 2017, 5:26 p.m. UTC | #2
On Fri, Apr 21, 2017 at 10:04:35AM -0600, Baicar, Tyler wrote:
> This is basically what I already had in v14...you asked to move it into a
> different if-statement? https://lkml.org/lkml/2017/4/12/397

Well, clearly I've been smoking some nasty potent sh*t. :-\

/me goes and looks at the spec:

"Bit 0 – Timestamp is precise if this bit is set and correlates to the
time of the error event."

So why are we even printing the timestamp when !precise?

IOW, I think we should do:

	if (!(timestamp[3] & 0x1))
		printk("%stimestamp imprecise\n", pfx);
	else {
		sec = ..
		min = ...

		...
	}

and print the actual values only when the timestamp is precise.
Otherwise it has *some* values which could just as well be completely
random. And it's not like we're reporting the error tomorrow - it is
mostly a couple of seconds from logging to the fw pushing it out...
Tyler Baicar April 21, 2017, 6:08 p.m. UTC | #3
On 4/21/2017 11:26 AM, Borislav Petkov wrote:
> On Fri, Apr 21, 2017 at 10:04:35AM -0600, Baicar, Tyler wrote:
>> This is basically what I already had in v14...you asked to move it into a
>> different if-statement? https://lkml.org/lkml/2017/4/12/397
> Well, clearly I've been smoking some nasty potent sh*t. :-\
>
> /me goes and looks at the spec:
>
> "Bit 0 – Timestamp is precise if this bit is set and correlates to the
> time of the error event."
>
> So why are we even printing the timestamp when !precise?
>
> IOW, I think we should do:
>
> 	if (!(timestamp[3] & 0x1))
> 		printk("%stimestamp imprecise\n", pfx);
> 	else {
> 		sec = ..
> 		min = ...
>
> 		...
> 	}
>
> and print the actual values only when the timestamp is precise.
> Otherwise it has *some* values which could just as well be completely
> random. And it's not like we're reporting the error tomorrow - it is
> mostly a couple of seconds from logging to the fw pushing it out...
The timestamp may still be useful when it is imprecise. In the polling 
case, you may only poll every minute or so, so the time may be useful. 
Also, I imagine there could be interrupt based errors happening much 
faster than the FW/OS handshake can happen. Maybe we can just use what I 
had before but also specify imprecise so that it is clear:

         printk("%s%ststamp: %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx,
             (timestamp[3] & 0x1 ? "precise " : "imprecise "),
              century, year, mon, day, hour, min, sec);

Thanks,
Tyler
Borislav Petkov April 21, 2017, 6:12 p.m. UTC | #4
On Fri, Apr 21, 2017 at 12:08:43PM -0600, Baicar, Tyler wrote:
> The timestamp may still be useful when it is imprecise. In the polling case,
> you may only poll every minute or so, so the time may be useful.

Well, what is in the timestamp when !precise? Some random time or some
timestamp from a couple of seconds ago? How do you differentiate what
timestamp is bollocks and what is from a while ago?

Is the imprecise tstamp really close to the time the error happened or
pointing at 1970 - the beginning of unix time? :-)

I'm sure you've picked up by now that we don't trust the firmware one
bit.

> Also, I imagine there could be interrupt based errors happening much faster than the
> FW/OS handshake can happen. Maybe we can just use what I had before but also
> specify imprecise so that it is clear:
> 
>         printk("%s%ststamp: %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx,
>             (timestamp[3] & 0x1 ? "precise " : "imprecise "),
>              century, year, mon, day, hour, min, sec);

I guess.
diff mbox

Patch

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 46585f92b741..a649884e2264 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -404,10 +404,8 @@  static void cper_estatus_timestamp(const char *pfx,
 		year      = bcd2bin(timestamp[6]);
 		century   = bcd2bin(timestamp[7]);
 
-		if (*(timestamp + 3) & 0x1)
-			printk("%stimestamp is precise\n", pfx);
-
-		printk("%stime: %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx,
+		printk("%s%ststamp: %02d%02d-%02d-%02d %02d:%02d:%02d\n", pfx,
+			(timestamp[3] & 0x1 ? "precise " : ""), 
 			century, year, mon, day, hour, min, sec);
 	}
 }