diff mbox

[V12,07/10] efi: print unrecognized CPER section

Message ID 1488833103-21082-8-git-send-email-tbaicar@codeaurora.org (mailing list archive)
State New, archived
Headers show

Commit Message

Tyler Baicar March 6, 2017, 8:45 p.m. UTC
UEFI spec allows for non-standard section in Common Platform Error
Record. This is defined in section N.2.3 of UEFI version 2.5.

Currently if the CPER section's type (UUID) does not match with
one of the section types that the kernel knows how to parse, the
section is skipped. Therefore, user is not able to see
such CPER data, for instance, error record of non-standard section.

For above mentioned case, this change prints out the raw data in
hex in dmesg buffer. Data length is taken from Error Data length
field of Generic Error Data Entry.

Following is a sample output from dmesg:
[  115.771702] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 2
[  115.779042] {1}[Hardware Error]: It has been corrected by h/w and requires no further action
[  115.787456] {1}[Hardware Error]: event severity: corrected
[  115.792927] {1}[Hardware Error]:  Error 0, type: corrected
[  115.798415] {1}[Hardware Error]:  fru_id: 00000000-0000-0000-0000-000000000000
[  115.805596] {1}[Hardware Error]:  fru_text:
[  115.816105] {1}[Hardware Error]:  section type: d2e2621c-f936-468d-0d84-15a4ed015c8b
[  115.823880] {1}[Hardware Error]:  section length: 88
[  115.828779] {1}[Hardware Error]:   00000000: 01000001 00000002 5f434345 525f4543
[  115.836153] {1}[Hardware Error]:   00000010: 0000574d 00000000 00000000 00000000
[  115.843531] {1}[Hardware Error]:   00000020: 00000000 00000000 00000000 00000000
[  115.850908] {1}[Hardware Error]:   00000030: 00000000 00000000 00000000 00000000
[  115.858288] {1}[Hardware Error]:   00000040: fe800000 00000000 00000004 5f434345
[  115.865665] {1}[Hardware Error]:   00000050: 525f4543 0000574d

The raw data from the error can then be decoded using vendor
specific tools.

Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
CC: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Reviewed-by: James Morse <james.morse@arm.com>
---
 drivers/firmware/efi/cper.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

Comments

Joe Perches March 6, 2017, 9:05 p.m. UTC | #1
On Mon, 2017-03-06 at 13:45 -0700, Tyler Baicar wrote:
> UEFI spec allows for non-standard section in Common Platform Error
> Record. This is defined in section N.2.3 of UEFI version 2.5.
> 
> Currently if the CPER section's type (UUID) does not match with
> one of the section types that the kernel knows how to parse, the
> section is skipped. Therefore, user is not able to see
> such CPER data, for instance, error record of non-standard section.
> 
> For above mentioned case, this change prints out the raw data in
> hex in dmesg buffer. Data length is taken from Error Data length
> field of Generic Error Data Entry.

Hi Tyler.

Trivia: (probably not worth resubmitting for this)

There's a slight mismatch between logging output and commit
message.  Now there's an ASCII block after the output.

Another suggestion below.

> Following is a sample output from dmesg:
> [  115.771702] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 2
> [  115.779042] {1}[Hardware Error]: It has been corrected by h/w and requires no further action
> [  115.787456] {1}[Hardware Error]: event severity: corrected
> [  115.792927] {1}[Hardware Error]:  Error 0, type: corrected
> [  115.798415] {1}[Hardware Error]:  fru_id: 00000000-0000-0000-0000-000000000000
> [  115.805596] {1}[Hardware Error]:  fru_text:
> [  115.816105] {1}[Hardware Error]:  section type: d2e2621c-f936-468d-0d84-15a4ed015c8b
> [  115.823880] {1}[Hardware Error]:  section length: 88
> [  115.828779] {1}[Hardware Error]:   00000000: 01000001 00000002 5f434345 525f4543
> [  115.836153] {1}[Hardware Error]:   00000010: 0000574d 00000000 00000000 00000000
> [  115.843531] {1}[Hardware Error]:   00000020: 00000000 00000000 00000000 00000000
> [  115.850908] {1}[Hardware Error]:   00000030: 00000000 00000000 00000000 00000000
> [  115.858288] {1}[Hardware Error]:   00000040: fe800000 00000000 00000004 5f434345
> [  115.865665] {1}[Hardware Error]:   00000050: 525f4543 0000574d
[]
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
[]
> @@ -591,8 +591,16 @@ static void cper_estatus_print_section(
>  			cper_print_proc_arm(newpfx, arm_err);
>  		else
>  			goto err_section_too_small;
> -	} else
> -		printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
> +	} else {
> +		const void *unknown_err;
> +
> +		unknown_err = acpi_hest_generic_data_payload(gdata);
> +		printk("%ssection type: unknown, %pUl\n", newpfx, sec_type);
> +		printk("%ssection length: %d\n", newpfx,
> +		       gdata->error_data_length);

It might be nice to output this as

		printk("%ssection length: %d (%#x)\n",
		       newpfx, gdata->error_data_length, gdata->error_data_length);

so it's easy to know the appropriate hex buffer length too.

> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
> +			       unknown_err, gdata->error_data_length, true);
> +	}
>  
>  	return;
>
Tyler Baicar March 7, 2017, 4:39 p.m. UTC | #2
On 3/6/2017 2:05 PM, Joe Perches wrote:
> On Mon, 2017-03-06 at 13:45 -0700, Tyler Baicar wrote:
>> UEFI spec allows for non-standard section in Common Platform Error
>> Record. This is defined in section N.2.3 of UEFI version 2.5.
>>
>> Currently if the CPER section's type (UUID) does not match with
>> one of the section types that the kernel knows how to parse, the
>> section is skipped. Therefore, user is not able to see
>> such CPER data, for instance, error record of non-standard section.
>>
>> For above mentioned case, this change prints out the raw data in
>> hex in dmesg buffer. Data length is taken from Error Data length
>> field of Generic Error Data Entry.
> Hi Tyler.
>
> Trivia: (probably not worth resubmitting for this)
>
> There's a slight mismatch between logging output and commit
> message.  Now there's an ASCII block after the output.
>
> Another suggestion below.
True, I can update this commit for the next patch set.
>
>> Following is a sample output from dmesg:
>> [  115.771702] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 2
>> [  115.779042] {1}[Hardware Error]: It has been corrected by h/w and requires no further action
>> [  115.787456] {1}[Hardware Error]: event severity: corrected
>> [  115.792927] {1}[Hardware Error]:  Error 0, type: corrected
>> [  115.798415] {1}[Hardware Error]:  fru_id: 00000000-0000-0000-0000-000000000000
>> [  115.805596] {1}[Hardware Error]:  fru_text:
>> [  115.816105] {1}[Hardware Error]:  section type: d2e2621c-f936-468d-0d84-15a4ed015c8b
>> [  115.823880] {1}[Hardware Error]:  section length: 88
>> [  115.828779] {1}[Hardware Error]:   00000000: 01000001 00000002 5f434345 525f4543
>> [  115.836153] {1}[Hardware Error]:   00000010: 0000574d 00000000 00000000 00000000
>> [  115.843531] {1}[Hardware Error]:   00000020: 00000000 00000000 00000000 00000000
>> [  115.850908] {1}[Hardware Error]:   00000030: 00000000 00000000 00000000 00000000
>> [  115.858288] {1}[Hardware Error]:   00000040: fe800000 00000000 00000004 5f434345
>> [  115.865665] {1}[Hardware Error]:   00000050: 525f4543 0000574d
> []
>> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> []
>> @@ -591,8 +591,16 @@ static void cper_estatus_print_section(
>>   			cper_print_proc_arm(newpfx, arm_err);
>>   		else
>>   			goto err_section_too_small;
>> -	} else
>> -		printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
>> +	} else {
>> +		const void *unknown_err;
>> +
>> +		unknown_err = acpi_hest_generic_data_payload(gdata);
>> +		printk("%ssection type: unknown, %pUl\n", newpfx, sec_type);
>> +		printk("%ssection length: %d\n", newpfx,
>> +		       gdata->error_data_length);
> It might be nice to output this as
>
> 		printk("%ssection length: %d (%#x)\n",
> 		       newpfx, gdata->error_data_length, gdata->error_data_length);
>
> so it's easy to know the appropriate hex buffer length too.
I will make this change in the next patch set.

Thanks,
Tyler
>
>> +		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
>> +			       unknown_err, gdata->error_data_length, true);
>> +	}
>>   
>>   	return;
>>
diff mbox

Patch

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 56aa516..545a6c2 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -591,8 +591,16 @@  static void cper_estatus_print_section(
 			cper_print_proc_arm(newpfx, arm_err);
 		else
 			goto err_section_too_small;
-	} else
-		printk("%s""section type: unknown, %pUl\n", newpfx, sec_type);
+	} else {
+		const void *unknown_err;
+
+		unknown_err = acpi_hest_generic_data_payload(gdata);
+		printk("%ssection type: unknown, %pUl\n", newpfx, sec_type);
+		printk("%ssection length: %d\n", newpfx,
+		       gdata->error_data_length);
+		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
+			       unknown_err, gdata->error_data_length, true);
+	}
 
 	return;