diff mbox series

efi/cper: Print correctable AER information

Message ID 20240823002422.3056599-1-avadhut.naik@amd.com (mailing list archive)
State Accepted
Commit d7171eb494353e03f3cde1a6f665e19c243c98e8
Headers show
Series efi/cper: Print correctable AER information | expand

Commit Message

Avadhut Naik Aug. 23, 2024, 12:24 a.m. UTC
From: Yazen Ghannam <yazen.ghannam@amd.com>

Currently, cper_print_pcie() only logs Uncorrectable Error Status, Mask
and Severity registers along with the TLP header.

If a correctable error is received immediately preceding or following an
Uncorrectable Fatal Error, its information is lost since Correctable
Error Status and Mask registers are not logged.

As such, to avoid skipping any possible error information, Correctable
Error Status and Mask registers should also be logged.

Additionally, ensure that AER information is also available through
cper_print_pcie() for Correctable and Uncorrectable Non-Fatal Errors.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Tested-by: Avadhut Naik <avadhut.naik@amd.com>
Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
---
 drivers/firmware/efi/cper.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)


base-commit: fdf969bbceb389f5a7c69e226daf2cb724ea66ba

Comments

Ard Biesheuvel Aug. 27, 2024, 10:23 a.m. UTC | #1
On Fri, 23 Aug 2024 at 02:24, Avadhut Naik <avadhut.naik@amd.com> wrote:
>
> From: Yazen Ghannam <yazen.ghannam@amd.com>
>
> Currently, cper_print_pcie() only logs Uncorrectable Error Status, Mask
> and Severity registers along with the TLP header.
>
> If a correctable error is received immediately preceding or following an
> Uncorrectable Fatal Error, its information is lost since Correctable
> Error Status and Mask registers are not logged.
>
> As such, to avoid skipping any possible error information, Correctable
> Error Status and Mask registers should also be logged.
>
> Additionally, ensure that AER information is also available through
> cper_print_pcie() for Correctable and Uncorrectable Non-Fatal Errors.
>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> Tested-by: Avadhut Naik <avadhut.naik@amd.com>
> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com>
> ---
>  drivers/firmware/efi/cper.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>

Queued for v6.12 - thanks.


> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 7d2cdd9e2227..b69e68ef3f02 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -434,12 +434,17 @@ static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
>         "%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
>         pfx, pcie->bridge.secondary_status, pcie->bridge.control);
>
> -       /* Fatal errors call __ghes_panic() before AER handler prints this */
> -       if ((pcie->validation_bits & CPER_PCIE_VALID_AER_INFO) &&
> -           (gdata->error_severity & CPER_SEV_FATAL)) {
> +       /*
> +        * Print all valid AER info. Record may be from BERT (boot-time) or GHES (run-time).
> +        *
> +        * Fatal errors call __ghes_panic() before AER handler prints this.
> +        */
> +       if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO) {
>                 struct aer_capability_regs *aer;
>
>                 aer = (struct aer_capability_regs *)pcie->aer_info;
> +               printk("%saer_cor_status: 0x%08x, aer_cor_mask: 0x%08x\n",
> +                      pfx, aer->cor_status, aer->cor_mask);
>                 printk("%saer_uncor_status: 0x%08x, aer_uncor_mask: 0x%08x\n",
>                        pfx, aer->uncor_status, aer->uncor_mask);
>                 printk("%saer_uncor_severity: 0x%08x\n",
>
> base-commit: fdf969bbceb389f5a7c69e226daf2cb724ea66ba
> --
> 2.34.1
>
diff mbox series

Patch

diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index 7d2cdd9e2227..b69e68ef3f02 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -434,12 +434,17 @@  static void cper_print_pcie(const char *pfx, const struct cper_sec_pcie *pcie,
 	"%s""bridge: secondary_status: 0x%04x, control: 0x%04x\n",
 	pfx, pcie->bridge.secondary_status, pcie->bridge.control);
 
-	/* Fatal errors call __ghes_panic() before AER handler prints this */
-	if ((pcie->validation_bits & CPER_PCIE_VALID_AER_INFO) &&
-	    (gdata->error_severity & CPER_SEV_FATAL)) {
+	/*
+	 * Print all valid AER info. Record may be from BERT (boot-time) or GHES (run-time).
+	 *
+	 * Fatal errors call __ghes_panic() before AER handler prints this.
+	 */
+	if (pcie->validation_bits & CPER_PCIE_VALID_AER_INFO) {
 		struct aer_capability_regs *aer;
 
 		aer = (struct aer_capability_regs *)pcie->aer_info;
+		printk("%saer_cor_status: 0x%08x, aer_cor_mask: 0x%08x\n",
+		       pfx, aer->cor_status, aer->cor_mask);
 		printk("%saer_uncor_status: 0x%08x, aer_uncor_mask: 0x%08x\n",
 		       pfx, aer->uncor_status, aer->uncor_mask);
 		printk("%saer_uncor_severity: 0x%08x\n",