[v4,1/3] PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info

Message ID	20240509084833.2147767-2-zhenzhong.duan@intel.com (mailing list archive)
State	Superseded
Delegated to:	Bjorn Helgaas
Headers	show Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 456F0127E1F; Thu, 9 May 2024 08:51:31 +0000 (UTC) From: Zhenzhong Duan <zhenzhong.duan@intel.com> To: linux-pci@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org, linux-acpi@vger.kernel.org, rafael@kernel.org, lenb@kernel.org, james.morse@arm.com, tony.luck@intel.com, bp@alien8.de, dave@stgolabs.net, jonathan.cameron@huawei.com, dave.jiang@intel.com, alison.schofield@intel.com, vishal.l.verma@intel.com, ira.weiny@intel.com, bhelgaas@google.com, helgaas@kernel.org, mahesh@linux.ibm.com, oohall@gmail.com, linmiaohe@huawei.com, shiju.jose@huawei.com, adam.c.preble@intel.com, lukas@wunner.de, Smita.KoralahalliChannabasappa@amd.com, rrichter@amd.com, linux-cxl@vger.kernel.org, linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org, erwin.tsaur@intel.com, sathyanarayanan.kuppuswamy@intel.com, dan.j.williams@intel.com, feiting.wanyan@intel.com, yudong.wang@intel.com, chao.p.peng@intel.com, qingshun.wang@linux.intel.com, Zhenzhong Duan <zhenzhong.duan@intel.com> Subject: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info Date: Thu, 9 May 2024 16:48:31 +0800 Message-Id: <20240509084833.2147767-2-zhenzhong.duan@intel.com> In-Reply-To: <20240509084833.2147767-1-zhenzhong.duan@intel.com> References: <20240509084833.2147767-1-zhenzhong.duan@intel.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	PCI/AER: Handle Advisory Non-Fatal error \| expand [v4,0/3] PCI/AER: Handle Advisory Non-Fatal error [v4,1/3] PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info [v4,2/3] PCI/AER: Print UNCOR_STATUS bits that might be ANFE [v4,3/3] PCI/AER: Clear UNCOR_STATUS bits that might be ANFE

Zhenzhong Duan May 9, 2024, 8:48 a.m. UTC

In some cases the detector of a Non-Fatal Error(NFE) is not the most
appropriate agent to determine the type of the error. For example,
when software performs a configuration read from a non-existent
device or Function, completer will send an ERR_NONFATAL Message.
On some platforms, ERR_NONFATAL results in a System Error, which
breaks normal software probing.

Advisory Non-Fatal Error(ANFE) is a special case that can be used
in above scenario. It is predominantly determined by the role of the
detecting agent (Requester, Completer, or Receiver) and the specific
error. In such cases, an agent with AER signals the NFE (if enabled)
by sending an ERR_COR Message as an advisory to software, instead of
sending ERR_NONFATAL.

When processing an ANFE, ideally both correctable error(CE) status and
uncorrectable error(UE) status should be cleared. However, there is no
way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
NFE will reproduce above mentioned issue, i.e., breaking softwore probing;
treating NFE as ANFE will make us ignoring some UEs which need active
recover operation. To avoid clearing UEs that are not ANFE by accident,
the most conservative route is taken here: If any of the NFE Detected
bits is set in Device Status, do not touch UE status, they should be
cleared later by the UE handler. Otherwise, a specific set of UEs that
may be raised as ANFE according to the PCIe specification will be cleared
if their corresponding severity is Non-Fatal.

To achieve above purpose, store UNCOR_STATUS bits that might be ANFE
in aer_err_info.anfe_status. So that those bits could be printed and
processed later.

Tested-by: Yudong Wang <yudong.wang@intel.com>
Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
---
 drivers/pci/pci.h      |  1 +
 drivers/pci/pcie/aer.c | 53 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+)

Jonathan Cameron June 6, 2024, 3:06 p.m. UTC | #1

On Thu,  9 May 2024 16:48:31 +0800
Zhenzhong Duan <zhenzhong.duan@intel.com> wrote:

> In some cases the detector of a Non-Fatal Error(NFE) is not the most
> appropriate agent to determine the type of the error. For example,
> when software performs a configuration read from a non-existent
> device or Function, completer will send an ERR_NONFATAL Message.
> On some platforms, ERR_NONFATAL results in a System Error, which
> breaks normal software probing.
> 
> Advisory Non-Fatal Error(ANFE) is a special case that can be used
> in above scenario. It is predominantly determined by the role of the
> detecting agent (Requester, Completer, or Receiver) and the specific
> error. In such cases, an agent with AER signals the NFE (if enabled)
> by sending an ERR_COR Message as an advisory to software, instead of
> sending ERR_NONFATAL.
> 
> When processing an ANFE, ideally both correctable error(CE) status and
> uncorrectable error(UE) status should be cleared. However, there is no
> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
> NFE will reproduce above mentioned issue, i.e., breaking softwore probing;
> treating NFE as ANFE will make us ignoring some UEs which need active
> recover operation. To avoid clearing UEs that are not ANFE by accident,
> the most conservative route is taken here: If any of the NFE Detected
> bits is set in Device Status, do not touch UE status, they should be
> cleared later by the UE handler. Otherwise, a specific set of UEs that
> may be raised as ANFE according to the PCIe specification will be cleared
> if their corresponding severity is Non-Fatal.
> 
> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE
> in aer_err_info.anfe_status. So that those bits could be printed and
> processed later.
> 
> Tested-by: Yudong Wang <yudong.wang@intel.com>
> Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
> Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>

Not my most confident review ever as this is nasty and gives
me a headache but your description is good and I think the
implementation looks reasonable.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Kuppuswamy Sathyanarayanan June 13, 2024, 9:26 p.m. UTC | #2

Hi,

On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
> In some cases the detector of a Non-Fatal Error(NFE) is not the most
> appropriate agent to determine the type of the error. For example,
> when software performs a configuration read from a non-existent
> device or Function, completer will send an ERR_NONFATAL Message.
> On some platforms, ERR_NONFATAL results in a System Error, which
> breaks normal software probing.
>
> Advisory Non-Fatal Error(ANFE) is a special case that can be used
> in above scenario. It is predominantly determined by the role of the
> detecting agent (Requester, Completer, or Receiver) and the specific
> error. In such cases, an agent with AER signals the NFE (if enabled)
> by sending an ERR_COR Message as an advisory to software, instead of
> sending ERR_NONFATAL.
>
> When processing an ANFE, ideally both correctable error(CE) status and
> uncorrectable error(UE) status should be cleared. However, there is no
> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
> NFE will reproduce above mentioned issue, i.e., breaking softwore probing;
> treating NFE as ANFE will make us ignoring some UEs which need active
> recover operation. To avoid clearing UEs that are not ANFE by accident,
> the most conservative route is taken here: If any of the NFE Detected
> bits is set in Device Status, do not touch UE status, they should be
> cleared later by the UE handler. Otherwise, a specific set of UEs that
> may be raised as ANFE according to the PCIe specification will be cleared
> if their corresponding severity is Non-Fatal.
>
> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE
> in aer_err_info.anfe_status. So that those bits could be printed and
> processed later.
>
> Tested-by: Yudong Wang <yudong.wang@intel.com>
> Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
> Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
> ---
>  drivers/pci/pci.h      |  1 +
>  drivers/pci/pcie/aer.c | 53 ++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 54 insertions(+)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 17fed1846847..3f9eb807f9fd 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -412,6 +412,7 @@ struct aer_err_info {
>  
>  	unsigned int status;		/* COR/UNCOR Error Status */
>  	unsigned int mask;		/* COR/UNCOR Error Mask */
> +	unsigned int anfe_status;	/* UNCOR Error Status for ANFE */
>  	struct pcie_tlp_log tlp;	/* TLP Header */
>  };
>  
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index ac6293c24976..f2839b51321a 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -107,6 +107,12 @@ struct aer_stats {
>  					PCI_ERR_ROOT_MULTI_COR_RCV |	\
>  					PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>  
> +#define AER_ERR_ANFE_UNC_MASK		(PCI_ERR_UNC_POISON_TLP |	\
> +					PCI_ERR_UNC_COMP_TIME |		\
> +					PCI_ERR_UNC_COMP_ABORT |	\
> +					PCI_ERR_UNC_UNX_COMP |		\
> +					PCI_ERR_UNC_UNSUP)
> +
>  static int pcie_aer_disable;
>  static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>  
> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
>  EXPORT_SYMBOL_GPL(aer_recover_queue);
>  #endif
>  
> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info *info)
> +{
> +	u32 uncor_mask, uncor_status, anfe_status;
> +	u16 device_status;
> +	int aer = dev->aer_cap;
> +
> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS, &uncor_status);
> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK, &uncor_mask);
> +	/*
> +	 * According to PCIe Base Specification Revision 6.1,
> +	 * Section 6.2.3.2.4, if an UNCOR error is raised as
> +	 * Advisory Non-Fatal error, it will match the following
> +	 * conditions:
> +	 *	a. The severity of the error is Non-Fatal.
> +	 *	b. The error is one of the following:
> +	 *		1. Poisoned TLP           (Section 6.2.3.2.4.3)
> +	 *		2. Completion Timeout     (Section 6.2.3.2.4.4)
> +	 *		3. Completer Abort        (Section 6.2.3.2.4.1)
> +	 *		4. Unexpected Completion  (Section 6.2.3.2.4.5)
> +	 *		5. Unsupported Request    (Section 6.2.3.2.4.1)
> +	 */
> +	anfe_status = uncor_status & ~uncor_mask & ~info->severity &
> +		      AER_ERR_ANFE_UNC_MASK;
> +
> +	if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA, &device_status))
> +		return;
> +	/*
> +	 * Take the most conservative route here. If there are Non-Fatal errors
> +	 * detected, do not assume any bit in uncor_status is set by ANFE.
> +	 */
> +	if (device_status & PCI_EXP_DEVSTA_NFED)
> +		return;

You can move this check to the top of the function. You don't need to check
the rest if NFE error is detected in device status.

> +
> +	/*
> +	 * If there is another ANFE between reading uncor_status and clearing
> +	 * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE isn't
> +	 * recorded in info->anfe_status. It will be read out as NFE in
> +	 * following uncor_status register reading and processed by NFE
> +	 * handler.
> +	 */
> +	info->anfe_status = anfe_status;
> +}
> +
>  /**
>   * aer_get_device_error_info - read error status from dev and store it to info
>   * @dev: pointer to the device expected to have a error record
> @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
>  
>  	/* Must reset in this function */
>  	info->status = 0;
> +	info->anfe_status = 0;
>  	info->tlp_header_valid = 0;
>  
>  	/* The device might not support AER */
> @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev *dev, struct aer_err_info *info)
>  			&info->mask);
>  		if (!(info->status & ~info->mask))
>  			return 0;
> +
> +		if (info->status & PCI_ERR_COR_ADV_NFAT)
> +			anfe_get_uc_status(dev, info);
>  	} else if (type == PCI_EXP_TYPE_ROOT_PORT ||
>  		   type == PCI_EXP_TYPE_RC_EC ||
>  		   type == PCI_EXP_TYPE_DOWNSTREAM ||

Zhenzhong Duan June 14, 2024, 2:39 a.m. UTC | #3

Hi

>-----Original Message-----
>From: Kuppuswamy Sathyanarayanan
><sathyanarayanan.kuppuswamy@linux.intel.com>
>Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might
>be ANFE in aer_err_info
>
>Hi,
>
>On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
>> In some cases the detector of a Non-Fatal Error(NFE) is not the most
>> appropriate agent to determine the type of the error. For example,
>> when software performs a configuration read from a non-existent
>> device or Function, completer will send an ERR_NONFATAL Message.
>> On some platforms, ERR_NONFATAL results in a System Error, which
>> breaks normal software probing.
>>
>> Advisory Non-Fatal Error(ANFE) is a special case that can be used
>> in above scenario. It is predominantly determined by the role of the
>> detecting agent (Requester, Completer, or Receiver) and the specific
>> error. In such cases, an agent with AER signals the NFE (if enabled)
>> by sending an ERR_COR Message as an advisory to software, instead of
>> sending ERR_NONFATAL.
>>
>> When processing an ANFE, ideally both correctable error(CE) status and
>> uncorrectable error(UE) status should be cleared. However, there is no
>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
>> NFE will reproduce above mentioned issue, i.e., breaking softwore probing;
>> treating NFE as ANFE will make us ignoring some UEs which need active
>> recover operation. To avoid clearing UEs that are not ANFE by accident,
>> the most conservative route is taken here: If any of the NFE Detected
>> bits is set in Device Status, do not touch UE status, they should be
>> cleared later by the UE handler. Otherwise, a specific set of UEs that
>> may be raised as ANFE according to the PCIe specification will be cleared
>> if their corresponding severity is Non-Fatal.
>>
>> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE
>> in aer_err_info.anfe_status. So that those bits could be printed and
>> processed later.
>>
>> Tested-by: Yudong Wang <yudong.wang@intel.com>
>> Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
>> Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>> ---
>>  drivers/pci/pci.h      |  1 +
>>  drivers/pci/pcie/aer.c | 53
>++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 54 insertions(+)
>>
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index 17fed1846847..3f9eb807f9fd 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -412,6 +412,7 @@ struct aer_err_info {
>>
>>  	unsigned int status;		/* COR/UNCOR Error Status */
>>  	unsigned int mask;		/* COR/UNCOR Error Mask */
>> +	unsigned int anfe_status;	/* UNCOR Error Status for ANFE */
>>  	struct pcie_tlp_log tlp;	/* TLP Header */
>>  };
>>
>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>> index ac6293c24976..f2839b51321a 100644
>> --- a/drivers/pci/pcie/aer.c
>> +++ b/drivers/pci/pcie/aer.c
>> @@ -107,6 +107,12 @@ struct aer_stats {
>>  					PCI_ERR_ROOT_MULTI_COR_RCV |
>	\
>>  					PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>>
>> +#define AER_ERR_ANFE_UNC_MASK
>	(PCI_ERR_UNC_POISON_TLP |	\
>> +					PCI_ERR_UNC_COMP_TIME |
>	\
>> +					PCI_ERR_UNC_COMP_ABORT |
>	\
>> +					PCI_ERR_UNC_UNX_COMP |
>	\
>> +					PCI_ERR_UNC_UNSUP)
>> +
>>  static int pcie_aer_disable;
>>  static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>>
>> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned
>int bus, unsigned int devfn,
>>  EXPORT_SYMBOL_GPL(aer_recover_queue);
>>  #endif
>>
>> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info
>*info)
>> +{
>> +	u32 uncor_mask, uncor_status, anfe_status;
>> +	u16 device_status;
>> +	int aer = dev->aer_cap;
>> +
>> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
>&uncor_status);
>> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK,
>&uncor_mask);
>> +	/*
>> +	 * According to PCIe Base Specification Revision 6.1,
>> +	 * Section 6.2.3.2.4, if an UNCOR error is raised as
>> +	 * Advisory Non-Fatal error, it will match the following
>> +	 * conditions:
>> +	 *	a. The severity of the error is Non-Fatal.
>> +	 *	b. The error is one of the following:
>> +	 *		1. Poisoned TLP           (Section 6.2.3.2.4.3)
>> +	 *		2. Completion Timeout     (Section 6.2.3.2.4.4)
>> +	 *		3. Completer Abort        (Section 6.2.3.2.4.1)
>> +	 *		4. Unexpected Completion  (Section 6.2.3.2.4.5)
>> +	 *		5. Unsupported Request    (Section 6.2.3.2.4.1)
>> +	 */
>> +	anfe_status = uncor_status & ~uncor_mask & ~info->severity &
>> +		      AER_ERR_ANFE_UNC_MASK;
>> +
>> +	if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA,
>&device_status))
>> +		return;
>> +	/*
>> +	 * Take the most conservative route here. If there are Non-Fatal
>errors
>> +	 * detected, do not assume any bit in uncor_status is set by ANFE.
>> +	 */
>> +	if (device_status & PCI_EXP_DEVSTA_NFED)
>> +		return;
>
>You can move this check to the top of the function. You don't need to check
>the rest if NFE error is detected in device status.

The v3 just worked that way. Jonathan pointed a race that NFE triggered after
the check will be treated as ANFE and cleared. Check it after reading UNCOR_STATUS
can avoid the race.

See https://lkml.org/lkml/2024/4/22/1011 for discussion details.

Thanks
Zhenzhong

>
>> +
>> +	/*
>> +	 * If there is another ANFE between reading uncor_status and
>clearing
>> +	 * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE
>isn't
>> +	 * recorded in info->anfe_status. It will be read out as NFE in
>> +	 * following uncor_status register reading and processed by NFE
>> +	 * handler.
>> +	 */
>> +	info->anfe_status = anfe_status;
>> +}
>> +
>>  /**
>>   * aer_get_device_error_info - read error status from dev and store it to
>info
>>   * @dev: pointer to the device expected to have a error record
>> @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev
>*dev, struct aer_err_info *info)
>>
>>  	/* Must reset in this function */
>>  	info->status = 0;
>> +	info->anfe_status = 0;
>>  	info->tlp_header_valid = 0;
>>
>>  	/* The device might not support AER */
>> @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev
>*dev, struct aer_err_info *info)
>>  			&info->mask);
>>  		if (!(info->status & ~info->mask))
>>  			return 0;
>> +
>> +		if (info->status & PCI_ERR_COR_ADV_NFAT)
>> +			anfe_get_uc_status(dev, info);
>>  	} else if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>  		   type == PCI_EXP_TYPE_RC_EC ||
>>  		   type == PCI_EXP_TYPE_DOWNSTREAM ||
>
>--
>Sathyanarayanan Kuppuswamy
>Linux Kernel Developer

Kuppuswamy Sathyanarayanan June 14, 2024, 3:05 a.m. UTC | #4

On 6/13/24 7:39 PM, Duan, Zhenzhong wrote:
> Hi
>
>> -----Original Message-----
>> From: Kuppuswamy Sathyanarayanan
>> <sathyanarayanan.kuppuswamy@linux.intel.com>
>> Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might
>> be ANFE in aer_err_info
>>
>> Hi,
>>
>> On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
>>> In some cases the detector of a Non-Fatal Error(NFE) is not the most
>>> appropriate agent to determine the type of the error. For example,
>>> when software performs a configuration read from a non-existent
>>> device or Function, completer will send an ERR_NONFATAL Message.
>>> On some platforms, ERR_NONFATAL results in a System Error, which
>>> breaks normal software probing.
>>>
>>> Advisory Non-Fatal Error(ANFE) is a special case that can be used
>>> in above scenario. It is predominantly determined by the role of the
>>> detecting agent (Requester, Completer, or Receiver) and the specific
>>> error. In such cases, an agent with AER signals the NFE (if enabled)
>>> by sending an ERR_COR Message as an advisory to software, instead of
>>> sending ERR_NONFATAL.
>>>
>>> When processing an ANFE, ideally both correctable error(CE) status and
>>> uncorrectable error(UE) status should be cleared. However, there is no
>>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
>>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
>>> NFE will reproduce above mentioned issue, i.e., breaking softwore probing;
>>> treating NFE as ANFE will make us ignoring some UEs which need active
>>> recover operation. To avoid clearing UEs that are not ANFE by accident,
>>> the most conservative route is taken here: If any of the NFE Detected
>>> bits is set in Device Status, do not touch UE status, they should be
>>> cleared later by the UE handler. Otherwise, a specific set of UEs that
>>> may be raised as ANFE according to the PCIe specification will be cleared
>>> if their corresponding severity is Non-Fatal.
>>>
>>> To achieve above purpose, store UNCOR_STATUS bits that might be ANFE
>>> in aer_err_info.anfe_status. So that those bits could be printed and
>>> processed later.
>>>
>>> Tested-by: Yudong Wang <yudong.wang@intel.com>
>>> Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
>>> Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>> ---
>>>  drivers/pci/pci.h      |  1 +
>>>  drivers/pci/pcie/aer.c | 53
>> ++++++++++++++++++++++++++++++++++++++++++
>>>  2 files changed, 54 insertions(+)
>>>
>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>> index 17fed1846847..3f9eb807f9fd 100644
>>> --- a/drivers/pci/pci.h
>>> +++ b/drivers/pci/pci.h
>>> @@ -412,6 +412,7 @@ struct aer_err_info {
>>>
>>>  	unsigned int status;		/* COR/UNCOR Error Status */
>>>  	unsigned int mask;		/* COR/UNCOR Error Mask */
>>> +	unsigned int anfe_status;	/* UNCOR Error Status for ANFE */
>>>  	struct pcie_tlp_log tlp;	/* TLP Header */
>>>  };
>>>
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index ac6293c24976..f2839b51321a 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -107,6 +107,12 @@ struct aer_stats {
>>>  					PCI_ERR_ROOT_MULTI_COR_RCV |
>> 	\
>>>  					PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>>>
>>> +#define AER_ERR_ANFE_UNC_MASK
>> 	(PCI_ERR_UNC_POISON_TLP |	\
>>> +					PCI_ERR_UNC_COMP_TIME |
>> 	\
>>> +					PCI_ERR_UNC_COMP_ABORT |
>> 	\
>>> +					PCI_ERR_UNC_UNX_COMP |
>> 	\
>>> +					PCI_ERR_UNC_UNSUP)
>>> +
>>>  static int pcie_aer_disable;
>>>  static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>>>
>>> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain, unsigned
>> int bus, unsigned int devfn,
>>>  EXPORT_SYMBOL_GPL(aer_recover_queue);
>>>  #endif
>>>
>>> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info
>> *info)
>>> +{
>>> +	u32 uncor_mask, uncor_status, anfe_status;
>>> +	u16 device_status;
>>> +	int aer = dev->aer_cap;
>>> +
>>> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
>> &uncor_status);
>>> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK,
>> &uncor_mask);
>>> +	/*
>>> +	 * According to PCIe Base Specification Revision 6.1,
>>> +	 * Section 6.2.3.2.4, if an UNCOR error is raised as
>>> +	 * Advisory Non-Fatal error, it will match the following
>>> +	 * conditions:
>>> +	 *	a. The severity of the error is Non-Fatal.
>>> +	 *	b. The error is one of the following:
>>> +	 *		1. Poisoned TLP           (Section 6.2.3.2.4.3)
>>> +	 *		2. Completion Timeout     (Section 6.2.3.2.4.4)
>>> +	 *		3. Completer Abort        (Section 6.2.3.2.4.1)
>>> +	 *		4. Unexpected Completion  (Section 6.2.3.2.4.5)
>>> +	 *		5. Unsupported Request    (Section 6.2.3.2.4.1)
>>> +	 */
>>> +	anfe_status = uncor_status & ~uncor_mask & ~info->severity &
>>> +		      AER_ERR_ANFE_UNC_MASK;
>>> +
>>> +	if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA,
>> &device_status))
>>> +		return;
>>> +	/*
>>> +	 * Take the most conservative route here. If there are Non-Fatal
>> errors
>>> +	 * detected, do not assume any bit in uncor_status is set by ANFE.
>>> +	 */
>>> +	if (device_status & PCI_EXP_DEVSTA_NFED)
>>> +		return;
>> You can move this check to the top of the function. You don't need to check
>> the rest if NFE error is detected in device status.
> The v3 just worked that way. Jonathan pointed a race that NFE triggered after
> the check will be treated as ANFE and cleared. Check it after reading UNCOR_STATUS
> can avoid the race.
>
> See https://lkml.org/lkml/2024/4/22/1011 for discussion details.

Got it. I would recommend adding a comment about it in handler. May be
some thing like,

/*
 * To avoid race between device status read and error status register read, cache
 * uncorrectable error status before checking for NFE in device status * register. */
>
> Thanks
> Zhenzhong
>
>>> +
>>> +	/*
>>> +	 * If there is another ANFE between reading uncor_status and
>> clearing
>>> +	 * PCI_ERR_COR_ADV_NFAT bit in cor_status register, that ANFE
>> isn't
>>> +	 * recorded in info->anfe_status. It will be read out as NFE in
>>> +	 * following uncor_status register reading and processed by NFE
>>> +	 * handler.
>>> +	 */
>>> +	info->anfe_status = anfe_status;
>>> +}
>>> +
>>>  /**
>>>   * aer_get_device_error_info - read error status from dev and store it to
>> info
>>>   * @dev: pointer to the device expected to have a error record
>>> @@ -1213,6 +1262,7 @@ int aer_get_device_error_info(struct pci_dev
>> *dev, struct aer_err_info *info)
>>>  	/* Must reset in this function */
>>>  	info->status = 0;
>>> +	info->anfe_status = 0;
>>>  	info->tlp_header_valid = 0;
>>>
>>>  	/* The device might not support AER */
>>> @@ -1226,6 +1276,9 @@ int aer_get_device_error_info(struct pci_dev
>> *dev, struct aer_err_info *info)
>>>  			&info->mask);
>>>  		if (!(info->status & ~info->mask))
>>>  			return 0;
>>> +
>>> +		if (info->status & PCI_ERR_COR_ADV_NFAT)
>>> +			anfe_get_uc_status(dev, info);
>>>  	} else if (type == PCI_EXP_TYPE_ROOT_PORT ||
>>>  		   type == PCI_EXP_TYPE_RC_EC ||
>>>  		   type == PCI_EXP_TYPE_DOWNSTREAM ||
>> --
>> Sathyanarayanan Kuppuswamy
>> Linux Kernel Developer

Zhenzhong Duan June 14, 2024, 3:13 a.m. UTC | #5

>-----Original Message-----
>From: Kuppuswamy Sathyanarayanan
><sathyanarayanan.kuppuswamy@linux.intel.com>
>Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that might
>be ANFE in aer_err_info
>
>
>On 6/13/24 7:39 PM, Duan, Zhenzhong wrote:
>> Hi
>>
>>> -----Original Message-----
>>> From: Kuppuswamy Sathyanarayanan
>>> <sathyanarayanan.kuppuswamy@linux.intel.com>
>>> Subject: Re: [PATCH v4 1/3] PCI/AER: Store UNCOR_STATUS bits that
>might
>>> be ANFE in aer_err_info
>>>
>>> Hi,
>>>
>>> On 5/9/24 1:48 AM, Zhenzhong Duan wrote:
>>>> In some cases the detector of a Non-Fatal Error(NFE) is not the most
>>>> appropriate agent to determine the type of the error. For example,
>>>> when software performs a configuration read from a non-existent
>>>> device or Function, completer will send an ERR_NONFATAL Message.
>>>> On some platforms, ERR_NONFATAL results in a System Error, which
>>>> breaks normal software probing.
>>>>
>>>> Advisory Non-Fatal Error(ANFE) is a special case that can be used
>>>> in above scenario. It is predominantly determined by the role of the
>>>> detecting agent (Requester, Completer, or Receiver) and the specific
>>>> error. In such cases, an agent with AER signals the NFE (if enabled)
>>>> by sending an ERR_COR Message as an advisory to software, instead of
>>>> sending ERR_NONFATAL.
>>>>
>>>> When processing an ANFE, ideally both correctable error(CE) status and
>>>> uncorrectable error(UE) status should be cleared. However, there is no
>>>> way to fully identify the UE associated with ANFE. Even worse, Non-Fatal
>>>> Error(NFE) may set the same UE status bit as ANFE. Treating an ANFE as
>>>> NFE will reproduce above mentioned issue, i.e., breaking softwore
>probing;
>>>> treating NFE as ANFE will make us ignoring some UEs which need active
>>>> recover operation. To avoid clearing UEs that are not ANFE by accident,
>>>> the most conservative route is taken here: If any of the NFE Detected
>>>> bits is set in Device Status, do not touch UE status, they should be
>>>> cleared later by the UE handler. Otherwise, a specific set of UEs that
>>>> may be raised as ANFE according to the PCIe specification will be cleared
>>>> if their corresponding severity is Non-Fatal.
>>>>
>>>> To achieve above purpose, store UNCOR_STATUS bits that might be
>ANFE
>>>> in aer_err_info.anfe_status. So that those bits could be printed and
>>>> processed later.
>>>>
>>>> Tested-by: Yudong Wang <yudong.wang@intel.com>
>>>> Co-developed-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
>>>> Signed-off-by: "Wang, Qingshun" <qingshun.wang@linux.intel.com>
>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>> ---
>>>>  drivers/pci/pci.h      |  1 +
>>>>  drivers/pci/pcie/aer.c | 53
>>> ++++++++++++++++++++++++++++++++++++++++++
>>>>  2 files changed, 54 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>>>> index 17fed1846847..3f9eb807f9fd 100644
>>>> --- a/drivers/pci/pci.h
>>>> +++ b/drivers/pci/pci.h
>>>> @@ -412,6 +412,7 @@ struct aer_err_info {
>>>>
>>>>  	unsigned int status;		/* COR/UNCOR Error Status */
>>>>  	unsigned int mask;		/* COR/UNCOR Error Mask */
>>>> +	unsigned int anfe_status;	/* UNCOR Error Status for ANFE */
>>>>  	struct pcie_tlp_log tlp;	/* TLP Header */
>>>>  };
>>>>
>>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>>> index ac6293c24976..f2839b51321a 100644
>>>> --- a/drivers/pci/pcie/aer.c
>>>> +++ b/drivers/pci/pcie/aer.c
>>>> @@ -107,6 +107,12 @@ struct aer_stats {
>>>>  					PCI_ERR_ROOT_MULTI_COR_RCV |
>>> 	\
>>>>  					PCI_ERR_ROOT_MULTI_UNCOR_RCV)
>>>>
>>>> +#define AER_ERR_ANFE_UNC_MASK
>>> 	(PCI_ERR_UNC_POISON_TLP |	\
>>>> +					PCI_ERR_UNC_COMP_TIME |
>>> 	\
>>>> +					PCI_ERR_UNC_COMP_ABORT |
>>> 	\
>>>> +					PCI_ERR_UNC_UNX_COMP |
>>> 	\
>>>> +					PCI_ERR_UNC_UNSUP)
>>>> +
>>>>  static int pcie_aer_disable;
>>>>  static pci_ers_result_t aer_root_reset(struct pci_dev *dev);
>>>>
>>>> @@ -1196,6 +1202,49 @@ void aer_recover_queue(int domain,
>unsigned
>>> int bus, unsigned int devfn,
>>>>  EXPORT_SYMBOL_GPL(aer_recover_queue);
>>>>  #endif
>>>>
>>>> +static void anfe_get_uc_status(struct pci_dev *dev, struct aer_err_info
>>> *info)
>>>> +{
>>>> +	u32 uncor_mask, uncor_status, anfe_status;
>>>> +	u16 device_status;
>>>> +	int aer = dev->aer_cap;
>>>> +
>>>> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_STATUS,
>>> &uncor_status);
>>>> +	pci_read_config_dword(dev, aer + PCI_ERR_UNCOR_MASK,
>>> &uncor_mask);
>>>> +	/*
>>>> +	 * According to PCIe Base Specification Revision 6.1,
>>>> +	 * Section 6.2.3.2.4, if an UNCOR error is raised as
>>>> +	 * Advisory Non-Fatal error, it will match the following
>>>> +	 * conditions:
>>>> +	 *	a. The severity of the error is Non-Fatal.
>>>> +	 *	b. The error is one of the following:
>>>> +	 *		1. Poisoned TLP           (Section 6.2.3.2.4.3)
>>>> +	 *		2. Completion Timeout     (Section 6.2.3.2.4.4)
>>>> +	 *		3. Completer Abort        (Section 6.2.3.2.4.1)
>>>> +	 *		4. Unexpected Completion  (Section 6.2.3.2.4.5)
>>>> +	 *		5. Unsupported Request    (Section 6.2.3.2.4.1)
>>>> +	 */
>>>> +	anfe_status = uncor_status & ~uncor_mask & ~info->severity &
>>>> +		      AER_ERR_ANFE_UNC_MASK;
>>>> +
>>>> +	if (pcie_capability_read_word(dev, PCI_EXP_DEVSTA,
>>> &device_status))
>>>> +		return;
>>>> +	/*
>>>> +	 * Take the most conservative route here. If there are Non-Fatal
>>> errors
>>>> +	 * detected, do not assume any bit in uncor_status is set by ANFE.
>>>> +	 */
>>>> +	if (device_status & PCI_EXP_DEVSTA_NFED)
>>>> +		return;
>>> You can move this check to the top of the function. You don't need to
>check
>>> the rest if NFE error is detected in device status.
>> The v3 just worked that way. Jonathan pointed a race that NFE triggered
>after
>> the check will be treated as ANFE and cleared. Check it after reading
>UNCOR_STATUS
>> can avoid the race.
>>
>> See https://lkml.org/lkml/2024/4/22/1011 for discussion details.
>
>Got it. I would recommend adding a comment about it in handler. May be
>some thing like,
>
>/*
> * To avoid race between device status read and error status register read,
>cache
> * uncorrectable error status before checking for NFE in device status *
>register. */

Good suggestion, will add.

Thanks
Zhenzhong

[v4,1/3] PCI/AER: Store UNCOR_STATUS bits that might be ANFE in aer_err_info

Commit Message

Comments

Patch