diff mbox

[1/3] Support for PCI Express reset type

Message ID 4A721FB1.4040903@us.ibm.com (mailing list archive)
State Accepted, archived
Headers show

Commit Message

Mike Mason July 30, 2009, 10:33 p.m. UTC
This is the first of three patches that implement a bit field that PCI Express device drivers can use to indicate they need a fundamental reset during error recovery.

By default, the EEH framework on powerpc does what's known as a "hot reset" during recovery of a PCI Express device.  We've found a case where the device needs a "fundamental reset" to recover properly.  The current PCI error recovery and EEH frameworks do not support this distinction.

The attached patch (courtesy of Richard Lary) adds a bit field to pci_dev that indicates whether the device requires a fundamental reset during recovery.

These patches supersede the previously submitted patch that implemented a fundamental reset bit field. 

Please review and let me know of any concerns.

Signed-off-by: Mike Mason <mmlnx@us.ibm.com>
Signed-off-by: Richard Lary <rlary@us.ibm.com>

Comments

Andi Kleen July 31, 2009, 4:19 p.m. UTC | #1
Mike Mason <mmlnx@us.ibm.com> writes:
>
> These patches supersede the previously submitted patch that
> implemented a fundamental reset bit field. 
>
> Please review and let me know of any concerns.

Any plans to implement that for x86 too? Right now it seems to be a PPC
specific hack. And where is the driver that is using it?

-Andi
Linas Vepstas Aug. 1, 2009, 10:40 p.m. UTC | #2
Hi Andi,

2009/7/31 Andi Kleen <andi@firstfloor.org>:
> Mike Mason <mmlnx@us.ibm.com> writes:
>>
>> These patches supersede the previously submitted patch that
>> implemented a fundamental reset bit field.
>>
>> Please review and let me know of any concerns.
>
> Any plans to implement that for x86 too? Right now it seems to be a PPC
> specific hack.

I've found the PCIE chipsepc somewhat daunting, but was under the
impression that much if not most of what was needed was specified
there.

See, for example:
    Documentation/PCI/pcieaer-howto.txt

which states:
|||   The PCI Express Advanced Error Reporting Driver Guide HOWTO
|||                T. Long Nguyen  <tom.l.nguyen@intel.com>
|||                Yanmin Zhang    <yanmin.zhang@intel.com>
|||                                07/29/2006
[..]
||| The PCI Express AER driver provides the infrastructure to support PCI
||| Express Advanced Error Reporting capability. The PCI Express AER
||| driver provides three basic functions:
|||
||| -       Gathers the comprehensive error information if errors occurred.
||| -       Reports error to the users.
||| -       Performs error recovery actions.

I presume the last bullet point  means that the AER code works and
actually does more or less the same thing as the PPC EEH code,
but in a more architecture-independent way, as it only assumes
that PCI AER is there (and is correctly implemented in the CPI chipset)
The AER code uses the same core infrastructure as the EEH code,
at the time, I did exchange emails w/ the above authors discussing
this stuff.

As to whether the x86 server vendors are actually selling something
with AER in it, and whether any of them are actually testing this stuff
is unclear.

FWIW IBM has pretty much no incentive to lobby other server vendors
to get on the ball ...as this is viewed as one of those things that lets
IBM charge premium prices for PPC hardware.

--linas
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linas Vepstas Aug. 1, 2009, 10:43 p.m. UTC | #3
2009/7/30 Mike Mason <mmlnx@us.ibm.com>:
> This is the first of three patches that implement a bit field that PCI
> Express device drivers can use to indicate they need a fundamental reset
> during error recovery.
>
> By default, the EEH framework on powerpc does what's known as a "hot reset"
> during recovery of a PCI Express device.  We've found a case where the
> device needs a "fundamental reset" to recover properly.  The current PCI
> error recovery and EEH frameworks do not support this distinction.
>
> The attached patch (courtesy of Richard Lary) adds a bit field to pci_dev
> that indicates whether the device requires a fundamental reset during
> recovery.
>
> These patches supersede the previously submitted patch that implemented a
> fundamental reset bit field.
> Please review and let me know of any concerns.
>
> Signed-off-by: Mike Mason <mmlnx@us.ibm.com>
> Signed-off-by: Richard Lary <rlary@us.ibm.com>


Signed-off-by: Linas Vepstas <linasvepstas@gmail.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Barnes Aug. 14, 2009, 4:54 p.m. UTC | #4
On Thu, 30 Jul 2009 15:33:21 -0700
Mike Mason <mmlnx@us.ibm.com> wrote:

> This is the first of three patches that implement a bit field that
> PCI Express device drivers can use to indicate they need a
> fundamental reset during error recovery.
> 
> By default, the EEH framework on powerpc does what's known as a "hot
> reset" during recovery of a PCI Express device.  We've found a case
> where the device needs a "fundamental reset" to recover properly.
> The current PCI error recovery and EEH frameworks do not support this
> distinction.
> 
> The attached patch (courtesy of Richard Lary) adds a bit field to
> pci_dev that indicates whether the device requires a fundamental
> reset during recovery.
> 
> These patches supersede the previously submitted patch that
> implemented a fundamental reset bit field. 
> 
> Please review and let me know of any concerns.
> 
> Signed-off-by: Mike Mason <mmlnx@us.ibm.com>
> Signed-off-by: Richard Lary <rlary@us.ibm.com>

Ok, applied this series to my linux-next branch, it looks pretty
reasonable to me.

For future patches, please cc me, and include the subsystem in the
subject, along with a specific description of the patch, e.g. "PCI: add
PCIe fundamental reset interface", "PCI: document PCIe fundamental
reset", or for arch specific patches, "PCI/powerpc: implement support
for PCIe fundamental reset".

Thanks,
diff mbox

Patch

diff -uNrp a/include/linux/pci.h b/include/linux/pci.h
--- a/include/linux/pci.h	2009-07-13 14:25:37.000000000 -0700
+++ b/include/linux/pci.h	2009-07-15 10:25:37.000000000 -0700
@@ -273,6 +273,7 @@  struct pci_dev {
 	unsigned int	ari_enabled:1;	/* ARI forwarding */
 	unsigned int	is_managed:1;
 	unsigned int	is_pcie:1;
+	unsigned int    needs_freset:1; /* Dev requires fundamental reset */
 	unsigned int	state_saved:1;
 	unsigned int	is_physfn:1;
 	unsigned int	is_virtfn:1;