Message ID | 20191202233127.31160-1-ray.jui@broadcom.com (mailing list archive) |
---|---|
Headers | show |
Series | Add iProc IDM device support | expand |
On Mon, 2 Dec 2019 15:31:25 -0800 Ray Jui <ray.jui@broadcom.com> wrote: > The Broadcom iProc IDM device allows control and monitoring of ASIC internal > bus transactions. Most importantly, it can be configured to detect bus > transaction timeout. In such case, critical information such as transaction > address that caused the error, bus master ID of the transaction that caused > the error, and etc., are made available from the IDM device. This seems to have many of the features of an EDAC device reporting uncorrectable errors. Is there any reason why it is not implemented as such? Thanks, M.
On 12/7/19 9:39 AM, Marc Zyngier wrote: > On Mon, 2 Dec 2019 15:31:25 -0800 > Ray Jui <ray.jui@broadcom.com> wrote: > >> The Broadcom iProc IDM device allows control and monitoring of ASIC internal >> bus transactions. Most importantly, it can be configured to detect bus >> transaction timeout. In such case, critical information such as transaction >> address that caused the error, bus master ID of the transaction that caused >> the error, and etc., are made available from the IDM device. > > This seems to have many of the features of an EDAC device reporting > uncorrectable errors. > > Is there any reason why it is not implemented as such? > > Thanks, > > M. > I thought EDAC errors (in fact, in our case, that's fatal rather than uncorrectable) are mostly for DDR. Is my understanding incorrect? Thanks, Ray
On Mon, 9 Dec 2019 10:02:53 -0800 Ray Jui <ray.jui@broadcom.com> wrote: > On 12/7/19 9:39 AM, Marc Zyngier wrote: > > On Mon, 2 Dec 2019 15:31:25 -0800 > > Ray Jui <ray.jui@broadcom.com> wrote: > > > >> The Broadcom iProc IDM device allows control and monitoring of ASIC internal > >> bus transactions. Most importantly, it can be configured to detect bus > >> transaction timeout. In such case, critical information such as transaction > >> address that caused the error, bus master ID of the transaction that caused > >> the error, and etc., are made available from the IDM device. > > > > This seems to have many of the features of an EDAC device reporting > > uncorrectable errors. > > > > Is there any reason why it is not implemented as such? > > > > Thanks, > > > > M. > > > > I thought EDAC errors (in fact, in our case, that's fatal rather than > uncorrectable) are mostly for DDR. Is my understanding incorrect? No, they are for HW errors in general. There is no real limitation of scope, as far as I understand. Recently, the Annapurna guys came up with a similar HW block, and were convinced to make it an EDAC device. See [1] for details. Thanks, M. [1] https://lore.kernel.org/linux-devicetree/1570707681-865-1-git-send-email-talel@amazon.com/
On 12/9/19 10:36 AM, Marc Zyngier wrote: > On Mon, 9 Dec 2019 10:02:53 -0800 > Ray Jui <ray.jui@broadcom.com> wrote: > >> On 12/7/19 9:39 AM, Marc Zyngier wrote: >>> On Mon, 2 Dec 2019 15:31:25 -0800 >>> Ray Jui <ray.jui@broadcom.com> wrote: >>> >>>> The Broadcom iProc IDM device allows control and monitoring of ASIC internal >>>> bus transactions. Most importantly, it can be configured to detect bus >>>> transaction timeout. In such case, critical information such as transaction >>>> address that caused the error, bus master ID of the transaction that caused >>>> the error, and etc., are made available from the IDM device. >>> >>> This seems to have many of the features of an EDAC device reporting >>> uncorrectable errors. >>> >>> Is there any reason why it is not implemented as such? >>> >>> Thanks, >>> >>> M. >>> >> >> I thought EDAC errors (in fact, in our case, that's fatal rather than >> uncorrectable) are mostly for DDR. Is my understanding incorrect? > > No, they are for HW errors in general. There is no real limitation of > scope, as far as I understand. Recently, the Annapurna guys came up > with a similar HW block, and were convinced to make it an EDAC device. > > See [1] for details. > > Thanks, > > M. > > [1] https://lore.kernel.org/linux-devicetree/1570707681-865-1-git-send-email-talel@amazon.com/ > Ah I see. It looks like memory controllers are the primary devices supported by EDAC. In addition to that, EDAC also does seem to provide a generic data structure to support other types of HW devices and error events. I'll look into this and get back. Thanks, Ray