[EDAC,07/13] edac: add support for raw error reports

Em Fri, 15 Feb 2013 17:02:57 +0100
Borislav Petkov <bp@alien8.de> escreveu:

> On Fri, Feb 15, 2013 at 01:49:29PM -0200, Mauro Carvalho Chehab wrote:
> > Sure, but calling kmalloc while handling a memory error doesn't seem
> > a very good idea, IMHO. So, better to either use an already allocated
> > space (or the stack).
> 
> Either that, or prealloc a buffer on EDAC initialization. You probably
> won't need more than one in 99% of the cases so if you keep it simple
> with a single static buffer for starters, that would probably be the
> cleanest solution.
> 
> > Yes, I know, but, on the other hand, there's the additional cost of
> > copying almost all data into the structure.
> 
> That's very easily paralelizable on out-of-order CPUs (I'd say, all of
> them which need to run EDAC, can do that :-)) so it wouldn't hurt.
> 
> Also, you could allocate the struct in the callers and work directly
> with its members before sending it down to edac_raw_mc_handle_error() -
> that would probably simplify the code a bit more.

Yeah, pre-allocating a buffer is something that it was on my plans.
It seems it is time to do it in a clean way. I prefer to keep this
as a separate patch from 07/13, as it has a different rationale,
and mixing with 07/13 would just mix two different subjects.

Also, having it separate helps reviewing.

---

[PATCH] edac: put all arguments for the raw error handling call into a struct

The number of arguments for edac_raw_mc_handle_error() is too big;
put them into a structure and allocate space for it inside
edac_mc_alloc().

That reduces a lot the stack usage and simplifies the raw API call.

Tested with sb_edac driver and MCE error injection. Worked as expected:

[  143.066100] EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x320 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
[  143.086424] EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x320 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
[  143.106570] EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x320 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)
[  143.126712] EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Channel#0_DIMM#0 (channel:0 slot:0 page:0x320 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0090 socket:0 channel_mask:1 rank:0)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[EDAC,07/13] edac: add support for raw error reports

Commit Message

Comments

Patch