Message ID | 20211215155309.2711917-2-yazen.ghannam@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | AMD Family 19h Models 10h-1Fh Updates | expand |
On Wed, Dec 15, 2021 at 03:53:08PM +0000, Yazen Ghannam wrote: > - if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5)) > + u32 umc_cfg = 0, dimm_cfg = 0, i = 0; > + > + for_each_umc(i) { > + umc_cfg |= pvt->umc[i].umc_cfg; > + dimm_cfg |= pvt->umc[i].dimm_cfg; > + } > + > + if (dimm_cfg & BIT(5)) > pvt->dram_type = MEM_LRDDR4; > - else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4)) > + else if (dimm_cfg & BIT(4)) You're working here under the assumption that bit 4 and 5 will have the same value on all those UMCs. You're probably going to say that that is how the BIOS is programming them so they should be all the same and any other configuration is invalid but lemme still ask about it explicitly. And if so, this would probably need a comment above it which I can add when applying... Hmm?
On Wed, Dec 15, 2021 at 07:01:22PM +0100, Borislav Petkov wrote: > On Wed, Dec 15, 2021 at 03:53:08PM +0000, Yazen Ghannam wrote: > > - if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5)) > > + u32 umc_cfg = 0, dimm_cfg = 0, i = 0; > > + > > + for_each_umc(i) { > > + umc_cfg |= pvt->umc[i].umc_cfg; > > + dimm_cfg |= pvt->umc[i].dimm_cfg; > > + } > > + > > + if (dimm_cfg & BIT(5)) > > pvt->dram_type = MEM_LRDDR4; > > - else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4)) > > + else if (dimm_cfg & BIT(4)) > > You're working here under the assumption that bit 4 and 5 will have the > same value on all those UMCs. > > You're probably going to say that that is how the BIOS is programming > them so they should be all the same and any other configuration is > invalid but lemme still ask about it explicitly. > > And if so, this would probably need a comment above it which I can add > when applying... > > Hmm? > No, that's a good question. And actually the assumption is incorrect. It is allowed to have different DIMM types in a system though all DIMMs on a single UMC must match. Do you recommend a follow up patch or should this one be reworked? Thanks, Yazen
On Thu, Dec 16, 2021 at 04:08:18PM +0000, Yazen Ghannam wrote: > No, that's a good question. And actually the assumption is incorrect. It is > allowed to have different DIMM types in a system though all DIMMs on a single > UMC must match. Oh fun, really?! So a single system can have DDR4 *and* DDR5 on the same board?! So then that pvt->dram_type is insufficient to store the DIMM type for a pvt. If you have multiple UMCs on a pvt and all have different type DIMMs, then you need the relevant DIMM type when you dump it in sysfs... Which then means, you need ->dram_type to be per UMC... And also, I'm assuming the hw already enforces that DIMMs on a single UMC must match - it simply won't boot if they don't so you don't have to enforce that, at least. > Do you recommend a follow up patch or should this one be reworked? This one is insufficient, I'm afraid. One way to address this is, you could use pvt->umc at the places where dram_type is used and assign directly to the dimm->mtype thing. But then you'd need a way to map each struct dimm_info *dimm to the UMC so that you can determine the correct DIMM type. Which would make pvt->dram_type redundant and can be removed. Or you might have a better idea... HTH.
On Thu, Dec 30, 2021 at 12:36:44PM +0100, Borislav Petkov wrote: > On Thu, Dec 16, 2021 at 04:08:18PM +0000, Yazen Ghannam wrote: > > No, that's a good question. And actually the assumption is incorrect. It is > > allowed to have different DIMM types in a system though all DIMMs on a single > > UMC must match. > > Oh fun, really?! > > So a single system can have DDR4 *and* DDR5 on the same board?! > Well, I don't know about that specifically. There are some restrictions, but you could have UDIMMs and RDIMMs of the same generation, at least. > So then that > > pvt->dram_type > > is insufficient to store the DIMM type for a pvt. If you have multiple > UMCs on a pvt and all have different type DIMMs, then you need the > relevant DIMM type when you dump it in sysfs... > > Which then means, you need ->dram_type to be per UMC... > > And also, I'm assuming the hw already enforces that DIMMs on a single > UMC must match - it simply won't boot if they don't so you don't have to > enforce that, at least. > > > Do you recommend a follow up patch or should this one be reworked? > > This one is insufficient, I'm afraid. > > One way to address this is, you could use pvt->umc at the places where > dram_type is used and assign directly to the dimm->mtype thing. But then > you'd need a way to map each struct dimm_info *dimm to the UMC so that > you can determine the correct DIMM type. > I did send a patch that did something like this. https://lkml.kernel.org/r/20211228200615.412999-2-yazen.ghannam@amd.com Though this got a build warning report, so I need to follow up on that. > Which would make pvt->dram_type redundant and can be removed. > I kept this so as to not break legacy systems. But I'll look at it again. I think you may be right. Thanks, Yazen
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index ff29267e46a6..1df763128483 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -1621,9 +1621,16 @@ static void determine_memory_type(struct amd64_pvt *pvt) u32 dram_ctrl, dcsm; if (pvt->umc) { - if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5)) + u32 umc_cfg = 0, dimm_cfg = 0, i = 0; + + for_each_umc(i) { + umc_cfg |= pvt->umc[i].umc_cfg; + dimm_cfg |= pvt->umc[i].dimm_cfg; + } + + if (dimm_cfg & BIT(5)) pvt->dram_type = MEM_LRDDR4; - else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4)) + else if (dimm_cfg & BIT(4)) pvt->dram_type = MEM_RDDR4; else pvt->dram_type = MEM_DDR4;
The initial support for Unified Memory Controllers (UMCs) was added to AMD64 EDAC for the first generation of Zen systems. These systems have two UMCs per Die, and the code originally assumed two UMCs in various places. Later systems have more than two UMCs, and this assumption was fixed in the following commits. commit bdcee7747f5c ("EDAC/amd64: Support more than two Unified Memory Controllers") commit d971e28e2ce4 ("EDAC/amd64: Support more than two controllers for chip selects handling") However, the determine_memory_type() function was missed in these changes, and two UMCs are still assumed. Update determine_memory_type() to account for all UMCs when checking the register values. Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> --- Link: https://lkml.kernel.org/r/20211208174356.1997855-4-yazen.ghannam@amd.com v1->v2: * Was patch 3 in v1. * Update commit message. drivers/edac/amd64_edac.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)