Message ID | 169646090522.666328.17608442776078591123.stgit@bgt-140510-bm03.eng.stellus.in |
---|---|
State | New, archived |
Headers | show |
Series | cxl: set root decoder granularity based on region params | expand |
Jim Harris wrote: > The granularity for downstream targets is all based on descending the > topology from the root. So we need to set the root decoder's granularity > based on the region params. > > Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch. > > Region created with 2048 granularity using following command line: > > cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \ > -g 2048 -s 2048M > > Use "cxl list -PDE | grep granularity" to get a view of the granularity > set at each level of the topology. > > Before: > "interleave_granularity":2048, > "interleave_granularity":2048, > "interleave_granularity":512, > "interleave_granularity":2048, > "interleave_granularity":2048, > "interleave_granularity":512, > "interleave_granularity":256, > > After: > "interleave_granularity":2048, > "interleave_granularity":2048, > "interleave_granularity":4096, > "interleave_granularity":2048, > "interleave_granularity":2048, > "interleave_granularity":4096, > "interleave_granularity":2048, > > Signed-off-by: Jim Harris <jim.harris@samsung.com> > --- > drivers/cxl/core/region.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 6d63b8798c29..f14110e35f79 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -1691,6 +1691,11 @@ static int cxl_region_attach(struct cxl_region *cxlr, > return -EINVAL; > } > > + /* Set root decoder's granularity now, so that we can use it to calculate > + * granularity for the downstream targets in cxl_region_setup_targets(). > + */ Minor, and I mean minor, nit, the cxl subsystem is using: /* * Comment... */ ...block-comment style. Yes, it varies across subsystems, yes it's arbitrary, but general rule is follow local customs. > + cxlrd->cxlsd.cxld.interleave_granularity = cxlr->params.interleave_granularity; > + So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1 case as interleave_granularity_store() forbids regions that do not match the granularity of the root. If a BIOS tries to ship such a config in production that's when I expect that policy needs to be revisited, but outside of being forced to reconsider that stance the complexity reduction is the benefit. In the meantime maybe add an "effective granularity" concept for x1 root decoders and x1 switches so the math can use that value?
> On Oct 4, 2023, at 6:18 PM, Dan Williams <dan.j.williams@intel.com> wrote: > > Jim Harris wrote: >> >> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c >> index 6d63b8798c29..f14110e35f79 100644 >> --- a/drivers/cxl/core/region.c >> +++ b/drivers/cxl/core/region.c >> @@ -1691,6 +1691,11 @@ static int cxl_region_attach(struct cxl_region *cxlr, >> return -EINVAL; >> } >> >> + /* Set root decoder's granularity now, so that we can use it to calculate >> + * granularity for the downstream targets in cxl_region_setup_targets(). >> + */ > > Minor, and I mean minor, nit, the cxl subsystem is using: > > /* > * Comment... > */ > > ...block-comment style. Yes, it varies across subsystems, yes it's > arbitrary, but general rule is follow local customs. Ack. >> + cxlrd->cxlsd.cxld.interleave_granularity = cxlr->params.interleave_granularity; >> + > > So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1 > case as interleave_granularity_store() forbids regions that do not match > the granularity of the root. That’s correct. And now I think it’s better to put this assignment in interleave_granularity_store() instead since we are already doing root decoder related checking there, and we would then just be setting this granularity once instead of once per memdev. > If a BIOS tries to ship such a config in production that's when I expect > that policy needs to be revisited, but outside of being forced to > reconsider that stance the complexity reduction is the benefit. > > In the meantime maybe add an "effective granularity" concept for x1 root > decoders and x1 switches so the math can use that value? Ack. Use “effective_granularity” (EG) for the topology math and “interleave_granularity” (IG) for programming the decoders. For the x1 switches, is your intent here to set IG = 256 since the IG is really a don’t care from the decoder’s perspective? Or is there another reason we would need to track different IG v. EG for x1 switches?
Jim Harris wrote: [..] > > If a BIOS tries to ship such a config in production that's when I expect > > that policy needs to be revisited, but outside of being forced to > > reconsider that stance the complexity reduction is the benefit. > > > > In the meantime maybe add an "effective granularity" concept for x1 root > > decoders and x1 switches so the math can use that value? > > Ack. Use “effective_granularity” (EG) for the topology math and > “interleave_granularity” (IG) for programming the decoders. > > For the x1 switches, is your intent here to set IG = 256 since the IG is > really a don’t care from the decoder’s perspective? Or is there another > reason we would need to track different IG v. EG for x1 switches? Hmm, I see what you are saying. If the value is a don't care, might as well just update interleave_granularity to make the math friendly. So "effective granularity" is probably just a documentation update to the 'struct cxl_decoder' kdoc so people are not surprised about why this may differ from CFMWS value.
Jim Harris wrote: [..] > > So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1 > > case as interleave_granularity_store() forbids regions that do not match > > the granularity of the root. > > That’s correct. And now I think it’s better to put this assignment in > interleave_granularity_store() instead since we are already doing > root decoder related checking there, and we would then just be setting > this granularity once instead of once per memdev. Did you ever send the interleave_granularity_store() version of this patch? I can't seem to find it.
On Tue, Oct 24, 2023 at 05:55:53PM -0700, Dan Williams wrote: > Jim Harris wrote: > [..] > > > So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1 > > > case as interleave_granularity_store() forbids regions that do not match > > > the granularity of the root. > > > > That’s correct. And now I think it’s better to put this assignment in > > interleave_granularity_store() instead since we are already doing > > root decoder related checking there, and we would then just be setting > > this granularity once instead of once per memdev. > > Did you ever send the interleave_granularity_store() version of this > patch? I can't seem to find it. I found a third approach that was much simpler than the effective_granularity approach. It's in this thread with a completely different title. Should I have still marked it v2, even though the commit title and patch was radically different?
Jim Harris wrote: > On Tue, Oct 24, 2023 at 05:55:53PM -0700, Dan Williams wrote: > > Jim Harris wrote: > > [..] > > > > So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1 > > > > case as interleave_granularity_store() forbids regions that do not match > > > > the granularity of the root. > > > > > > That’s correct. And now I think it’s better to put this assignment in > > > interleave_granularity_store() instead since we are already doing > > > root decoder related checking there, and we would then just be setting > > > this granularity once instead of once per memdev. > > > > Did you ever send the interleave_granularity_store() version of this > > patch? I can't seem to find it. > > I found a third approach that was much simpler than the effective_granularity > approach. It's in this thread with a completely different title. Oh, this one http://lore.kernel.org/r/169660931834.684402.2774329392272976121.stgit@bgt-140510-bm03.eng.stellus.in? I'll go comment.
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 6d63b8798c29..f14110e35f79 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1691,6 +1691,11 @@ static int cxl_region_attach(struct cxl_region *cxlr, return -EINVAL; } + /* Set root decoder's granularity now, so that we can use it to calculate + * granularity for the downstream targets in cxl_region_setup_targets(). + */ + cxlrd->cxlsd.cxld.interleave_granularity = cxlr->params.interleave_granularity; + if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) { int i;
The granularity for downstream targets is all based on descending the topology from the root. So we need to set the root decoder's granularity based on the region params. Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch. Region created with 2048 granularity using following command line: cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \ -g 2048 -s 2048M Use "cxl list -PDE | grep granularity" to get a view of the granularity set at each level of the topology. Before: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":512, "interleave_granularity":256, After: "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048, "interleave_granularity":2048, "interleave_granularity":4096, "interleave_granularity":2048, Signed-off-by: Jim Harris <jim.harris@samsung.com> --- drivers/cxl/core/region.c | 5 +++++ 1 file changed, 5 insertions(+)