diff mbox series

cxl: set root decoder granularity based on region params

Message ID 169646090522.666328.17608442776078591123.stgit@bgt-140510-bm03.eng.stellus.in
State New, archived
Headers show
Series cxl: set root decoder granularity based on region params | expand

Commit Message

Jim Harris Oct. 4, 2023, 11:08 p.m. UTC
The granularity for downstream targets is all based on descending the
topology from the root. So we need to set the root decoder's granularity
based on the region params.

Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch.

Region created with 2048 granularity using following command line:

cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \
		  -g 2048 -s 2048M

Use "cxl list -PDE | grep granularity" to get a view of the granularity
set at each level of the topology.

Before:
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":512,
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":512,
"interleave_granularity":256,

After:
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":4096,
        "interleave_granularity":2048,
        "interleave_granularity":2048,
    "interleave_granularity":4096,
"interleave_granularity":2048,

Signed-off-by: Jim Harris <jim.harris@samsung.com>
---
 drivers/cxl/core/region.c |    5 +++++
 1 file changed, 5 insertions(+)

Comments

Dan Williams Oct. 5, 2023, 1:18 a.m. UTC | #1
Jim Harris wrote:
> The granularity for downstream targets is all based on descending the
> topology from the root. So we need to set the root decoder's granularity
> based on the region params.
> 
> Test configuration is 1 host bridge * 2 switches * 2 endpoints per switch.
> 
> Region created with 2048 granularity using following command line:
> 
> cxl create-region -m -d decoder0.0 -w 4 mem0 mem2 mem1 mem3 \
> 		  -g 2048 -s 2048M
> 
> Use "cxl list -PDE | grep granularity" to get a view of the granularity
> set at each level of the topology.
> 
> Before:
>         "interleave_granularity":2048,
>         "interleave_granularity":2048,
>     "interleave_granularity":512,
>         "interleave_granularity":2048,
>         "interleave_granularity":2048,
>     "interleave_granularity":512,
> "interleave_granularity":256,
> 
> After:
>         "interleave_granularity":2048,
>         "interleave_granularity":2048,
>     "interleave_granularity":4096,
>         "interleave_granularity":2048,
>         "interleave_granularity":2048,
>     "interleave_granularity":4096,
> "interleave_granularity":2048,
> 
> Signed-off-by: Jim Harris <jim.harris@samsung.com>
> ---
>  drivers/cxl/core/region.c |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index 6d63b8798c29..f14110e35f79 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -1691,6 +1691,11 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>  		return -EINVAL;
>  	}
>  
> +	/* Set root decoder's granularity now, so that we can use it to calculate
> +	 * granularity for the downstream targets in cxl_region_setup_targets().
> +	 */

Minor, and I mean minor, nit, the cxl subsystem is using:

/*
 * Comment...
 */

...block-comment style. Yes, it varies across subsystems, yes it's
arbitrary, but general rule is follow local customs.


> +	cxlrd->cxlsd.cxld.interleave_granularity = cxlr->params.interleave_granularity;
> +

So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1
case as interleave_granularity_store() forbids regions that do not match
the granularity of the root.

If a BIOS tries to ship such a config in production that's when I expect
that policy needs to be revisited, but outside of being forced to
reconsider that stance the complexity reduction is the benefit.

In the meantime maybe add an "effective granularity" concept for x1 root
decoders and x1 switches so the math can use that value?
Jim Harris Oct. 5, 2023, 4:52 p.m. UTC | #2
> On Oct 4, 2023, at 6:18 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> 
> Jim Harris wrote:
>> 
>> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
>> index 6d63b8798c29..f14110e35f79 100644
>> --- a/drivers/cxl/core/region.c
>> +++ b/drivers/cxl/core/region.c
>> @@ -1691,6 +1691,11 @@ static int cxl_region_attach(struct cxl_region *cxlr,
>> return -EINVAL;
>> }
>> 
>> + /* Set root decoder's granularity now, so that we can use it to calculate
>> + * granularity for the downstream targets in cxl_region_setup_targets().
>> + */
> 
> Minor, and I mean minor, nit, the cxl subsystem is using:
> 
> /*
> * Comment...
> */
> 
> ...block-comment style. Yes, it varies across subsystems, yes it's
> arbitrary, but general rule is follow local customs.

Ack.

>> + cxlrd->cxlsd.cxld.interleave_granularity = cxlr->params.interleave_granularity;
>> +
> 
> So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1
> case as interleave_granularity_store() forbids regions that do not match
> the granularity of the root.

That’s correct. And now I think it’s better to put this assignment in
interleave_granularity_store() instead since we are already doing
root decoder related checking there, and we would then just be setting
this granularity once instead of once per memdev.

> If a BIOS tries to ship such a config in production that's when I expect
> that policy needs to be revisited, but outside of being forced to
> reconsider that stance the complexity reduction is the benefit.
> 
> In the meantime maybe add an "effective granularity" concept for x1 root
> decoders and x1 switches so the math can use that value?

Ack. Use “effective_granularity” (EG) for the topology math and
“interleave_granularity” (IG) for programming the decoders.

For the x1 switches, is your intent here to set IG = 256 since the IG is
really a don’t care from the decoder’s perspective? Or is there another
reason we would need to track different IG v. EG for x1 switches?
Dan Williams Oct. 5, 2023, 6:08 p.m. UTC | #3
Jim Harris wrote:
[..]
> > If a BIOS tries to ship such a config in production that's when I expect
> > that policy needs to be revisited, but outside of being forced to
> > reconsider that stance the complexity reduction is the benefit.
> > 
> > In the meantime maybe add an "effective granularity" concept for x1 root
> > decoders and x1 switches so the math can use that value?
> 
> Ack. Use “effective_granularity” (EG) for the topology math and
> “interleave_granularity” (IG) for programming the decoders.
> 
> For the x1 switches, is your intent here to set IG = 256 since the IG is
> really a don’t care from the decoder’s perspective? Or is there another
> reason we would need to track different IG v. EG for x1 switches?

Hmm, I see what you are saying. If the value is a don't care, might as well
just update interleave_granularity to make the math friendly. So "effective
granularity" is probably just a documentation update to the 'struct
cxl_decoder' kdoc so people are not surprised about why this may differ
from CFMWS value.
Dan Williams Oct. 25, 2023, 12:55 a.m. UTC | #4
Jim Harris wrote:
[..]
> > So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1
> > case as interleave_granularity_store() forbids regions that do not match
> > the granularity of the root.
> 
> That’s correct. And now I think it’s better to put this assignment in
> interleave_granularity_store() instead since we are already doing
> root decoder related checking there, and we would then just be setting
> this granularity once instead of once per memdev.

Did you ever send the interleave_granularity_store() version of this
patch? I can't seem to find it.
Jim Harris Oct. 25, 2023, 1:33 a.m. UTC | #5
On Tue, Oct 24, 2023 at 05:55:53PM -0700, Dan Williams wrote:
> Jim Harris wrote:
> [..]
> > > So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1
> > > case as interleave_granularity_store() forbids regions that do not match
> > > the granularity of the root.
> > 
> > That’s correct. And now I think it’s better to put this assignment in
> > interleave_granularity_store() instead since we are already doing
> > root decoder related checking there, and we would then just be setting
> > this granularity once instead of once per memdev.
> 
> Did you ever send the interleave_granularity_store() version of this
> patch? I can't seem to find it.

I found a third approach that was much simpler than the effective_granularity
approach. It's in this thread with a completely different title.

Should I have still marked it v2, even though the commit title and patch was
radically different?
Dan Williams Oct. 25, 2023, 1:58 a.m. UTC | #6
Jim Harris wrote:
> On Tue, Oct 24, 2023 at 05:55:53PM -0700, Dan Williams wrote:
> > Jim Harris wrote:
> > [..]
> > > > So I think this is only valid in the cxlrd->cxlsd.cxld.interleave_ways == 1
> > > > case as interleave_granularity_store() forbids regions that do not match
> > > > the granularity of the root.
> > > 
> > > That’s correct. And now I think it’s better to put this assignment in
> > > interleave_granularity_store() instead since we are already doing
> > > root decoder related checking there, and we would then just be setting
> > > this granularity once instead of once per memdev.
> > 
> > Did you ever send the interleave_granularity_store() version of this
> > patch? I can't seem to find it.
> 
> I found a third approach that was much simpler than the effective_granularity
> approach. It's in this thread with a completely different title.

Oh, this one
http://lore.kernel.org/r/169660931834.684402.2774329392272976121.stgit@bgt-140510-bm03.eng.stellus.in?

I'll go comment.
diff mbox series

Patch

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 6d63b8798c29..f14110e35f79 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1691,6 +1691,11 @@  static int cxl_region_attach(struct cxl_region *cxlr,
 		return -EINVAL;
 	}
 
+	/* Set root decoder's granularity now, so that we can use it to calculate
+	 * granularity for the downstream targets in cxl_region_setup_targets().
+	 */
+	cxlrd->cxlsd.cxld.interleave_granularity = cxlr->params.interleave_granularity;
+
 	if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
 		int i;