diff mbox series

[RFC,v2,23/28] cxl/region: Implement XHB verification

Message ID 20211022183709.1199701-24-ben.widawsky@intel.com
State New, archived
Headers show
Series CXL Region Creation / HDM decoder programming | expand

Commit Message

Ben Widawsky Oct. 22, 2021, 6:37 p.m. UTC
Cross host bridge verification primarily determines if the requested
interleave ordering can be achieved by the root decoder, which isn't as
programmable as other decoders.

The algorithm implemented here is based on the CXL Type 3 Memory Device
Software Guide, chapter 2.13.14

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
 .clang-format        |  1 +
 drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++-
 drivers/cxl/trace.h  |  3 ++
 3 files changed, 84 insertions(+), 1 deletion(-)

Comments

Jonathan Cameron Jan. 6, 2022, 4:55 p.m. UTC | #1
On Fri, 22 Oct 2021 11:37:04 -0700
Ben Widawsky <ben.widawsky@intel.com> wrote:

> Cross host bridge verification primarily determines if the requested
> interleave ordering can be achieved by the root decoder, which isn't as
> programmable as other decoders.
> 
> The algorithm implemented here is based on the CXL Type 3 Memory Device
> Software Guide, chapter 2.13.14
> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
>  .clang-format        |  1 +
>  drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++-
>  drivers/cxl/trace.h  |  3 ++
>  3 files changed, 84 insertions(+), 1 deletion(-)
> 
> diff --git a/.clang-format b/.clang-format
> index cb7c46371465..55f628f21722 100644
> --- a/.clang-format
> +++ b/.clang-format
> @@ -169,6 +169,7 @@ ForEachMacros:
>    - 'for_each_cpu_and'
>    - 'for_each_cpu_not'
>    - 'for_each_cpu_wrap'
> +  - 'for_each_cxl_decoder_target'
>    - 'for_each_cxl_endpoint'
>    - 'for_each_dapm_widgets'
>    - 'for_each_dev_addr'
> diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> index d127c9c69eef..53442de33d11 100644
> --- a/drivers/cxl/region.c
> +++ b/drivers/cxl/region.c
> @@ -30,6 +30,11 @@
>  	for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region);  \
>  	     idx++, ep = (region)->targets[idx])
>  
> +#define for_each_cxl_decoder_target(target, decoder, idx)                      \
> +	for (idx = 0, target = (decoder)->target[idx];                         \
> +	     idx < (decoder)->nr_targets;                                      \
> +	     idx++, target = (decoder)->target[idx])
> +
target used for too many things in this macro.

I'm messing around with this to poke some of the Qemu stuff and noticed
this in passing...

Jonathan
Ben Widawsky Jan. 6, 2022, 4:58 p.m. UTC | #2
On 22-01-06 16:55:47, Jonathan Cameron wrote:
> On Fri, 22 Oct 2021 11:37:04 -0700
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > Cross host bridge verification primarily determines if the requested
> > interleave ordering can be achieved by the root decoder, which isn't as
> > programmable as other decoders.
> > 
> > The algorithm implemented here is based on the CXL Type 3 Memory Device
> > Software Guide, chapter 2.13.14
> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> >  .clang-format        |  1 +
> >  drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++-
> >  drivers/cxl/trace.h  |  3 ++
> >  3 files changed, 84 insertions(+), 1 deletion(-)
> > 
> > diff --git a/.clang-format b/.clang-format
> > index cb7c46371465..55f628f21722 100644
> > --- a/.clang-format
> > +++ b/.clang-format
> > @@ -169,6 +169,7 @@ ForEachMacros:
> >    - 'for_each_cpu_and'
> >    - 'for_each_cpu_not'
> >    - 'for_each_cpu_wrap'
> > +  - 'for_each_cxl_decoder_target'
> >    - 'for_each_cxl_endpoint'
> >    - 'for_each_dapm_widgets'
> >    - 'for_each_dev_addr'
> > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> > index d127c9c69eef..53442de33d11 100644
> > --- a/drivers/cxl/region.c
> > +++ b/drivers/cxl/region.c
> > @@ -30,6 +30,11 @@
> >  	for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region);  \
> >  	     idx++, ep = (region)->targets[idx])
> >  
> > +#define for_each_cxl_decoder_target(target, decoder, idx)                      \
> > +	for (idx = 0, target = (decoder)->target[idx];                         \
> > +	     idx < (decoder)->nr_targets;                                      \
> > +	     idx++, target = (decoder)->target[idx])
> > +
> target used for too many things in this macro.
> 
> I'm messing around with this to poke some of the Qemu stuff and noticed
> this in passing...
> 
> Jonathan

Thanks.

BTW, I have some rather large changes in flight. Might be good to check this
branch (I'm in force push mode):
https://gitlab.com/bwidawsk/linux/-/commits/cxl_region

Also, I have a minor QEMU change (HACK) to support multiple root ports.
https://gitlab.com/bwidawsk/qemu/-/commit/7c76849f9a4d2bc5fc9c355ed06ea926fc7ab494
Jonathan Cameron Jan. 6, 2022, 5:33 p.m. UTC | #3
On Thu, 6 Jan 2022 08:58:15 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 22-01-06 16:55:47, Jonathan Cameron wrote:
> > On Fri, 22 Oct 2021 11:37:04 -0700
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > Cross host bridge verification primarily determines if the requested
> > > interleave ordering can be achieved by the root decoder, which isn't as
> > > programmable as other decoders.
> > > 
> > > The algorithm implemented here is based on the CXL Type 3 Memory Device
> > > Software Guide, chapter 2.13.14
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > >  .clang-format        |  1 +
> > >  drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++-
> > >  drivers/cxl/trace.h  |  3 ++
> > >  3 files changed, 84 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/.clang-format b/.clang-format
> > > index cb7c46371465..55f628f21722 100644
> > > --- a/.clang-format
> > > +++ b/.clang-format
> > > @@ -169,6 +169,7 @@ ForEachMacros:
> > >    - 'for_each_cpu_and'
> > >    - 'for_each_cpu_not'
> > >    - 'for_each_cpu_wrap'
> > > +  - 'for_each_cxl_decoder_target'
> > >    - 'for_each_cxl_endpoint'
> > >    - 'for_each_dapm_widgets'
> > >    - 'for_each_dev_addr'
> > > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> > > index d127c9c69eef..53442de33d11 100644
> > > --- a/drivers/cxl/region.c
> > > +++ b/drivers/cxl/region.c
> > > @@ -30,6 +30,11 @@
> > >  	for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region);  \
> > >  	     idx++, ep = (region)->targets[idx])
> > >  
> > > +#define for_each_cxl_decoder_target(target, decoder, idx)                      \
> > > +	for (idx = 0, target = (decoder)->target[idx];                         \
> > > +	     idx < (decoder)->nr_targets;                                      \
> > > +	     idx++, target = (decoder)->target[idx])
> > > +  
> > target used for too many things in this macro.
> > 
> > I'm messing around with this to poke some of the Qemu stuff and noticed
> > this in passing...
> > 
> > Jonathan  
> 
> Thanks.
> 
> BTW, I have some rather large changes in flight. Might be good to check this
> branch (I'm in force push mode):
> https://gitlab.com/bwidawsk/linux/-/commits/cxl_region
> 
> Also, I have a minor QEMU change (HACK) to support multiple root ports.
> https://gitlab.com/bwidawsk/qemu/-/commit/7c76849f9a4d2bc5fc9c355ed06ea926fc7ab494
Thanks. Will take a look at both.

Mostly I'm interested in the QEMU side of things and trying to get a cleaner
command line working but good to have a way to poke it an check the
CFMWS is correct etc.

Jonathan
Jonathan Cameron Jan. 6, 2022, 6:10 p.m. UTC | #4
On Thu, 6 Jan 2022 17:33:46 +0000
Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:

> On Thu, 6 Jan 2022 08:58:15 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > On 22-01-06 16:55:47, Jonathan Cameron wrote:  
> > > On Fri, 22 Oct 2021 11:37:04 -0700
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >     
> > > > Cross host bridge verification primarily determines if the requested
> > > > interleave ordering can be achieved by the root decoder, which isn't as
> > > > programmable as other decoders.
> > > > 
> > > > The algorithm implemented here is based on the CXL Type 3 Memory Device
> > > > Software Guide, chapter 2.13.14
> > > > 
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > ---
> > > >  .clang-format        |  1 +
> > > >  drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++-
> > > >  drivers/cxl/trace.h  |  3 ++
> > > >  3 files changed, 84 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/.clang-format b/.clang-format
> > > > index cb7c46371465..55f628f21722 100644
> > > > --- a/.clang-format
> > > > +++ b/.clang-format
> > > > @@ -169,6 +169,7 @@ ForEachMacros:
> > > >    - 'for_each_cpu_and'
> > > >    - 'for_each_cpu_not'
> > > >    - 'for_each_cpu_wrap'
> > > > +  - 'for_each_cxl_decoder_target'
> > > >    - 'for_each_cxl_endpoint'
> > > >    - 'for_each_dapm_widgets'
> > > >    - 'for_each_dev_addr'
> > > > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> > > > index d127c9c69eef..53442de33d11 100644
> > > > --- a/drivers/cxl/region.c
> > > > +++ b/drivers/cxl/region.c
> > > > @@ -30,6 +30,11 @@
> > > >  	for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region);  \
> > > >  	     idx++, ep = (region)->targets[idx])
> > > >  
> > > > +#define for_each_cxl_decoder_target(target, decoder, idx)                      \
> > > > +	for (idx = 0, target = (decoder)->target[idx];                         \
> > > > +	     idx < (decoder)->nr_targets;                                      \
> > > > +	     idx++, target = (decoder)->target[idx])
> > > > +    
> > > target used for too many things in this macro.
> > > 
> > > I'm messing around with this to poke some of the Qemu stuff and noticed
> > > this in passing...
> > > 
> > > Jonathan    
> > 
> > Thanks.
> > 
> > BTW, I have some rather large changes in flight. Might be good to check this
> > branch (I'm in force push mode):
> > https://gitlab.com/bwidawsk/linux/-/commits/cxl_region
> > 
> > Also, I have a minor QEMU change (HACK) to support multiple root ports.
> > https://gitlab.com/bwidawsk/qemu/-/commit/7c76849f9a4d2bc5fc9c355ed06ea926fc7ab494  

If we were feeling lazy that could (I think) just be set to the maximum allowed and
be 'correct' in all cases.

> Thanks. Will take a look at both.
> 
> Mostly I'm interested in the QEMU side of things and trying to get a cleaner
> command line working but good to have a way to poke it an check the
> CFMWS is correct etc.

FYI. I'll leave feedback for where I'm hitting bugs on your gitlab branches.
My test setup that I'm trying to build regions on is
2 host bridge, 2 ports on each, 1 device directly connected to both.
The qemu code will unfortunately take a bit of extracting from company internals
so I want to get a bit further with it before going the effort of doing that
and I have a few other things on my todo list.

Jonathan

> 
> Jonathan
> 
>
Ben Widawsky Jan. 6, 2022, 6:34 p.m. UTC | #5
On 22-01-06 18:10:33, Jonathan Cameron wrote:
> On Thu, 6 Jan 2022 17:33:46 +0000
> Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote:
> 
> > On Thu, 6 Jan 2022 08:58:15 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > 
> > > On 22-01-06 16:55:47, Jonathan Cameron wrote:  
> > > > On Fri, 22 Oct 2021 11:37:04 -0700
> > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >     
> > > > > Cross host bridge verification primarily determines if the requested
> > > > > interleave ordering can be achieved by the root decoder, which isn't as
> > > > > programmable as other decoders.
> > > > > 
> > > > > The algorithm implemented here is based on the CXL Type 3 Memory Device
> > > > > Software Guide, chapter 2.13.14
> > > > > 
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > ---
> > > > >  .clang-format        |  1 +
> > > > >  drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++-
> > > > >  drivers/cxl/trace.h  |  3 ++
> > > > >  3 files changed, 84 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/.clang-format b/.clang-format
> > > > > index cb7c46371465..55f628f21722 100644
> > > > > --- a/.clang-format
> > > > > +++ b/.clang-format
> > > > > @@ -169,6 +169,7 @@ ForEachMacros:
> > > > >    - 'for_each_cpu_and'
> > > > >    - 'for_each_cpu_not'
> > > > >    - 'for_each_cpu_wrap'
> > > > > +  - 'for_each_cxl_decoder_target'
> > > > >    - 'for_each_cxl_endpoint'
> > > > >    - 'for_each_dapm_widgets'
> > > > >    - 'for_each_dev_addr'
> > > > > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
> > > > > index d127c9c69eef..53442de33d11 100644
> > > > > --- a/drivers/cxl/region.c
> > > > > +++ b/drivers/cxl/region.c
> > > > > @@ -30,6 +30,11 @@
> > > > >  	for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region);  \
> > > > >  	     idx++, ep = (region)->targets[idx])
> > > > >  
> > > > > +#define for_each_cxl_decoder_target(target, decoder, idx)                      \
> > > > > +	for (idx = 0, target = (decoder)->target[idx];                         \
> > > > > +	     idx < (decoder)->nr_targets;                                      \
> > > > > +	     idx++, target = (decoder)->target[idx])
> > > > > +    
> > > > target used for too many things in this macro.
> > > > 
> > > > I'm messing around with this to poke some of the Qemu stuff and noticed
> > > > this in passing...
> > > > 
> > > > Jonathan    
> > > 
> > > Thanks.
> > > 
> > > BTW, I have some rather large changes in flight. Might be good to check this
> > > branch (I'm in force push mode):
> > > https://gitlab.com/bwidawsk/linux/-/commits/cxl_region
> > > 
> > > Also, I have a minor QEMU change (HACK) to support multiple root ports.
> > > https://gitlab.com/bwidawsk/qemu/-/commit/7c76849f9a4d2bc5fc9c355ed06ea926fc7ab494  
> 
> If we were feeling lazy that could (I think) just be set to the maximum allowed and
> be 'correct' in all cases.
> 

Yeah. For validation standpoint, having it be a prop is nice, but I think
default as max rather than 1 is smart.

> > Thanks. Will take a look at both.
> > 
> > Mostly I'm interested in the QEMU side of things and trying to get a cleaner
> > command line working but good to have a way to poke it an check the
> > CFMWS is correct etc.
> 
> FYI. I'll leave feedback for where I'm hitting bugs on your gitlab branches.
> My test setup that I'm trying to build regions on is
> 2 host bridge, 2 ports on each, 1 device directly connected to both.
> The qemu code will unfortunately take a bit of extracting from company internals
> so I want to get a bit further with it before going the effort of doing that
> and I have a few other things on my todo list.

Okay, thanks.

> 
> Jonathan
> 
> > 
> > Jonathan
> > 
> > 
>
diff mbox series

Patch

diff --git a/.clang-format b/.clang-format
index cb7c46371465..55f628f21722 100644
--- a/.clang-format
+++ b/.clang-format
@@ -169,6 +169,7 @@  ForEachMacros:
   - 'for_each_cpu_and'
   - 'for_each_cpu_not'
   - 'for_each_cpu_wrap'
+  - 'for_each_cxl_decoder_target'
   - 'for_each_cxl_endpoint'
   - 'for_each_dapm_widgets'
   - 'for_each_dev_addr'
diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c
index d127c9c69eef..53442de33d11 100644
--- a/drivers/cxl/region.c
+++ b/drivers/cxl/region.c
@@ -30,6 +30,11 @@ 
 	for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region);  \
 	     idx++, ep = (region)->targets[idx])
 
+#define for_each_cxl_decoder_target(target, decoder, idx)                      \
+	for (idx = 0, target = (decoder)->target[idx];                         \
+	     idx < (decoder)->nr_targets;                                      \
+	     idx++, target = (decoder)->target[idx])
+
 #define region_ways(region) ((region)->eniw)
 #define region_ig(region) (ilog2((region)->ig))
 
@@ -165,6 +170,28 @@  static bool qtg_match(const struct cxl_decoder *cfmws,
 	return true;
 }
 
+static int get_unique_hostbridges(const struct cxl_region *region,
+				  struct cxl_port **hbs)
+{
+	struct cxl_memdev *ep;
+	int i, hb_count = 0;
+
+	for_each_cxl_endpoint(ep, region, i) {
+		struct cxl_port *hb = get_hostbridge(ep);
+		bool found = false;
+		int j;
+
+		for (j = 0; j < hb_count; j++) {
+			if (hbs[j] == hb)
+				found = true;
+		}
+		if (!found)
+			hbs[hb_count++] = hb;
+	}
+
+	return hb_count;
+}
+
 /**
  * region_xhb_config_valid() - determine cross host bridge validity
  * @cfmws: The CFMWS to check against
@@ -178,7 +205,59 @@  static bool qtg_match(const struct cxl_decoder *cfmws,
 static bool region_xhb_config_valid(const struct cxl_region *region,
 				    const struct cxl_decoder *cfmws)
 {
-	/* TODO: */
+	struct cxl_port *hbs[CXL_DECODER_MAX_INTERLEAVE];
+	int cfmws_ig, i;
+	struct cxl_dport *target;
+
+	/* Are all devices in this region on the same CXL host bridge */
+	if (get_unique_hostbridges(region, hbs) == 1)
+		return true;
+
+	cfmws_ig = cfmws->interleave_granularity;
+
+	/* CFMWS.HBIG >= Device.Label.IG */
+	if (cfmws_ig < region_ig(region)) {
+		trace_xhb_valid(region,
+				"granularity does not support the region interleave granularity\n");
+		return false;
+	}
+
+	/* ((2^(CFMWS.HBIG - Device.RLabel.IG) * (2^CFMWS.ENIW)) > Device.RLabel.NLabel) */
+	if (1 << (cfmws_ig - region_ig(region)) * (1 << cfmws->interleave_ways) >
+	    region_ways(region)) {
+		trace_xhb_valid(region,
+				"granularity to device granularity ratio requires a larger number of devices than currently configured");
+		return false;
+	}
+
+	/* Check that endpoints are hooked up in the correct order */
+	for_each_cxl_decoder_target(target, cfmws, i) {
+		struct cxl_memdev *endpoint = region->targets[i];
+
+		if (get_hostbridge(endpoint) != target->port) {
+			trace_xhb_valid(region, "device ordering bad\n");
+			return false;
+		}
+	}
+
+	/*
+	 * CFMWS.InterleaveTargetList[n] must contain all devices, x where:
+	 *	(Device[x],RegionLabel.Position >> (CFMWS.HBIG -
+	 *	Device[x].RegionLabel.InterleaveGranularity)) &
+	 *	((2^CFMWS.ENIW) - 1) = n
+	 *
+	 * Linux notes: All devices are known to have the same interleave
+	 * granularity at this point.
+	 */
+	for_each_cxl_decoder_target(target, cfmws, i) {
+		if (((i >> (cfmws_ig - region_ig(region)))) &
+		    (((1 << cfmws->interleave_ways) - 1) != target->port_id)) {
+			trace_xhb_valid(region,
+					"One or more devices are not connected to the correct hostbridge.");
+			return false;
+		}
+	}
+
 	return true;
 }
 
diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h
index a53f00ba5d0e..4de47d1111ac 100644
--- a/drivers/cxl/trace.h
+++ b/drivers/cxl/trace.h
@@ -38,6 +38,9 @@  DEFINE_EVENT(cxl_region_template, sanitize_failed,
 DEFINE_EVENT(cxl_region_template, allocation_failed,
 	     TP_PROTO(const struct cxl_region *region, char *status),
 	     TP_ARGS(region, status));
+DEFINE_EVENT(cxl_region_template, xhb_valid,
+	     TP_PROTO(const struct cxl_region *region, char *status),
+	     TP_ARGS(region, status));
 
 #endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */