Message ID | 20211022183709.1199701-24-ben.widawsky@intel.com |
---|---|
State | New, archived |
Headers | show |
Series | CXL Region Creation / HDM decoder programming | expand |
On Fri, 22 Oct 2021 11:37:04 -0700 Ben Widawsky <ben.widawsky@intel.com> wrote: > Cross host bridge verification primarily determines if the requested > interleave ordering can be achieved by the root decoder, which isn't as > programmable as other decoders. > > The algorithm implemented here is based on the CXL Type 3 Memory Device > Software Guide, chapter 2.13.14 > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> > --- > .clang-format | 1 + > drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++- > drivers/cxl/trace.h | 3 ++ > 3 files changed, 84 insertions(+), 1 deletion(-) > > diff --git a/.clang-format b/.clang-format > index cb7c46371465..55f628f21722 100644 > --- a/.clang-format > +++ b/.clang-format > @@ -169,6 +169,7 @@ ForEachMacros: > - 'for_each_cpu_and' > - 'for_each_cpu_not' > - 'for_each_cpu_wrap' > + - 'for_each_cxl_decoder_target' > - 'for_each_cxl_endpoint' > - 'for_each_dapm_widgets' > - 'for_each_dev_addr' > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c > index d127c9c69eef..53442de33d11 100644 > --- a/drivers/cxl/region.c > +++ b/drivers/cxl/region.c > @@ -30,6 +30,11 @@ > for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region); \ > idx++, ep = (region)->targets[idx]) > > +#define for_each_cxl_decoder_target(target, decoder, idx) \ > + for (idx = 0, target = (decoder)->target[idx]; \ > + idx < (decoder)->nr_targets; \ > + idx++, target = (decoder)->target[idx]) > + target used for too many things in this macro. I'm messing around with this to poke some of the Qemu stuff and noticed this in passing... Jonathan
On 22-01-06 16:55:47, Jonathan Cameron wrote: > On Fri, 22 Oct 2021 11:37:04 -0700 > Ben Widawsky <ben.widawsky@intel.com> wrote: > > > Cross host bridge verification primarily determines if the requested > > interleave ordering can be achieved by the root decoder, which isn't as > > programmable as other decoders. > > > > The algorithm implemented here is based on the CXL Type 3 Memory Device > > Software Guide, chapter 2.13.14 > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> > > --- > > .clang-format | 1 + > > drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++- > > drivers/cxl/trace.h | 3 ++ > > 3 files changed, 84 insertions(+), 1 deletion(-) > > > > diff --git a/.clang-format b/.clang-format > > index cb7c46371465..55f628f21722 100644 > > --- a/.clang-format > > +++ b/.clang-format > > @@ -169,6 +169,7 @@ ForEachMacros: > > - 'for_each_cpu_and' > > - 'for_each_cpu_not' > > - 'for_each_cpu_wrap' > > + - 'for_each_cxl_decoder_target' > > - 'for_each_cxl_endpoint' > > - 'for_each_dapm_widgets' > > - 'for_each_dev_addr' > > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c > > index d127c9c69eef..53442de33d11 100644 > > --- a/drivers/cxl/region.c > > +++ b/drivers/cxl/region.c > > @@ -30,6 +30,11 @@ > > for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region); \ > > idx++, ep = (region)->targets[idx]) > > > > +#define for_each_cxl_decoder_target(target, decoder, idx) \ > > + for (idx = 0, target = (decoder)->target[idx]; \ > > + idx < (decoder)->nr_targets; \ > > + idx++, target = (decoder)->target[idx]) > > + > target used for too many things in this macro. > > I'm messing around with this to poke some of the Qemu stuff and noticed > this in passing... > > Jonathan Thanks. BTW, I have some rather large changes in flight. Might be good to check this branch (I'm in force push mode): https://gitlab.com/bwidawsk/linux/-/commits/cxl_region Also, I have a minor QEMU change (HACK) to support multiple root ports. https://gitlab.com/bwidawsk/qemu/-/commit/7c76849f9a4d2bc5fc9c355ed06ea926fc7ab494
On Thu, 6 Jan 2022 08:58:15 -0800 Ben Widawsky <ben.widawsky@intel.com> wrote: > On 22-01-06 16:55:47, Jonathan Cameron wrote: > > On Fri, 22 Oct 2021 11:37:04 -0700 > > Ben Widawsky <ben.widawsky@intel.com> wrote: > > > > > Cross host bridge verification primarily determines if the requested > > > interleave ordering can be achieved by the root decoder, which isn't as > > > programmable as other decoders. > > > > > > The algorithm implemented here is based on the CXL Type 3 Memory Device > > > Software Guide, chapter 2.13.14 > > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> > > > --- > > > .clang-format | 1 + > > > drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++- > > > drivers/cxl/trace.h | 3 ++ > > > 3 files changed, 84 insertions(+), 1 deletion(-) > > > > > > diff --git a/.clang-format b/.clang-format > > > index cb7c46371465..55f628f21722 100644 > > > --- a/.clang-format > > > +++ b/.clang-format > > > @@ -169,6 +169,7 @@ ForEachMacros: > > > - 'for_each_cpu_and' > > > - 'for_each_cpu_not' > > > - 'for_each_cpu_wrap' > > > + - 'for_each_cxl_decoder_target' > > > - 'for_each_cxl_endpoint' > > > - 'for_each_dapm_widgets' > > > - 'for_each_dev_addr' > > > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c > > > index d127c9c69eef..53442de33d11 100644 > > > --- a/drivers/cxl/region.c > > > +++ b/drivers/cxl/region.c > > > @@ -30,6 +30,11 @@ > > > for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region); \ > > > idx++, ep = (region)->targets[idx]) > > > > > > +#define for_each_cxl_decoder_target(target, decoder, idx) \ > > > + for (idx = 0, target = (decoder)->target[idx]; \ > > > + idx < (decoder)->nr_targets; \ > > > + idx++, target = (decoder)->target[idx]) > > > + > > target used for too many things in this macro. > > > > I'm messing around with this to poke some of the Qemu stuff and noticed > > this in passing... > > > > Jonathan > > Thanks. > > BTW, I have some rather large changes in flight. Might be good to check this > branch (I'm in force push mode): > https://gitlab.com/bwidawsk/linux/-/commits/cxl_region > > Also, I have a minor QEMU change (HACK) to support multiple root ports. > https://gitlab.com/bwidawsk/qemu/-/commit/7c76849f9a4d2bc5fc9c355ed06ea926fc7ab494 Thanks. Will take a look at both. Mostly I'm interested in the QEMU side of things and trying to get a cleaner command line working but good to have a way to poke it an check the CFMWS is correct etc. Jonathan
On Thu, 6 Jan 2022 17:33:46 +0000 Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > On Thu, 6 Jan 2022 08:58:15 -0800 > Ben Widawsky <ben.widawsky@intel.com> wrote: > > > On 22-01-06 16:55:47, Jonathan Cameron wrote: > > > On Fri, 22 Oct 2021 11:37:04 -0700 > > > Ben Widawsky <ben.widawsky@intel.com> wrote: > > > > > > > Cross host bridge verification primarily determines if the requested > > > > interleave ordering can be achieved by the root decoder, which isn't as > > > > programmable as other decoders. > > > > > > > > The algorithm implemented here is based on the CXL Type 3 Memory Device > > > > Software Guide, chapter 2.13.14 > > > > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> > > > > --- > > > > .clang-format | 1 + > > > > drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++- > > > > drivers/cxl/trace.h | 3 ++ > > > > 3 files changed, 84 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/.clang-format b/.clang-format > > > > index cb7c46371465..55f628f21722 100644 > > > > --- a/.clang-format > > > > +++ b/.clang-format > > > > @@ -169,6 +169,7 @@ ForEachMacros: > > > > - 'for_each_cpu_and' > > > > - 'for_each_cpu_not' > > > > - 'for_each_cpu_wrap' > > > > + - 'for_each_cxl_decoder_target' > > > > - 'for_each_cxl_endpoint' > > > > - 'for_each_dapm_widgets' > > > > - 'for_each_dev_addr' > > > > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c > > > > index d127c9c69eef..53442de33d11 100644 > > > > --- a/drivers/cxl/region.c > > > > +++ b/drivers/cxl/region.c > > > > @@ -30,6 +30,11 @@ > > > > for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region); \ > > > > idx++, ep = (region)->targets[idx]) > > > > > > > > +#define for_each_cxl_decoder_target(target, decoder, idx) \ > > > > + for (idx = 0, target = (decoder)->target[idx]; \ > > > > + idx < (decoder)->nr_targets; \ > > > > + idx++, target = (decoder)->target[idx]) > > > > + > > > target used for too many things in this macro. > > > > > > I'm messing around with this to poke some of the Qemu stuff and noticed > > > this in passing... > > > > > > Jonathan > > > > Thanks. > > > > BTW, I have some rather large changes in flight. Might be good to check this > > branch (I'm in force push mode): > > https://gitlab.com/bwidawsk/linux/-/commits/cxl_region > > > > Also, I have a minor QEMU change (HACK) to support multiple root ports. > > https://gitlab.com/bwidawsk/qemu/-/commit/7c76849f9a4d2bc5fc9c355ed06ea926fc7ab494 If we were feeling lazy that could (I think) just be set to the maximum allowed and be 'correct' in all cases. > Thanks. Will take a look at both. > > Mostly I'm interested in the QEMU side of things and trying to get a cleaner > command line working but good to have a way to poke it an check the > CFMWS is correct etc. FYI. I'll leave feedback for where I'm hitting bugs on your gitlab branches. My test setup that I'm trying to build regions on is 2 host bridge, 2 ports on each, 1 device directly connected to both. The qemu code will unfortunately take a bit of extracting from company internals so I want to get a bit further with it before going the effort of doing that and I have a few other things on my todo list. Jonathan > > Jonathan > >
On 22-01-06 18:10:33, Jonathan Cameron wrote: > On Thu, 6 Jan 2022 17:33:46 +0000 > Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > > > On Thu, 6 Jan 2022 08:58:15 -0800 > > Ben Widawsky <ben.widawsky@intel.com> wrote: > > > > > On 22-01-06 16:55:47, Jonathan Cameron wrote: > > > > On Fri, 22 Oct 2021 11:37:04 -0700 > > > > Ben Widawsky <ben.widawsky@intel.com> wrote: > > > > > > > > > Cross host bridge verification primarily determines if the requested > > > > > interleave ordering can be achieved by the root decoder, which isn't as > > > > > programmable as other decoders. > > > > > > > > > > The algorithm implemented here is based on the CXL Type 3 Memory Device > > > > > Software Guide, chapter 2.13.14 > > > > > > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> > > > > > --- > > > > > .clang-format | 1 + > > > > > drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++- > > > > > drivers/cxl/trace.h | 3 ++ > > > > > 3 files changed, 84 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/.clang-format b/.clang-format > > > > > index cb7c46371465..55f628f21722 100644 > > > > > --- a/.clang-format > > > > > +++ b/.clang-format > > > > > @@ -169,6 +169,7 @@ ForEachMacros: > > > > > - 'for_each_cpu_and' > > > > > - 'for_each_cpu_not' > > > > > - 'for_each_cpu_wrap' > > > > > + - 'for_each_cxl_decoder_target' > > > > > - 'for_each_cxl_endpoint' > > > > > - 'for_each_dapm_widgets' > > > > > - 'for_each_dev_addr' > > > > > diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c > > > > > index d127c9c69eef..53442de33d11 100644 > > > > > --- a/drivers/cxl/region.c > > > > > +++ b/drivers/cxl/region.c > > > > > @@ -30,6 +30,11 @@ > > > > > for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region); \ > > > > > idx++, ep = (region)->targets[idx]) > > > > > > > > > > +#define for_each_cxl_decoder_target(target, decoder, idx) \ > > > > > + for (idx = 0, target = (decoder)->target[idx]; \ > > > > > + idx < (decoder)->nr_targets; \ > > > > > + idx++, target = (decoder)->target[idx]) > > > > > + > > > > target used for too many things in this macro. > > > > > > > > I'm messing around with this to poke some of the Qemu stuff and noticed > > > > this in passing... > > > > > > > > Jonathan > > > > > > Thanks. > > > > > > BTW, I have some rather large changes in flight. Might be good to check this > > > branch (I'm in force push mode): > > > https://gitlab.com/bwidawsk/linux/-/commits/cxl_region > > > > > > Also, I have a minor QEMU change (HACK) to support multiple root ports. > > > https://gitlab.com/bwidawsk/qemu/-/commit/7c76849f9a4d2bc5fc9c355ed06ea926fc7ab494 > > If we were feeling lazy that could (I think) just be set to the maximum allowed and > be 'correct' in all cases. > Yeah. For validation standpoint, having it be a prop is nice, but I think default as max rather than 1 is smart. > > Thanks. Will take a look at both. > > > > Mostly I'm interested in the QEMU side of things and trying to get a cleaner > > command line working but good to have a way to poke it an check the > > CFMWS is correct etc. > > FYI. I'll leave feedback for where I'm hitting bugs on your gitlab branches. > My test setup that I'm trying to build regions on is > 2 host bridge, 2 ports on each, 1 device directly connected to both. > The qemu code will unfortunately take a bit of extracting from company internals > so I want to get a bit further with it before going the effort of doing that > and I have a few other things on my todo list. Okay, thanks. > > Jonathan > > > > > Jonathan > > > > >
diff --git a/.clang-format b/.clang-format index cb7c46371465..55f628f21722 100644 --- a/.clang-format +++ b/.clang-format @@ -169,6 +169,7 @@ ForEachMacros: - 'for_each_cpu_and' - 'for_each_cpu_not' - 'for_each_cpu_wrap' + - 'for_each_cxl_decoder_target' - 'for_each_cxl_endpoint' - 'for_each_dapm_widgets' - 'for_each_dev_addr' diff --git a/drivers/cxl/region.c b/drivers/cxl/region.c index d127c9c69eef..53442de33d11 100644 --- a/drivers/cxl/region.c +++ b/drivers/cxl/region.c @@ -30,6 +30,11 @@ for (idx = 0, ep = (region)->targets[idx]; idx < region_ways(region); \ idx++, ep = (region)->targets[idx]) +#define for_each_cxl_decoder_target(target, decoder, idx) \ + for (idx = 0, target = (decoder)->target[idx]; \ + idx < (decoder)->nr_targets; \ + idx++, target = (decoder)->target[idx]) + #define region_ways(region) ((region)->eniw) #define region_ig(region) (ilog2((region)->ig)) @@ -165,6 +170,28 @@ static bool qtg_match(const struct cxl_decoder *cfmws, return true; } +static int get_unique_hostbridges(const struct cxl_region *region, + struct cxl_port **hbs) +{ + struct cxl_memdev *ep; + int i, hb_count = 0; + + for_each_cxl_endpoint(ep, region, i) { + struct cxl_port *hb = get_hostbridge(ep); + bool found = false; + int j; + + for (j = 0; j < hb_count; j++) { + if (hbs[j] == hb) + found = true; + } + if (!found) + hbs[hb_count++] = hb; + } + + return hb_count; +} + /** * region_xhb_config_valid() - determine cross host bridge validity * @cfmws: The CFMWS to check against @@ -178,7 +205,59 @@ static bool qtg_match(const struct cxl_decoder *cfmws, static bool region_xhb_config_valid(const struct cxl_region *region, const struct cxl_decoder *cfmws) { - /* TODO: */ + struct cxl_port *hbs[CXL_DECODER_MAX_INTERLEAVE]; + int cfmws_ig, i; + struct cxl_dport *target; + + /* Are all devices in this region on the same CXL host bridge */ + if (get_unique_hostbridges(region, hbs) == 1) + return true; + + cfmws_ig = cfmws->interleave_granularity; + + /* CFMWS.HBIG >= Device.Label.IG */ + if (cfmws_ig < region_ig(region)) { + trace_xhb_valid(region, + "granularity does not support the region interleave granularity\n"); + return false; + } + + /* ((2^(CFMWS.HBIG - Device.RLabel.IG) * (2^CFMWS.ENIW)) > Device.RLabel.NLabel) */ + if (1 << (cfmws_ig - region_ig(region)) * (1 << cfmws->interleave_ways) > + region_ways(region)) { + trace_xhb_valid(region, + "granularity to device granularity ratio requires a larger number of devices than currently configured"); + return false; + } + + /* Check that endpoints are hooked up in the correct order */ + for_each_cxl_decoder_target(target, cfmws, i) { + struct cxl_memdev *endpoint = region->targets[i]; + + if (get_hostbridge(endpoint) != target->port) { + trace_xhb_valid(region, "device ordering bad\n"); + return false; + } + } + + /* + * CFMWS.InterleaveTargetList[n] must contain all devices, x where: + * (Device[x],RegionLabel.Position >> (CFMWS.HBIG - + * Device[x].RegionLabel.InterleaveGranularity)) & + * ((2^CFMWS.ENIW) - 1) = n + * + * Linux notes: All devices are known to have the same interleave + * granularity at this point. + */ + for_each_cxl_decoder_target(target, cfmws, i) { + if (((i >> (cfmws_ig - region_ig(region)))) & + (((1 << cfmws->interleave_ways) - 1) != target->port_id)) { + trace_xhb_valid(region, + "One or more devices are not connected to the correct hostbridge."); + return false; + } + } + return true; } diff --git a/drivers/cxl/trace.h b/drivers/cxl/trace.h index a53f00ba5d0e..4de47d1111ac 100644 --- a/drivers/cxl/trace.h +++ b/drivers/cxl/trace.h @@ -38,6 +38,9 @@ DEFINE_EVENT(cxl_region_template, sanitize_failed, DEFINE_EVENT(cxl_region_template, allocation_failed, TP_PROTO(const struct cxl_region *region, char *status), TP_ARGS(region, status)); +DEFINE_EVENT(cxl_region_template, xhb_valid, + TP_PROTO(const struct cxl_region *region, char *status), + TP_ARGS(region, status)); #endif /* if !defined (__CXL_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ) */
Cross host bridge verification primarily determines if the requested interleave ordering can be achieved by the root decoder, which isn't as programmable as other decoders. The algorithm implemented here is based on the CXL Type 3 Memory Device Software Guide, chapter 2.13.14 Signed-off-by: Ben Widawsky <ben.widawsky@intel.com> --- .clang-format | 1 + drivers/cxl/region.c | 81 +++++++++++++++++++++++++++++++++++++++++++- drivers/cxl/trace.h | 3 ++ 3 files changed, 84 insertions(+), 1 deletion(-)