Message ID | 20230822014303.110509-1-alison.schofield@intel.com |
---|---|
State | Superseded |
Headers | show |
Series | [v2] cxl/region: Match auto-discovered region decoders by HPA range | expand |
On 8/21/23 18:43, alison.schofield@intel.com wrote: > From: Alison Schofield <alison.schofield@intel.com> > > Currently, when the region driver attaches a region to a port, it > selects the ports next available decoder to program. > > With the addition of auto-discovered regions, a port decoder has > already been programmed so grabbing the next available decoder can > be a mismatch when there is more than one region using the port. > > The failure appears like this with CXL DEBUG enabled: > > [] cxl_core:alloc_region_ref:754: cxl region0: endpoint9: HPA order violation region0:[mem 0x14780000000-0x1478fffffff flags 0x200] vs [mem 0x880000000-0x185fffffff flags 0x200] > [] cxl_core:cxl_port_attach_region:972: cxl region0: endpoint9: failed to allocate region reference > > When CXL DEBUG is not enabled, there is no failure message. The region > just never materializes. Users can suspect this issue if they know their > firmware has programmed decoders so that more than one region is using > a port. Note that the problem may appear intermittently, ie not on > every reboot. > > Add a matching method for auto-discovered regions that finds a decoder > based on an HPA range. The decoder range must exactly match the region > resource parameter. > > Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery") > Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > > Changes in v2: > - Use cxlr->params for HPA match rather than requiring cxled (Dan) > - dev_warn() if decoder already assigned to a region (Dan) > - Add failure footprint to commit log (Dan) > - Add Fixes Tag (Dan) > - v1: https://lore.kernel.org/linux-cxl/20230804213004.1669658-1-alison.schofield@intel.com/ > > drivers/cxl/core/region.c | 29 ++++++++++++++++++++++++++++- > 1 file changed, 28 insertions(+), 1 deletion(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index e115ba382e04..b08aec9f0af8 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -717,13 +717,40 @@ static int match_free_decoder(struct device *dev, void *data) > return 0; > } > > +static int match_auto_decoder(struct device *dev, void *data) > +{ > + struct cxl_region_params *p = data; > + struct cxl_decoder *cxld; > + struct range *r; > + > + if (!is_switch_decoder(dev)) > + return 0; > + > + cxld = to_cxl_decoder(dev); > + r = &cxld->hpa_range; > + > + if (p->res && p->res->start == r->start && p->res->end == r->end) { > + if (cxld->region) { > + dev_WARN(dev, "decoder already attached to %s\n", > + dev_name(&cxld->region->dev)); > + return 0; > + } > + return 1; > + } > + return 0; > +} > + > static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > struct cxl_region *cxlr) > { > struct device *dev; > int id = 0; > > - dev = device_find_child(&port->dev, &id, match_free_decoder); > + if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) > + dev = device_find_child(&port->dev, &cxlr->params, > + match_auto_decoder); > + else > + dev = device_find_child(&port->dev, &id, match_free_decoder); > if (!dev) > return NULL; > /* > > base-commit: fe77cc2e5a6a7c85f5c6ef8a39d7694ffc7f41c9
On Mon, 21 Aug 2023, alison.schofield@intel.com wrote: >From: Alison Schofield <alison.schofield@intel.com> > >Currently, when the region driver attaches a region to a port, it >selects the ports next available decoder to program. > >With the addition of auto-discovered regions, a port decoder has >already been programmed so grabbing the next available decoder can >be a mismatch when there is more than one region using the port. > >The failure appears like this with CXL DEBUG enabled: > >[] cxl_core:alloc_region_ref:754: cxl region0: endpoint9: HPA order violation region0:[mem 0x14780000000-0x1478fffffff flags 0x200] vs [mem 0x880000000-0x185fffffff flags 0x200] >[] cxl_core:cxl_port_attach_region:972: cxl region0: endpoint9: failed to allocate region reference > >When CXL DEBUG is not enabled, there is no failure message. The region >just never materializes. Users can suspect this issue if they know their >firmware has programmed decoders so that more than one region is using >a port. Note that the problem may appear intermittently, ie not on >every reboot. > >Add a matching method for auto-discovered regions that finds a decoder >based on an HPA range. The decoder range must exactly match the region >resource parameter. > >Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery") >Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
On Mon, 21 Aug 2023 18:43:03 -0700 alison.schofield@intel.com wrote: > From: Alison Schofield <alison.schofield@intel.com> > > Currently, when the region driver attaches a region to a port, it > selects the ports next available decoder to program. > > With the addition of auto-discovered regions, a port decoder has > already been programmed so grabbing the next available decoder can > be a mismatch when there is more than one region using the port. > > The failure appears like this with CXL DEBUG enabled: > > [] cxl_core:alloc_region_ref:754: cxl region0: endpoint9: HPA order violation region0:[mem 0x14780000000-0x1478fffffff flags 0x200] vs [mem 0x880000000-0x185fffffff flags 0x200] > [] cxl_core:cxl_port_attach_region:972: cxl region0: endpoint9: failed to allocate region reference > > When CXL DEBUG is not enabled, there is no failure message. The region > just never materializes. Users can suspect this issue if they know their > firmware has programmed decoders so that more than one region is using > a port. Note that the problem may appear intermittently, ie not on > every reboot. > > Add a matching method for auto-discovered regions that finds a decoder > based on an HPA range. The decoder range must exactly match the region > resource parameter. > > Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery") > Signed-off-by: Alison Schofield <alison.schofield@intel.com> Bikeshed time... > --- > > Changes in v2: > - Use cxlr->params for HPA match rather than requiring cxled (Dan) > - dev_warn() if decoder already assigned to a region (Dan) > - Add failure footprint to commit log (Dan) > - Add Fixes Tag (Dan) > - v1: https://lore.kernel.org/linux-cxl/20230804213004.1669658-1-alison.schofield@intel.com/ > > drivers/cxl/core/region.c | 29 ++++++++++++++++++++++++++++- > 1 file changed, 28 insertions(+), 1 deletion(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index e115ba382e04..b08aec9f0af8 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -717,13 +717,40 @@ static int match_free_decoder(struct device *dev, void *data) > return 0; > } > > +static int match_auto_decoder(struct device *dev, void *data) Does more than matching a decoder and in the warning path doesn't actually match it (not returning 1 so we carry on). So perhaps a rename? > +{ > + struct cxl_region_params *p = data; > + struct cxl_decoder *cxld; > + struct range *r; > + > + if (!is_switch_decoder(dev)) > + return 0; > + > + cxld = to_cxl_decoder(dev); > + r = &cxld->hpa_range; > + > + if (p->res && p->res->start == r->start && p->res->end == r->end) { > + if (cxld->region) { > + dev_WARN(dev, "decoder already attached to %s\n", > + dev_name(&cxld->region->dev)); > + return 0; Not obvious to me why we carry on looking for matching regions if we have found a precise match (even if it's already attached). I'd expect return 1 here. Maybe a comment on why it makes sense to keep trying? > + } > + return 1; > + } > + return 0; > +} > + > static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > struct cxl_region *cxlr) > { > struct device *dev; > int id = 0; > > - dev = device_find_child(&port->dev, &id, match_free_decoder); > + if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) > + dev = device_find_child(&port->dev, &cxlr->params, > + match_auto_decoder); > + else > + dev = device_find_child(&port->dev, &id, match_free_decoder); > if (!dev) > return NULL; > /* > > base-commit: fe77cc2e5a6a7c85f5c6ef8a39d7694ffc7f41c9
On Tue, Aug 29, 2023 at 02:21:15PM +0100, Jonathan Cameron wrote: > On Mon, 21 Aug 2023 18:43:03 -0700 > alison.schofield@intel.com wrote: > > > From: Alison Schofield <alison.schofield@intel.com> > > > > Currently, when the region driver attaches a region to a port, it > > selects the ports next available decoder to program. > > > > With the addition of auto-discovered regions, a port decoder has > > already been programmed so grabbing the next available decoder can > > be a mismatch when there is more than one region using the port. > > > > The failure appears like this with CXL DEBUG enabled: > > > > [] cxl_core:alloc_region_ref:754: cxl region0: endpoint9: HPA order violation region0:[mem 0x14780000000-0x1478fffffff flags 0x200] vs [mem 0x880000000-0x185fffffff flags 0x200] > > [] cxl_core:cxl_port_attach_region:972: cxl region0: endpoint9: failed to allocate region reference > > > > When CXL DEBUG is not enabled, there is no failure message. The region > > just never materializes. Users can suspect this issue if they know their > > firmware has programmed decoders so that more than one region is using > > a port. Note that the problem may appear intermittently, ie not on > > every reboot. > > > > Add a matching method for auto-discovered regions that finds a decoder > > based on an HPA range. The decoder range must exactly match the region > > resource parameter. > > > > Fixes: a32320b71f08 ("cxl/region: Add region autodiscovery") > > Signed-off-by: Alison Schofield <alison.schofield@intel.com> > Bikeshed time... Not bikeshed, good catch - > > > --- > > > > Changes in v2: > > - Use cxlr->params for HPA match rather than requiring cxled (Dan) > > - dev_warn() if decoder already assigned to a region (Dan) > > - Add failure footprint to commit log (Dan) > > - Add Fixes Tag (Dan) > > - v1: https://lore.kernel.org/linux-cxl/20230804213004.1669658-1-alison.schofield@intel.com/ > > > > drivers/cxl/core/region.c | 29 ++++++++++++++++++++++++++++- > > 1 file changed, 28 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > > index e115ba382e04..b08aec9f0af8 100644 > > --- a/drivers/cxl/core/region.c > > +++ b/drivers/cxl/core/region.c > > @@ -717,13 +717,40 @@ static int match_free_decoder(struct device *dev, void *data) > > return 0; > > } > > > > +static int match_auto_decoder(struct device *dev, void *data) > > Does more than matching a decoder and in the warning path doesn't actually > match it (not returning 1 so we carry on). > > So perhaps a rename? I've updated to just do the match. Checking for an attached region was redundant as it's already done in the call path. Thanks for the review! > > > +{ > > + struct cxl_region_params *p = data; > > + struct cxl_decoder *cxld; > > + struct range *r; > > + > > + if (!is_switch_decoder(dev)) > > + return 0; > > + > > + cxld = to_cxl_decoder(dev); > > + r = &cxld->hpa_range; > > + > > + if (p->res && p->res->start == r->start && p->res->end == r->end) { > > + if (cxld->region) { > > + dev_WARN(dev, "decoder already attached to %s\n", > > + dev_name(&cxld->region->dev)); > > + return 0; > > Not obvious to me why we carry on looking for matching regions if we have > found a precise match (even if it's already attached). > I'd expect return 1 here. Maybe a comment on why it makes sense to keep trying? > > > + } > > + return 1; > > + } > > + return 0; > > +} > > + > > static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, > > struct cxl_region *cxlr) > > { > > struct device *dev; > > int id = 0; > > > > - dev = device_find_child(&port->dev, &id, match_free_decoder); > > + if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) > > + dev = device_find_child(&port->dev, &cxlr->params, > > + match_auto_decoder); > > + else > > + dev = device_find_child(&port->dev, &id, match_free_decoder); > > if (!dev) > > return NULL; > > /* > > > > base-commit: fe77cc2e5a6a7c85f5c6ef8a39d7694ffc7f41c9 >
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index e115ba382e04..b08aec9f0af8 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -717,13 +717,40 @@ static int match_free_decoder(struct device *dev, void *data) return 0; } +static int match_auto_decoder(struct device *dev, void *data) +{ + struct cxl_region_params *p = data; + struct cxl_decoder *cxld; + struct range *r; + + if (!is_switch_decoder(dev)) + return 0; + + cxld = to_cxl_decoder(dev); + r = &cxld->hpa_range; + + if (p->res && p->res->start == r->start && p->res->end == r->end) { + if (cxld->region) { + dev_WARN(dev, "decoder already attached to %s\n", + dev_name(&cxld->region->dev)); + return 0; + } + return 1; + } + return 0; +} + static struct cxl_decoder *cxl_region_find_decoder(struct cxl_port *port, struct cxl_region *cxlr) { struct device *dev; int id = 0; - dev = device_find_child(&port->dev, &id, match_free_decoder); + if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) + dev = device_find_child(&port->dev, &cxlr->params, + match_auto_decoder); + else + dev = device_find_child(&port->dev, &id, match_free_decoder); if (!dev) return NULL; /*