diff mbox series

[v3] cxl: check decoder count for end device

Message ID 166662616015.232090.4970569004666131514.stgit@djiang5-desk3.ch.intel.com
State New, archived
Headers show
Series [v3] cxl: check decoder count for end device | expand

Commit Message

Dave Jiang Oct. 24, 2022, 3:43 p.m. UTC
CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also
indicates that for devices, only 10 decoders should be advertised. Add
check on number of decoders greater than 10 for devices and emit warning.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---

v3:
- Fix commit header and output message to reflect code change from v2.

v2:
- Remove decoder count reassignment from violation (Dan)

 drivers/cxl/core/hdm.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Jonathan Cameron Oct. 25, 2022, 11:08 a.m. UTC | #1
On Mon, 24 Oct 2022 08:43:30 -0700
Dave Jiang <dave.jiang@intel.com> wrote:

> CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also
> indicates that for devices, only 10 decoders should be advertised. Add
> check on number of decoders greater than 10 for devices and emit warning.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
I wonder... Should warning say CXL r3.0 spec violation?

Seems possible this might grow in future versions and so it
might be nice to give a hint that it 'might' be valid in a higher spec version?

I don't care strongly either way though.

FWIW
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

> ---
> 
> v3:
> - Fix commit header and output message to reflect code change from v2.
> 
> v2:
> - Remove decoder count reassignment from violation (Dan)
> 
>  drivers/cxl/core/hdm.c |   10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> index d1d2caea5c62..c3b756f93261 100644
> --- a/drivers/cxl/core/hdm.c
> +++ b/drivers/cxl/core/hdm.c
> @@ -71,9 +71,19 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL);
>  static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
>  {
>  	u32 hdm_cap;
> +	struct device *dev = &cxlhdm->port->dev;
>  
>  	hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
>  	cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
> +	/*
> +	 * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise
> +	 * more than 10 decoders. Switches and Host Bridges may advertise up to
> +	 * 32 decoders. Set the decoders to 10 for devices if more than 10 are
> +	 * found.
> +	 */
> +	if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10)
> +		dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n",
> +			 cxlhdm->decoder_count);
>  	cxlhdm->target_count =
>  		FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
>  	if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap))
> 
>
Dan Williams Oct. 25, 2022, 5:38 p.m. UTC | #2
Jonathan Cameron wrote:
> On Mon, 24 Oct 2022 08:43:30 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
> 
> > CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also
> > indicates that for devices, only 10 decoders should be advertised. Add
> > check on number of decoders greater than 10 for devices and emit warning.
> > 
> > Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> I wonder... Should warning say CXL r3.0 spec violation?
> 
> Seems possible this might grow in future versions and so it
> might be nice to give a hint that it 'might' be valid in a higher spec version?
> 
> I don't care strongly either way though.
> 
> FWIW
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> 
> > ---
> > 
> > v3:
> > - Fix commit header and output message to reflect code change from v2.
> > 
> > v2:
> > - Remove decoder count reassignment from violation (Dan)
> > 
> >  drivers/cxl/core/hdm.c |   10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> > index d1d2caea5c62..c3b756f93261 100644
> > --- a/drivers/cxl/core/hdm.c
> > +++ b/drivers/cxl/core/hdm.c
> > @@ -71,9 +71,19 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL);
> >  static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
> >  {
> >  	u32 hdm_cap;
> > +	struct device *dev = &cxlhdm->port->dev;
> >  
> >  	hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> >  	cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
> > +	/*
> > +	 * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise
> > +	 * more than 10 decoders. Switches and Host Bridges may advertise up to
> > +	 * 32 decoders. Set the decoders to 10 for devices if more than 10 are
> > +	 * found.

Stale comment, the "Set the decoders to 10" is no longer done.

> > +	 */
> > +	if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10)
> > +		dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n",
> > +			 cxlhdm->decoder_count);

This is still too scary for how the kernel is going to handle it. I
would expect:

dev_dbg(dev, "Endpoint decoder count %d, expected max 10\n")

...what downside are you seeing that's prompting this patch? Sometimes
Linux might want to constrain what hardware can do to avoid code
maintenance burden, but I do not see that harm here. Otherwise, there
are users that treat any log message above KERN_NOTICE priority as a
sign of a broken system. Is a platform with an endpoint with 16 decoders
broken? Also, if we do this bounds check for endpoints why are we not
doing it for bridges for the > 32 case?
Dave Jiang Oct. 25, 2022, 5:51 p.m. UTC | #3
On 10/25/2022 10:38 AM, Dan Williams wrote:
> Jonathan Cameron wrote:
>> On Mon, 24 Oct 2022 08:43:30 -0700
>> Dave Jiang <dave.jiang@intel.com> wrote:
>>
>>> CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also
>>> indicates that for devices, only 10 decoders should be advertised. Add
>>> check on number of decoders greater than 10 for devices and emit warning.
>>>
>>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>> I wonder... Should warning say CXL r3.0 spec violation?
>>
>> Seems possible this might grow in future versions and so it
>> might be nice to give a hint that it 'might' be valid in a higher spec version?
>>
>> I don't care strongly either way though.
>>
>> FWIW
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>
>>> ---
>>>
>>> v3:
>>> - Fix commit header and output message to reflect code change from v2.
>>>
>>> v2:
>>> - Remove decoder count reassignment from violation (Dan)
>>>
>>>   drivers/cxl/core/hdm.c |   10 ++++++++++
>>>   1 file changed, 10 insertions(+)
>>>
>>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
>>> index d1d2caea5c62..c3b756f93261 100644
>>> --- a/drivers/cxl/core/hdm.c
>>> +++ b/drivers/cxl/core/hdm.c
>>> @@ -71,9 +71,19 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL);
>>>   static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
>>>   {
>>>   	u32 hdm_cap;
>>> +	struct device *dev = &cxlhdm->port->dev;
>>>   
>>>   	hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
>>>   	cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
>>> +	/*
>>> +	 * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise
>>> +	 * more than 10 decoders. Switches and Host Bridges may advertise up to
>>> +	 * 32 decoders. Set the decoders to 10 for devices if more than 10 are
>>> +	 * found.
> Stale comment, the "Set the decoders to 10" is no longer done.
>
>>> +	 */
>>> +	if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10)
>>> +		dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n",
>>> +			 cxlhdm->decoder_count);
> This is still too scary for how the kernel is going to handle it. I
> would expect:
>
> dev_dbg(dev, "Endpoint decoder count %d, expected max 10\n")
>
> ...what downside are you seeing that's prompting this patch? Sometimes
> Linux might want to constrain what hardware can do to avoid code
> maintenance burden, but I do not see that harm here. Otherwise, there
> are users that treat any log message above KERN_NOTICE priority as a
> sign of a broken system. Is a platform with an endpoint with 16 decoders
> broken? Also, if we do this bounds check for endpoints why are we not
> doing it for bridges for the > 32 case?

This is purely the spec changed for 3.0. Should I add check for bridges 
as well and add debug emit for both or just drop it entirely?
Dan Williams Oct. 25, 2022, 6:46 p.m. UTC | #4
Dave Jiang wrote:
> 
> On 10/25/2022 10:38 AM, Dan Williams wrote:
> > Jonathan Cameron wrote:
> >> On Mon, 24 Oct 2022 08:43:30 -0700
> >> Dave Jiang <dave.jiang@intel.com> wrote:
> >>
> >>> CXL spec rev3.0 8.2.4.19.1 added definition for up to 32 decoders. It also
> >>> indicates that for devices, only 10 decoders should be advertised. Add
> >>> check on number of decoders greater than 10 for devices and emit warning.
> >>>
> >>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> >> I wonder... Should warning say CXL r3.0 spec violation?
> >>
> >> Seems possible this might grow in future versions and so it
> >> might be nice to give a hint that it 'might' be valid in a higher spec version?
> >>
> >> I don't care strongly either way though.
> >>
> >> FWIW
> >> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> >>
> >>> ---
> >>>
> >>> v3:
> >>> - Fix commit header and output message to reflect code change from v2.
> >>>
> >>> v2:
> >>> - Remove decoder count reassignment from violation (Dan)
> >>>
> >>>   drivers/cxl/core/hdm.c |   10 ++++++++++
> >>>   1 file changed, 10 insertions(+)
> >>>
> >>> diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
> >>> index d1d2caea5c62..c3b756f93261 100644
> >>> --- a/drivers/cxl/core/hdm.c
> >>> +++ b/drivers/cxl/core/hdm.c
> >>> @@ -71,9 +71,19 @@ EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL);
> >>>   static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
> >>>   {
> >>>   	u32 hdm_cap;
> >>> +	struct device *dev = &cxlhdm->port->dev;
> >>>   
> >>>   	hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
> >>>   	cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
> >>> +	/*
> >>> +	 * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise
> >>> +	 * more than 10 decoders. Switches and Host Bridges may advertise up to
> >>> +	 * 32 decoders. Set the decoders to 10 for devices if more than 10 are
> >>> +	 * found.
> > Stale comment, the "Set the decoders to 10" is no longer done.
> >
> >>> +	 */
> >>> +	if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10)
> >>> +		dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n",
> >>> +			 cxlhdm->decoder_count);
> > This is still too scary for how the kernel is going to handle it. I
> > would expect:
> >
> > dev_dbg(dev, "Endpoint decoder count %d, expected max 10\n")
> >
> > ...what downside are you seeing that's prompting this patch? Sometimes
> > Linux might want to constrain what hardware can do to avoid code
> > maintenance burden, but I do not see that harm here. Otherwise, there
> > are users that treat any log message above KERN_NOTICE priority as a
> > sign of a broken system. Is a platform with an endpoint with 16 decoders
> > broken? Also, if we do this bounds check for endpoints why are we not
> > doing it for bridges for the > 32 case?
> 
> This is purely the spec changed for 3.0. Should I add check for bridges 
> as well and add debug emit for both or just drop it entirely?

I see your point about the decoders > 32 case since those are not even
defined in the 3.0 specification, however if the driver had been strict
about the 2.0 decoder count limits then 3.0 devices would start tripping
a dev_warn() unnecessarily. So I am skeptical of being noisy about spec
compliance in places that have no material affect on the behavior of the
driver. If it turned out that the decoders above 10 were broken we could
fix that with a quirk, but those decoders are otherwise avoidable via
region creation policy.

So I think just drop this, it does not add anything and ends up asking
more questions than it answers.
diff mbox series

Patch

diff --git a/drivers/cxl/core/hdm.c b/drivers/cxl/core/hdm.c
index d1d2caea5c62..c3b756f93261 100644
--- a/drivers/cxl/core/hdm.c
+++ b/drivers/cxl/core/hdm.c
@@ -71,9 +71,19 @@  EXPORT_SYMBOL_NS_GPL(devm_cxl_add_passthrough_decoder, CXL);
 static void parse_hdm_decoder_caps(struct cxl_hdm *cxlhdm)
 {
 	u32 hdm_cap;
+	struct device *dev = &cxlhdm->port->dev;
 
 	hdm_cap = readl(cxlhdm->regs.hdm_decoder + CXL_HDM_DECODER_CAP_OFFSET);
 	cxlhdm->decoder_count = cxl_hdm_decoder_count(hdm_cap);
+	/*
+	 * CXL spec rev3.0 8.2.4.19.1 indicates CXL devices shall not advertise
+	 * more than 10 decoders. Switches and Host Bridges may advertise up to
+	 * 32 decoders. Set the decoders to 10 for devices if more than 10 are
+	 * found.
+	 */
+	if (is_cxl_endpoint(cxlhdm->port) && cxlhdm->decoder_count > 10)
+		dev_warn(dev, "Endpoint decoder count (%d) > 10, spec violation!\n",
+			 cxlhdm->decoder_count);
 	cxlhdm->target_count =
 		FIELD_GET(CXL_HDM_DECODER_TARGET_COUNT_MASK, hdm_cap);
 	if (FIELD_GET(CXL_HDM_DECODER_INTERLEAVE_11_8, hdm_cap))