mbox series

[RFC,0/3] lspci: Display cxl1.1 device link status

Message ID 20231220050738.178481-1-kobayashi.da-06@fujitsu.com (mailing list archive)
Headers show
Series lspci: Display cxl1.1 device link status | expand

Message

Daisuke Kobayashi (Fujitsu) Dec. 20, 2023, 5:07 a.m. UTC
Hello.

This patch series adds a feature to lspci that displays the link status
of the CXL1.1 device.

CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards,
the link status can be output in the same way as traditional PCIe.
However, unlike devices from CXL2.0 onwards, CXL1.1 requires a
different method to obtain the link status from traditional PCIe.
This is because the link status of the CXL1.1 device is not mapped
in the configuration space (as per cxl3.0 specification 8.1).
Instead, the configuration space containing the link status is mapped
to the memory mapped register region (as per cxl3.0 specification 8.2,
Table 8-18). Therefore, the current lspci has a problem where it does
not display the link status of the CXL1.1 device. 
This patch solves these issues.

The method of acquisition is in the order of obtaining the device UID,
obtaining the base address from CEDT, and then obtaining the link
status from memory mapped register. Considered outputting with the cxl
command due to the scope of the CXL specification, but devices from
CXL2.0 onwards can be output in the same way as traditional PCIe.
Therefore, it would be better to make the lspci command compatible with
the CXL1.1 device for compatibility reasons.

I look forward to any comments you may have.

KobayashiDaisuke (3):
  Add function to display cxl1.1 device link status
  Implement a function to get cxl1.1 device uid
  Implement a function to get a RCRB Base address

 ls-caps.c | 216 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 lspci.h   |  35 +++++++++
 2 files changed, 251 insertions(+)

Comments

Jonathan Cameron Jan. 9, 2024, 3:57 p.m. UTC | #1
On Wed, 20 Dec 2023 14:07:35 +0900
KobayashiDaisuke <kobayashi.da-06@fujitsu.com> wrote:

> Hello.
> 
> This patch series adds a feature to lspci that displays the link status
> of the CXL1.1 device.
> 
> CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards,
> the link status can be output in the same way as traditional PCIe.
> However, unlike devices from CXL2.0 onwards, CXL1.1 requires a
> different method to obtain the link status from traditional PCIe.
> This is because the link status of the CXL1.1 device is not mapped
> in the configuration space (as per cxl3.0 specification 8.1).
> Instead, the configuration space containing the link status is mapped
> to the memory mapped register region (as per cxl3.0 specification 8.2,
> Table 8-18). Therefore, the current lspci has a problem where it does
> not display the link status of the CXL1.1 device. 
> This patch solves these issues.
> 
> The method of acquisition is in the order of obtaining the device UID,
> obtaining the base address from CEDT, and then obtaining the link
> status from memory mapped register. Considered outputting with the cxl
> command due to the scope of the CXL specification, but devices from
> CXL2.0 onwards can be output in the same way as traditional PCIe.
> Therefore, it would be better to make the lspci command compatible with
> the CXL1.1 device for compatibility reasons.
> 
> I look forward to any comments you may have.
Yikes. 

My gut feeling is that you shouldn't need to do this level of hackery.

If we need this information to be exposed to tooling then we should
add support to the kernel to export it somewhere in sysfs and read that
directly.  Do we need it to be available in absence of the CXL driver
stack? 

Jonathan
> 
> KobayashiDaisuke (3):
>   Add function to display cxl1.1 device link status
>   Implement a function to get cxl1.1 device uid
>   Implement a function to get a RCRB Base address
> 
>  ls-caps.c | 216 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  lspci.h   |  35 +++++++++
>  2 files changed, 251 insertions(+)
>
Dan Williams Jan. 11, 2024, 1:11 a.m. UTC | #2
Jonathan Cameron wrote:
> On Wed, 20 Dec 2023 14:07:35 +0900
> KobayashiDaisuke <kobayashi.da-06@fujitsu.com> wrote:
> 
> > Hello.
> > 
> > This patch series adds a feature to lspci that displays the link status
> > of the CXL1.1 device.
> > 
> > CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards,
> > the link status can be output in the same way as traditional PCIe.
> > However, unlike devices from CXL2.0 onwards, CXL1.1 requires a
> > different method to obtain the link status from traditional PCIe.
> > This is because the link status of the CXL1.1 device is not mapped
> > in the configuration space (as per cxl3.0 specification 8.1).
> > Instead, the configuration space containing the link status is mapped
> > to the memory mapped register region (as per cxl3.0 specification 8.2,
> > Table 8-18). Therefore, the current lspci has a problem where it does
> > not display the link status of the CXL1.1 device. 
> > This patch solves these issues.
> > 
> > The method of acquisition is in the order of obtaining the device UID,
> > obtaining the base address from CEDT, and then obtaining the link
> > status from memory mapped register. Considered outputting with the cxl
> > command due to the scope of the CXL specification, but devices from
> > CXL2.0 onwards can be output in the same way as traditional PCIe.
> > Therefore, it would be better to make the lspci command compatible with
> > the CXL1.1 device for compatibility reasons.
> > 
> > I look forward to any comments you may have.
> Yikes. 
> 
> My gut feeling is that you shouldn't need to do this level of hackery.
> 
> If we need this information to be exposed to tooling then we should
> add support to the kernel to export it somewhere in sysfs and read that
> directly.  Do we need it to be available in absence of the CXL driver
> stack? 

I am hoping that's a non-goal if only because that makes it more
difficult for the kernel to provide some help here without polluting to
the PCI core.

To date, RCRB handling is nothing that the PCI core needs to worry
about, and I am not sure I want to open that box.

I am wondering about an approach like below is sufficient for lspci.

The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can
opt-in to publishing these hidden registers.

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 4fd1f207c84e..ee63dff63b68 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -960,6 +960,19 @@ static const struct pci_error_handlers cxl_error_handlers = {
        .cor_error_detected     = cxl_cor_error_detected,
 };
 
+static struct attribute *cxl_rcd_attrs[] = {
+       &dev_attr_rcd_lnkcp.attr,
+       &dev_attr_rcd_lnkctl.attr,
+       NULL
+};
+
+static struct attribute_group cxl_rcd_group = {
+       .attrs = cxl_rcd_attrs,
+       .is_visible = cxl_rcd_visible,
+};
+
+__ATTRIBUTE_GROUPS(cxl_pci);
+
 static struct pci_driver cxl_pci_driver = {
        .name                   = KBUILD_MODNAME,
        .id_table               = cxl_mem_pci_tbl,
@@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = {
        .err_handler            = &cxl_error_handlers,
        .driver = {
                .probe_type     = PROBE_PREFER_ASYNCHRONOUS,
+               .dev_groups     = cxl_rcd_groups,
        },
 };
 

However, the problem I believe is this will end up with:

/sys/bus/pci/devices/$pdev/rcd_lnkcap
/sys/bus/pci/devices/$pdev/rcd_lnkctl

...with valid values, but attributes like:

/sys/bus/pci/devices/$pdev/current_link_speed

...returning -EINVAL.

So I think the options are:

1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and
   piecemeal enable specific lspci needs with RCD-specific attributes

...or:

2/ Hack pcie_capability_read_word() to internally figure out that based
   on a config offset a device may have a hidden capability and switch over
   to RCRB based config-cycle access for those.

Given that the CXL 1.1 RCH topology concept was immediately deprecated
in favor of VH topology in CXL 2.0, I am not inclined to pollute the
general Linux PCI core with that "aberration of history" as it were.
Jonathan Cameron Jan. 12, 2024, 11:24 a.m. UTC | #3
On Wed, 10 Jan 2024 17:11:38 -0800
Dan Williams <dan.j.williams@intel.com> wrote:

> Jonathan Cameron wrote:
> > On Wed, 20 Dec 2023 14:07:35 +0900
> > KobayashiDaisuke <kobayashi.da-06@fujitsu.com> wrote:
> >   
> > > Hello.
> > > 
> > > This patch series adds a feature to lspci that displays the link status
> > > of the CXL1.1 device.
> > > 
> > > CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards,
> > > the link status can be output in the same way as traditional PCIe.
> > > However, unlike devices from CXL2.0 onwards, CXL1.1 requires a
> > > different method to obtain the link status from traditional PCIe.
> > > This is because the link status of the CXL1.1 device is not mapped
> > > in the configuration space (as per cxl3.0 specification 8.1).
> > > Instead, the configuration space containing the link status is mapped
> > > to the memory mapped register region (as per cxl3.0 specification 8.2,
> > > Table 8-18). Therefore, the current lspci has a problem where it does
> > > not display the link status of the CXL1.1 device. 
> > > This patch solves these issues.
> > > 
> > > The method of acquisition is in the order of obtaining the device UID,
> > > obtaining the base address from CEDT, and then obtaining the link
> > > status from memory mapped register. Considered outputting with the cxl
> > > command due to the scope of the CXL specification, but devices from
> > > CXL2.0 onwards can be output in the same way as traditional PCIe.
> > > Therefore, it would be better to make the lspci command compatible with
> > > the CXL1.1 device for compatibility reasons.
> > > 
> > > I look forward to any comments you may have.  
> > Yikes. 
> > 
> > My gut feeling is that you shouldn't need to do this level of hackery.
> > 
> > If we need this information to be exposed to tooling then we should
> > add support to the kernel to export it somewhere in sysfs and read that
> > directly.  Do we need it to be available in absence of the CXL driver
> > stack?   
> 
> I am hoping that's a non-goal if only because that makes it more
> difficult for the kernel to provide some help here without polluting to
> the PCI core.
> 
> To date, RCRB handling is nothing that the PCI core needs to worry
> about, and I am not sure I want to open that box.
> 
> I am wondering about an approach like below is sufficient for lspci.
> 
> The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can
> opt-in to publishing these hidden registers.
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 4fd1f207c84e..ee63dff63b68 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -960,6 +960,19 @@ static const struct pci_error_handlers cxl_error_handlers = {
>         .cor_error_detected     = cxl_cor_error_detected,
>  };
>  
> +static struct attribute *cxl_rcd_attrs[] = {
> +       &dev_attr_rcd_lnkcp.attr,
> +       &dev_attr_rcd_lnkctl.attr,
> +       NULL
> +};
> +
> +static struct attribute_group cxl_rcd_group = {
> +       .attrs = cxl_rcd_attrs,
> +       .is_visible = cxl_rcd_visible,
> +};
> +
> +__ATTRIBUTE_GROUPS(cxl_pci);
> +
>  static struct pci_driver cxl_pci_driver = {
>         .name                   = KBUILD_MODNAME,
>         .id_table               = cxl_mem_pci_tbl,
> @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = {
>         .err_handler            = &cxl_error_handlers,
>         .driver = {
>                 .probe_type     = PROBE_PREFER_ASYNCHRONOUS,
> +               .dev_groups     = cxl_rcd_groups,
>         },
>  };
>  
> 
> However, the problem I believe is this will end up with:
> 
> /sys/bus/pci/devices/$pdev/rcd_lnkcap
> /sys/bus/pci/devices/$pdev/rcd_lnkctl
> 
> ...with valid values, but attributes like:
> 
> /sys/bus/pci/devices/$pdev/current_link_speed
> 
> ...returning -EINVAL.
> 
> So I think the options are:
> 
> 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and
>    piecemeal enable specific lspci needs with RCD-specific attributes

This one gets my vote.

> 
> ...or:
> 
> 2/ Hack pcie_capability_read_word() to internally figure out that based
>    on a config offset a device may have a hidden capability and switch over
>    to RCRB based config-cycle access for those.
> 
> Given that the CXL 1.1 RCH topology concept was immediately deprecated
> in favor of VH topology in CXL 2.0, I am not inclined to pollute the
> general Linux PCI core with that "aberration of history" as it were.
Agreed.
Daisuke Kobayashi (Fujitsu) Jan. 15, 2024, 9:09 a.m. UTC | #4
> -----Original Message-----
> From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Sent: Friday, January 12, 2024 8:24 PM
> To: Dan Williams <dan.j.williams@intel.com>
> Cc: Kobayashi, Daisuke/小林 大介 <kobayashi.da-06@fujitsu.com>;
> linux-pci@vger.kernel.org; linux-cxl@vger.kernel.org; Gotou, Yasunori/五島 康
> 文 <y-goto@fujitsu.com>
> Subject: Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status
> 
> On Wed, 10 Jan 2024 17:11:38 -0800
> Dan Williams <dan.j.williams@intel.com> wrote:
> 
> > Jonathan Cameron wrote:
> > > On Wed, 20 Dec 2023 14:07:35 +0900
> > > KobayashiDaisuke <kobayashi.da-06@fujitsu.com> wrote:
> > >
> > > > Hello.
> > > >
> > > > This patch series adds a feature to lspci that displays the link status
> > > > of the CXL1.1 device.
> > > >
> > > > CXL devices are extensions of PCIe. Therefore, from CXL2.0 onwards,
> > > > the link status can be output in the same way as traditional PCIe.
> > > > However, unlike devices from CXL2.0 onwards, CXL1.1 requires a
> > > > different method to obtain the link status from traditional PCIe.
> > > > This is because the link status of the CXL1.1 device is not mapped
> > > > in the configuration space (as per cxl3.0 specification 8.1).
> > > > Instead, the configuration space containing the link status is mapped
> > > > to the memory mapped register region (as per cxl3.0 specification 8.2,
> > > > Table 8-18). Therefore, the current lspci has a problem where it does
> > > > not display the link status of the CXL1.1 device.
> > > > This patch solves these issues.
> > > >
> > > > The method of acquisition is in the order of obtaining the device UID,
> > > > obtaining the base address from CEDT, and then obtaining the link
> > > > status from memory mapped register. Considered outputting with the cxl
> > > > command due to the scope of the CXL specification, but devices from
> > > > CXL2.0 onwards can be output in the same way as traditional PCIe.
> > > > Therefore, it would be better to make the lspci command compatible with
> > > > the CXL1.1 device for compatibility reasons.
> > > >
> > > > I look forward to any comments you may have.
> > > Yikes.
> > >
> > > My gut feeling is that you shouldn't need to do this level of hackery.
> > >
> > > If we need this information to be exposed to tooling then we should
> > > add support to the kernel to export it somewhere in sysfs and read that
> > > directly.  Do we need it to be available in absence of the CXL driver
> > > stack?
> >
> > I am hoping that's a non-goal if only because that makes it more
> > difficult for the kernel to provide some help here without polluting to
> > the PCI core.
> >
> > To date, RCRB handling is nothing that the PCI core needs to worry
> > about, and I am not sure I want to open that box.
> >
> > I am wondering about an approach like below is sufficient for lspci.
> >
> > The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can
> > opt-in to publishing these hidden registers.
> >
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 4fd1f207c84e..ee63dff63b68 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -960,6 +960,19 @@ static const struct pci_error_handlers
> cxl_error_handlers = {
> >         .cor_error_detected     = cxl_cor_error_detected,
> >  };
> >
> > +static struct attribute *cxl_rcd_attrs[] = {
> > +       &dev_attr_rcd_lnkcp.attr,
> > +       &dev_attr_rcd_lnkctl.attr,
> > +       NULL
> > +};
> > +
> > +static struct attribute_group cxl_rcd_group = {
> > +       .attrs = cxl_rcd_attrs,
> > +       .is_visible = cxl_rcd_visible,
> > +};
> > +
> > +__ATTRIBUTE_GROUPS(cxl_pci);
> > +
> >  static struct pci_driver cxl_pci_driver = {
> >         .name                   = KBUILD_MODNAME,
> >         .id_table               = cxl_mem_pci_tbl,
> > @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = {
> >         .err_handler            = &cxl_error_handlers,
> >         .driver = {
> >                 .probe_type     = PROBE_PREFER_ASYNCHRONOUS,
> > +               .dev_groups     = cxl_rcd_groups,
> >         },
> >  };
> >
> >
> > However, the problem I believe is this will end up with:
> >
> > /sys/bus/pci/devices/$pdev/rcd_lnkcap
> > /sys/bus/pci/devices/$pdev/rcd_lnkctl
> >
> > ...with valid values, but attributes like:
> >
> > /sys/bus/pci/devices/$pdev/current_link_speed
> >
> > ...returning -EINVAL.
> >
> > So I think the options are:
> >
> > 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and
> >    piecemeal enable specific lspci needs with RCD-specific attributes
> 
> This one gets my vote.

Thank you for your feedback.
Like Dan, I also believe that implementing this feature in the kernel may 
not be appropriate, as it is specifically needed for CXL1.1 devices.
Therefore, I understand that it would be better to implement 
the link status of CXL1.1 devices directly in lspci.
Please tell me if my understanding is wrong.

> 
> >
> > ...or:
> >
> > 2/ Hack pcie_capability_read_word() to internally figure out that based
> >    on a config offset a device may have a hidden capability and switch over
> >    to RCRB based config-cycle access for those.
> >
> > Given that the CXL 1.1 RCH topology concept was immediately deprecated
> > in favor of VH topology in CXL 2.0, I am not inclined to pollute the
> > general Linux PCI core with that "aberration of history" as it were.
> Agreed.
>
Dan Williams Jan. 16, 2024, 9:29 p.m. UTC | #5
Daisuke Kobayashi (Fujitsu) wrote:
> > > I am wondering about an approach like below is sufficient for lspci.
> > >
> > > The idea here is that cxl_pci (or other PCI driver for Type-2 RCDs) can
> > > opt-in to publishing these hidden registers.
> > >
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index 4fd1f207c84e..ee63dff63b68 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -960,6 +960,19 @@ static const struct pci_error_handlers
> > cxl_error_handlers = {
> > >         .cor_error_detected     = cxl_cor_error_detected,
> > >  };
> > >
> > > +static struct attribute *cxl_rcd_attrs[] = {
> > > +       &dev_attr_rcd_lnkcp.attr,
> > > +       &dev_attr_rcd_lnkctl.attr,
> > > +       NULL
> > > +};
> > > +
> > > +static struct attribute_group cxl_rcd_group = {
> > > +       .attrs = cxl_rcd_attrs,
> > > +       .is_visible = cxl_rcd_visible,
> > > +};
> > > +
> > > +__ATTRIBUTE_GROUPS(cxl_pci);
> > > +
> > >  static struct pci_driver cxl_pci_driver = {
> > >         .name                   = KBUILD_MODNAME,
> > >         .id_table               = cxl_mem_pci_tbl,
> > > @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = {
> > >         .err_handler            = &cxl_error_handlers,
> > >         .driver = {
> > >                 .probe_type     = PROBE_PREFER_ASYNCHRONOUS,
> > > +               .dev_groups     = cxl_rcd_groups,
> > >         },
> > >  };
> > >
> > >
> > > However, the problem I believe is this will end up with:
> > >
> > > /sys/bus/pci/devices/$pdev/rcd_lnkcap
> > > /sys/bus/pci/devices/$pdev/rcd_lnkctl
> > >
> > > ...with valid values, but attributes like:
> > >
> > > /sys/bus/pci/devices/$pdev/current_link_speed
> > >
> > > ...returning -EINVAL.
> > >
> > > So I think the options are:
> > >
> > > 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and
> > >    piecemeal enable specific lspci needs with RCD-specific attributes
> > 
> > This one gets my vote.
> 
> Thank you for your feedback.
> Like Dan, I also believe that implementing this feature in the kernel may 
> not be appropriate, as it is specifically needed for CXL1.1 devices.
> Therefore, I understand that it would be better to implement 
> the link status of CXL1.1 devices directly in lspci.
> Please tell me if my understanding is wrong.

The proposal is to do a hybrid approach. The drivers/cxl/ subsystem
already handles RCRB register access internally, so it can go further
and expose a couple attributes ("rcd_lnkcap" and "rcd_lnkctl") that
lspci can go read. In other words, "/dev/mem" is not a reliable way to
access the RCRB, and it is too much work to make the existing sysfs
config-space access ABI understand the RCRB layout since that
complication would only be useful for one hardware generation.

An additional idea here is to allow for the CXL subsystem to takeover
publishing PCIe attributes like "current_link_speed", that are currently
broken by the RCRB configuration, with a change like this:

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 2321fdfefd7d..982bbec721fd 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1613,7 +1613,7 @@ static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj,
        struct device *dev = kobj_to_dev(kobj);
        struct pci_dev *pdev = to_pci_dev(dev);
 
-       if (pci_is_pcie(pdev))
+       if (pci_is_pcie(pdev) && !is_cxl_rcd(pdev))
                return a->mode;
 
        return 0;

...then the CXL subsystem can produce its own attributes with the same
name, but backed by the RCRB lookup mechanism.
Daisuke Kobayashi (Fujitsu) Jan. 17, 2024, 9:23 a.m. UTC | #6
> -----Original Message-----
> From: Dan Williams <dan.j.williams@intel.com>
> Sent: Wednesday, January 17, 2024 6:29 AM
> To: Kobayashi, Daisuke/小林 大介 <kobayashi.da-06@fujitsu.com>;
> 'Jonathan Cameron' <Jonathan.Cameron@huawei.com>; Dan Williams
> <dan.j.williams@intel.com>
> Cc: linux-pci@vger.kernel.org; linux-cxl@vger.kernel.org; Gotou, Yasunori/五島
> 康文 <y-goto@fujitsu.com>
> Subject: RE: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status
> 
> Daisuke Kobayashi (Fujitsu) wrote:
> > > > I am wondering about an approach like below is sufficient for lspci.
> > > >
> > > > The idea here is that cxl_pci (or other PCI driver for Type-2
> > > > RCDs) can opt-in to publishing these hidden registers.
> > > >
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index
> > > > 4fd1f207c84e..ee63dff63b68 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -960,6 +960,19 @@ static const struct pci_error_handlers
> > > cxl_error_handlers = {
> > > >         .cor_error_detected     = cxl_cor_error_detected,
> > > >  };
> > > >
> > > > +static struct attribute *cxl_rcd_attrs[] = {
> > > > +       &dev_attr_rcd_lnkcp.attr,
> > > > +       &dev_attr_rcd_lnkctl.attr,
> > > > +       NULL
> > > > +};
> > > > +
> > > > +static struct attribute_group cxl_rcd_group = {
> > > > +       .attrs = cxl_rcd_attrs,
> > > > +       .is_visible = cxl_rcd_visible, };
> > > > +
> > > > +__ATTRIBUTE_GROUPS(cxl_pci);
> > > > +
> > > >  static struct pci_driver cxl_pci_driver = {
> > > >         .name                   = KBUILD_MODNAME,
> > > >         .id_table               = cxl_mem_pci_tbl,
> > > > @@ -967,6 +980,7 @@ static struct pci_driver cxl_pci_driver = {
> > > >         .err_handler            = &cxl_error_handlers,
> > > >         .driver = {
> > > >                 .probe_type     =
> PROBE_PREFER_ASYNCHRONOUS,
> > > > +               .dev_groups     = cxl_rcd_groups,
> > > >         },
> > > >  };
> > > >
> > > >
> > > > However, the problem I believe is this will end up with:
> > > >
> > > > /sys/bus/pci/devices/$pdev/rcd_lnkcap
> > > > /sys/bus/pci/devices/$pdev/rcd_lnkctl
> > > >
> > > > ...with valid values, but attributes like:
> > > >
> > > > /sys/bus/pci/devices/$pdev/current_link_speed
> > > >
> > > > ...returning -EINVAL.
> > > >
> > > > So I think the options are:
> > > >
> > > > 1/ Keep the status quo of RCRB knowledge only lives in drivers/cxl/ and
> > > >    piecemeal enable specific lspci needs with RCD-specific
> > > > attributes
> > >
> > > This one gets my vote.
> >
> > Thank you for your feedback.
> > Like Dan, I also believe that implementing this feature in the kernel
> > may not be appropriate, as it is specifically needed for CXL1.1 devices.
> > Therefore, I understand that it would be better to implement the link
> > status of CXL1.1 devices directly in lspci.
> > Please tell me if my understanding is wrong.
> 
> The proposal is to do a hybrid approach. The drivers/cxl/ subsystem already
> handles RCRB register access internally, so it can go further and expose a
> couple attributes ("rcd_lnkcap" and "rcd_lnkctl") that lspci can go read. In
> other words, "/dev/mem" is not a reliable way to access the RCRB, and it is too
> much work to make the existing sysfs config-space access ABI understand the
> RCRB layout since that complication would only be useful for one hardware
> generation.
> 
> An additional idea here is to allow for the CXL subsystem to takeover
> publishing PCIe attributes like "current_link_speed", that are currently broken
> by the RCRB configuration, with a change like this:
> 

Thank you, it seems that my understanding was incorrect. 
I will try to consider the implementation by dividing it into parts: 
the hook on the pci driver, the RCRB access in the cxl driver, 
and the sysfs reading in lspci.

> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index
> 2321fdfefd7d..982bbec721fd 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1613,7 +1613,7 @@ static umode_t pcie_dev_attrs_are_visible(struct
> kobject *kobj,
>         struct device *dev = kobj_to_dev(kobj);
>         struct pci_dev *pdev = to_pci_dev(dev);
> 
> -       if (pci_is_pcie(pdev))
> +       if (pci_is_pcie(pdev) && !is_cxl_rcd(pdev))
>                 return a->mode;
> 
>         return 0;
> 
> ...then the CXL subsystem can produce its own attributes with the same name,
> but backed by the RCRB lookup mechanism.
Martin Mareš Jan. 17, 2024, 12:10 p.m. UTC | #7
Hello!

Sorry for the late reply, but these days I don't read linux-pci
regularly. Please Cc me on all patches for the pciutils.

Anyway...

I don't think this is the right approach. You poke things you shouldn't
in user space, you also make some bold assumptions on endianity of the
machine (you are using native C structs for data provided by the hardware).

This belongs to the kernel.

				Have a nice fortnight
Daisuke Kobayashi (Fujitsu) Jan. 18, 2024, 5:07 a.m. UTC | #8
> -----Original Message-----
> From: Martin Mareš <mj@ucw.cz>
> Sent: Wednesday, January 17, 2024 9:11 PM
> To: Kobayashi, Daisuke/小林 大介 <kobayashi.da-06@fujitsu.com>
> Cc: linux-pci@vger.kernel.org; linux-cxl@vger.kernel.org; Gotou, Yasunori/五島
> 康文 <y-goto@fujitsu.com>
> Subject: Re: [RFC PATCH 0/3] lspci: Display cxl1.1 device link status
> 
> Hello!
> 
> Sorry for the late reply, but these days I don't read linux-pci regularly. Please Cc
> me on all patches for the pciutils.
> 

I see. I'll include you in CC.

> Anyway...
> 
> I don't think this is the right approach. You poke things you shouldn't in user
> space, you also make some bold assumptions on endianity of the machine (you
> are using native C structs for data provided by the hardware).
> 
> This belongs to the kernel.

Thank you, Martin. I just want to say you made a good point. 
I totally missed that, so thanks for pointing it out.

> 
> 				Have a nice fortnight
> --
> Martin `MJ' Mareš                        <mj@ucw.cz>
> http://mj.ucw.cz/
> United Computer Wizards, Prague, Czech Republic, Europe, Earth, Universe