diff mbox series

[1/1] cxl/pci: Get AER capability address from RCRB only for RCH dport

Message ID 20240806041547.1958787-1-ming4.li@intel.com
State Changes Requested
Headers show
Series [1/1] cxl/pci: Get AER capability address from RCRB only for RCH dport | expand

Commit Message

Li, Ming4 Aug. 6, 2024, 4:15 a.m. UTC
cxl_setup_parent_dport() needs to get RCH dport AER capability address
from RCRB to disable AER interrupt. The function does not check if dport
is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev
in VH case because dport_dev points to a pci device(RP or switch DSP)
rather than a pci host bridge device.

Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the
RCH topology created by cxl-test, in the case, the RCH dport's dport_dev
is a platform device, and RCH dport with an available RCRB and AER
Capability is not supported yet.

Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery")
Signed-off-by: Li Ming <ming4.li@intel.com>
---
 drivers/cxl/core/pci.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Alison Schofield Aug. 6, 2024, 5:29 a.m. UTC | #1
On Tue, Aug 06, 2024 at 04:15:47AM +0000, Li Ming wrote:
> cxl_setup_parent_dport() needs to get RCH dport AER capability address
> from RCRB to disable AER interrupt. The function does not check if dport
> is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev
> in VH case because dport_dev points to a pci device(RP or switch DSP)
> rather than a pci host bridge device.

Ming,

Please add that this leads to a page fault. (from off-list chat)

> 
> Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the
> RCH topology created by cxl-test, in the case, the RCH dport's dport_dev
> is a platform device, and RCH dport with an available RCRB and AER
> Capability is not supported yet.
> 
> Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery")
> Signed-off-by: Li Ming <ming4.li@intel.com>
> ---
>  drivers/cxl/core/pci.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index a663e7566c48..f8a3188f5b17 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -4,6 +4,7 @@
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/device.h>
>  #include <linux/delay.h>
> +#include <linux/platform_device.h>
>  #include <linux/pci.h>
>  #include <linux/pci-doe.h>
>  #include <linux/aer.h>
> @@ -834,11 +835,16 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport)
>  void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport)
>  {
>  	struct device *dport_dev = dport->dport_dev;
> -	struct pci_host_bridge *host_bridge;
>  
> -	host_bridge = to_pci_host_bridge(dport_dev);
> -	if (host_bridge->native_aer)
> -		dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
> +	if (dev_is_platform(dport_dev))
> +		return;

The above, dev_is_plaform(), is the cxl-test only case - right?
I'm wondering if we need to mock something differently in cxl-test
because it is unusual to have a cxl-test special case in the driver.

Let's see what others say.

--Alison



> +
> +	if (dport->rch) {
> +		struct pci_host_bridge *host_bridge  = to_pci_host_bridge(dport_dev);
> +
> +		if (host_bridge->native_aer)
> +			dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
> +	}
>  
>  	dport->reg_map.host = host;
>  	cxl_dport_map_regs(dport);
> -- 
> 2.40.1
> 
>
Pengfei Xu Aug. 6, 2024, 7:34 a.m. UTC | #2
Hi Ming and all,

I met "BUG: slab-out-of-bounds in cxl_setup_parent_dport" problem in v6.10
kernel in cxl-test qemu environment:
https://lore.kernel.org/linux-cxl/ZrGFWYKwxa1ld9Iz@xpf.sh.intel.com/T/#u

And this patch could fix this problem.

Reported-by: Pengfei Xu <pengfei.xu@intel.com>
Tested-by: Pengfei Xu <pengfei.xu@intel.com>

Best Regards,
Thanks!

On 2024-08-06 at 04:15:47 +0000, Li Ming wrote:
> cxl_setup_parent_dport() needs to get RCH dport AER capability address
> from RCRB to disable AER interrupt. The function does not check if dport
> is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev
> in VH case because dport_dev points to a pci device(RP or switch DSP)
> rather than a pci host bridge device.
> 
> Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the
> RCH topology created by cxl-test, in the case, the RCH dport's dport_dev
> is a platform device, and RCH dport with an available RCRB and AER
> Capability is not supported yet.
> 
> Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery")
> Signed-off-by: Li Ming <ming4.li@intel.com>
> ---
>  drivers/cxl/core/pci.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index a663e7566c48..f8a3188f5b17 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -4,6 +4,7 @@
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/device.h>
>  #include <linux/delay.h>
> +#include <linux/platform_device.h>
>  #include <linux/pci.h>
>  #include <linux/pci-doe.h>
>  #include <linux/aer.h>
> @@ -834,11 +835,16 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport)
>  void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport)
>  {
>  	struct device *dport_dev = dport->dport_dev;
> -	struct pci_host_bridge *host_bridge;
>  
> -	host_bridge = to_pci_host_bridge(dport_dev);
> -	if (host_bridge->native_aer)
> -		dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
> +	if (dev_is_platform(dport_dev))
> +		return;
> +
> +	if (dport->rch) {
> +		struct pci_host_bridge *host_bridge  = to_pci_host_bridge(dport_dev);
> +
> +		if (host_bridge->native_aer)
> +			dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
> +	}
>  
>  	dport->reg_map.host = host;
>  	cxl_dport_map_regs(dport);
> -- 
> 2.40.1
>
Li, Ming4 Aug. 8, 2024, 1:26 a.m. UTC | #3
On 8/6/2024 1:29 PM, Alison Schofield wrote:
> On Tue, Aug 06, 2024 at 04:15:47AM +0000, Li Ming wrote:
>> cxl_setup_parent_dport() needs to get RCH dport AER capability address
>> from RCRB to disable AER interrupt. The function does not check if dport
>> is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev
>> in VH case because dport_dev points to a pci device(RP or switch DSP)
>> rather than a pci host bridge device.
> Ming,
>
> Please add that this leads to a page fault. (from off-list chat)

Sure, will do it.


>
>> Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the
>> RCH topology created by cxl-test, in the case, the RCH dport's dport_dev
>> is a platform device, and RCH dport with an available RCRB and AER
>> Capability is not supported yet.
>>
>> Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery")
>> Signed-off-by: Li Ming <ming4.li@intel.com>
>> ---
>>  drivers/cxl/core/pci.c | 14 ++++++++++----
>>  1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index a663e7566c48..f8a3188f5b17 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -4,6 +4,7 @@
>>  #include <linux/io-64-nonatomic-lo-hi.h>
>>  #include <linux/device.h>
>>  #include <linux/delay.h>
>> +#include <linux/platform_device.h>
>>  #include <linux/pci.h>
>>  #include <linux/pci-doe.h>
>>  #include <linux/aer.h>
>> @@ -834,11 +835,16 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport)
>>  void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport)
>>  {
>>  	struct device *dport_dev = dport->dport_dev;
>> -	struct pci_host_bridge *host_bridge;
>>  
>> -	host_bridge = to_pci_host_bridge(dport_dev);
>> -	if (host_bridge->native_aer)
>> -		dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
>> +	if (dev_is_platform(dport_dev))
>> +		return;
> The above, dev_is_plaform(), is the cxl-test only case - right?

Thanks for review.

Right, only dports created by cxl-test using platform_device as dport->dport_dev.


> I'm wondering if we need to mock something differently in cxl-test
> because it is unusual to have a cxl-test special case in the driver.

Make sense, what do you think if I implement a mock function like mock_cxl_setup_parent_dport() for the case? The mock function can check if the dport is created by cxl-test then skip it.


Ming

>
> Let's see what others say.
>
> --Alison
>
>
>
>> +
>> +	if (dport->rch) {
>> +		struct pci_host_bridge *host_bridge  = to_pci_host_bridge(dport_dev);
>> +
>> +		if (host_bridge->native_aer)
>> +			dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
>> +	}
>>  
>>  	dport->reg_map.host = host;
>>  	cxl_dport_map_regs(dport);
>> -- 
>> 2.40.1
>>
>>
Alison Schofield Aug. 8, 2024, 6:57 p.m. UTC | #4
On Thu, Aug 08, 2024 at 09:26:23AM +0800, Li, Ming4 wrote:
> On 8/6/2024 1:29 PM, Alison Schofield wrote:
> > On Tue, Aug 06, 2024 at 04:15:47AM +0000, Li Ming wrote:
> >> cxl_setup_parent_dport() needs to get RCH dport AER capability address
> >> from RCRB to disable AER interrupt. The function does not check if dport
> >> is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev
> >> in VH case because dport_dev points to a pci device(RP or switch DSP)
> >> rather than a pci host bridge device.
> > Ming,
> >
> > Please add that this leads to a page fault. (from off-list chat)
> 
> Sure, will do it.
> 
> 
> >
> >> Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the
> >> RCH topology created by cxl-test, in the case, the RCH dport's dport_dev
> >> is a platform device, and RCH dport with an available RCRB and AER
> >> Capability is not supported yet.
> >>
> >> Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery")
> >> Signed-off-by: Li Ming <ming4.li@intel.com>
> >> ---
> >>  drivers/cxl/core/pci.c | 14 ++++++++++----
> >>  1 file changed, 10 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> >> index a663e7566c48..f8a3188f5b17 100644
> >> --- a/drivers/cxl/core/pci.c
> >> +++ b/drivers/cxl/core/pci.c
> >> @@ -4,6 +4,7 @@
> >>  #include <linux/io-64-nonatomic-lo-hi.h>
> >>  #include <linux/device.h>
> >>  #include <linux/delay.h>
> >> +#include <linux/platform_device.h>
> >>  #include <linux/pci.h>
> >>  #include <linux/pci-doe.h>
> >>  #include <linux/aer.h>
> >> @@ -834,11 +835,16 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport)
> >>  void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport)
> >>  {
> >>  	struct device *dport_dev = dport->dport_dev;
> >> -	struct pci_host_bridge *host_bridge;
> >>  
> >> -	host_bridge = to_pci_host_bridge(dport_dev);
> >> -	if (host_bridge->native_aer)
> >> -		dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
> >> +	if (dev_is_platform(dport_dev))
> >> +		return;
> > The above, dev_is_plaform(), is the cxl-test only case - right?
> 
> Thanks for review.
> 
> Right, only dports created by cxl-test using platform_device as dport->dport_dev.
> 
> 
> > I'm wondering if we need to mock something differently in cxl-test
> > because it is unusual to have a cxl-test special case in the driver.
> 
> Make sense, what do you think if I implement a mock function like mock_cxl_setup_parent_dport() for the case? The mock function can check if the dport is created by cxl-test then skip it.
>

Sounds good to me :)


> 
> Ming
> 
> >
> > Let's see what others say.
> >
> > --Alison
> >
> >
> >
> >> +
> >> +	if (dport->rch) {
> >> +		struct pci_host_bridge *host_bridge  = to_pci_host_bridge(dport_dev);
> >> +
> >> +		if (host_bridge->native_aer)
> >> +			dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
> >> +	}
> >>  
> >>  	dport->reg_map.host = host;
> >>  	cxl_dport_map_regs(dport);
> >> -- 
> >> 2.40.1
> >>
> >>
>
diff mbox series

Patch

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index a663e7566c48..f8a3188f5b17 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -4,6 +4,7 @@ 
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/device.h>
 #include <linux/delay.h>
+#include <linux/platform_device.h>
 #include <linux/pci.h>
 #include <linux/pci-doe.h>
 #include <linux/aer.h>
@@ -834,11 +835,16 @@  static void cxl_disable_rch_root_ints(struct cxl_dport *dport)
 void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport)
 {
 	struct device *dport_dev = dport->dport_dev;
-	struct pci_host_bridge *host_bridge;
 
-	host_bridge = to_pci_host_bridge(dport_dev);
-	if (host_bridge->native_aer)
-		dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
+	if (dev_is_platform(dport_dev))
+		return;
+
+	if (dport->rch) {
+		struct pci_host_bridge *host_bridge  = to_pci_host_bridge(dport_dev);
+
+		if (host_bridge->native_aer)
+			dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base);
+	}
 
 	dport->reg_map.host = host;
 	cxl_dport_map_regs(dport);