Message ID | 20240806041547.1958787-1-ming4.li@intel.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [1/1] cxl/pci: Get AER capability address from RCRB only for RCH dport | expand |
On Tue, Aug 06, 2024 at 04:15:47AM +0000, Li Ming wrote: > cxl_setup_parent_dport() needs to get RCH dport AER capability address > from RCRB to disable AER interrupt. The function does not check if dport > is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev > in VH case because dport_dev points to a pci device(RP or switch DSP) > rather than a pci host bridge device. Ming, Please add that this leads to a page fault. (from off-list chat) > > Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the > RCH topology created by cxl-test, in the case, the RCH dport's dport_dev > is a platform device, and RCH dport with an available RCRB and AER > Capability is not supported yet. > > Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery") > Signed-off-by: Li Ming <ming4.li@intel.com> > --- > drivers/cxl/core/pci.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index a663e7566c48..f8a3188f5b17 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -4,6 +4,7 @@ > #include <linux/io-64-nonatomic-lo-hi.h> > #include <linux/device.h> > #include <linux/delay.h> > +#include <linux/platform_device.h> > #include <linux/pci.h> > #include <linux/pci-doe.h> > #include <linux/aer.h> > @@ -834,11 +835,16 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport) > void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport) > { > struct device *dport_dev = dport->dport_dev; > - struct pci_host_bridge *host_bridge; > > - host_bridge = to_pci_host_bridge(dport_dev); > - if (host_bridge->native_aer) > - dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); > + if (dev_is_platform(dport_dev)) > + return; The above, dev_is_plaform(), is the cxl-test only case - right? I'm wondering if we need to mock something differently in cxl-test because it is unusual to have a cxl-test special case in the driver. Let's see what others say. --Alison > + > + if (dport->rch) { > + struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev); > + > + if (host_bridge->native_aer) > + dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); > + } > > dport->reg_map.host = host; > cxl_dport_map_regs(dport); > -- > 2.40.1 > >
Hi Ming and all, I met "BUG: slab-out-of-bounds in cxl_setup_parent_dport" problem in v6.10 kernel in cxl-test qemu environment: https://lore.kernel.org/linux-cxl/ZrGFWYKwxa1ld9Iz@xpf.sh.intel.com/T/#u And this patch could fix this problem. Reported-by: Pengfei Xu <pengfei.xu@intel.com> Tested-by: Pengfei Xu <pengfei.xu@intel.com> Best Regards, Thanks! On 2024-08-06 at 04:15:47 +0000, Li Ming wrote: > cxl_setup_parent_dport() needs to get RCH dport AER capability address > from RCRB to disable AER interrupt. The function does not check if dport > is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev > in VH case because dport_dev points to a pci device(RP or switch DSP) > rather than a pci host bridge device. > > Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the > RCH topology created by cxl-test, in the case, the RCH dport's dport_dev > is a platform device, and RCH dport with an available RCRB and AER > Capability is not supported yet. > > Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery") > Signed-off-by: Li Ming <ming4.li@intel.com> > --- > drivers/cxl/core/pci.c | 14 ++++++++++---- > 1 file changed, 10 insertions(+), 4 deletions(-) > > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > index a663e7566c48..f8a3188f5b17 100644 > --- a/drivers/cxl/core/pci.c > +++ b/drivers/cxl/core/pci.c > @@ -4,6 +4,7 @@ > #include <linux/io-64-nonatomic-lo-hi.h> > #include <linux/device.h> > #include <linux/delay.h> > +#include <linux/platform_device.h> > #include <linux/pci.h> > #include <linux/pci-doe.h> > #include <linux/aer.h> > @@ -834,11 +835,16 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport) > void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport) > { > struct device *dport_dev = dport->dport_dev; > - struct pci_host_bridge *host_bridge; > > - host_bridge = to_pci_host_bridge(dport_dev); > - if (host_bridge->native_aer) > - dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); > + if (dev_is_platform(dport_dev)) > + return; > + > + if (dport->rch) { > + struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev); > + > + if (host_bridge->native_aer) > + dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); > + } > > dport->reg_map.host = host; > cxl_dport_map_regs(dport); > -- > 2.40.1 >
On 8/6/2024 1:29 PM, Alison Schofield wrote: > On Tue, Aug 06, 2024 at 04:15:47AM +0000, Li Ming wrote: >> cxl_setup_parent_dport() needs to get RCH dport AER capability address >> from RCRB to disable AER interrupt. The function does not check if dport >> is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev >> in VH case because dport_dev points to a pci device(RP or switch DSP) >> rather than a pci host bridge device. > Ming, > > Please add that this leads to a page fault. (from off-list chat) Sure, will do it. > >> Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the >> RCH topology created by cxl-test, in the case, the RCH dport's dport_dev >> is a platform device, and RCH dport with an available RCRB and AER >> Capability is not supported yet. >> >> Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery") >> Signed-off-by: Li Ming <ming4.li@intel.com> >> --- >> drivers/cxl/core/pci.c | 14 ++++++++++---- >> 1 file changed, 10 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c >> index a663e7566c48..f8a3188f5b17 100644 >> --- a/drivers/cxl/core/pci.c >> +++ b/drivers/cxl/core/pci.c >> @@ -4,6 +4,7 @@ >> #include <linux/io-64-nonatomic-lo-hi.h> >> #include <linux/device.h> >> #include <linux/delay.h> >> +#include <linux/platform_device.h> >> #include <linux/pci.h> >> #include <linux/pci-doe.h> >> #include <linux/aer.h> >> @@ -834,11 +835,16 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport) >> void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport) >> { >> struct device *dport_dev = dport->dport_dev; >> - struct pci_host_bridge *host_bridge; >> >> - host_bridge = to_pci_host_bridge(dport_dev); >> - if (host_bridge->native_aer) >> - dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); >> + if (dev_is_platform(dport_dev)) >> + return; > The above, dev_is_plaform(), is the cxl-test only case - right? Thanks for review. Right, only dports created by cxl-test using platform_device as dport->dport_dev. > I'm wondering if we need to mock something differently in cxl-test > because it is unusual to have a cxl-test special case in the driver. Make sense, what do you think if I implement a mock function like mock_cxl_setup_parent_dport() for the case? The mock function can check if the dport is created by cxl-test then skip it. Ming > > Let's see what others say. > > --Alison > > > >> + >> + if (dport->rch) { >> + struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev); >> + >> + if (host_bridge->native_aer) >> + dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); >> + } >> >> dport->reg_map.host = host; >> cxl_dport_map_regs(dport); >> -- >> 2.40.1 >> >>
On Thu, Aug 08, 2024 at 09:26:23AM +0800, Li, Ming4 wrote: > On 8/6/2024 1:29 PM, Alison Schofield wrote: > > On Tue, Aug 06, 2024 at 04:15:47AM +0000, Li Ming wrote: > >> cxl_setup_parent_dport() needs to get RCH dport AER capability address > >> from RCRB to disable AER interrupt. The function does not check if dport > >> is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev > >> in VH case because dport_dev points to a pci device(RP or switch DSP) > >> rather than a pci host bridge device. > > Ming, > > > > Please add that this leads to a page fault. (from off-list chat) > > Sure, will do it. > > > > > >> Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the > >> RCH topology created by cxl-test, in the case, the RCH dport's dport_dev > >> is a platform device, and RCH dport with an available RCRB and AER > >> Capability is not supported yet. > >> > >> Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery") > >> Signed-off-by: Li Ming <ming4.li@intel.com> > >> --- > >> drivers/cxl/core/pci.c | 14 ++++++++++---- > >> 1 file changed, 10 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c > >> index a663e7566c48..f8a3188f5b17 100644 > >> --- a/drivers/cxl/core/pci.c > >> +++ b/drivers/cxl/core/pci.c > >> @@ -4,6 +4,7 @@ > >> #include <linux/io-64-nonatomic-lo-hi.h> > >> #include <linux/device.h> > >> #include <linux/delay.h> > >> +#include <linux/platform_device.h> > >> #include <linux/pci.h> > >> #include <linux/pci-doe.h> > >> #include <linux/aer.h> > >> @@ -834,11 +835,16 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport) > >> void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport) > >> { > >> struct device *dport_dev = dport->dport_dev; > >> - struct pci_host_bridge *host_bridge; > >> > >> - host_bridge = to_pci_host_bridge(dport_dev); > >> - if (host_bridge->native_aer) > >> - dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); > >> + if (dev_is_platform(dport_dev)) > >> + return; > > The above, dev_is_plaform(), is the cxl-test only case - right? > > Thanks for review. > > Right, only dports created by cxl-test using platform_device as dport->dport_dev. > > > > I'm wondering if we need to mock something differently in cxl-test > > because it is unusual to have a cxl-test special case in the driver. > > Make sense, what do you think if I implement a mock function like mock_cxl_setup_parent_dport() for the case? The mock function can check if the dport is created by cxl-test then skip it. > Sounds good to me :) > > Ming > > > > > Let's see what others say. > > > > --Alison > > > > > > > >> + > >> + if (dport->rch) { > >> + struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev); > >> + > >> + if (host_bridge->native_aer) > >> + dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); > >> + } > >> > >> dport->reg_map.host = host; > >> cxl_dport_map_regs(dport); > >> -- > >> 2.40.1 > >> > >> >
diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c index a663e7566c48..f8a3188f5b17 100644 --- a/drivers/cxl/core/pci.c +++ b/drivers/cxl/core/pci.c @@ -4,6 +4,7 @@ #include <linux/io-64-nonatomic-lo-hi.h> #include <linux/device.h> #include <linux/delay.h> +#include <linux/platform_device.h> #include <linux/pci.h> #include <linux/pci-doe.h> #include <linux/aer.h> @@ -834,11 +835,16 @@ static void cxl_disable_rch_root_ints(struct cxl_dport *dport) void cxl_setup_parent_dport(struct device *host, struct cxl_dport *dport) { struct device *dport_dev = dport->dport_dev; - struct pci_host_bridge *host_bridge; - host_bridge = to_pci_host_bridge(dport_dev); - if (host_bridge->native_aer) - dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); + if (dev_is_platform(dport_dev)) + return; + + if (dport->rch) { + struct pci_host_bridge *host_bridge = to_pci_host_bridge(dport_dev); + + if (host_bridge->native_aer) + dport->rcrb.aer_cap = cxl_rcrb_to_aer(dport_dev, dport->rcrb.base); + } dport->reg_map.host = host; cxl_dport_map_regs(dport);
cxl_setup_parent_dport() needs to get RCH dport AER capability address from RCRB to disable AER interrupt. The function does not check if dport is RCH dport, it will get a wrong pci_host_bridge structure by dport_dev in VH case because dport_dev points to a pci device(RP or switch DSP) rather than a pci host bridge device. Besides, cxl_setup_parent_dport() can exit simply if a RCH dport in the RCH topology created by cxl-test, in the case, the RCH dport's dport_dev is a platform device, and RCH dport with an available RCRB and AER Capability is not supported yet. Fixes: f05fd10d138d ("cxl/pci: Add RCH downstream port AER register discovery") Signed-off-by: Li Ming <ming4.li@intel.com> --- drivers/cxl/core/pci.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)