Message ID | 20250319035516.222054-1-ming.li@zohomail.com |
---|---|
Headers | show |
Series | Fix using wrong GPF DVSEC location issue | expand |
On Wed, 19 Mar 2025, Li Ming wrote: >But I am not sure if all dports under a same port will have same >configuration space layout, if yes, that will not be a problem. If I am >wrong, please let me know, thanks. Yes, when caching the dvsec was suggested, it was my assumption that the config space would be the same. Thanks, Davidlohr
On Thu, 20 Mar 2025, Davidlohr Bueso wrote: >On Wed, 19 Mar 2025, Li Ming wrote: > >>But I am not sure if all dports under a same port will have same >>configuration space layout, if yes, that will not be a problem. If I am >>wrong, please let me know, thanks. > >Yes, when caching the dvsec was suggested, it was my assumption that the >config space would be the same. Ultimately I don't know what the expectation is here, but your updates do allow more flexibility from vendors, I guess(?). It's a bit late in the cycle, unfortunately, so if these are to go in for v6.15, they would be considered a fix imo, otherwise perhaps they are wanted for v6.16 or not at all (patch 3 does look useful regardless)? Based on some of the topologies listed in qemu, I did some testing (and this was also why the same dvsec config layout) and see things working as expected. https://www.qemu.org/docs/master/system/devices/cxl.html#example-command-lines (i) 2 direct-attached Type3 devices: -next [ 3.938238] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:00.0: Port GPF phase 1 timeout: 20 secs [ 3.939323] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:00.0: Port GPF phase 2 timeout: 20 secs [ 4.003074] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:01.0: Port GPF phase 1 timeout: 20 secs [ 4.003676] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:01.0: Port GPF phase 2 timeout: 20 secs -next+fix [ 3.969841] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:00.0: Port GPF phase 1 timeout: 20 secs [ 3.970957] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:00.0: Port GPF phase 2 timeout: 20 secs [ 3.971622] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:01.0: Port GPF phase 1 timeout: 20 secs [ 3.972664] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:01.0: Port GPF phase 2 timeout: 20 secs (ii) 2 CXL host bridges. Each host bridge has 2 CXL Root Ports, with the CXL Type3 device directly attached: -next [ 6.182333] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:de:00.0: Port GPF phase 1 timeout: 20 secs [ 6.182352] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:de:00.0: Port GPF phase 2 timeout: 20 secs [ 6.183938] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:01.0: Port GPF phase 1 timeout: 20 secs [ 6.183955] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:01.0: Port GPF phase 2 timeout: 20 secs [ 6.204324] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:00.0: Port GPF phase 1 timeout: 20 secs [ 6.205407] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:00.0: Port GPF phase 2 timeout: 20 secs [ 6.210006] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:de:01.0: Port GPF phase 1 timeout: 20 secs [ 6.210921] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:de:01.0: Port GPF phase 2 timeout: 20 secs -next+fix [ 6.153093] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:01.0: Port GPF phase 1 timeout: 20 secs [ 6.153107] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:01.0: Port GPF phase 2 timeout: 20 secs [ 6.154170] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:de:01.0: Port GPF phase 1 timeout: 20 secs [ 6.154187] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:de:01.0: Port GPF phase 2 timeout: 20 secs [ 6.195279] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:de:00.0: Port GPF phase 1 timeout: 20 secs [ 6.195859] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:de:00.0: Port GPF phase 2 timeout: 20 secs [ 6.255782] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:00.0: Port GPF phase 1 timeout: 20 secs [ 6.257152] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:00.0: Port GPF phase 2 timeout: 20 secs (iii) 4 Type3 devices below a CXL Switch: -next [ 3.940200] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0e:01.0: Port GPF phase 1 timeout: 20 secs [ 3.940218] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0e:01.0: Port GPF phase 2 timeout: 20 secs [ 3.940231] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0e:00.0: Port GPF phase 1 timeout: 20 secs [ 3.940245] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0e:00.0: Port GPF phase 2 timeout: 20 secs [ 3.940340] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:00.0: Port GPF phase 1 timeout: 20 secs [ 3.940350] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0c:00.0: Port GPF phase 2 timeout: 20 secs [ 3.948114] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0e:02.0: Port GPF phase 1 timeout: 20 secs [ 3.949203] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0e:02.0: Port GPF phase 2 timeout: 20 secs [ 3.997620] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0e:03.0: Port GPF phase 1 timeout: 20 secs [ 3.997641] cxl_core:update_gpf_port_dvsec:1125: pcieport 0000:0e:03.0: Port GPF phase 2 timeout: 20 secs -next+fix [ 3.983632] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0e:01.0: Port GPF phase 1 timeout: 20 secs [ 3.983649] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0e:01.0: Port GPF phase 2 timeout: 20 secs [ 3.984525] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:00.0: Port GPF phase 1 timeout: 20 secs [ 3.984539] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0c:00.0: Port GPF phase 2 timeout: 20 secs [ 3.984667] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0e:03.0: Port GPF phase 1 timeout: 20 secs [ 3.984692] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0e:03.0: Port GPF phase 2 timeout: 20 secs [ 3.985074] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0e:02.0: Port GPF phase 1 timeout: 20 secs [ 3.985090] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0e:02.0: Port GPF phase 2 timeout: 20 secs [ 3.988954] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0e:00.0: Port GPF phase 1 timeout: 20 secs [ 3.990464] cxl_core:update_gpf_port_dvsec:1131: pcieport 0000:0e:00.0: Port GPF phase 2 timeout: 20 secs Thanks, Davidlohr
On 3/21/2025 11:59 AM, Davidlohr Bueso wrote: > On Thu, 20 Mar 2025, Davidlohr Bueso wrote: > >> On Wed, 19 Mar 2025, Li Ming wrote: >> >>> But I am not sure if all dports under a same port will have same >>> configuration space layout, if yes, that will not be a problem. If I am >>> wrong, please let me know, thanks. >> >> Yes, when caching the dvsec was suggested, it was my assumption that the >> config space would be the same. > > Ultimately I don't know what the expectation is here, but your updates > do allow more flexibility from vendors, I guess(?). It's a bit late > in the cycle, unfortunately, so if these are to go in for v6.15, they > would be considered a fix imo, otherwise perhaps they are wanted for > v6.16 or not at all (patch 3 does look useful regardless)? My understanding is that the expectation of the patchset is to avoid using a wrong GPF DVSEC in case of dports under a same port have different config space layout. And I think the change is more closely to the description of CXL spec. If the case(dports under a same port have different config space layout) would not happen, maybe add a comment in cxl_gpf_port_setup() is another option. Yes, if patch 1 & 2 are considered to be merged, they are worth a fix tag. And patch 3 is an obvious cleanup change. > > Based on some of the topologies listed in qemu, I did some testing (and > this was also why the same dvsec config layout) and see things working as > expected. Thanks for testing. Ming [snip]
On Fri, 21 Mar 2025 14:55:42 +0800 Li Ming <ming.li@zohomail.com> wrote: > On 3/21/2025 11:59 AM, Davidlohr Bueso wrote: > > On Thu, 20 Mar 2025, Davidlohr Bueso wrote: > > > >> On Wed, 19 Mar 2025, Li Ming wrote: > >> > >>> But I am not sure if all dports under a same port will have same > >>> configuration space layout, if yes, that will not be a problem. If I am > >>> wrong, please let me know, thanks. > >> > >> Yes, when caching the dvsec was suggested, it was my assumption that the > >> config space would be the same. > > > > Ultimately I don't know what the expectation is here, but your updates > > do allow more flexibility from vendors, I guess(?). It's a bit late > > in the cycle, unfortunately, so if these are to go in for v6.15, they > > would be considered a fix imo, otherwise perhaps they are wanted for > > v6.16 or not at all (patch 3 does look useful regardless)? > > My understanding is that the expectation of the patchset is to avoid using a wrong GPF DVSEC in case of dports under a same port have different config space layout. And I think the change is more closely to the description of CXL spec. > > If the case(dports under a same port have different config space layout) would not happen, maybe add a comment in cxl_gpf_port_setup() is another option. > > Yes, if patch 1 & 2 are considered to be merged, they are worth a fix tag. And patch 3 is an obvious cleanup change. I think they can indeed have different layout (in theory). Seems moderately unlikely to occur in real devices, but you never know. So I think a fixes tag would be valid. Jonathan > > > > > Based on some of the topologies listed in qemu, I did some testing (and > > this was also why the same dvsec config layout) and see things working as > > expected. > > Thanks for testing. > > > Ming > > [snip] >
Li Ming wrote: > During review all new patches in branch cxl/next. I noticed there may be > a problem in below commit. > > commit a52b6a2c1c99 ("cxl/pci: Support Global Persistent Flush (GPF)") > > There is a new field gpf_dvsec in struct cxl_port to cache GPF DVSEC for > Port(DVSEC ID 04h) location. When the first EP attaching to a cxl port, > it will trigger locating GPF DVSEC on the cxl dport which the first EP > is under, then the location is cached in port->gpf_dvsec. So if another > EP under another dport is attaching, it will reuse the value of > port->gpf_dvsec as GPF DVSEC location for this another dport. The > problem is the cached location may be a wrong location for other dports > of the port. > > Per Table 8-2 in CXL r3.2 section 8.1.1 and CXL r3.2 section 8.1.6, Each > CXL Downstream switch ports and CXL root ports have their own GPF DVSEC > for Port(DVSEC ID 04h). So my understanding is that CXL subsystem should > locate GPF DVSEC for Port for each dport rather than using the cached > location in CXL port. > > But I am not sure if all dports under a same port will have same > configuration space layout, if yes, that will not be a problem. If I am > wrong, please let me know, thanks. > > base-commit: 3b5d43245f0a56390baaa670e1b6d898772266b3 cxl/next > > Li Ming (3): > cxl/core: Fix caching dport GPF DVSEC issue > cxl/pci: Update Port GPF timeout only when the first EP attaching > cxl/pci: Drop the parameter is_port of cxl_gpf_get_dvsec() These look good. I doubt a device would build different offsets per-dport, but the real risk is more in confusing future code readers that dports are uniform. For the series, Reviewed-by: Dan Williams <dan.j.williams@intel.com> A Fixes tag seems reasonable.