mbox series

[0/3] PCI: Add Secondary Bus Reset (SBR) support for CXL

Message ID 20240311204132.62757-1-dave.jiang@intel.com
Headers show
Series PCI: Add Secondary Bus Reset (SBR) support for CXL | expand

Message

Dave Jiang March 11, 2024, 8:39 p.m. UTC
Hi Bjorn,
Please consider this series for kernel 6.10. The series attempt to
add secondary bus reset (SBR) support to CXL. By default, SBR for CXL is
masked. Per CXL specification r3.1 8.1.5.2, the Port Control Extensions
register bit 0 (Unmask SBR) in the host bridge controls the masking of CXL SBR.
"When 0, SBR bit in Bridge Control register of this Port has no effect. When 1,
the Port shall generate hot reset when SBR in Bridge Control gets set to 1.
Default value of this bit is 0. When the Port is operating in PCIe mode or RCD
mode, this field has no effect on SBR functionality and Port shall follow PCIe
Base Specification."

Patch 1:
Add check to PCI bus_reset path for CXL device and return with error if "Unmask
SBR" bit is set to 0. This allows user to realize that SBR is masked for this
CXL device. However, if the user sets the "Unmask SBR" bit via a tool such as
setpci, then the bus_reset will proceed.

Patch2:
Add a new PCI reset method "cxl_bus_force" in order to allow the user an
intetional way to perform SBR. The code will set the "Unmask SBR" bit to
1 and proceed with bus_reset. The original value of the bit will be restored
after the reset operation.

Patch3:
CXL driver change that provides a ->reset_done() callback. It compares the
hardware decoder settings with the existing software configuration and emit
warning if they differ. The difference indicates that decoders were programmed
before the reset and are now cleared after reset. There may be onlined system
memory backed by CXL memory device that are now violently ripped away from
kernel mapping.

Patch series stemmed from this [1] patch. With comments [2] from Bjorn.

[1]: https://lore.kernel.org/linux-cxl/20240215232307.2793530-1-dave.jiang@intel.com/
[2]: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/

Comments

Lukas Wunner March 12, 2024, 7:46 a.m. UTC | #1
On Mon, Mar 11, 2024 at 01:39:52PM -0700, Dave Jiang wrote:
> Patch 1:
> Add check to PCI bus_reset path for CXL device and return with error if "Unmask
> SBR" bit is set to 0. This allows user to realize that SBR is masked for this
> CXL device. However, if the user sets the "Unmask SBR" bit via a tool such as
> setpci, then the bus_reset will proceed.

So is the point of patch 1 only to inform the user that the SBR has
no effect?  Or does the SBR have any negative side effects that you
want to avoid (e.g. due to the config space save/restore)?

If you only want to inform the user, then this functionality could
live in a ->reset_prepare() callback exposed by the cxl subsystem
and the pci subsystem wouldn't have to be touched at all.

Thanks,

Lukas
Dave Jiang March 12, 2024, 9:31 p.m. UTC | #2
On 3/12/24 12:46 AM, Lukas Wunner wrote:
> On Mon, Mar 11, 2024 at 01:39:52PM -0700, Dave Jiang wrote:
>> Patch 1:
>> Add check to PCI bus_reset path for CXL device and return with error if "Unmask
>> SBR" bit is set to 0. This allows user to realize that SBR is masked for this
>> CXL device. However, if the user sets the "Unmask SBR" bit via a tool such as
>> setpci, then the bus_reset will proceed.
> 
> So is the point of patch 1 only to inform the user that the SBR has
> no effect?  Or does the SBR have any negative side effects that you
> want to avoid (e.g. due to the config space save/restore)?
> 
> If you only want to inform the user, then this functionality could
> live in a ->reset_prepare() callback exposed by the cxl subsystem
> and the pci subsystem wouldn't have to be touched at all.

Patch 1 is to inform the user that SBR has no effect. The user needs to be informed via direct errno feedback that reset is masked and isn't going to happen. In this [1] response, I believe Bjorn wanted pci_reset_function() to fail. ->reset_prepare() is only helpful if the cxl_pci driver is loaded. CXL mem devices can be programmed by the BIOS and be active without CXL driver being loaded. Also, cxl_pci only picks up PCI_CLASS_MEMORY_CXL. So other CXL devices would not be managed by this driver.

[1]: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/


> 
> Thanks,
> 
> Lukas