diff mbox series

[v1,10/12] PCI: hotplug: implement the hotplug_slot_ops callback for fpgahp

Message ID 20230119013602.607466-11-tianfei.zhang@intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show
Series add FPGA hotplug manager driver | expand

Commit Message

Zhang, Tianfei Jan. 19, 2023, 1:36 a.m. UTC
Implement the image_load and available_images callback functions
for fpgahp driver. This patch leverages some APIs from pciehp
driver to implement the device reconfiguration below the PCI hotplug
bridge.

Here are the steps for a process of image load.
1. remove all PFs and VFs except the PF0.
2. remove all non-reserved devices of PF0.
3. trigger a image load via BMC.
4. disable the link of the hotplug bridge.
5. remove all reserved devices under PF0 and PCI devices
   below the hotplug bridge.
6. wait for image load done via BMC, e.g. 10s.
7. re-enable the link of the hotplug bridge.
8. re-enumerate PCI devices below the hotplug bridge.

Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
---
 Documentation/ABI/testing/sysfs-driver-fpgahp |  21 ++
 MAINTAINERS                                   |   1 +
 drivers/pci/hotplug/fpgahp.c                  | 179 ++++++++++++++++++
 3 files changed, 201 insertions(+)
 create mode 100644 Documentation/ABI/testing/sysfs-driver-fpgahp

Comments

Greg Kroah-Hartman Jan. 19, 2023, 1:28 p.m. UTC | #1
On Wed, Jan 18, 2023 at 08:36:00PM -0500, Tianfei Zhang wrote:
> Implement the image_load and available_images callback functions
> for fpgahp driver. This patch leverages some APIs from pciehp
> driver to implement the device reconfiguration below the PCI hotplug
> bridge.
> 
> Here are the steps for a process of image load.
> 1. remove all PFs and VFs except the PF0.
> 2. remove all non-reserved devices of PF0.
> 3. trigger a image load via BMC.
> 4. disable the link of the hotplug bridge.
> 5. remove all reserved devices under PF0 and PCI devices
>    below the hotplug bridge.
> 6. wait for image load done via BMC, e.g. 10s.
> 7. re-enable the link of the hotplug bridge.
> 8. re-enumerate PCI devices below the hotplug bridge.
> 
> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> ---
>  Documentation/ABI/testing/sysfs-driver-fpgahp |  21 ++
>  MAINTAINERS                                   |   1 +
>  drivers/pci/hotplug/fpgahp.c                  | 179 ++++++++++++++++++
>  3 files changed, 201 insertions(+)
>  create mode 100644 Documentation/ABI/testing/sysfs-driver-fpgahp
> 
> diff --git a/Documentation/ABI/testing/sysfs-driver-fpgahp b/Documentation/ABI/testing/sysfs-driver-fpgahp
> new file mode 100644
> index 000000000000..8d4b1bfc4012
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-driver-fpgahp
> @@ -0,0 +1,21 @@
> +What:		/sys/bus/pci/slots/X-X/available_images
> +Date:		May 2023
> +KernelVersion:	6.3
> +Contact:	Tianfei Zhang <tianfei.zhang@intel.com>
> +Description:	Read-only. This file returns a space separated list of
> +		key words that may be written into the image_load file
> +		described below. These keywords decribe an FPGA, BMC,
> +		or firmware image in FLASH or EEPROM storage that may
> +		be loaded.

No, sysfs is "one value per file", why is this a list?

And what exactly defines the values in this list?

> +
> +What:		/sys/bus/pci/slots/X-X/image_load
> +Date:		May 2023
> +KernelVersion:	6.3
> +Contact:	Tianfei Zhang <tianfei.zhang@intel.com>
> +Description:	Write-only. A key word may be written to this file to
> +		trigger a new image loading of an FPGA, BMC, or firmware
> +		image from FLASH or EEPROM. Refer to the available_images
> +		file for a list of supported key words for the underlying
> +		device.
> +		Writing an unsupported string to this file will result in
> +		EINVAL being returned.

Why is this a separate file from the "read the list" file?

That feels wrong.

thanks,

greg k-h
Russ Weight Jan. 20, 2023, 10:38 p.m. UTC | #2
On 1/19/23 05:28, Greg KH wrote:
> On Wed, Jan 18, 2023 at 08:36:00PM -0500, Tianfei Zhang wrote:
>> Implement the image_load and available_images callback functions
>> for fpgahp driver. This patch leverages some APIs from pciehp
>> driver to implement the device reconfiguration below the PCI hotplug
>> bridge.
>>
>> Here are the steps for a process of image load.
>> 1. remove all PFs and VFs except the PF0.
>> 2. remove all non-reserved devices of PF0.
>> 3. trigger a image load via BMC.
>> 4. disable the link of the hotplug bridge.
>> 5. remove all reserved devices under PF0 and PCI devices
>>    below the hotplug bridge.
>> 6. wait for image load done via BMC, e.g. 10s.
>> 7. re-enable the link of the hotplug bridge.
>> 8. re-enumerate PCI devices below the hotplug bridge.
>>
>> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
>> ---
>>  Documentation/ABI/testing/sysfs-driver-fpgahp |  21 ++
>>  MAINTAINERS                                   |   1 +
>>  drivers/pci/hotplug/fpgahp.c                  | 179 ++++++++++++++++++
>>  3 files changed, 201 insertions(+)
>>  create mode 100644 Documentation/ABI/testing/sysfs-driver-fpgahp
>>
>> diff --git a/Documentation/ABI/testing/sysfs-driver-fpgahp b/Documentation/ABI/testing/sysfs-driver-fpgahp
>> new file mode 100644
>> index 000000000000..8d4b1bfc4012
>> --- /dev/null
>> +++ b/Documentation/ABI/testing/sysfs-driver-fpgahp
>> @@ -0,0 +1,21 @@
>> +What:		/sys/bus/pci/slots/X-X/available_images
>> +Date:		May 2023
>> +KernelVersion:	6.3
>> +Contact:	Tianfei Zhang <tianfei.zhang@intel.com>
>> +Description:	Read-only. This file returns a space separated list of
>> +		key words that may be written into the image_load file
>> +		described below. These keywords decribe an FPGA, BMC,
>> +		or firmware image in FLASH or EEPROM storage that may
>> +		be loaded.
> No, sysfs is "one value per file", why is this a list?
>
> And what exactly defines the values in this list?
>
>> +
>> +What:		/sys/bus/pci/slots/X-X/image_load
>> +Date:		May 2023
>> +KernelVersion:	6.3
>> +Contact:	Tianfei Zhang <tianfei.zhang@intel.com>
>> +Description:	Write-only. A key word may be written to this file to
>> +		trigger a new image loading of an FPGA, BMC, or firmware
>> +		image from FLASH or EEPROM. Refer to the available_images
>> +		file for a list of supported key words for the underlying
>> +		device.
>> +		Writing an unsupported string to this file will result in
>> +		EINVAL being returned.
> Why is this a separate file from the "read the list" file?

The intended usage is like this:

$ cat available_images
bmc_factory bmc_user fpga_factory fpga_user1 fpga_user2
$ echo bmc_user > image_load

This specifies which image stored in flash that you want to have activated
on the device.

An existing example of something like this is in the tracing code:
available_tracers and current_tracer

Would it be preferable to just create a file for each possible image,
and echo 1 to trigger the event? (echo 1 > bmc_user)

Thanks,
- Russ

> That feels wrong.
>
> thanks,
>
> greg k-h
Greg Kroah-Hartman Jan. 21, 2023, 7:35 a.m. UTC | #3
On Fri, Jan 20, 2023 at 02:38:43PM -0800, Russ Weight wrote:
> 
> 
> On 1/19/23 05:28, Greg KH wrote:
> > On Wed, Jan 18, 2023 at 08:36:00PM -0500, Tianfei Zhang wrote:
> >> Implement the image_load and available_images callback functions
> >> for fpgahp driver. This patch leverages some APIs from pciehp
> >> driver to implement the device reconfiguration below the PCI hotplug
> >> bridge.
> >>
> >> Here are the steps for a process of image load.
> >> 1. remove all PFs and VFs except the PF0.
> >> 2. remove all non-reserved devices of PF0.
> >> 3. trigger a image load via BMC.
> >> 4. disable the link of the hotplug bridge.
> >> 5. remove all reserved devices under PF0 and PCI devices
> >>    below the hotplug bridge.
> >> 6. wait for image load done via BMC, e.g. 10s.
> >> 7. re-enable the link of the hotplug bridge.
> >> 8. re-enumerate PCI devices below the hotplug bridge.
> >>
> >> Signed-off-by: Tianfei Zhang <tianfei.zhang@intel.com>
> >> ---
> >>  Documentation/ABI/testing/sysfs-driver-fpgahp |  21 ++
> >>  MAINTAINERS                                   |   1 +
> >>  drivers/pci/hotplug/fpgahp.c                  | 179 ++++++++++++++++++
> >>  3 files changed, 201 insertions(+)
> >>  create mode 100644 Documentation/ABI/testing/sysfs-driver-fpgahp
> >>
> >> diff --git a/Documentation/ABI/testing/sysfs-driver-fpgahp b/Documentation/ABI/testing/sysfs-driver-fpgahp
> >> new file mode 100644
> >> index 000000000000..8d4b1bfc4012
> >> --- /dev/null
> >> +++ b/Documentation/ABI/testing/sysfs-driver-fpgahp
> >> @@ -0,0 +1,21 @@
> >> +What:		/sys/bus/pci/slots/X-X/available_images
> >> +Date:		May 2023
> >> +KernelVersion:	6.3
> >> +Contact:	Tianfei Zhang <tianfei.zhang@intel.com>
> >> +Description:	Read-only. This file returns a space separated list of
> >> +		key words that may be written into the image_load file
> >> +		described below. These keywords decribe an FPGA, BMC,
> >> +		or firmware image in FLASH or EEPROM storage that may
> >> +		be loaded.
> > No, sysfs is "one value per file", why is this a list?
> >
> > And what exactly defines the values in this list?
> >
> >> +
> >> +What:		/sys/bus/pci/slots/X-X/image_load
> >> +Date:		May 2023
> >> +KernelVersion:	6.3
> >> +Contact:	Tianfei Zhang <tianfei.zhang@intel.com>
> >> +Description:	Write-only. A key word may be written to this file to
> >> +		trigger a new image loading of an FPGA, BMC, or firmware
> >> +		image from FLASH or EEPROM. Refer to the available_images
> >> +		file for a list of supported key words for the underlying
> >> +		device.
> >> +		Writing an unsupported string to this file will result in
> >> +		EINVAL being returned.
> > Why is this a separate file from the "read the list" file?
> 
> The intended usage is like this:
> 
> $ cat available_images
> bmc_factory bmc_user fpga_factory fpga_user1 fpga_user2
> $ echo bmc_user > image_load
> 
> This specifies which image stored in flash that you want to have activated
> on the device.
> 
> An existing example of something like this is in the tracing code:
> available_tracers and current_tracer
> 
> Would it be preferable to just create a file for each possible image,
> and echo 1 to trigger the event? (echo 1 > bmc_user)

That would make things much more simpler overall and not force people to
have to parse a sysfs file, which is the main reason we created sysfs in
the first place.

thanks,

greg k-h
diff mbox series

Patch

diff --git a/Documentation/ABI/testing/sysfs-driver-fpgahp b/Documentation/ABI/testing/sysfs-driver-fpgahp
new file mode 100644
index 000000000000..8d4b1bfc4012
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-fpgahp
@@ -0,0 +1,21 @@ 
+What:		/sys/bus/pci/slots/X-X/available_images
+Date:		May 2023
+KernelVersion:	6.3
+Contact:	Tianfei Zhang <tianfei.zhang@intel.com>
+Description:	Read-only. This file returns a space separated list of
+		key words that may be written into the image_load file
+		described below. These keywords decribe an FPGA, BMC,
+		or firmware image in FLASH or EEPROM storage that may
+		be loaded.
+
+What:		/sys/bus/pci/slots/X-X/image_load
+Date:		May 2023
+KernelVersion:	6.3
+Contact:	Tianfei Zhang <tianfei.zhang@intel.com>
+Description:	Write-only. A key word may be written to this file to
+		trigger a new image loading of an FPGA, BMC, or firmware
+		image from FLASH or EEPROM. Refer to the available_images
+		file for a list of supported key words for the underlying
+		device.
+		Writing an unsupported string to this file will result in
+		EINVAL being returned.
diff --git a/MAINTAINERS b/MAINTAINERS
index 7ac38b7cc44c..85d4e3a0e986 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8168,6 +8168,7 @@  M:	Tianfei Zhang <tianfei.zhang@intel.com>
 L:	linux-fpga@vger.kernel.org
 L:	linux-pci@vger.kernel.org
 S:	Maintained
+F:	Documentation/ABI/testing/sysfs-driver-fpgahp
 F:	drivers/pci/hotplug/fpgahp.c
 F:	include/linux/fpga/fpgahp_manager.h
 
diff --git a/drivers/pci/hotplug/fpgahp.c b/drivers/pci/hotplug/fpgahp.c
index 79bae97a1d39..3fdf37b238c6 100644
--- a/drivers/pci/hotplug/fpgahp.c
+++ b/drivers/pci/hotplug/fpgahp.c
@@ -9,6 +9,7 @@ 
 #include <linux/module.h>
 #include <linux/pci.h>
 #include <linux/pci_hotplug.h>
+#include <linux/pm_runtime.h>
 
 #include "pciehp.h"
 
@@ -25,8 +26,182 @@  struct fpgahp_controller {
 	struct pcie_device *pcie;
 	struct controller ctrl;
 	struct pci_dev *hotplug_bridge;
+	struct mutex lock;  /* parallel access into image_load callback */
 };
 
+static inline struct fpgahp_controller *to_fhpc(struct controller *ctrl)
+{
+	return container_of(ctrl, struct fpgahp_controller, ctrl);
+}
+
+static int fpgahp_available_images(struct hotplug_slot *slot, char *buf)
+{
+	struct controller *ctrl = to_ctrl(slot);
+	struct fpgahp_controller *fhpc = to_fhpc(ctrl);
+	struct fpgahp_manager *mgr = &fhpc->mgr;
+	struct fpgahp_bmc_device *bmc = &mgr->bmc;
+	ssize_t count;
+
+	mutex_lock(&mgr->lock);
+
+	if (!mgr->registered || !bmc->registered)
+		goto err_unlock;
+
+	if (!bmc->ops->available_images)
+		goto err_unlock;
+
+	count = bmc->ops->available_images(bmc, buf);
+
+	mutex_unlock(&mgr->lock);
+
+	return count;
+
+err_unlock:
+	mutex_unlock(&mgr->lock);
+	return -EINVAL;
+}
+
+static void fpgahp_remove_sibling_pci_dev(struct pci_dev *pcidev)
+{
+	struct pci_bus *bus = pcidev->bus;
+	struct pci_dev *sibling, *tmp;
+
+	if (!bus)
+		return;
+
+	list_for_each_entry_safe_reverse(sibling, tmp, &bus->devices, bus_list)
+		if (sibling != pcidev)
+			pci_stop_and_remove_bus_device_locked(sibling);
+}
+
+static int fpgahp_link_enable(struct controller *ctrl)
+{
+	int retval;
+
+	retval = pciehp_link_enable(ctrl);
+	if (retval) {
+		ctrl_err(ctrl, "Can not enable the link!\n");
+		return retval;
+	}
+
+	retval = pciehp_check_link_status(ctrl);
+	if (retval) {
+		ctrl_err(ctrl, "Check link status fail!\n");
+		return retval;
+	}
+
+	retval = pciehp_query_power_fault(ctrl);
+	if (retval)
+		ctrl_err(ctrl, "Slot(%s): Power fault\n", slot_name(ctrl));
+
+	return retval;
+}
+
+static int fpgahp_rescan_slot(struct controller *ctrl)
+{
+	int retval;
+	struct pci_bus *parent = ctrl->pcie->port->subordinate;
+
+	retval = pciehp_configure_device(ctrl);
+	if (retval && retval != -EEXIST)
+		ctrl_err(ctrl, "Cannot add device at %04x:%02x:00\n",
+			 pci_domain_nr(parent), parent->number);
+
+	return retval;
+}
+
+static int __fpgahp_image_load(struct fpgahp_controller *fhpc, const char *buf)
+{
+	struct pci_dev *hotplug_bridge = fhpc->hotplug_bridge;
+	struct fpgahp_manager *mgr = &fhpc->mgr;
+	struct fpgahp_bmc_device *bmc = &mgr->bmc;
+	struct controller *ctrl = &fhpc->ctrl;
+	struct pci_dev *pcidev = mgr->priv;
+	u32 wait_time_msec;
+	int ret;
+
+	ret = pm_runtime_resume_and_get(&hotplug_bridge->dev);
+	if (ret)
+		return ret;
+
+	mutex_lock(&mgr->lock);
+
+	if (!pcidev || !mgr->registered || !bmc->registered || !bmc->ops->image_trigger) {
+		ret = -EINVAL;
+		goto out_unlock;
+	}
+
+	mgr->state = FPGAHP_MGR_LOADING;
+
+	/* 1. remove all PFs and VFs except the PF0 */
+	fpgahp_remove_sibling_pci_dev(pcidev);
+
+	/* 2. remove all non-reserved devices */
+	if (mgr->ops->hotplug_prepare) {
+		ret = mgr->ops->hotplug_prepare(mgr);
+		if (ret) {
+			ctrl_err(ctrl, "Prepare hotplug failed\n");
+			goto out_unlock;
+		}
+	}
+
+	/* 3. trigger loading a new image of BMC */
+	ret = bmc->ops->image_trigger(bmc, buf, &wait_time_msec);
+	if (ret) {
+		ctrl_err(ctrl, "Image trigger failed\n");
+		goto out_unlock;
+	}
+
+	/* 4. disable link of hotplug bridge */
+	pciehp_link_disable(ctrl);
+
+	/*
+	 * unlock the mrg->lock temporarily to avoid the dead lock while re-gain
+	 * the same lock on fpgahp_unregister() during remove PCI devices below the
+	 * hotplug bridge
+	 */
+	mutex_unlock(&mgr->lock);
+
+	/* 5. remove PCI devices below hotplug bridge */
+	pciehp_unconfigure_device(ctrl, true);
+
+	/* 6. wait for FPGA/BMC load done */
+	msleep(wait_time_msec);
+
+	mutex_lock(&mgr->lock);
+
+	/* 7. re-enable link */
+	ret = fpgahp_link_enable(ctrl);
+
+out_unlock:
+	if (ret)
+		mgr->state = FPGAHP_MGR_HP_FAIL;
+	else
+		mgr->state = FPGAHP_MGR_LOAD_DONE;
+
+	mutex_unlock(&mgr->lock);
+
+	/* re-enumerate PCI devices below hotplug bridge */
+	if (!ret)
+		ret = fpgahp_rescan_slot(ctrl);
+
+	pm_runtime_put(&hotplug_bridge->dev);
+	return ret;
+}
+
+static int fpgahp_image_load(struct hotplug_slot *slot, const char *buf)
+{
+	struct controller *ctrl = to_ctrl(slot);
+	struct fpgahp_controller *fhpc = to_fhpc(ctrl);
+	int ret;
+
+	mutex_lock(&fhpc->lock);
+	ret = __fpgahp_image_load(fhpc, buf);
+	mutex_unlock(&fhpc->lock);
+
+	return ret;
+}
+
 static void fpgahp_add_fhpc(struct fpgahp_controller *fhpc)
 {
 	mutex_lock(&fhpc_lock);
@@ -128,6 +303,8 @@  static int fpgahp_init_controller(struct controller *ctrl, struct pcie_device *d
 }
 
 static const struct hotplug_slot_ops fpgahp_slot_ops = {
+	.available_images	= fpgahp_available_images,
+	.image_load		= fpgahp_image_load,
 };
 
 static int fpgahp_init_slot(struct controller *ctrl)
@@ -183,6 +360,7 @@  fpgahp_create_new_fhpc(struct fpgahp_controller *fhpc, struct pci_dev *hotplug_b
 	}
 
 	mutex_init(&mgr->lock);
+	mutex_init(&fhpc->lock);
 
 	fpgahp_add_fhpc(fhpc);
 
@@ -345,3 +523,4 @@  module_exit(fpgahp_exit);
 MODULE_DESCRIPTION("FPGA PCI Hotplug Manager Driver");
 MODULE_AUTHOR("Tianfei Zhang <tianfei.zhang@intel.com>");
 MODULE_LICENSE("GPL");
+MODULE_IMPORT_NS(PCIEHP);