diff mbox

[-v2] PCI, pciehp: make every slot have its own workqueue to avoid deadlock

Message ID 1352450600-1956-1-git-send-email-wangyijing@huawei.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Yijing Wang Nov. 9, 2012, 8:43 a.m. UTC
Currently, pciehp use global pciehp_wq to handle hotplug event from hardware.
Hot remove path will be blocked if a hotplug slot connected a IO-BOX(composed of PCIe
Switch and some slots which support hotplug). The hot removed work was queued into
pciehp_wq. But in the hot-remove path, pciehp driver would flush pciehp_wq when
the pcie port(support pciehp) was removed. In this case the hot-remove path blocked.
This patch remove the global pciehp_wq and create a new workqueue for every slot to
avoid above problem.

-+-[0000:40]-+-00.0-[0000:41]--
 |           +-01.0-[0000:42]--+-00.0  Intel Corporation 82576 Gigabit Network Connection
 |           |                 \-00.1  Intel Corporation 82576 Gigabit Network Connection
 |           +-03.0-[0000:43]----00.0  LSI Logic / Symbios Logic SAS1064ET PCI-Express Fusion-MPT SAS
 |           +-04.0-[0000:44]--
 |           +-05.0-[0000:45]--
 |           +-07.0-[0000:46-4f]----00.0-[0000:47-4f]--+-04.0-[0000:48-49]----00.0-[0000:49]--
 |           |(hotplug slot)                           +-08.0-[0000:4a]--
 |           |                                         +-09.0-[0000:4b]--
 |           |                                         +-10.0-[0000:4c]--
 |           |                                         +-11.0-[0000:4d]--
 |           |                                         +-14.0-[0000:4e]--
 |           |                                         \-15.0-[0000:4f]--+-00.0  Intel Corporation 82576 Gigabit Network Connection
 |           |                                         (hotplug slot)    \-00.1  Intel Corporation 82576 Gigabit Network Connection


Signed-off-by: Yijing Wang <wangyijing@huawei.com>
---
 drivers/pci/hotplug/pciehp.h      |    2 +-
 drivers/pci/hotplug/pciehp_core.c |   26 +++++++++++++++++---------
 drivers/pci/hotplug/pciehp_ctrl.c |    8 ++++----
 drivers/pci/hotplug/pciehp_hpc.c  |    2 +-
 4 files changed, 23 insertions(+), 15 deletions(-)

Comments

Kenji Kaneshige Nov. 12, 2012, 7:17 a.m. UTC | #1
> @@ -94,6 +93,8 @@ static int init_slot(struct controller *ctrl)
>  	struct hotplug_slot_info *info = NULL;
>  	struct hotplug_slot_ops *ops = NULL;
>  	char name[SLOT_NAME_SIZE];
> +	char *buffer;
> +	int len;
>  	int retval = -ENOMEM;
> 
>  	hotplug = kzalloc(sizeof(*hotplug), GFP_KERNEL);
> @@ -135,6 +136,19 @@ static int init_slot(struct controller *ctrl)
>  	if (retval)
>  		ctrl_err(ctrl,
>  			 "pci_hp_register failed with error %d\n", retval);

I think it's natural to go to out: here if retval != 0.
I guess you intentionally didn't do that because you might want to do
workqueue cleanup in pciehp_release_ctrl() code path. But it is confusing
and hard to understand.

In the previous patch, you created the work queue in pcie_init_slot().
I think it's better. Maybe the reason you moved it from pcie_init_slot()
to init_slot() was that you needed the slot name by pci_hp_register().

How about using physical slot number, which is same as pci slot name by
pci_hp_register() on the normal platform, for the workqueue name instead?

> +
> +	len = strlen(slot_name(slot)) + 16;
> +	buffer = kzalloc(len, GFP_KERNEL);
> +	if (!buffer) {
> +		retval = -ENOMEM;
> +		goto out;
> +	}
> +	snprintf(buffer, len, "pciehp_slot(%s)", slot_name(slot));
> +	slot->wq = create_singlethread_workqueue(buffer);
> +	if (!slot->wq)
> +		retval = -ENOMEM;
> +
> +	kfree(buffer);
>  out:
>  	if (retval) {
>  		kfree(ops);

Regards,
Kenji Kaneshige

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yijing Wang Nov. 12, 2012, 8:27 a.m. UTC | #2
On 2012/11/12 15:17, Kaneshige, Kenji wrote:
>> @@ -94,6 +93,8 @@ static int init_slot(struct controller *ctrl)
>>  	struct hotplug_slot_info *info = NULL;
>>  	struct hotplug_slot_ops *ops = NULL;
>>  	char name[SLOT_NAME_SIZE];
>> +	char *buffer;
>> +	int len;
>>  	int retval = -ENOMEM;
>>
>>  	hotplug = kzalloc(sizeof(*hotplug), GFP_KERNEL);
>> @@ -135,6 +136,19 @@ static int init_slot(struct controller *ctrl)
>>  	if (retval)
>>  		ctrl_err(ctrl,
>>  			 "pci_hp_register failed with error %d\n", retval);
> 
> I think it's natural to go to out: here if retval != 0.
> I guess you intentionally didn't do that because you might want to do
> workqueue cleanup in pciehp_release_ctrl() code path. But it is confusing
> and hard to understand.

Hi Kaneshige,

You are right, go to out here is better.

> In the previous patch, you created the work queue in pcie_init_slot().
> I think it's better. Maybe the reason you moved it from pcie_init_slot()
> to init_slot() was that you needed the slot name by pci_hp_register().
> 
> How about using physical slot number, which is same as pci slot name by
> pci_hp_register() on the normal platform, for the workqueue name instead?
>

I think use physical slot number is ok, so what about workqueue name like slot(physical slot number) format ?

root     45808  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(0)]
root     45815  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(4)]
root     45836  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(5)]
root     45840  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(3)]
root     45842  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(7)]
root     45844  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(228)]
root     45846  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(232)]
root     45848  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(233)]
root     45850  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(240)]
root     45852  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(241)]
root     45854  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(244)]
root     45856  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(245)]
root     45879  0.0  0.0   4416  2304 pts/0    S+   16:25   0:00 grep slot


Thanks!
Yijing

>> +
>> +	len = strlen(slot_name(slot)) + 16;
>> +	buffer = kzalloc(len, GFP_KERNEL);
>> +	if (!buffer) {
>> +		retval = -ENOMEM;
>> +		goto out;
>> +	}
>> +	snprintf(buffer, len, "pciehp_slot(%s)", slot_name(slot));
>> +	slot->wq = create_singlethread_workqueue(buffer);
>> +	if (!slot->wq)
>> +		retval = -ENOMEM;
>> +
>> +	kfree(buffer);
>>  out:
>>  	if (retval) {
>>  		kfree(ops);
> 
> Regards,
> Kenji Kaneshige
> 
> 
> .
>
Kenji Kaneshige Nov. 12, 2012, 8:51 a.m. UTC | #3
> -----Original Message-----
> From: Yijing Wang [mailto:wangyijing@huawei.com]
> Sent: Monday, November 12, 2012 5:28 PM
> To: Kaneshige, Kenji/?? ??
> Cc: Bjorn Helgaas; Yinghai Lu; Rafael; Rusty Russell; Mauro Carvalho Chehab; Oliver Neukum; jiang.liu@huawei.com;
> linux-pci@vger.kernel.org; Hanjun Guo
> Subject: Re: [PATCH -v2] PCI, pciehp: make every slot have its own workqueue to avoid deadlock
> 
> On 2012/11/12 15:17, Kaneshige, Kenji wrote:
> >> @@ -94,6 +93,8 @@ static int init_slot(struct controller *ctrl)
> >>  	struct hotplug_slot_info *info = NULL;
> >>  	struct hotplug_slot_ops *ops = NULL;
> >>  	char name[SLOT_NAME_SIZE];
> >> +	char *buffer;
> >> +	int len;
> >>  	int retval = -ENOMEM;
> >>
> >>  	hotplug = kzalloc(sizeof(*hotplug), GFP_KERNEL);
> >> @@ -135,6 +136,19 @@ static int init_slot(struct controller *ctrl)
> >>  	if (retval)
> >>  		ctrl_err(ctrl,
> >>  			 "pci_hp_register failed with error %d\n", retval);
> >
> > I think it's natural to go to out: here if retval != 0.
> > I guess you intentionally didn't do that because you might want to do
> > workqueue cleanup in pciehp_release_ctrl() code path. But it is confusing
> > and hard to understand.
> 
> Hi Kaneshige,
> 
> You are right, go to out here is better.

Just in case, my proposal is the creating workqueue in pcie_init_slot()
as you did in your first patch.

> 
> > In the previous patch, you created the work queue in pcie_init_slot().
> > I think it's better. Maybe the reason you moved it from pcie_init_slot()
> > to init_slot() was that you needed the slot name by pci_hp_register().
> >
> > How about using physical slot number, which is same as pci slot name by
> > pci_hp_register() on the normal platform, for the workqueue name instead?
> >
> 
> I think use physical slot number is ok, so what about workqueue name like slot(physical slot number) format ?
> 
> root     45808  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(0)]

I think we need "pciehp" in the name. I prefer "pciehp-%u".

Regards,
Kenji Kaneshige

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yijing Wang Nov. 12, 2012, 9:04 a.m. UTC | #4
On 2012/11/12 16:51, Kaneshige, Kenji wrote:
>> -----Original Message-----
>> From: Yijing Wang [mailto:wangyijing@huawei.com]
>> Sent: Monday, November 12, 2012 5:28 PM
>> To: Kaneshige, Kenji/?? ??
>> Cc: Bjorn Helgaas; Yinghai Lu; Rafael; Rusty Russell; Mauro Carvalho Chehab; Oliver Neukum; jiang.liu@huawei.com;
>> linux-pci@vger.kernel.org; Hanjun Guo
>> Subject: Re: [PATCH -v2] PCI, pciehp: make every slot have its own workqueue to avoid deadlock
>>
>> On 2012/11/12 15:17, Kaneshige, Kenji wrote:
>>>> @@ -94,6 +93,8 @@ static int init_slot(struct controller *ctrl)
>>>>  	struct hotplug_slot_info *info = NULL;
>>>>  	struct hotplug_slot_ops *ops = NULL;
>>>>  	char name[SLOT_NAME_SIZE];
>>>> +	char *buffer;
>>>> +	int len;
>>>>  	int retval = -ENOMEM;
>>>>
>>>>  	hotplug = kzalloc(sizeof(*hotplug), GFP_KERNEL);
>>>> @@ -135,6 +136,19 @@ static int init_slot(struct controller *ctrl)
>>>>  	if (retval)
>>>>  		ctrl_err(ctrl,
>>>>  			 "pci_hp_register failed with error %d\n", retval);
>>>
>>> I think it's natural to go to out: here if retval != 0.
>>> I guess you intentionally didn't do that because you might want to do
>>> workqueue cleanup in pciehp_release_ctrl() code path. But it is confusing
>>> and hard to understand.
>>
>> Hi Kaneshige,
>>
>> You are right, go to out here is better.
> 
> Just in case, my proposal is the creating workqueue in pcie_init_slot()
> as you did in your first patch.

ok, i will create workqueue in pcie_init_slot() and use pciehp-%u format as workqueue name.

I will send out the next version patch later.

Thanks!
Yijing.

> 
>>
>>> In the previous patch, you created the work queue in pcie_init_slot().
>>> I think it's better. Maybe the reason you moved it from pcie_init_slot()
>>> to init_slot() was that you needed the slot name by pci_hp_register().
>>>
>>> How about using physical slot number, which is same as pci slot name by
>>> pci_hp_register() on the normal platform, for the workqueue name instead?
>>>
>>
>> I think use physical slot number is ok, so what about workqueue name like slot(physical slot number) format ?
>>
>> root     45808  0.0  0.0      0     0 ?        S<   16:25   0:00 [slot(0)]
> 
> I think we need "pciehp" in the name. I prefer "pciehp-%u".
> 
> Regards,
> Kenji Kaneshige
> 
> 
> .
>
diff mbox

Patch

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 26ffd3e..2c113de 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -44,7 +44,6 @@  extern bool pciehp_poll_mode;
 extern int pciehp_poll_time;
 extern bool pciehp_debug;
 extern bool pciehp_force;
-extern struct workqueue_struct *pciehp_wq;
 
 #define dbg(format, arg...)						\
 do {									\
@@ -78,6 +77,7 @@  struct slot {
 	struct hotplug_slot *hotplug_slot;
 	struct delayed_work work;	/* work for button event */
 	struct mutex lock;
+	struct workqueue_struct *wq;
 };
 
 struct event_info {
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index 916bf4f..86a7ffb 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -42,7 +42,6 @@  bool pciehp_debug;
 bool pciehp_poll_mode;
 int pciehp_poll_time;
 bool pciehp_force;
-struct workqueue_struct *pciehp_wq;
 
 #define DRIVER_VERSION	"0.4"
 #define DRIVER_AUTHOR	"Dan Zink <dan.zink@compaq.com>, Greg Kroah-Hartman <greg@kroah.com>, Dely Sy <dely.l.sy@intel.com>"
@@ -94,6 +93,8 @@  static int init_slot(struct controller *ctrl)
 	struct hotplug_slot_info *info = NULL;
 	struct hotplug_slot_ops *ops = NULL;
 	char name[SLOT_NAME_SIZE];
+	char *buffer;
+	int len;
 	int retval = -ENOMEM;
 
 	hotplug = kzalloc(sizeof(*hotplug), GFP_KERNEL);
@@ -135,6 +136,19 @@  static int init_slot(struct controller *ctrl)
 	if (retval)
 		ctrl_err(ctrl,
 			 "pci_hp_register failed with error %d\n", retval);
+
+	len = strlen(slot_name(slot)) + 16;
+	buffer = kzalloc(len, GFP_KERNEL);
+	if (!buffer) {
+		retval = -ENOMEM;
+		goto out;
+	}
+	snprintf(buffer, len, "pciehp_slot(%s)", slot_name(slot));
+	slot->wq = create_singlethread_workqueue(buffer);
+	if (!slot->wq)
+		retval = -ENOMEM;
+
+	kfree(buffer);
 out:
 	if (retval) {
 		kfree(ops);
@@ -340,18 +354,13 @@  static int __init pcied_init(void)
 {
 	int retval = 0;
 
-	pciehp_wq = alloc_workqueue("pciehp", 0, 0);
-	if (!pciehp_wq)
-		return -ENOMEM;
-
 	pciehp_firmware_init();
 	retval = pcie_port_service_register(&hpdriver_portdrv);
  	dbg("pcie_port_service_register = %d\n", retval);
   	info(DRIVER_DESC " version: " DRIVER_VERSION "\n");
- 	if (retval) {
-		destroy_workqueue(pciehp_wq);
+	if (retval)
 		dbg("Failure to register service\n");
-	}
+
 	return retval;
 }
 
@@ -359,7 +368,6 @@  static void __exit pcied_cleanup(void)
 {
 	dbg("unload_pciehpd()\n");
 	pcie_port_service_unregister(&hpdriver_portdrv);
-	destroy_workqueue(pciehp_wq);
 	info(DRIVER_DESC " version: " DRIVER_VERSION " unloaded\n");
 }
 
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index 27f4429..38f0186 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -49,7 +49,7 @@  static int queue_interrupt_event(struct slot *p_slot, u32 event_type)
 	info->p_slot = p_slot;
 	INIT_WORK(&info->work, interrupt_event_handler);
 
-	queue_work(pciehp_wq, &info->work);
+	queue_work(p_slot->wq, &info->work);
 
 	return 0;
 }
@@ -344,7 +344,7 @@  void pciehp_queue_pushbutton_work(struct work_struct *work)
 		kfree(info);
 		goto out;
 	}
-	queue_work(pciehp_wq, &info->work);
+	queue_work(p_slot->wq, &info->work);
  out:
 	mutex_unlock(&p_slot->lock);
 }
@@ -377,7 +377,7 @@  static void handle_button_press_event(struct slot *p_slot)
 		if (ATTN_LED(ctrl))
 			pciehp_set_attention_status(p_slot, 0);
 
-		queue_delayed_work(pciehp_wq, &p_slot->work, 5*HZ);
+		queue_delayed_work(p_slot->wq, &p_slot->work, 5*HZ);
 		break;
 	case BLINKINGOFF_STATE:
 	case BLINKINGON_STATE:
@@ -439,7 +439,7 @@  static void handle_surprise_event(struct slot *p_slot)
 	else
 		p_slot->state = POWERON_STATE;
 
-	queue_work(pciehp_wq, &info->work);
+	queue_work(p_slot->wq, &info->work);
 }
 
 static void interrupt_event_handler(struct work_struct *work)
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 13b2eaf..1bbbfd9 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -789,7 +789,7 @@  static void pcie_cleanup_slot(struct controller *ctrl)
 {
 	struct slot *slot = ctrl->slot;
 	cancel_delayed_work(&slot->work);
-	flush_workqueue(pciehp_wq);
+	destroy_workqueue(slot->wq);
 	kfree(slot);
 }