diff mbox

pci: Avoid reentrant calls to work_on_cpu

Message ID 20130514221748.6180.30597.stgit@ahduyck-cp1.jf.intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Duyck, Alexander H May 14, 2013, 10:26 p.m. UTC
This change is meant to fix a deadlock seen when pci_enable_sriov was
called from within a driver's probe routine.  The issue was that
work_on_cpu calls flush_work which attempts to flush a work queue for a
cpu that we are currently working in.  In order to avoid the reentrant
path we just skip the call to work_on_cpu in the case that the device
node matches our current node.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
---

This patch is meant to address the issue pointed out in an earlier patch
sent by Yinghai Lu titled:
  [PATCH 6/7] PCI: Make sure VF's driver get attached after PF's

 drivers/pci/pci-driver.c |   14 +++++++++-----
 1 files changed, 9 insertions(+), 5 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Or Gerlitz May 15, 2013, 12:32 a.m. UTC | #1
On Tue, May 14, 2013 at 6:26 PM, Alexander Duyck
<alexander.h.duyck@intel.com> wrote:
>
> This change is meant to fix a deadlock seen when pci_enable_sriov was
> called from within a driver's probe routine.  The issue was that
> work_on_cpu calls flush_work which attempts to flush a work queue for a
> cpu that we are currently working in.  In order to avoid the reentrant
> path we just skip the call to work_on_cpu in the case that the device
> node matches our current node.
>
> Reported-by: Yinghai Lu <yinghai@kernel.org>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
>
> This patch is meant to address the issue pointed out in an earlier patch
> sent by Yinghai Lu titled:
>   [PATCH 6/7] PCI: Make sure VF's driver get attached after PF's
>
>  drivers/pci/pci-driver.c |   14 +++++++++-----
>  1 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 79277fb..caeb1c0 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -277,12 +277,16 @@ static int pci_call_probe(struct pci_driver *drv,
> struct pci_dev *dev,
>         int error, node;
>         struct drv_dev_and_id ddi = { drv, dev, id };
>
> -       /* Execute driver initialization on node where the device's
> -          bus is attached to.  This way the driver likely allocates
> -          its local memory on the right node without any need to
> -          change it. */
> +       /*
> +        * Execute driver initialization on the node where the device's
> +        * bus is attached.  This way the driver likely allocates
> +        * its local memory on the right node without any need to
> +        * change it.  If the node is the current node just call
> +        * local_pci_probe and avoid the possibility of reentrant
> +        * calls to work_on_cpu.
> +        */
>         node = dev_to_node(&dev->dev);
> -       if (node >= 0) {
> +       if ((node >= 0) && (node != numa_node_id())) {
>                 int cpu;
>
>                 get_online_cpus();


Alex, FWIW a similar patch was posted by Michael during the last rc
cycles of 3.9 see
http://marc.info/?l=linux-netdev&m=136569426119644&w=2
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexander Duyck May 15, 2013, 1:58 a.m. UTC | #2
On 05/14/2013 05:32 PM, Or Gerlitz wrote:
> On Tue, May 14, 2013 at 6:26 PM, Alexander Duyck
> <alexander.h.duyck@intel.com> wrote:
>>
>> This change is meant to fix a deadlock seen when pci_enable_sriov was
>> called from within a driver's probe routine.  The issue was that
>> work_on_cpu calls flush_work which attempts to flush a work queue for a
>> cpu that we are currently working in.  In order to avoid the reentrant
>> path we just skip the call to work_on_cpu in the case that the device
>> node matches our current node.
>>
>> Reported-by: Yinghai Lu <yinghai@kernel.org>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>> ---
>>
>> This patch is meant to address the issue pointed out in an earlier patch
>> sent by Yinghai Lu titled:
>>   [PATCH 6/7] PCI: Make sure VF's driver get attached after PF's
>>
>>  drivers/pci/pci-driver.c |   14 +++++++++-----
>>  1 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 79277fb..caeb1c0 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -277,12 +277,16 @@ static int pci_call_probe(struct pci_driver *drv,
>> struct pci_dev *dev,
>>         int error, node;
>>         struct drv_dev_and_id ddi = { drv, dev, id };
>>
>> -       /* Execute driver initialization on node where the device's
>> -          bus is attached to.  This way the driver likely allocates
>> -          its local memory on the right node without any need to
>> -          change it. */
>> +       /*
>> +        * Execute driver initialization on the node where the device's
>> +        * bus is attached.  This way the driver likely allocates
>> +        * its local memory on the right node without any need to
>> +        * change it.  If the node is the current node just call
>> +        * local_pci_probe and avoid the possibility of reentrant
>> +        * calls to work_on_cpu.
>> +        */
>>         node = dev_to_node(&dev->dev);
>> -       if (node >= 0) {
>> +       if ((node >= 0) && (node != numa_node_id())) {
>>                 int cpu;
>>
>>                 get_online_cpus();
> 
> 
> Alex, FWIW a similar patch was posted by Michael during the last rc
> cycles of 3.9 see
> http://marc.info/?l=linux-netdev&m=136569426119644&w=2

Did his patch ever get applied anywhere?  I don't see it in any of the
trees.

The advantage this approach has over the one in the similar patch is
that this covers a broader set of CPUs since anything on the same node
is local versus just the first CPU in a given NUMA node.

Thanks,

Alex



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu May 15, 2013, 2:50 a.m. UTC | #3
On Tue, May 14, 2013 at 3:26 PM, Alexander Duyck
<alexander.h.duyck@intel.com> wrote:
> This change is meant to fix a deadlock seen when pci_enable_sriov was
> called from within a driver's probe routine.  The issue was that
> work_on_cpu calls flush_work which attempts to flush a work queue for a
> cpu that we are currently working in.  In order to avoid the reentrant
> path we just skip the call to work_on_cpu in the case that the device
> node matches our current node.
>
> Reported-by: Yinghai Lu <yinghai@kernel.org>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
> ---
>
> This patch is meant to address the issue pointed out in an earlier patch
> sent by Yinghai Lu titled:
>   [PATCH 6/7] PCI: Make sure VF's driver get attached after PF's

Yes, that help. my v2 patch will not need to device schecdule and
device_initicall to wait
first work_on_cpu is done.

Tested-by: Yinghai Lu <yinghai@kernel.org>

>
>  drivers/pci/pci-driver.c |   14 +++++++++-----
>  1 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 79277fb..caeb1c0 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -277,12 +277,16 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
>         int error, node;
>         struct drv_dev_and_id ddi = { drv, dev, id };
>
> -       /* Execute driver initialization on node where the device's
> -          bus is attached to.  This way the driver likely allocates
> -          its local memory on the right node without any need to
> -          change it. */
> +       /*
> +        * Execute driver initialization on the node where the device's
> +        * bus is attached.  This way the driver likely allocates
> +        * its local memory on the right node without any need to
> +        * change it.  If the node is the current node just call
> +        * local_pci_probe and avoid the possibility of reentrant
> +        * calls to work_on_cpu.
> +        */
>         node = dev_to_node(&dev->dev);
> -       if (node >= 0) {
> +       if ((node >= 0) && (node != numa_node_id())) {
>                 int cpu;
>
>                 get_online_cpus();
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Duyck, Alexander H June 12, 2013, 5:58 p.m. UTC | #4
On 05/14/2013 07:50 PM, Yinghai Lu wrote:
> On Tue, May 14, 2013 at 3:26 PM, Alexander Duyck
> <alexander.h.duyck@intel.com> wrote:
>> This change is meant to fix a deadlock seen when pci_enable_sriov was
>> called from within a driver's probe routine.  The issue was that
>> work_on_cpu calls flush_work which attempts to flush a work queue for a
>> cpu that we are currently working in.  In order to avoid the reentrant
>> path we just skip the call to work_on_cpu in the case that the device
>> node matches our current node.
>>
>> Reported-by: Yinghai Lu <yinghai@kernel.org>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
>> ---
>>
>> This patch is meant to address the issue pointed out in an earlier patch
>> sent by Yinghai Lu titled:
>>   [PATCH 6/7] PCI: Make sure VF's driver get attached after PF's
> Yes, that help. my v2 patch will not need to device schecdule and
> device_initicall to wait
> first work_on_cpu is done.
>
> Tested-by: Yinghai Lu <yinghai@kernel.org>

So what ever happened with this patch?  It doesn't look like it was
applied anywhere.  Was there some objection to it?  If so I can update
and resubmit if necessary.

Thanks,

Alex


>
>>  drivers/pci/pci-driver.c |   14 +++++++++-----
>>  1 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 79277fb..caeb1c0 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -277,12 +277,16 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
>>         int error, node;
>>         struct drv_dev_and_id ddi = { drv, dev, id };
>>
>> -       /* Execute driver initialization on node where the device's
>> -          bus is attached to.  This way the driver likely allocates
>> -          its local memory on the right node without any need to
>> -          change it. */
>> +       /*
>> +        * Execute driver initialization on the node where the device's
>> +        * bus is attached.  This way the driver likely allocates
>> +        * its local memory on the right node without any need to
>> +        * change it.  If the node is the current node just call
>> +        * local_pci_probe and avoid the possibility of reentrant
>> +        * calls to work_on_cpu.
>> +        */
>>         node = dev_to_node(&dev->dev);
>> -       if (node >= 0) {
>> +       if ((node >= 0) && (node != numa_node_id())) {
>>                 int cpu;
>>
>>                 get_online_cpus();
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 79277fb..caeb1c0 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -277,12 +277,16 @@  static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 	int error, node;
 	struct drv_dev_and_id ddi = { drv, dev, id };
 
-	/* Execute driver initialization on node where the device's
-	   bus is attached to.  This way the driver likely allocates
-	   its local memory on the right node without any need to
-	   change it. */
+	/*
+	 * Execute driver initialization on the node where the device's
+	 * bus is attached.  This way the driver likely allocates
+	 * its local memory on the right node without any need to
+	 * change it.  If the node is the current node just call
+	 * local_pci_probe and avoid the possibility of reentrant
+	 * calls to work_on_cpu.
+	 */
 	node = dev_to_node(&dev->dev);
-	if (node >= 0) {
+	if ((node >= 0) && (node != numa_node_id())) {
 		int cpu;
 
 		get_online_cpus();