diff mbox series

[RFC,v1,3/3] PCI: Limit pci_alloc_irq_vectors as per housekeeping CPUs

Message ID 20200909150818.313699-4-nitesh@redhat.com (mailing list archive)
State Superseded, archived
Headers show
Series isolation: limit msix vectors based on housekeeping CPUs | expand

Commit Message

Nitesh Narayan Lal Sept. 9, 2020, 3:08 p.m. UTC
This patch limits the pci_alloc_irq_vectors max vectors that is passed on
by the caller based on the available housekeeping CPUs by only using the
minimum of the two.

A minimum of the max_vecs passed and available housekeeping CPUs is
derived to ensure that we don't create excess vectors which can be
problematic specifically in an RT environment. This is because for an RT
environment unwanted IRQs are moved to the housekeeping CPUs from
isolated CPUs to keep the latency overhead to a minimum. If the number of
housekeeping CPUs are significantly lower than that of the isolated CPUs
we can run into failures while moving these IRQs to housekeeping due to
per CPU vector limit.

Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
---
 include/linux/pci.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Marcelo Tosatti Sept. 10, 2020, 7:22 p.m. UTC | #1
On Wed, Sep 09, 2020 at 11:08:18AM -0400, Nitesh Narayan Lal wrote:
> This patch limits the pci_alloc_irq_vectors max vectors that is passed on
> by the caller based on the available housekeeping CPUs by only using the
> minimum of the two.
> 
> A minimum of the max_vecs passed and available housekeeping CPUs is
> derived to ensure that we don't create excess vectors which can be
> problematic specifically in an RT environment. This is because for an RT
> environment unwanted IRQs are moved to the housekeeping CPUs from
> isolated CPUs to keep the latency overhead to a minimum. If the number of
> housekeeping CPUs are significantly lower than that of the isolated CPUs
> we can run into failures while moving these IRQs to housekeeping due to
> per CPU vector limit.
> 
> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
> ---
>  include/linux/pci.h | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 835530605c0d..750ba927d963 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -38,6 +38,7 @@
>  #include <linux/interrupt.h>
>  #include <linux/io.h>
>  #include <linux/resource_ext.h>
> +#include <linux/sched/isolation.h>
>  #include <uapi/linux/pci.h>
>  
>  #include <linux/pci_ids.h>
> @@ -1797,6 +1798,21 @@ static inline int
>  pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
>  		      unsigned int max_vecs, unsigned int flags)
>  {
> +	unsigned int num_housekeeping = num_housekeeping_cpus();
> +	unsigned int num_online = num_online_cpus();
> +
> +	/*
> +	 * Try to be conservative and at max only ask for the same number of
> +	 * vectors as there are housekeeping CPUs. However, skip any
> +	 * modification to the of max vectors in two conditions:
> +	 * 1. If the min_vecs requested are higher than that of the
> +	 *    housekeeping CPUs as we don't want to prevent the initialization
> +	 *    of a device.
> +	 * 2. If there are no isolated CPUs as in this case the driver should
> +	 *    already have taken online CPUs into consideration.
> +	 */
> +	if (min_vecs < num_housekeeping && num_housekeeping != num_online)
> +		max_vecs = min_t(int, max_vecs, num_housekeeping);
>  	return pci_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs, flags,
>  					      NULL);
>  }

If min_vecs > num_housekeeping, for example:

/* PCI MSI/MSIx support */
#define XGBE_MSI_BASE_COUNT     4
#define XGBE_MSI_MIN_COUNT      (XGBE_MSI_BASE_COUNT + 1)

Then the protection fails.

How about reducing max_vecs down to min_vecs, if min_vecs >
num_housekeeping ?
Nitesh Narayan Lal Sept. 10, 2020, 7:31 p.m. UTC | #2
On 9/10/20 3:22 PM, Marcelo Tosatti wrote:
> On Wed, Sep 09, 2020 at 11:08:18AM -0400, Nitesh Narayan Lal wrote:
>> This patch limits the pci_alloc_irq_vectors max vectors that is passed on
>> by the caller based on the available housekeeping CPUs by only using the
>> minimum of the two.
>>
>> A minimum of the max_vecs passed and available housekeeping CPUs is
>> derived to ensure that we don't create excess vectors which can be
>> problematic specifically in an RT environment. This is because for an RT
>> environment unwanted IRQs are moved to the housekeeping CPUs from
>> isolated CPUs to keep the latency overhead to a minimum. If the number of
>> housekeeping CPUs are significantly lower than that of the isolated CPUs
>> we can run into failures while moving these IRQs to housekeeping due to
>> per CPU vector limit.
>>
>> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
>> ---
>>  include/linux/pci.h | 16 ++++++++++++++++
>>  1 file changed, 16 insertions(+)
>>
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 835530605c0d..750ba927d963 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -38,6 +38,7 @@
>>  #include <linux/interrupt.h>
>>  #include <linux/io.h>
>>  #include <linux/resource_ext.h>
>> +#include <linux/sched/isolation.h>
>>  #include <uapi/linux/pci.h>
>>  
>>  #include <linux/pci_ids.h>
>> @@ -1797,6 +1798,21 @@ static inline int
>>  pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
>>  		      unsigned int max_vecs, unsigned int flags)
>>  {
>> +	unsigned int num_housekeeping = num_housekeeping_cpus();
>> +	unsigned int num_online = num_online_cpus();
>> +
>> +	/*
>> +	 * Try to be conservative and at max only ask for the same number of
>> +	 * vectors as there are housekeeping CPUs. However, skip any
>> +	 * modification to the of max vectors in two conditions:
>> +	 * 1. If the min_vecs requested are higher than that of the
>> +	 *    housekeeping CPUs as we don't want to prevent the initialization
>> +	 *    of a device.
>> +	 * 2. If there are no isolated CPUs as in this case the driver should
>> +	 *    already have taken online CPUs into consideration.
>> +	 */
>> +	if (min_vecs < num_housekeeping && num_housekeeping != num_online)
>> +		max_vecs = min_t(int, max_vecs, num_housekeeping);
>>  	return pci_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs, flags,
>>  					      NULL);
>>  }
> If min_vecs > num_housekeeping, for example:
>
> /* PCI MSI/MSIx support */
> #define XGBE_MSI_BASE_COUNT     4
> #define XGBE_MSI_MIN_COUNT      (XGBE_MSI_BASE_COUNT + 1)
>
> Then the protection fails.

Right, I was ignoring that case.

>
> How about reducing max_vecs down to min_vecs, if min_vecs >
> num_housekeeping ?

Yes, I think this makes sense.
I will wait a bit to see if anyone else has any other comment and will post
the next version then.

>
Nitesh Narayan Lal Sept. 22, 2020, 1:54 p.m. UTC | #3
On 9/10/20 3:31 PM, Nitesh Narayan Lal wrote:
> On 9/10/20 3:22 PM, Marcelo Tosatti wrote:
>> On Wed, Sep 09, 2020 at 11:08:18AM -0400, Nitesh Narayan Lal wrote:
>>> This patch limits the pci_alloc_irq_vectors max vectors that is passed on
>>> by the caller based on the available housekeeping CPUs by only using the
>>> minimum of the two.
>>>
>>> A minimum of the max_vecs passed and available housekeeping CPUs is
>>> derived to ensure that we don't create excess vectors which can be
>>> problematic specifically in an RT environment. This is because for an RT
>>> environment unwanted IRQs are moved to the housekeeping CPUs from
>>> isolated CPUs to keep the latency overhead to a minimum. If the number of
>>> housekeeping CPUs are significantly lower than that of the isolated CPUs
>>> we can run into failures while moving these IRQs to housekeeping due to
>>> per CPU vector limit.
>>>
>>> Signed-off-by: Nitesh Narayan Lal <nitesh@redhat.com>
>>> ---
>>>  include/linux/pci.h | 16 ++++++++++++++++
>>>  1 file changed, 16 insertions(+)
>>>
>>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>>> index 835530605c0d..750ba927d963 100644
>>> --- a/include/linux/pci.h
>>> +++ b/include/linux/pci.h
>>> @@ -38,6 +38,7 @@
>>>  #include <linux/interrupt.h>
>>>  #include <linux/io.h>
>>>  #include <linux/resource_ext.h>
>>> +#include <linux/sched/isolation.h>
>>>  #include <uapi/linux/pci.h>
>>>  
>>>  #include <linux/pci_ids.h>
>>> @@ -1797,6 +1798,21 @@ static inline int
>>>  pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
>>>  		      unsigned int max_vecs, unsigned int flags)
>>>  {
>>> +	unsigned int num_housekeeping = num_housekeeping_cpus();
>>> +	unsigned int num_online = num_online_cpus();
>>> +
>>> +	/*
>>> +	 * Try to be conservative and at max only ask for the same number of
>>> +	 * vectors as there are housekeeping CPUs. However, skip any
>>> +	 * modification to the of max vectors in two conditions:
>>> +	 * 1. If the min_vecs requested are higher than that of the
>>> +	 *    housekeeping CPUs as we don't want to prevent the initialization
>>> +	 *    of a device.
>>> +	 * 2. If there are no isolated CPUs as in this case the driver should
>>> +	 *    already have taken online CPUs into consideration.
>>> +	 */
>>> +	if (min_vecs < num_housekeeping && num_housekeeping != num_online)
>>> +		max_vecs = min_t(int, max_vecs, num_housekeeping);
>>>  	return pci_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs, flags,
>>>  					      NULL);
>>>  }
>> If min_vecs > num_housekeeping, for example:
>>
>> /* PCI MSI/MSIx support */
>> #define XGBE_MSI_BASE_COUNT     4
>> #define XGBE_MSI_MIN_COUNT      (XGBE_MSI_BASE_COUNT + 1)
>>
>> Then the protection fails.
> Right, I was ignoring that case.
>
>> How about reducing max_vecs down to min_vecs, if min_vecs >
>> num_housekeeping ?
> Yes, I think this makes sense.
> I will wait a bit to see if anyone else has any other comment and will post
> the next version then.
>

Are there any other comments/concerns on this patch that I need to address in
the next posting?
Frederic Weisbecker Sept. 22, 2020, 9:08 p.m. UTC | #4
On Tue, Sep 22, 2020 at 09:54:58AM -0400, Nitesh Narayan Lal wrote:
> >> If min_vecs > num_housekeeping, for example:
> >>
> >> /* PCI MSI/MSIx support */
> >> #define XGBE_MSI_BASE_COUNT     4
> >> #define XGBE_MSI_MIN_COUNT      (XGBE_MSI_BASE_COUNT + 1)
> >>
> >> Then the protection fails.
> > Right, I was ignoring that case.
> >
> >> How about reducing max_vecs down to min_vecs, if min_vecs >
> >> num_housekeeping ?
> > Yes, I think this makes sense.
> > I will wait a bit to see if anyone else has any other comment and will post
> > the next version then.
> >
> 
> Are there any other comments/concerns on this patch that I need to address in
> the next posting?

No objection from me, I don't know much about this area anyway.

> -- 
> Nitesh
>
diff mbox series

Patch

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 835530605c0d..750ba927d963 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -38,6 +38,7 @@ 
 #include <linux/interrupt.h>
 #include <linux/io.h>
 #include <linux/resource_ext.h>
+#include <linux/sched/isolation.h>
 #include <uapi/linux/pci.h>
 
 #include <linux/pci_ids.h>
@@ -1797,6 +1798,21 @@  static inline int
 pci_alloc_irq_vectors(struct pci_dev *dev, unsigned int min_vecs,
 		      unsigned int max_vecs, unsigned int flags)
 {
+	unsigned int num_housekeeping = num_housekeeping_cpus();
+	unsigned int num_online = num_online_cpus();
+
+	/*
+	 * Try to be conservative and at max only ask for the same number of
+	 * vectors as there are housekeeping CPUs. However, skip any
+	 * modification to the of max vectors in two conditions:
+	 * 1. If the min_vecs requested are higher than that of the
+	 *    housekeeping CPUs as we don't want to prevent the initialization
+	 *    of a device.
+	 * 2. If there are no isolated CPUs as in this case the driver should
+	 *    already have taken online CPUs into consideration.
+	 */
+	if (min_vecs < num_housekeeping && num_housekeeping != num_online)
+		max_vecs = min_t(int, max_vecs, num_housekeeping);
 	return pci_alloc_irq_vectors_affinity(dev, min_vecs, max_vecs, flags,
 					      NULL);
 }