diff mbox

incremental fix for NIU MSI-X problem

Message ID 4A0CD816.7020500@jp.fujitsu.com (mailing list archive)
State Not Applicable, archived
Headers show

Commit Message

Hidetoshi Seto May 15, 2009, 2:48 a.m. UTC
Michael Ellerman wrote:
> On Fri, 2009-05-15 at 09:49 +0900, Hidetoshi Seto wrote:
>> It is definitely cleanup, not for NIU MSI-X problem, and not for .30
>>
>> One point...
>>
>> Matthew Wilcox wrote:
>>> +	nr_entries = msix_setup_entries(dev, pos, entries, nvec);
>>>  
>>>  	ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
>>> -	if (ret < 0) {
>>> -		/* If we had some success report the number of irqs
>>> -		 * we succeeded in setting up. */
>>> -		int avail = 0;
>>> -		list_for_each_entry(entry, &dev->msi_list, list) {
>>> -			if (entry->irq != 0) {
>>> -				avail++;
>>> -			}
>>> -		}
>>> -
>>> -		if (avail != 0)
>>> -			ret = avail;
>>> -	}
>>> +	if (ret < 0 && nr_entries > 0)
>>> +		ret = nr_entries;
>>>  
>>>  	if (ret) {
>>>  		msi_free_irqs(dev);
>> I think this changes the logic badly.
>> nr_entries is number of allocated entry, while avail is number of irq
>> successfully allocated.  I suppose usually nr_entries > avail.
> 
> Yeah that bit's broken. If the arch routine fails (< 0) then we'll
> return nr_entries to the driver, which will then try again with nvec =
> nr_entries.
> 
> And you're passing nvec to the arch routine even though
> msix_setup_entries() might not have allocated that many entries - which
> means the value of nvec and the contents of the entry list don't match.

Hum, it seems it's an another bug.  Then we need fix like this?


One concern is if we don't have enough memory to have number of entries
requested at first time, the driver will get -ENOMEM and will not do retry
even if we can have less number of entries.

i.e. if the driver can handle its device with 32 or 16 or 8 vectors, and
if there are only memory for 24 entries, and if there are only 30 vectors
available, the best return value of pci_enable_msix() will be 24.
It will let the driver to do retry with 16.

In current code, the return value of pci_enable_msix() in the above case
might be 0 even it allocates 24 vectors while requested is 32.
Yes, it is broken.

The above patch hunk will fix the broken behavior, but it cannot handle
low-memory condition very well.  However we could have another patch for
that as improve later.


Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Michael Ellerman May 15, 2009, 3:22 a.m. UTC | #1
On Fri, 2009-05-15 at 11:48 +0900, Hidetoshi Seto wrote:
> Michael Ellerman wrote:
> > On Fri, 2009-05-15 at 09:49 +0900, Hidetoshi Seto wrote:
> >> It is definitely cleanup, not for NIU MSI-X problem, and not for .30
> >>
> >> One point...
> >>
> >> Matthew Wilcox wrote:
> >>> +	nr_entries = msix_setup_entries(dev, pos, entries, nvec);
> >>>  
> >>>  	ret = arch_setup_msi_irqs(dev, nvec, PCI_CAP_ID_MSIX);
> >>> -	if (ret < 0) {
> >>> -		/* If we had some success report the number of irqs
> >>> -		 * we succeeded in setting up. */
> >>> -		int avail = 0;
> >>> -		list_for_each_entry(entry, &dev->msi_list, list) {
> >>> -			if (entry->irq != 0) {
> >>> -				avail++;
> >>> -			}
> >>> -		}
> >>> -
> >>> -		if (avail != 0)
> >>> -			ret = avail;
> >>> -	}
> >>> +	if (ret < 0 && nr_entries > 0)
> >>> +		ret = nr_entries;
> >>>  
> >>>  	if (ret) {
> >>>  		msi_free_irqs(dev);
> >> I think this changes the logic badly.
> >> nr_entries is number of allocated entry, while avail is number of irq
> >> successfully allocated.  I suppose usually nr_entries > avail.
> > 
> > Yeah that bit's broken. If the arch routine fails (< 0) then we'll
> > return nr_entries to the driver, which will then try again with nvec =
> > nr_entries.
> > 
> > And you're passing nvec to the arch routine even though
> > msix_setup_entries() might not have allocated that many entries - which
> > means the value of nvec and the contents of the entry list don't match.
> 
> Hum, it seems it's an another bug.  Then we need fix like this?

Ah geez you're right, never pays too look at the code too much :)

> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -447,8 +447,10 @@ static int msix_capability_init(struct pci_dev *dev,
>  
>  	for (i = 0; i < nvec; i++) {
>  		entry = alloc_msi_entry(dev);
> -		if (!entry)
> -			break;
> +		if (!entry) {
> +			msi_free_irqs(dev);
> +			return -ENOMEM;

Should be:
			 return i;

> +		}
>  
>  		j = entries[i].entry;
>  		entry->msi_attrib.is_msix = 1;
> 
> One concern is if we don't have enough memory to have number of entries
> requested at first time, the driver will get -ENOMEM and will not do retry
> even if we can have less number of entries.

That should be fixed by returning i, ie. the number of entries we had
memory to allocate.

cheers
Hidetoshi Seto May 15, 2009, 4:44 a.m. UTC | #2
Michael Ellerman wrote:
>>  		entry = alloc_msi_entry(dev);
>> -		if (!entry)
>> -			break;
>> +		if (!entry) {
>> +			msi_free_irqs(dev);
>> +			return -ENOMEM;
> 
> Should be:
> 			 return i;

Genius!

>> +		}
>>  
>>  		j = entries[i].entry;
>>  		entry->msi_attrib.is_msix = 1;
>>
>> One concern is if we don't have enough memory to have number of entries
>> requested at first time, the driver will get -ENOMEM and will not do retry
>> even if we can have less number of entries.
> 
> That should be fixed by returning i, ie. the number of entries we had
> memory to allocate.

Yes, if number of allocatable vector is less than that of allocatable entry,
pci_enable_msix() will be return again with >0 but it will never be problem,
because "allocatable number" changes from time to time.  As you once said,
"you might not be able to get the number of interrupts that pci_enable_msix
reported."

I'll post this fix in a patch format soon.
I'd appreciate it if you can provide Acked-by for it.


Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -447,8 +447,10 @@  static int msix_capability_init(struct pci_dev *dev,
 
 	for (i = 0; i < nvec; i++) {
 		entry = alloc_msi_entry(dev);
-		if (!entry)
-			break;
+		if (!entry) {
+			msi_free_irqs(dev);
+			return -ENOMEM;
+		}
 
 		j = entries[i].entry;
 		entry->msi_attrib.is_msix = 1;